Technical reports
GTVS: boosting the collection of application traffic ground truth
Marco Canini, Wei Li, Andrew W. Moore
April 2009, 20 pages
DOI: 10.48456/tr-748
Abstract
Interesting research in the areas of traffic classification, network monitoring, and application-orient analysis can not proceed with real trace data labeled with actual application information. However, hand-labeled traces are an extremely valuable but scarce resource in the traffic monitoring and analysis community, as a result of both privacy concerns and technical difficulties: hardly any possibility exists for payloaded data to be released to the public, while the intensive labor required for getting the ground-truth application information from the data severely constrains the feasibility of releasing anonymized versions of hand-labeled payloaded data.
The usual way to obtain the ground truth is fragile, inefficient and not directly comparable from one’s work to another. This chapter proposes and details a methodology that significantly boosts the efficiency in compiling the application traffic ground truth. In contrast with other existing work, our approach maintains the high certainty as in hand-verification, while striving to save time and labor required for that. Further, it is implemented as an easy hands-on tool suite which is now freely available to the public.
In this paper we present a case study using a 30 minute real data trace to guide the readers through our ground-truth classification process. We also present a method, which is an extension of GTVS that efficiently classifies HTTP traffic by its purpose.
Full text
PDF (0.3 MB)
BibTeX record
@TechReport{UCAM-CL-TR-748, author = {Canini, Marco and Li, Wei and Moore, Andrew W.}, title = {{GTVS: boosting the collection of application traffic ground truth}}, year = 2009, month = apr, url = {https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-748.pdf}, institution = {University of Cambridge, Computer Laboratory}, doi = {10.48456/tr-748}, number = {UCAM-CL-TR-748} }