Department of Computer Science and Technology

Technical reports

GTVS: boosting the collection of application traffic ground truth

Marco Canini, Wei Li, Andrew W. Moore

April 2009, 20 pages

DOI: 10.48456/tr-748


Interesting research in the areas of traffic classification, network monitoring, and application-orient analysis can not proceed with real trace data labeled with actual application information. However, hand-labeled traces are an extremely valuable but scarce resource in the traffic monitoring and analysis community, as a result of both privacy concerns and technical difficulties: hardly any possibility exists for payloaded data to be released to the public, while the intensive labor required for getting the ground-truth application information from the data severely constrains the feasibility of releasing anonymized versions of hand-labeled payloaded data.

The usual way to obtain the ground truth is fragile, inefficient and not directly comparable from one’s work to another. This chapter proposes and details a methodology that significantly boosts the efficiency in compiling the application traffic ground truth. In contrast with other existing work, our approach maintains the high certainty as in hand-verification, while striving to save time and labor required for that. Further, it is implemented as an easy hands-on tool suite which is now freely available to the public.

In this paper we present a case study using a 30 minute real data trace to guide the readers through our ground-truth classification process. We also present a method, which is an extension of GTVS that efficiently classifies HTTP traffic by its purpose.

Full text

PDF (0.3 MB)

BibTeX record

  author =	 {Canini, Marco and Li, Wei and Moore, Andrew W.},
  title = 	 {{GTVS: boosting the collection of application traffic
         	   ground truth}},
  year = 	 2009,
  month = 	 apr,
  url = 	 {},
  institution =  {University of Cambridge, Computer Laboratory},
  doi = 	 {10.48456/tr-748},
  number = 	 {UCAM-CL-TR-748}