Course pages 2016–17
Social and Technological Network Data Analytics
1 - Data Analysis Report
Every student will analyze an assigned dataset: the analysis should be reported in a document of no more than 4,000 words where the results are commented and justified. This will be worth 70% of the final mark.
Each student is encouraged to come up with original project proposals which should be inspired by the material in the lectures.
A suggested structure (although not prescriptive) of the report could be:
- Introduction and background to the idea under investigation, including related work if any.
- Introduction to the dataset and basic/classic analysis.
- Analisys report of the idea under investigation.
- Discussion of limitation and possible other ideas.
The marking will be distributed to these various components in the following manner:
- Background and motivation reporting [10%]
- Basic analysis [30%]
- Analysis results report [50%]
- Discussion of limitations and future work [10%]
The report should be submitted via Moodle.
Deadline for choice: 10th February 2017
Deadline for submission: 13th March 2017
Datasets
Datasets will be assigned on first come first served basis. Multiple students might be assigned to the same dataset (or a portion of the same dataset).
- Airbnb: AirbnB data of listings, reviews, geographical data of various cities. Information here.
- Facebook: friendship connections within one regional network and information about interaction events between users, such as likes, comments and the like. (657,681 nodes and 1,302,764 undirected edges).
- Foursquare: Foursquare venues with timestamped transitions for New York.
- Human Smugglers Network: network of phone calls made by suspects operating along the Eastern Mediterranean route. It includes 8,943 nodes and 82,979 ties (of which 48 nodes and 10,319 ties constitute the sub-network of suspects). There is a time-stamp for each phone call (Summer 2011). There is also information on the role played by each smuggler and the country of residence.
- Protein-protein interaction: interaction network protein to protein in budding yeast. Information here.
- Taxi Rides: New York taxi trip records of pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts. Information here.
- Stack Overflow temporal network: temporal network of interactions on the stack exchange web site Stack Overflow. Network information here.
- Wikipedia: discussion network between Wikipedia users. (2,394,385 nodes and 5,021,410 directed edges)
2 - Presentation
Every student will prepare a brief oral presentation of the project findings. Each presentation MUST last 8 minutes, including questions: thus, a reasonable combination would be a 6-minute talk followed by 2 minutes for questions.
The presentations will take place on 14th March 2017.
Slides must be sent by email to cm542 in PDF fomat before 23:59 13th March 2017.
The presentation is worth 30% of the final mark and it will be evaluated along these categories:
- Presentation skills [20%]
- General knowledge of the subject [20%]
- Discussion of findings [40%]
- Questions and answers [20%]