Social and Technological Network Analysis
MPhil course - 2011/12
1 - Research paper report
Every student will prepare one report (of approximately 1,500 words) on one assigned research paper. The report is due at the end of the course and it is worth 30% of the final mark.
The report will contain two parts of about 750 words each:
- Critical analysis of the papers including, possibly, comparisons and references to other material presented in the course or found by the student and comments on how solid the result obtained are (e.g., comments on the evaluation methods or on the analysis applied can be included).
- Discussion of possible future research ideas in the area.
This is a selection of mainly recent research papers on social and technological networks. Choose any still available paper and e-mail your choice to cm542: the submission of reports is to be done via email to cm542 in PDF format.
D. Liben-Nowell, J. Kleinberg. The Link Prediction Problem for Social Networks. Proc. CIKM, 2003. D. Liben-Nowell, J. Novak, R. Kumar, P. Raghavan, A. Tomkins. Geographic routing in social networks. Proc. Natl. Acad. Sci., 102, 2005. J. Leskovec, L. Adamic, B. Huberman. The Dynamics of Viral Marketing. TWEB, 2007. J. Leskovec, L. Backstrom, R. Kumar, A. Tomkins. Microscopic evolution of social networks. In Proceedings of KDD’08, pages 462–470, New York, NY, USA, 2008. ACM. J. Leskovec, E. Horvitz. Worldwide Buzz: Planetary-Scale Views on an Instant-Messaging Network. Proc. International WWW Conference, 2008.
- M.E.J. Newman. The first-mover advantage in scientific publication. European Physics Letters 86, 68001, 2008.
Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, Ben Y. Zhao. User Interactions in Social Networks and their Implications. ACM EuroSys 2009 A. Goyal, F. Bonchi, L.V.S. Lakshmanan. Learning influence probabilities in social networks. In Proc. WSDM, 2010. Rongjing Xiang, Jennifer Neville, Monica Rogati. Modeling Relationship Strength in Online Social Network In Proc. WWW, 2010. Alessandra Sala, Lili Cao, Christo Wilson, Robert Zablit, Haitao Zheng, Ben Zhao. Measurement-calibrated Graph Models for Social Network Experiments In Proc. WWW, 2010. Munmun De Choudhury, Winter Mason, Jake Hofman, Duncan Watts. Inferring Relevant Social Networks from Interpersonal CommunicationIn Proc. WWW, 2010. J. Leskovec, D. Huttenlocher, J. Kleinberg. Signed Networks in Social Media. In Proc. CHI, 2010. D.M. Romero and J. Kleinberg. The Directed Closure Process in Hybrid Social-Information Networks, with an Analysis of Link Formation on Twitter. Proc. 4th International AAAI Conference on Weblogs and Social Media, 2010.
J. Leskovec, D. Huttenlocher, J. Kleinberg. Predicting Positive and Negative Links in Online Social Networks. In Proc. WWW, 2010. Shuang-hong Yang, Bo Long, Alex Smola, Narayanan Sadagopan, Zhaohui Zheng and Hongyuan Zha. Like like alike: Joint friendship and interest propagation in social networks. Proceedings of WWW 2011, 2011.
J. Cheng, D. Romero, B. Meeder, J. Kleinberg. Predicting Reciprocity in Social Networks. Proc. 3rd IEEE Conference on Social Computing, 2011. E. Sadikov, M. Medina, J. Leskovec, H. Garcia-Molina. Correcting for Missing Data in Information Cascades ACM International Conference on Web Search and Data Mining (WSDM), 2011 Paul Expert, Tim S. Evans, Vincent D. Blondel, Renaud Lambiotte. Uncovering space-independent communities in spatial networks. Proceedings of the National Academy of Sciences, Vol. 108, No. 19. (10 May 2011), pp. 7663-7668.
- Lars Backstrom and Jon Kleinberg. Network Bucket Testing Proceedings of WWW 2011, 2011.
Eytan Bakshy, Jake M Hofman, Duncan J Watts, Winter A Mason. Identifying Influencers on Twitter ACM International Conference on Web Search and Data Mining (WSDM), 2011 Jukka-Pekka Onnela, Samuel Arbesman, Marta C. González, Albert-László Barabási, Nicholas A. Christakis. Geographic Constraints on Social Network Groups. PLoS ONE, Vol. 6, No. 4. (5 April 2011), Silvio Lattanzi, Alessandro Panconesi and D. Sivakumar. Milgram-Routing in Social Networks Proceedings of WWW 2011.
Deadline for choice: 23th January 2012
Deadline for submission: 10th February 2012
2 - Project
Every student will complete a project which consists of analysis of an assigned dataset according to some indicated network measures using NetworkX: the analysis should be reported in a document of about 1,500 words where the results are commented and justified. This will be worth 60% of the final mark.
Each student is encouraged to come up with original project proposals which may be inspired by the ideas reported below.
Amazon: co-purchase product network and all product information. (548,552 products)
- extraction of network communities and analysis of their homogeneity with respect to product categories;
- analysis of potential correlations between network node metrics and product sales/reviews;
- prediction model for product sales given network properties and product characteristics.
HEP-PH: citation graph among papers in high-energy physics with temporal publication data of each paper. (34,546 nodes and 421,578 directed edges)
- investigation of power-law network structure evolution over time with a generative model;
- analysis of the first-mover advantage for scientific publications;
- analysis/design of ranking algorithms to facilitate search among publications.
Epinions: trust and distrust signed social network among users on Epinions.com. (131,828 nodes and 841,372 directed edges)
- analysis of the structure of the social network arising from positive, negative and aggregated edges, with an investigation of the correlations among them;
- analysis of social triangles and verification of structural balance theory ("the enemy of my enemy is my friend");
- prediction models of the sign of a social link.
Facebook: friendship connections within one regional network and information about interaction events between users, such as likes, comments and the like. (657,681 nodes and 1,302,764 undirected edges)
- analysis and comparison of the social network among Facebook users and the network arising from their explicit interactions;
- prediction models of interaction between users;
- analysis of user activity as a function of ego-network properties.
- Cond-Mat: collaboration network extracted from the e-print arXiv,
covering co-authorship ties from the papers submitted to the Condensed
Matter category. (23,133 nodes and 186,936 undirected edges)
- analysis of the social properties of the coauthorship networl
- study of correlation between scientific productivity and network position
- Enron: email communication network between Enron employees and with
external email addresses. (36,692 nodes and 367,662 undirected edges)
- study of the statistical properties of an email network
- analysis of the communication patterns
- Roads: road network of Pennsylvania, each node is an intersection and
each edge is a road. (1,088,092 nodes and 3,083,796 undirected edges)
- characteristics of a planar networks
- community detection in a planar network
Web: web hyperlinks between pages in the stanford.edu domain. (281,903 nodes and 2,312,497 directed edges)
- centrality measures in a Web graph
- ranking of Web pages
Wikipedia: discussion network between Wikipedia users. (2,394,385 nodes and 5,021,410 directed edges)
- statistical properties of a large-scale discussion network
- social network properties of Wikipedia users
Brightkite: social networks arising among Brightkite users, including also a total of 4,491,143 checkins of these users over the period of Apr. 2008 - Oct. 2010. (2,394,385 nodes and 5,021,410 directed edges)
- spatial properties of the social network
- analysis of the relationship between user mobility and social properties
Deadline for choice: 13th February 2012
Deadline for submission: 6th March 2012 at noon
3 - Presentation
Every student will prepare a brief oral presentation of the project findings. Each presentation MUST last 8 minutes, including questions: thus, a reasonable combination would be a 6-minute talk followed by 2 minutes for questions. The presentation is worth 10% of the final mark and it will be evaluated along these categories:
- Presentation skills [20%]
- General knowledge of the subject [20%]
- Discussion of findings [40%]
- Questions and answers [20%]
The presentations will take place on 8th March 2012 (11am-1pm) and 9th March 2012 (11am-noon).
Slides must be sent in PDF fomat before 23:59 7th March 2012.