# Research Methods

Lecture 7: graphs, etc

## Exercise 6

In each case you are trying to find the best way to present the data. This may be a graph, chart, or table.

### Data Set 1

This data set contains the number of people who live in eight different types of household in three different regions of Britain. Your job is to present the information in a way that best compares the different types of household for Cambridge, East of England, and England. You want to show all of the categories that are in the data file, but you want particularly to highlight two things: the proportion of people who live in households with children and the proportion of people who live in households that consist of all students.

The data set can be downloaded as a CSV file.

### Data Set 2

Present the following data in the most appropriate fashion.

CountryAreaPopulationGDP per person
USA3537438 sq. miles301139947\$43444
Monaco1.95 sq. km32671\$29882.77
Ghana230940 sq. km22931299\$2771

The data set can be downloaded as a CSV file.

### Data Set 3

This data contains the data on the number of male and female undergraduate and postgraduate students for the academical years 1968-69 through to 2008-09 (source: Cambridge University Reporter, Special Issue No. 4, 8/10/2009). There are five columns of data: the year, UG men (number of undergraduate male students), UG women (number of undergraduate female students), PG men (number of postgraduate male students), and PG woman (number of postgraduate female students).

There are a range of interesting presentations of this data. Having investigated the data, choose a particular "story" that you want to tell and then produce a graph or chart that best presents that story. On the graph or chart, write one sentence that tells me what the "story" is supposed to be.

The data set can be downloaded as a CSV file.

### Data Set 4

This data contains the number of citations to thirty different academic papers as reported by the ISI website and by the Google Scholar website. You are to present this information in a way that best shows the relationship between the number of citations reported by the two websites.

The data set can be downloaded as a CSV file.

## Notes

### Good & Bad Graphs: Concepts

Based on Ross Ihaka's lecture (see above).

• Data content Small amounts of data do not require graphs. The human brain can easily grasp one, two or three values.
• Data relevance You cannot produce a good graph from bad data: graphs are only as good as the data they display.
• Complexity Graphs should be no more complex than the data they display. Avoid "chart junk": irrelevant decoration, unnecessary colour, 3D effects. Use the ink to display the data, not junk.
• Distortion Graphs should not give a distorted picture of the values they portray.
• Story Decide what "story" you want the graph to tell. The same data can tell many stories, what is important for communicating your ideas?
