Computer Laboratory

Research Skills - Graphing exercise

Exercise 4: Graphing

There are two sections to this exercise. The first requires you to correct three graphs. The second requires you to graph two data sets.

Section I - correcting graphs [9 marks]

Each of the graphs in this section has one or more problems that prevents it from being a good graph. For each of the graphs in this section you are required to submit two things:

  1. An explanation, in a single paragraph, of all the problems with the graph as presented.
  2. A good graph of the same data.

Graph 1

Source: data from US census.

YearPopulation
17903929214
186031443321
189062979766
191092228496
1930123202624
1950151325798
1960179323175
1970203211926
1980226545805
1990248709873
2000281421906
2010308745538

Graph 2

Source: "Hematocrit was not validated as a surrogate end point for survival among epoetin-treated hemodialysis patients", Dennis J. Cotter, Kevin Stefanik, Yi Zhang, Mae Thamer, Daniel Scharfstein, James Kaufman, Journal of Clinical Epidemiology 57(10):1086-1095, October 2004.

Below 30%30-33%33-36%36-39%Above 39%
Q1271245185184177
Q2344278212195186
Q3425316247199180
Q4501354280227196

The table shows the unadjusted one-year mortality rate by hematocrit group (Below 30%, 30-33%, 33-36%, 36-39%, Above 39%) disaggregated by epoetin dose quartile (Q1, Q2, Q3, Q4).

Graph 3

Source: data from Cambridge University Reporter, Special Issue No. 4, 8/10/2009.

Academic year 1990-91

undergraduate men6110
undergraduate women4217
postgraduate men2516
postgraduate women1240

Section II - graphing data sets [6 marks]

For each of these sets of data, you are required to investigate the data, then produce a graph that best presents the "story" that you want to tell from the data. More detailed instructions are provided for each data set.

Data Set 4

This data set contains the number of male and female undergraduate and postgraduate students for the academical years 1968-69 through to 2008-09 (source: Cambridge University Reporter, Special Issue No. 4, 8/10/2009). There are five columns of data: the year, UG men (number of undergraduate male students), UG women (number of undergraduate female students), PG men (number of postgraduate male students), and PG woman (number of postgraduate female students).

There are a range of "stories" that you could tell using this data. First, investigate the data. Then choose a particular "story" that you want to tell. Finally, produce a graph that best presents that story. On the graph, write the one sentence that tells me what the "story" is supposed to be.

The data set can be downloaded as a CSV file.

Data Set 5

An inventor has designed a novel web-page graphical object specification. He wants to test the time taken to upload and render these objects in his new web browser. He prepares eighty different objects, twenty of size 2kB, twenty of size 4kB, twenty of size 8kB and twenty of size 16kB. He records the time taken between requesting each object and the completion of the rendering of the object. The times are given below in milliseconds.

Your job is to investigate the data and work out what "story" the data tells. Then produce a graph that presents the data as well as possible. On the graph or chart, write the one sentence that tells me what the "story" is supposed to be.

The data set can be downloaded as a CSV file or below.

2kB4kB8kB16kB
1729229332761845
2642202229392201
1499148633483845
2397252819183251
2065352525755282
2172227112994681
2057128737733193
1486280434441923
1703160822732962
2377243415923179
2552242723231592
1968263224612007
1598231612562163
2025283118622745
2615263330514359
1620195138533747
2205322737484435
2298201930903813
2483248233451182
2072242044991319

Source: this data set is simulated.