Skip to main content

Getting Started With Research Resources: Statistics

Information on how to find resources for research, evaluate, and ethically cite them, as well as writing and citation help.

Finding and using statistics

For this guide, when we talk about statistics we are going to be talking about what are called descriptive statistics – “How many? – How much? – What percent?” Someone else has collected the information; you will benefit from all their hard work.  Later on in your coursework, especially if you major in the social or health sciences, you will have the pleasure of taking an entire semester-long class devoted to statistics where you will learn about inferential statistics - median and mean, standard deviations, chi squares, analysis of variance, etc. But for now we are going to concentrate on statistics as finding “numbers of.”

However, numbers without context are meaningless. Numbers become valuable when you can compare one set against another. For purpose of this guide, statistics will be defined as comparative numerical information. We know that alcohol is responsible for many traffic deaths. In 1998, 15,935 American were killed in alcohol-related crashes. That number by itself does give us a very good picture of the problem of drinking and driving. If we know that in 1998 there was a total of 41,471 traffic fatalities in the U.S. and 15,935 – or 38.4 % - were alcohol related, we see that drinking does play a large part in traffic-related deaths.

And when we have more numbers to compare, we get an even better context–

 
1985
1992
1995
1998
Total traffic fatalities
43,825
39,250
41,817
41,471
Alcohol related fatalities
22,716 
17,858 
17,247 
15,935
% of total fatalities
51.8%
  45.5%
41.2%
38.4%

 

Now that we can compare numbers, we can see trends and ask questions. The total number of alcohol-related fatalities has dropped steadily from 1985 to 1998, while the total number of traffic deaths has been fairly constant. Why, when less people are dying from alcohol-related accidents, are more people dying in other car accidents? (Is the total number of drivers increasing?). And why is the number of alcohol related deaths decreasing? (M.A.D.D.? Stiffer sentencing? Are Americans drinking less?). We are not going to answer our questions in this section, but just concentrate on finding the “numbers.”

Source for above statistics. United States. Bureau of the Census. Statistical Abstract of the United States: 2000. Washington, D.C.: G.P.O., 2000.

The statistical sources discussed in this guide can be broken down into three general categories.

The first would be primary sources, where the author(s) has designed a way to collect the statistics (i.e. questionnaire) using scientifically appropriate methodology (a topic for your semester-long statistics class) and then published the results. An example of this type of statistical work would be Sex in America: A Definitive Survey by Robert T. Michael, et al. (alas, no examples for you).

The second type of statistical source is a compilation such as Crime in the United States 2000, published by the Federal Bureau of Investigation. Various police agencies around the country keep track of crime in their jurisdictions. These numbers are then reported to the FBI which then publishes them in an annual compilation. Even though the FBI did not gather the statistics, they are still considered the author and this work would be thought of as a primary source.

The third type of statistical source is a work such as Statistical Abstract of the United States, published by the Bureau of the Census (part of the U.S. Department of Commerce). Think of it as a statistical “greatest hits.” Each year the Bureau of the Census “abstracts” the most relevant statistics gathered by governmental agencies or private organizations (such as our two examples above). Many of the crime statistics in Statistical Abstract come directly from Crime in the United States.

Statistical Abstract then becomes a one-stop shopping experience of statistics on America – How many of us are there? - How much money do we make?, How do we spend what we make? - How do we die?

Remember, it is called Statistical Abstract because the information you are seeing is abstracted (removed) from other statistical works. The original statistical source will always be listed after the abstract. We can make a similar analogy to your earlier use of a subject encyclopedia. You located a brief article on your topic (the “abstract”). At the end of the article was a bibliography, which gave you references to books and articles that discussed your topic in more depth. Statistical Abstract gives you a summary and the source (the “bibliography”) gives a place to look for more detailed information.

Private publishers will also put out compilations of statistic. They will often chose a theme for their statistical work, and then compile all the relative statistics they can glean from such works asStatistical Abstract. They do not collect any of the data, but they have created what we call “value-enhanced information,” i.e. they have done much of the drudge-work for you already. An example of this type of statistical source would be Statistical Handbook on the American Family. The authors have compiled data on marriage, divorce, children, and a whole range of other family-related numbers gathered from government sources and private organizations. 

Strengthen your argument

You can often strengthen your argument through the presentation of comparative numerical data. Good statistics are objective, and can allow a case to be debated on the facts, and not on the emotions.  Statistics also set the context of your argument.  If "school violence" is a bad thing, how widespread is it?  In comparison to what? -  more crimes are committed at home than in school (38% at own or neighbor's home, 12% at school) (Statistical Abstract, 1999). 

You will discover statistics as you perform the earlier steps in the research process:  books and articles on your topic often incorporate facts and figures in tables and graphs.  However, you will often have specific points you wish to make, or holes in your paper you need to fill.  When you need numbers, a dedicated source of statistical information is the easiest way of finding them. 

Be sure to keep an eye out for specialized terminology and its implications.  "Crime" statistics can be derived from arrest or conviction reports, crime victimization surveys, or polls on who feels crime is an important issue.  Statistical and other specialized terminology is defined in subject dictionaries. 
 

Quality control - should you trust statistics?

The quality of statistical data depends upon who collects the information, how, when and why. 

Who:  Does the author (organization) have the resources to do a proper survey / data collection?  Does the author have a particular interest in certain results?  Who else might approach the subject differently? 

How:  A census is a complete counting.  This is rare, but does occur with some data collected by the U.S. Government (every 10 years for population, every 5 years for some economic data) and with mortality statistics (every death from certain causes is reported).  A survey is normally based upon a sampling.  A random or scientific sampling is the best representation of the larger group.  Scientific samples will show somewhere their "margins of error."  A "sample" based upon self-selection (e.g. questionnaires or polls conducted online) cannot reflect accurately those people who do not participate.  Polls are sometimes scientific, and sometimes self-selected. 

Example: actual election returns represent a "census" of actual voters.  Scientific polling (a correctly constructed "survey sampling") of voters can give results that match the actual returns (or not, depending on circumstances!).  Web polls of voting preferences will represent the opinions only of those who are online and who care enough to participate--not the electorate as a whole. 

When:  The date of a statistical reference book is NOT the date the statistics are collected.  Very detailed information often takes a long time to appear; polls appear more quickly. 

Why:  Consider whether the organization collecting the information has a bias or an agenda. 

Scrolling through tables listed after “Birth and birth rates” in Statistical Abstract yielded this table –

No. 89. Live Births by Place of Delivery, Median and Low Birth Weight, 
 and Prenatal Care: 1990 –1997 

[Represents registered births, Excludes births to nonresidents of the United States. For total number of births, see Table 79. See Appendix III]

Item
1990
1992
1993
1994
1995
1996
1997
   Births attended (1,000):              
In hospital (1) . . . . . . . . . . . . . . . . . . . . . . . . .
4,110
4,022
3,959
3,912
3,861
3,891
3,881
   By physician, not in hospital . . . . . . . . . . . . 
14
10
8
7
6
6
5
   By midwife and other, not in hospital (2)  . .
21
21
20
21
21
20
20
Median birth weight (3) . . . . . . . . . . . . . . . . .
7lb7oz
7lb 7oz
7lb7oz
(NA)
(NA)
7lb7oz
7lb7oz
Percent of birth with low weight . . . . . . . . . . .
7.0
7.1
7.2
7.3
7.3
7.4
7.5
   White . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7
5.8
6.0
6.1
6.2
6.3
6.5
   Black . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
13.3
13.3
13.3
13.2
13.1
13.0
13.0
Percent of birth by period 
   in which prenatal care began: 
             
   1st trimester . . . . . . . . . . . . . . . . . . . . . . . .
74.2
77.7
78.9
80.2
81.3
81.9
82.5
   3rd trimester . . . . . . . . . . . . . . . . . . . . . . . .
6.0
5.2
4.8
4.4
4.2
4.0
3.9

 

     NA  Not available.  (1) Includes all births in hospitals or institutions and in clinics. (2) Includes births with attendant not specified.  (3) Median birth weight based on race or mothers; prior to 1990 based on race of child. 
   Source of Tables 89-91:  U.S. National Center for Health Statistics, Vital Statistics of the United States, annual; and unpublished data. 


1.  In order to save room, zeros are often left out of the columns. If you look in the upper left-hand side of the table above, after “Births attended” you see the following: (1,000). This is the “unit indicator” and tells you that the numbers you see are in thousands. In 1990 there were not just 4,110 hospital-attended births in this country, but 4,110,000. There were not 21 midwife-attended births, but 21,000. Forgetting the unit indicator can leave you with egg on your face.

2.  Keep an eye out for explanatory footnotes. The footnote for delivery “by midwife and other, not in hospital” states that it “includes births with attendant not specified.” Therefore, not all of the 21,000 births outside the hospital, and not attended by a physician, were attended by a midwife.

3.  In Statistical Abstract, the original source of the statistics is listed at the bottom of the table, or at the bottom of a range of related tables. As discussed above, the source can lead you to more detailed information on your topic. In this case, Vital Statistics of the United States, an annual publication of the U.S. National Center for Health Statistics.

If the source is not listed at the bottom of the table, check the title of the table for a "code," as in this example from Statistical Handbook on the American Family, 2nd edition.

The title of the statistical table is -

D1-15  Women Who Have Had a Child in the Last Year, by Age and Labor Force Status: 1980 -1995

When "D1-15" is checked against List of Sources in the back of the book we discover that the statistics originally came from No. 104, U.S. Bureau of the Census, Statistical Abstract of the United States: 1997 (117th Edition). Washington, DC 1997. The source is NOT the editors of  Statistical Handbook on the American Family, Bruce Chadwick & Tim Heaton. 

The author, before writing this section, assumed that midwife-attended births would have increased during the 1960’s and 1970’s, due to the “hippie” movement and the rise in feminism. But numbers have a good way of changing perceptions, of bolstering your argument and shooting down your opponents.Statistical Abstract tracks birth attended by “midwife and other, not in hospital” back to 1960. Here are the numbers:

1960
1965
1970
1973
1974
1975
1976
1978
1979
1980
94,000
66,000
18,000
16,000
16,000
28,000
32,000
21,000
22,000
24,000
1981 1982 1983 1984 1985 1986 1987 1988 1989
1990
27,000
28,000
29,000
28,000
29,000
27,000
27,000
28,000
22,000
21,000

 

On one level the assumption was correct. From a low point in 1973, the numbers increased dramatically, then remained stable until 1989. But the assumption is wrong if you look at the numbers for 1960 – 94,000 births!  What happened in a period of ten years to cause the number of midwife-attended births to drop from 94,000 to 18,000?

Maybe this is why in the science fiction novel Fahrenheit 451 firemen burn books to insure public happiness. Sometimes books, and numbers, create more questions that they answer.

Loading ...

Resources for Statistics

The College has a subscription to SPSS, which is available on computers in the labs on the Duluth campus as well as laptops available for check out in the Library.

YouTube video for SPSS 7 min. 18 sec. long