45 pages 1 hour read

How to Lie with Statistics

Nonfiction | Reference/Text Book | Adult | Published in 1954

A modern alternative to SparkNotes and CliffsNotes, SuperSummary offers high-quality Study Guides with detailed chapter summaries and analysis of major themes, characters, and more.

Chapter 1Chapter Summaries & Analyses

Chapter 1 Summary: “The Sample with the Built-in Bias”

Content Warning: The source material and this guide include discussions of suicide and systemic racism.

Chapter 1 covers the problem of sample bias in statistical data. Darrell Huff explains that it isn’t possible to count every individual when studying a population, so a statistician needs to create a sample representing the whole. To be the most accurate, a representative sample needs to be “one from which every source of bias has been removed” (20). Sample bias, he notes, appears in many different forms.

Huff introduces the chapter’s issue with an example: the reported amount of money made by an average graduate from a given year. The problem with the statistic, Huff says, comes from biased sampling. All the graduates were probably not contacted, and those who responded were likely more influential and made more money than those who chose not to. That creates a bias in the sample toward one part of the population over the other and yields an inaccurate result.

The following section covers the problem that the reported numbers could be made up, even with a decent sample. Huff’s examples here highlight that people often lie to make themselves look better in polls, from how much money they make to the media they consume or even their health habits.

Huff points out that invisible biases cause as many problems in statistics as visible ones. Aspects of a population that might lead to part of it being overlooked or overrepresented are the product of the biases of the person taking the measurement. In Huff’s examples, these often come from a lack of access. For instance, he cites a 1936 presidential election telephone poll of 10 million Literary Digest subscribers that is often cited as a statistical disaster for predicting that Alf Landon would defeat Franklin Delano Roosevelt. The results were biased because the sample failed to account for people who could not afford a subscription and a telephone in that era, so they skewed heavily toward Republican voters. The bias of the people taking the polls can also cause issues; Huff notes that the race of the person doing the interview and of the person interviewed can create different results due to tensions that result from systemic racism.

Huff concludes the chapter by discussing random sampling. While truly random samples are ideal, the cost and difficulty of getting them requires using stratified random samples instead. Huff says biases can never be removed entirely from a sample, and the reader should know where bias could appear in any statistics they encounter. In his examples, he references the work of Dr. Alfred Kinsey on women and sexuality. He notes that Kinsey’s samples, while not random, are large enough for significance and should be respected, but too much information should not be assumed based on them.

His final note for the chapter is that polls are often not rigged intentionally. Instead, they tend to bias toward those who are more “appealing” to those creating the poll. This usually means favoring people who are educated, wealthy, and able-bodied.

Chapter 1 Analysis

This chapter serves as an introduction to one of the base issues of bad statistics: The Importance of Proper Sampling. Because the sample is the root of statistical data, any issues with it corrupt the entire result. Huff uses this to underline the importance of proper samples. His explanation of sampling issues is the most comprehensive statistical guide in the book. He explains how and why sample selection works to introduce the audience to the subject. An understanding of sampling is crucial because there is no foundation for understanding the rest of the statistics in the book without knowing the significance of sample selection. The first chapter also provides a foundation for the following few chapters, which cover the problems resulting from the collection and analysis of statistical data. Here, he focuses on sample bias and the different ways statisticians can fail to account for bias or intentionally create it.

In this chapter, Huff also introduces the structure he relies on for much of the book. He begins with a contemporary example, which he presents as it might be viewed by the reader if encountered in their own lives. He then explains the focus of the chapter before returning to the example, describing how the problem applies, and deconstructing it for his reader. His examples, such as the one at the beginning of the chapter, are topical for the time and place they were published in: the US of the 1950s. However, many of them prove outdated when viewed through a modern lens. Examples involving money, for instance, may be challenging to interpret due to inflation following the book’s publication; for example, he devotes a section of the chapter to discussing a study that reported an average salary of “$25,111 a year” (13), and readers must use context to understand that this was considered an “impressive figure” at the time of the study—emphasizing The Importance of Critical Thinking. Despite issues with the relevance of some examples outside the original context, his conversational language creates accessibility to the subject matter. This counteracts the issue he discusses throughout the book of numbers having a mystique that creates distance between them and people who are unfamiliar with statistics.

blurred text
blurred text
blurred text
blurred text
Unlock IconUnlock all 45 pages of this Study Guide

Plus, gain access to 8,800+ more expert-written Study Guides.

Including features:

+ Mobile App
+ Printable PDF
+ Literary AI Tools