Fundamentals of Statistics
 May 20, 2005
 The Five Basic Words of Statistics
 The Branches of Statistics
 Sources of Data
 Sampling Concepts
 Sample Selection Methods
 OneMinute Summary
 Test Yourself
 Answers to Test Yourself Questions
 References

1.1 The Five Basic Words of Statistics

1.2 The Branches of Statistics

1.3 Sources of Data

1.4 Sampling Concepts

1.5 Sample Selection Methods

OneMinute Summary

Test Yourself
Every day, you encounter numerical information that describes or analyzes some aspect of the world you live in. For example, here are some news items that appeared in the pages of The New York Times during a onemonth period:

Between 1969 and 2001, the rate of forearm fractures rose 52% for girls and 32% for boys, with the largest increases among children in early puberty, according to a recent Mayo Clinic study.

Across the New York metropolitan area, the median sales price of a singlefamily home has risen by 75% since 1998, an increase of more than $140,000.

A study that explored the relationship between the price of a book and the number of copies of a book sold found that raising prices by 1% reduced sales by 4% at BN.com, but reduced sales by only 0.5% at Amazon.com.
Such stories as these would not be possible to understand without statistics, the branch of mathematics that consists of methods of processing and analyzing data to better support rational decisionmaking processes. Using statistics to better understand the world means more than just producing a new set of numerical information—you must interpret the results by reflecting on the significance and the importance of the results to the decisionmaking process you face. Interpretation also means knowing when to ignore results, either because they are misleading, are produced by incorrect methods, or just restate the obvious, as this news story "reported" by the comedian David Letterman illustrates:

USA Today has come out with a new survey. Apparently, 3 out of every 4 people make up 75% of the population.
As newer technologies allow people to process and analyze everincreasing amounts of data, statistics plays an increasingly important part of many decisionmaking processes today. Reading this chapter will help you understand the fundamentals of statistics and introduce you to concepts that are used throughout this book.
1.1 The Five Basic Words of Statistics
The five words population, sample, parameter, statistic (singular), and variable form the basic vocabulary of statistics. You cannot learn much about statistics unless you first learn the meanings of these five words. 
Population
CONCEPT All the members of a group about which you want to draw a conclusion.
EXAMPLES All U.S. citizens who are currently registered to vote, all patients treated at a particular hospital last year, the entire daily output of a cereal factory's production line.
Sample
CONCEPT The part of the population selected for analysis.
EXAMPLES The registered voters selected to participate in a recent survey concerning their intention to vote in the next election, the patients selected to fill out a patientsatisfaction questionnaire, 100 boxes of cereal selected from a factory's production line.
Parameter
CONCEPT A numerical measure that describes a characteristic of a population.
EXAMPLES The percentage of all registered voters who intend to vote in the next election, the percentage of all patients who are very satisfied with the care they received, the average weight of all the cereal boxes produced on a factory's production line on a particular day.
Statistic
CONCEPT A numerical measure that describes a characteristic of a sample.
EXAMPLES The percentage in a sample of registered voters who intend to vote in the next election, the percentage in a sample of patients who are very satisfied with the care they received, the average weight of a sample of cereal boxes produced on a factory's production line on a particular day.
INTERPRETATION Calculating statistics for a sample is the most common activity, because collecting population data is impractical for most actual decisionmaking situations.
Variable
CONCEPT A characteristic of an item or an individual that will be analyzed using statistics.
EXAMPLES Gender, the household income of the citizens who voted in the last presidential election, the publishing category (hardcover, trade paperback, massmarket paperback, textbook) of a book, the number of varieties of a brand of cereal.
INTERPRETATION All the variables taken together form the data of an analysis. Although you may have heard people saying that they are analyzing their data, they are, more precisely, analyzing their variables.
You should distinguish between a variable, such as gender, and its value for an individual, such as male. An observation is all the values for an individual item in the sample. For example, a survey might contain two variables, gender and age. The first observation might be male, 40. The second observation might be female, 45. The third observation might be female, 55. A variable is sometimes known as a column of data because of the convention of entering each observation as a unique row in a table of data. (Likewise, you may hear some refer to an observation as a row of data.)
Variables can be divided into the following types:
Categorical Variables 
Numerical Variables 


Concept 
The values of these variables are selected from an established list of categories. 
The values of these variables involve a counted or measured value. 
Subtypes 
None. 
Discrete values are counts of things. Continuous values are measures, and any value can theoretically occur, limited only by the precision of the measuring process. 
Examples 
Gender, a variable that has the categories male and female. Academic major, a variable that might have the categories English, Math, Science, and History, among others. 
The number of previous presidential elections in which a citizen voted, a discrete numerical variable. The household income of a citizen who voted, a continuous variable. 
All variables should have an operational definition—that is, a universallyaccepted meaning that is clear to all associated with an analysis. Without operational definitions, confusion can occur. A famous example of such confusion was the tallying of votes in Florida during the 2000 U.S. presidential election in which, at various times, nine different definitions of a valid ballot were used. (A later analysis^{ [1] } determined that three of these definitions, including one pursued by Al Gore, led to margins of victory for George Bush that ranged from 225 to 493 votes and that the six others, including one pursued by George Bush, led to margins of victory for Al Gore that ranged from 42 to 171 votes.) 