Main Content
Lesson 1: Decision Making Under Uncertainty
Descriptive Statistics
In this lesson, we will focus mostly on descriptive statistics (i.e., describing data and creating graphs and charts). We will be learning the basic descriptive statistics concepts using data from a production facility described below.
Case I: Production Line
This is data from a production facility where five parallel lines are filling boxes of cereal. The target weight of each box is 25 oz Each production line weighs a box every 30 seconds. You are the shift manager in this facility. Your job is to randomly check the weights of the boxes from each line. If you notice an anomaly (e.g., over- or under-filled boxes), you can stop the line to make an inspection. If the boxes are approximately 25 oz, you let the line continue.
SPSS Data File
Measures of Center
Most sets of data tend to group or cluster around a center point. Measures of central tendency yield information about this area of most common occurrences. In short, they tell us about what is a typical outcome. The three most common measures of center are the mean, median, and mode.
Mean
-
The numerical average is calculated as the sum of all of the data values divided by the number of values.
Example: Last Five Weights, Line 1
Find the mean of the last five weights of Line 1.
Table 1.1. Mean: Last Five Weights, Line 1 Time (minutes) Line 1 Weight (oz) 544.0 24.85 544.5 25.04 545.0 24.68 545.5 24.83 546.0 24.82 Median
- To find the median, sort the numbers from smallest to largest.
- If odd number of numbers, the middle number in the median
- If even number of numbers, then the average of the two middle numbers
Example: Last Five Weights, Line One
Order the numbers from least to greatest.
24.68, 24.82, 24.83, 24.85, 25.04
Median = 24.83
Mean vs Median
While the mean (i.e., the average) is the most frequently used measure, it may be misleading at times, especially if there are extreme data points. Consider the following example:
Warren Buffet moves to your street. What happens to the average household income of your neighborhood?
Person Street 1 Street 2 Table 1.3. Neighborhood Household Income 1 $10,000 $10,000 2 $20,000 $20,000 3 $30,000 $30,000 4 $40,000 $40,000 5 $50,000 $50,000 6 $60,000 $60,000 7 $70,000 $1,000,000 Street 1
- Mean income: $40,000
- Median income: $40,000
Street 2
- Mean income: $173,000
- Median income: $40,000
If you are looking at just the mean, all of a sudden everybody is earning a lot more than they did before Mr. Buffet moved to your street! Looking at the median, tell the true story, though. Outliers are extreme values in your data point (such as Warren Buffet moving onto a regular street). The mean may be somewhat misleading in the presence of outliers. However, because the median partitions the data in two halves, it provides a truer picture in the presence of such extreme values.
Mode
- The value that occurs most often.
How useful is mode in this instance? Mode does not provide much information about the center of this data. When would mode be important? In the world of business, the concept of mode is often used in determining sizes. For example, shoe manufacturers might produce inexpensive shoes in three widths only: narrow, normal, and wide. Each size represents a modal width. By reducing the number of sizes, companies can reduce costs by limiting machine set-up costs. Similarly, the garment industry produces clothing products in modal sizes.
An interesting work related to mode occurred in the fast food industry where firms found that consumers typically bought regular drinks when offered regular and large sizes. The industry designed an experiment to test the effect of using regular, large, and supersize—the latter a size few would ever choose. The result was that consumers now choose large more often than regular.
Descriptive Statistics: SPSS Instructions
SPSS: Introduction
Descriptive Statistics: SPSS Instructions Handout