In the study of statistics, there are no absolutes. For example, the answer to how much does an American adult weigh has no answer because there is a wide variety in the population in this value. What can be done in statistics is to obtain a large number of samples and average them together to determine the average, or mean, weight of an American adult. Having done that, statistics tries to answer the question of what variation around this mean can be expected. Many American adults may weigh within 10 lbs. (9 kg) of the mean. However, any American adult may weigh as much as 220 lbs. (100 kg) over or under the mean weight. Statistical analysis can inform as to how much variation around the mean may be expected, and in what quantities. American adults will tend to cluster around the mean weight, and as they get further and further from the mean, the number of people at that weight will trail off. When drawn graphically the data resembles the shape of a bell, hence the name bell curve. Statistics uses the term standard deviation to quantify the variation from the mean. Each standard deviation from the mean represents an exponential drop off of the number of samples at that level. 68 percent of the samples are within 1 standard deviation of the mean. 95 percent of the samples are within 2 standard deviations of the mean, which is commonly quoted as the margin of error on public opinion polls. In an effort to make this a bit easier to grasp for practical use, the concept of Z scores was created. Calculated from the mean and standard deviation of the sample set, the Z score for an individual sample represents how many standard deviations from the mean it lies.
Edit Steps
- 1Collect the samples of the variable of interest. Gather a large number of samples of the variable of interest to make sure that all reasonable variations from the average are covered. Samples should be randomly chosen. For example, if the sample of interest is the height of palm trees, measuring only palm trees in Florida will give an answer that is meaningful to Florida palm trees only. Palm trees selected randomly around the world should be chosen to arrive at an answer that is meaningful for palm trees as a flora.
- Determine sample size. The sample must be large enough to give a meaningful answer, but this does not mean that every palm tree in the world must be measured as a sample. The need for the most accurate answer possible must be weighed against the enormous mathematical task of taking all possible samples in to account. There is no hard and fast answer for this as the selection of sample size is basically dependent on how accurate the answer needs to be. Consult a statistical textbook or online university presentation to get an idea of what sample size is needed to get the desired accuracy.
- 2Find the sample mean. Add together the values of all of the samples. Divide this sum by the number of samples used. This number is the average, or mean, value.
- 3Determine the standard deviation of the sample. This represents how tightly or loosely the values are grouped around the mean. Determine the variation of each sample from the mean by noting the difference in values of the 2 numbers. Subtraction can be used to determine these variances, but remember to change all negative values obtained this way into positive values as variance is defined as distance from the mean, regardless of whether the sample is below or above the mean. Square each individual sample variance and add the squared values together. Divide this sum of the squares by the number of samples used. The result is the sample variance. Take the square root of the variance. This square root is the standard deviation.
- 4Calculate the Z scores. A Z score may be calculated for each sample. Subtract the sample group mean from the value of the individual sample of interest. Divide the result of that subtraction by the standard deviation of the sample group. The result of that division is the Z score of the chosen sample, indicating how many standard deviations away from the mean the chosen sample lies. Negative numbers are permitted, as the Z score not only gives the sample distance from the mean, but also indicates if the chosen sample lies below (negative Z score) or above (positive Z score) the mean.
Edit Sources and Citations
Article Info
Categories: Education and Communications
Recent edits by: Drg2345, Veracious