Dattaprabhakar G. answered 07/12/14
Tutor
5
(2)
Expert Tutor for Stat and Math at all levels
For the mean to make sense, the data values must be on an "interval-ratio" scale. To calculate the mean add all the data values and divide by the number of the data values added.
For the median to make sense, the data values should preferably be on an "interval-ratio" scale ("ordinal" scale is OK but only sometimes.) To calculate the median arrange all the data values in non-decreasing (or non-increasing) order and choose the "middle" value. Note that that middle value may not be unique. There are some ways to fix that but they are beyond scope of this conceptual question.
For the mode to make sense, data values may be on a nominal scale. A mode is that data value which occurs most frequently in the data. Once again the mode may not be unique.
Above definitions apply to data actually collected. There are "population" equivalents of these quantities, but that is beyond the scope of this question.
Q: What is variability in data set?
A: When a variable being measured has more than one possible value in the population, there is variability in the population values (if all values are the same in the population, the variability is zero). When there is variability in the population values the values in the sample MAY come out to be different. depending on the sample you get (remember, you MAY get identical values in the samples, sometimes, so that the sample variability is zero even when there is population variability).
Example: Population: {1, 2, 2, 1, 1, 1} There is a positive population variability
A possible sample of size 2: {1. 1}. Sample variability ZERO
It is an important statistical question as to how to quantify this variability. The RANGE (difference between the maximum and the minimum) for example is one possible measure of variability. If you are interested in how the data values are "scattered around" their mean, you have another measure of variability, called the standard deviation. In general, when the data values are on an interval ratio scale, the variability is small if the spread of the distribution of data values is tight rather than loose (may be determined visually too, from a graph of data). There are many measures of variablity.