Statistical Indices


Basic terminology

Population: The set of individual persons or objects in which an investigator is primarily interested during his or her research problem is known as the population. For example, all patients at a hospital treated in a year.

Sample: A sample is a set of individuals of a population which are observed. In other words, a sample is the part of the population selected for analysis. For example, the patients selected to fill out a survey from the total patients in a year.

Parameter: A parameter is an unknown numerical summary of the population. For example, the percentage of all the patients who are satisfied with the hospital.

Statistic: A statistic is a known numerical summary of the sample which can be used to make an inference about parameters. For example, the percentage in a sample of patients who are satisfied with the hospital is an estimate of the related parameter.

Variable: A characteristic that varies from one person or thing to another is called a variable, i.e, a variable is any characteristic that varies from one individual member of the population to another. For example, the gender of patients at a hospital.

Measures of tendency and spread

Average: An average is a value which is typical for a set of numbers. Since it tends to lie centrally within the set arranged according to magnitude, it is also a measure of central tendency.

Median: A median is the middle value in the list of numbers listed in numerical order from the smallest to largest.

Mode: A mode is the value that occurs most often. If no number in the list is repeated, then there is no mode for the list.

Range: A range of a list of numbers is the difference between the largest and smallest values.

Mean

The arithmetic mean of a set of $n$ numbers is given by $$\overline X=\frac{x_1+x_2+\cdots +x_n}{n}=\frac{\sum_i x_i}{n}.$$

For a set of $n$ numbers, $x_1,\dots ,x_n$ with each each number occurring $f_i$ times, the mean is give by $$\overline X=\frac{f_1x_1+f_2x_2+\cdots +f_nx_n}{f_1+f_2+\cdots +F_n}=\frac{\sum_i f_ix_i}{\sum_i f_i}.$$

Note that the arithmetic mean of a set of numbers is an average.

Deviation

Deviation from the arithmetic mean

The deviation from the arithmetic mean, $\overline X$, of each number, $x_j$ in a set of numbers $x_1, x_2, \dots , x_n$ is $$D_j=x_j-\overline X,\quad 1\le j \le n.$$

Mean deviation

The mean deviation (also know as the mean absolute deviation) of a set of numbers $x_1, x_2, \dots , x_n$ is $$\overline D = \frac{|x_1+x_2+\cdots +x_n-n\overline X|}{n}=\frac{\sum_i |x_i-\overline X|}{n}.$$

Standard deviation

The standard deviation of a set of numbers $x_1, x_2, \dots , x_n$ is $$\sigma = \sqrt{\frac{(x_1-\overline X)^2+(x_2-\overline X)^2+\cdots + (x_n-\overline X)^2}{n}}=\sqrt{\frac{\sum_i(x_i-\overline X)^2}{n}}.$$

Variance

The variance of a set of numbers $x_1, x_2, \dots , x_n$ is $$\sigma^2=\frac{\sum_i(x_i-\overline X)^2}{n}$$

Note that the units of variance are different to the units of the variable being measured.

Additional resources