Descriptive Stats

Descriptive statistics give you a meaningful, quantitative overview over your sample and helps you summarizing an overwhelming amount of data into something somehow more comprehensible. It is generally a first step into statistical analysis, even though it only reflects what your data are made of.

The mean is certainly one of the most used values to describe a sample or a dataset. It is often called average and constitutes one of the measures of the central tendency of your sample. The mean is the sum of all the values in your dataset divided by the number of […]

1. Mean

The median is another measure of the central tendency of your sample. The median is the number “in the middle” of the dataset, the middle value of a series of values sorted from the smallest to the largest. Providing that your dataset contain an odd number n of entries, the median is the value of the ((n+1)/2)th entry in the […]

2. Median

The minimum and maximum of a dataset are the smallest and the largest entries, respectively. No surprise here… The range is the difference between the maximum and the minimum, and defines the spread of the data. Note that the range may be expressed as a single value (the actual difference between maximum […]

3. Minimum, maximum and range

The variance corresponds to the average of the squared differences from the mean. Not sure to understand? Take any entry in your dataset, calculate the difference between its value and the mean and square it. Do the same for all the entries in the dataset. Eventually, calculate the average of these […]

4. Variance

The standard deviation is nothing else than the square root of the variance. Unlike the variance, the standard deviation has the same unit as the mean, the median or any entry in the dataset. This makes it very useful as it can be used in combination with the mean to […]

5. Standard Deviation

Take a dataset, sort the data from smallest to largest and split it in 4 equal subsets. The quartiles are the values of the dataset that cut it off in 4. Quartiles are called: Q1: the first quartile under which the first 25% of the data in the set can be […]

6. Quartiles

A boxplot is a convenient little plot that brings up the “essentials” about your sample. In R, a boxplot can be built within seconds using the function boxplot(). Not sure what this plot means? Check out the illustration to the right. Here is a quick explanation about what is displayed in this plot. As […]

7. Boxplot

R allows installation and use of toolboxes (called packages) made by third parties. Such packages are often useful for specific types of analyses and provide multitudes of functions and possibilities. With regards to descriptive statistics, the package “pastecs” is one of these toolboxes. Note that such a package must be […]

9. Pastecs