Test for normality – Shapiro-Wilks test


Sometimes, you are just curious to know whether a sample is normally distributed. Sometimes, you NEED to know whether it is or not because normality is a prerequisite for performing a specific test.

To check whether your sample is normally distributed, you may use shapiro.test() which performs the Shapiro-Wilks normality test. In this test, the null hypothesis H0 states that the sample has a normal distribution. Accordlingly, the p-value that results from the test represents the chance that the sample originates from a normal distribution. A low value (typically under 0.05) would thus indicate that the sample is not likely to be normally distributed.

Let’s take an example.

To visualize the sample, we first create a histogram of the vector data1 to which we add a line representing the normal density curve:

data1 <-c (11.25, 10.00, 9.68, 10.52, 8.77, 9.92, 8.62, 10.21, 9.09, 10.36)
hist(data1, col="red", xlim=c(7,13), prob=TRUE)
x <- seq(7,13,0.1)
curve(dnorm(x,mean=mean(data1), sd=sd(data1)), 7, 13, col="blue", add=TRUE)

Skjermbilde 2016-06-13 10.48.15

Now let’s run the test:

shapiro.test(data1)

Skjermbilde 2016-06-13 10.54.14

Apparently, the p-value is rather high. The null hypothesis H0 is NOT rejected, meaning that your sample is very likely to be normally distributed.