**Repeated measures ANOVA** is a test that seems close to one-way ANOVA as it allows to check for differences between the means of three and more groups. The essential difference is that the groups are **dependent** (i.e. related). This means that the groups contains data or measurements from the same individuals. What differs in terms of design may be **when** measurements were taken (measurements were thus repeated in time: before, during and after an experiment OR every day during a given time frame, OR etc) or **in which circumstances or conditions** measurements were performed (for example, measurements were done 3 times, each time following application of a new drug). “As usual”, the null hypothesis states that the means of all groups are equal.

Assumptions are:

- each
**individual**is represented in the form of a**measurement in each of the tested groups**(there cannot be any missing value), **normality**of distribution,**sphericity**, which means that equality of variance is verified when comparing any two groups (**all**possible pairs of groups) in the experimental design,- groups do not contain
**outliers**.

As in one-way ANOVA, you may use `aov()`

. You may as well fit a linear mixed effect model and thus employ the function `lme()`

in the package `nlme`

. However, in both cases, we need to tell the program/function that some of the measurements are not replicates but that they originate from the same individual. We will thus use the function `Error()`

within the function `aov()`

and the argument `random=~1|subjects`

in `lme()`

Let’s take an example where 5 rats are weighed 4 times with intervals of 4 weeks (week8 to 20). Here is the code for the dataframe:

rat.weight <- c(164,164,158,159,155,220,230,226,227,222,261,275,264,280,272,306,326,320,330,312) rat.ID <- as.factor(rep(c("rat1","rat2","rat3", "rat4", "rat5"),4)) time.point <- as.factor(c(rep("week08",5), rep("week12",5), rep("week16",5), rep("week20",5))) my.dataframe <- data.frame(rat.ID,time.point,rat.weight) my.dataframe

The dataframe looks like this (click to enlarge):

Let’s visualize the boxplots:

plot(rat.weight~time.point, data=my.dataframe, ylab="weight", xlab="time")

Below is a more useful, colored line chart that displays the growth curve for each of the five rats:

with(my.dataframe, interaction.plot(time.point, rat.ID, rat.weight, col=c("black", "red", "blue", "green", "purple"), lty= c(1,2,3,4,5)))

Now, let’s check the assumption of normality of distribution:

shapiro.test(rat.weight[time.point=="week08"]) shapiro.test(rat.weight[time.point=="week12"]) shapiro.test(rat.weight[time.point=="week16"]) shapiro.test(rat.weight[time.point=="week20"])

Everything looks fine… all p-values are above 0.05 so normality of distribution does not constitute a problem.

**Let’s go for the test with aov() and let’s mention the relationship between observations in Error():**

results <- aov(rat.weight~time.point+Error(rat.ID/time.point), data=my.dataframe) summary(results)

… and so the test returns a rather large F value (794) and a very, very small p-value (4.64.10^{-14}) indicating a main effect of `time.point`

on `rat.weight`

(unsurprisingly…).

**Let’s go for the test with lme() and let’s mention the relationship between observations with random=:**

Since `lme()`

is part of a package, you may need to install and/or load it, if it isn’t already done. Here is how to do it:

install.packages("nlme") library(nlme)

The following code fits the linear mixed model:

results.lme <- lme(rat.weight~time.point,random=~1|rat.ID, data=my.dataframe) anova(results.lme)

As for `aov()`

, the output shows a F-value close to 794 and a p-value under 0.0001. The null hypothesis can be rejected; there is a significant difference between the means of the groups in `time.point`

.

**But what about the assumption of sphericity that we mentioned earlier..? How do we test this? **

Sphericity is commonly checked via Mauchly’s test. However, running this test and the corresponding function (`mauchly.test()`

) is a bit complicated in R. In the case of (one-way) repeated measures ANOVA, we commonly use the package `car`

which is preinstalled in R (but not activated unless you have typed in `library(car)`

once since you started R today…). In fact, there is a way using `car`

to run the ANOVA and the Mauchly test at once. This requires that you convert your dataset in a particular way (which is described in details on this website by Koji Yatani). Let’s see that by reusing the previous example:

First load the package:

library(car)

Then, convert your dataframe into a matrix. In the following line of code, `cbind()`

picks up all the data in `my.dataframe`

, group by group (the groups are `week08`

, `week12`

, `week16`

and `week20`

) and builds a matrix.

my.matrix <- with(my.dataframe, cbind(rat.weight[time.point=="week08"], rat.weight[time.point=="week12"], rat.weight[time.point=="week16"], rat.weight[time.point=="week20"])) my.matrix

Now we build a linear model and a design (by which we name the groups/factors):

model <- lm(my.matrix ~ 1) design <- factor(c("week08", "week12", "week16", "week20"))

and we eventually run the repeated measures ANOVA via the function `Anova()`

:

options(contrasts=c("contr.sum", "contr.poly")) results <- Anova(model, idata=data.frame(design), idesign=~design, type="III") summary(results, multivariate=F)

R prints the following output:

As you may see here in the middle of the output, the Mauchly test for sphericity appears in the middle. It returns a p-value over 0.05 meaning that the null hypothesis stating that there is no difference in variances for all pairwise group comparisons is accepted. This means also that we have all rights to go ahead with the repeated measure ANOVA… and the result of the repeated measures ANOVA is to be found on top of the output. On the same line as “design”, an F-value and a p-value are given, and they show the same results as previously obtained.

Now that you know that there exist significant differences, you may proceed with an adapted posthoc test (pairwise comparisons).

**But what about the assumption of normality of distribution is not met?**

If so, you may use a non-parametric test called **Friedman rank sum test** (or simply Friedman test).

The function to call in R is `friedman.test(data, factor1, factor2)`

where `data`

is the vector that contains the measurements, `factor1`

is the grouping factor and `factor2`

is the blocking factor (where the individual IDs are stored).

Using our example as exposed at the top of this page, the syntax is:

friedman.test(rat.weight, time.point, rat.ID)

and the resulting output is:

The null hypothesis stating that there is no difference between the means of the groups is (again) rejected due to the low p-value.