The function summarize() (which may also be written summarise()) creates a table in which you will find the result(s) of the summary function(s) you have chosen to apply to a data frame. The summary functions may be:
mean(): which returns the mean of a variable,sd(): which returns the standard deviation of a variable,median(): which returns the median of a variable,min(): which returns the minimum value of a variable,max(): which returns the maximum value of a variable,var(): which returns the variance of a variable,sum(): which returns the sum of a variable,- etc.
To apply one or more of these summary functions to a data frame, you just have to indicate in summarise() which function(s) you want to apply and on which variable of the data frame. The syntax is:
summarise(dataframe, function1(variable), function2(variable), ...)
Alternatively, using pipes, the syntax is:
dataframe %>% summarise(function1(variable), function2(variable), ...)
Let’s use the data frame Orange as an example. The top of the data frame looks like this:
head(Orange)
To calculate the mean and the standard deviation of the variable circumference, we write either
summarise(Orange, mean(circumference), sd(circumference))
OR
Orange %>% summarise(mean(circumference), sd(circumference))
which both result in:

This example actually does not make much sense in terms of biology. Indeed, we have calculated the average of circumference for different trees, but considering measurements performed at 7 different time points… Instead we could calculate the average circumference and standard deviation for each time point described in age by using group_by on the variable age (read more about group_by here).
To calculate the group means and standard deviations of the variable circumference, we write:
Orange %>% group_by(age) %>% summarise(mean(circumference), sd(circumference))
which results in:

Each line in the result table now shows the mean and standard deviation for each of 7 factors in age described in the first column.

