Adèle Mennerat, postdoctoral researcher at the Theoretical Ecology Group, BIO, is interested in the fields of evolutionary and behavioural ecology. In the course BIO201, Adèle uses examples linked to a project she worked on in 2010 – 2012. The following tutorial reuses one of these examples and provides a step-by-step description of the methods she employed to analyse the data.Type of analysis: Bray-Curtis similarity, data clustering, multidimensional scaling (MDS), SIMPER Software: PRIMER 6
In brief, the project aims at studying how parasitic infection affects host behaviour in wild passerine birds like Blue Tits Cyanistes caeruleus and Great Tits Parus major. In southern France and Corsica, where researchers from the CEFE CNRS lab in Montpellier study evolution and ecology of wild passerine birds (more info here), female blue tits garnish their nests with aromatic plants, thus reducing bacterial loads on their offspring. This makes them grow faster and achieve a better condition at fledging, and resembles a kind of maternal care under the form of preventive medication. In populations of great tits studied by researchers from the EGI in Oxford (more info here) Adèle Mennerat has collaborated with Prof. Ben Sheldon to investigate how avian malaria infection affects social behaviour of hosts, and how bacteria are spread on social networks.
In this tutorial, we will work on 2 datasets named “communities” and “Sp_richness”. “communities” deals with communities of bacteria sampled from birds and genotyped according to a specific DNA fragment that vary in length between bacterial species; “Sp_richness” contains a summary of bacterial species richness for each sample.
First, you may retrieve the file Datafile_BIO201_apr2015.pwk which contains the datasets by clicking here. You may then open it in PRIMER 6.
This first video (to the right) gives a description of the datasets and the factors which define the samples in this study. The factors are: “Individual” – the ID of the bird, “date” – the day when the individual was captured, “captivity” – indicating whether the sample was taken before or after captivity, “location” – where the individuals were studied, “body_part” – indicating whether the sample was taken from the mouth, feet or wings, and finally “Sp_richness” – which is the number of bacteria species found in a given sample. Click on the snapshot to watch the video.
We first need to standardize the dataset so that the first column displays the relative abundance of bacteria species instead of raw abundance. Then, a logarithmic transformation of the dataset is required to give more weight to the rare bacteria species, which are of higher interest in this study. These modifications being done, the dataset is now ready to be analysed. Click on the snapshot to watch the video.
It is now time to start analyzing the data. First, let’s have a look at similarity between samples based on the Bray-Curtis index of similarity. This index ranges from 0 to 1, with a maximum similarity indicated by a value of 1. Clustering the data using one of the factors (for instance “captivity”) helps visualising similarities between samples. Multidimensional scaling provides another mean to visualise this degree of similarity. This time, two factors are taken into account to “sort” samples in a 2D plot. Click on the snapshot to watch the video.
Let’s now take a look at the second dataset (“Sp_richness”). Using a 2D bubble plot, a correlation between species richness and body part (the area on which the sample was taken) may be revealed has clusters of samples with high/low species richness appear on the plot. Click on the snapshot to watch the video.
Back to the first dataset, we may run the SIMPER tool which reveals the contribution of each bacterial species to total similarity of sample groups with respect for specific factors (such as body part). For instance, it may highlight the specificity of some bacterial species to mouth samples. Click on the snapshot to watch the video.
Finally, this last video shows how to extract a subset of data from the original dataset. This may be useful whenever we want to test a hypothesis which concerns only samples taken from a specific body part, location… Once extracted, the subset of data may be handled in the same manner as the whole set was (clustering, MDS, SIMPER…). Click on the snapshot to watch the video.