Entire data frames may be put together either beside each other (thus increasing the number of variables) or below each other (thus increasing the number of cases) into a single, large table. Here we focus on combining data frames beside each other.
One of the functions that can do such an operation is bind_cols()
. Note that bind_cols()
may be applied ONLY to data frames with equal length (number of cases). If the data frames are different in length, you will have to use another function such as left_join
or right_join
(among others), which may add NA
wherever necessary.
To illustrate how bind_cols()
works, we will use the data frames Orange
and Orange2
as examples. Orange2
is a data frame similar to Orange
, with the difference that the values of the variable circumference
have been multiplied by 5 using the following line of code:
Orange2 <- Orange %>% mutate_at(vars(circumference), list(~.*5))
We thus have the following two data frames Orange
and Orange2
:
Note that the data frames have three identical variables: Tree
, age
, and circumference
. We can combine the two data frames with the following code:
bind_cols(Orange, Orange2)
which gives us a table with 35 rows (like the original data frames), but 6 columns instead of 3.
As you may see here, bind_cols()
does not automatically work on recognizing identical variables, but rather copies all the variables of Orange2
to the right of the variables of Orange
, and then adds a digit next to the name of a variable which has been encountered before.
bind_cols()
does not (need to) rename variables if they have been used already once. Here is what happens when the data frames have (at least) one variable which is not common to both. Let’s use Orange3
which is similar to Orange
, with the difference that the variable circumference
has been renamed to circumferenceNEW
using the following code:
Orange3 <- Orange %>% rename(circumferenceNEW = circumference) head(Orange3)
bind_cols(Orange, Orange3)
We end up with a table made of 35 observations and 6 variables, but neither circumference
nor circumferenceNEW
has been renamed. Still, Tree
and age
are found in duplicates.