R gives you the possibility to extract parts of your dataframe. This can be useful to isolate data elements, columns of variables, subsets in your sample… Here are a few commands that create subsets from the dataframe.
extracting single data elements
Type the name of the object followed by brackets containing the coordinates of what you need:
[code language=”r”]
my.imported.data[6,2]
[/code]
extracting rows or columns
Type the name of the object followed by brackets containing the coordinates of what you need. In the command, the column #x is designated by [,x] while the row #y is designated by [y,]:
[code language=”r”]
my.imported.data[6,]
my.imported.data[,2]
[/code]
Note that it is possible to use the name of a variable to extract the corresponding column. Just replace the number of the column by the variable name. Do not omit the comma when typing the command as it will print the data elements in a different format, as depicted here:
[code language=”r”]
my.imported.data[,"Var1"]
my.imported.data["Var1"]
[/code]
extracting elements in a column based on value in another column
A useful command allows you to find and retrieve data elements based on the value that can be found in a different variable. This is quite convenient when you want to work specifically on the values of a group or subgroup (for instance control vs. treated). In the following example, we want to retrieve data elements in the Var1
column, but only those for which the corresponding data elements in Var2
is treated
:
[code language=”r”]
Var1[Var2=="treated"]
[/code]
Note the ==
symbol that we have used here. This MUST be written that way; omitting one of the =
symbols lead you directly into a problem, as depicted here:
[code language=”r”]
Var1[Var2="treated"]
[/code]
You may use this type of command to filter data based on a threshold in a different variable. In the next example, we want to retrieve the ID of the data for which Var1
is strictly over 65 in the first instance, and then equal to or greater than 65 in the second instance:
[code language=”r”]
my.imported.data
ID[Var1>65]
ID[Var1>=65]
[/code]
You may even combine variables to restrict your selection even further. Combinations are possible via the symbol &
. Here we select data elements based on two variables: Var1>=65 and Var3==”TRUE”.
[code language=”r”]
my.imported.data
ID[Var1>=65 & Var3=="TRUE"]
[/code]