4. Subscripts of a dataframe


R gives you the possibility to extract parts of your dataframe. This can be useful to isolate data elements, columns of variables, subsets in your sample… Here are a few commands that create subsets from the dataframe.

 

extracting single data elements

Type the name of the object followed by brackets containing the coordinates of what you need:

[code language=”r”]
my.imported.data[6,2]
[/code]

extract single data element

 

 

extracting rows or columns

Type the name of the object followed by brackets containing the coordinates of what you need. In the command, the column #x is designated by [,x] while the row #y is designated by [y,]:

[code language=”r”]
my.imported.data[6,]
my.imported.data[,2]
[/code]

extract row or column

Note that it is possible to use the name of a variable to extract the corresponding column. Just replace the number of the column by the variable name. Do not omit the comma when typing the command as it will print the data elements in a different format, as depicted here:

[code language=”r”]
my.imported.data[,"Var1"]
my.imported.data["Var1"]
[/code]

extract variable

 

extracting elements in a column based on value in another column

A useful command allows you to find and retrieve data elements based on the value that can be found in a different variable. This is quite convenient when you want to work specifically on the values of a group or subgroup (for instance control vs. treated). In the following example, we want to retrieve data elements in the Var1 column, but only those for which the corresponding data elements in Var2 is treated:

[code language=”r”]
Var1[Var2=="treated"]
[/code]

Var2==treated

Note the == symbol that we have used here. This MUST be written that way; omitting one of the = symbols lead you directly into a problem, as depicted here:

[code language=”r”]
Var1[Var2="treated"]
[/code]

Var2=treated

You may use this type of command to filter data based on a threshold in a different variable. In the next example, we want to retrieve the ID of the data for which Var1 is strictly over 65 in the first instance, and then equal to or greater than 65 in the second instance:

[code language=”r”]
my.imported.data
ID[Var1>65]
ID[Var1>=65]
[/code]

ID[Var1>65]

You may even combine variables to restrict your selection even further. Combinations are possible via the symbol &. Here we select data elements based on two variables: Var1>=65 and Var3==”TRUE”.

[code language=”r”]
my.imported.data
ID[Var1>=65 & Var3=="TRUE"]
[/code]

combination

 

  Fant du det du lette etter? Did you find this helpful?
[Average: 0]