Reading and Writing Data to and from R


Reading files into R

Usually we will be using data already in a file that we need to read into R in order to piece of work on information technology. R tin can read data from a diversity of file formats—for case, files created as text, or in Excel, SPSS or Stata. We will mainly be reading files in text format .txt or .csv (comma-separated, unremarkably created in Excel).

To read an entire information frame directly, the external file will commonly have a special grade

  • The outset line of the file should have a name for each variable in the data frame.
  • Each additional line of the file has as its first item a row characterization and the values for each variable.

Hither we utilise the example dataset chosen airquality.csv and airquality.txt

Input file form with names and row labels:

Ozone Solar.R * Current of air Temp Calendar month Day

1 41 ***** 190 ** seven.iv ** 67 **** 5 ** 1

2 36 ***** 118 ** 8.0 ** 72 **** 5 ** ii

3 12 ***** 149 * 12.6 ** 74 **** 5 ** 3

4 18 ***** 313 * 11.5 ** 62 **** 5 ** iv

v NA ***** NA ** 14.3 ** 56 **** five ** 5

   ...

Past default numeric items (except row labels) are read as numeric variables. This tin can be inverse if necessary.

The function read.table() tin then be used to read the data frame directly

     > airqual <- read.table("C:/Desktop/airquality.txt")

Similarly, to read .csv files the read.csv() part can be used to read in the data frame directly

[Notation: I have noticed that occasionally you'll demand to do a double slash in your path //. This seems to depend on the machine.]

> airqual <- read.csv("C:/Desktop/airquality.csv")

 In improver, yous tin can read in files using the file.choose() function in R. After typing in this control in R, you can manually select the directory and file where your dataset is located.

  1. Read the airquality.csv file into R using the read.csv control.
  2. Read the airquality.txt file into R using the file.choose() control

Occasionally, you lot will demand to read in information that does not already have cavalcade name data.  For example, the dataset BOD.txt looks like this:

one    viii.three

2   10.3

three   nineteen.0

four   16.0

5   fifteen.6

seven   xix.viii

Initially, there are no cavalcade names associated with the dataset.  We can apply the colnames() command to assign column names to the dataset.  Suppose that we desire to assign columns, "Fourth dimension" and "demand" to the BOD.txt dataset.  To exercise and then we practise the post-obit

> bod <- read.tabular array("BOD.txt", header=F)

> colnames(bod) <- c("Time","need")

> colnames(bod)

[ane] "Time"   "demand"

The showtime command reads in the dataset, the command "header=F" specifies that in that location are no column names associated with the dataset.

Read in the cars.txt dataset and call it car1.  Brand certain y'all employ the "header=F" selection to specify that there are no column names associated with the dataset.  Next, assign "speed" and "dist" to be the offset and second cavalcade names to the car1 dataset.

The 2 videos beneath provide a nice explanations of different methods to read data from a spreadsheet into an R dataset.

Import Data, Copy Data from Excel to R, Both .csv and .txt Formats (R Tutorial 1.3) MarinStatsLectures [Contents]

alternative accessible content

Importing Data and Working With Data in R (R Tutorial 1.4) MarinStatsLectures [Contents]

alternative accessible content

Writing Information to a File


Afterwards working with a dataset, we might similar to save information technology for time to come apply. Earlier we practise this, let'due south outset set up a working directory so we know where we can find all our data sets and files later.

Setting upward a Directory

In the R window, click on "File" and then on "Alter dir". You should then see a box pop upward titled "Choose directory". For this grade, choose the directory "Desktop" by clicking on "Browse", and so select "Desktop" and click "OK". In the hereafter, you may desire to create a directory on your computer where you go on your information sets and codes for this course.

Alternatively, you can use the setwd() function to assign every bit working directory.

> setwd("C:/Desktop")

To notice out what your current working directory is, type

> getwd()

Setting Up Working Directories in R (R Tutorial 1.eight) MarinStatsLectures [Contents]

alternative accessible content

In R, nosotros tin write data frames hands to a file, using the write.table() control.

> write.table(cars1, file=" cars1.txt ", quote=F)

The first argument refers to the data frame to be written to the output file, the 2nd is the name of the output file. By default R will surround each entry in the output file by quotes, so nosotros use quote=F.

Now, let's check whether R created the file on the Desktop, by going to the Desktop and clicking to open the file. You lot should see a file with three columns, the first giving the index (or row number) and the other ii the speed and distance. R by default creates a cavalcade of row indices. If we wanted to create a file without the row indices, nosotros would use the command:

> write.table(cars1, file=" cars1.txt ", quote=F, row.names=F)

Datasets in R


Watch the video below for a concise intoduction to working with the variables in an R dataset

Working with Variables and Data in R (R Tutorial one.v) MarinStatsLecures [Contents]

alternative accessible content

Around 100 datasets are supplied with R (in the packet datasets), and others are available.

To see the listing of datasets currently available employ the command:

information()

We volition start look at a data set up on CO2 (carbon dioxide) uptake in grass plants bachelor in R.

> CO2

[ Notation: capitalization matters here; also: it's the letter O, not zero. Typing this command should display the unabridged dataset called CO2, which has 84 observations (in rows) and five variables (columns).]

To get more than data on the variables in the dataset, type in

> help(CO2)

Evaluate and report the hateful and standard deviation of the variables "Concentration" and "Uptake".

Subsetting Data in R With Square Brackets and Logic Statements (R Tutorial one.6) MarinStatsLecures [Contents]

alternative accessible content