## R Graph Cookbook

Read more about this book |

*(For more resources on R, see here.)*

# Formatting time series data for plotting

Time series or trend charts are the most common form of line graphs. There are a lot of ways in R to plot such data, however it is important to first format the data in a suitable format that R can understand. In this recipe, we will look at some ways of formatting time series data using the base and some additional packages.

## Getting ready

In addition to the basic R functions, we will also be using the *zoo* package in this recipe. So first we need to install it:

install.packages("zoo")

## How to do it…

Let’s use the *dailysales.csv* example dataset and format its *date* column:

sales<-read.csv("dailysales.csv")

d1<-as.Date(sales$date,"%d/%m/%y")

d2<-strptime(sales$date,"%d/%m/%y")

data.class(d1)

[1] "Date"

data.class(d2)

[1] "POSIXt"

## How it works…

We have seen two different functions to convert a character vector into dates. If we did not convert the *date* column, R would not automatically recognize the values in the column as dates. Instead, the column would be treated as a character vector or a factor.

The *as.Date()* function takes at least two arguments: the character vector to be converted to dates and the format to which we want it converted. It returns an object of the Date class, represented as the number of days since 1970-01-01, with negative values for earlier dates. The values in the date column are in a DD/MM/YYYY format (you can verify this by typing *sales$date* at the R prompt). So, we specify the format argument as “*%d/%m/%y*“. Please note that this order is important. If we instead use “*%m/%d/%y*“, then our days will be read as months and vice-versa. The quotes around the value are also necessary.

The *strptime()* function is another way to convert character vectors into dates. However, *strptime()* returns a different kind of object of class *POSIXlt*, which is a named list of vectors representing the different components of a date and time, such as year, month, day, hour, seconds, minutes, and a few more.

*POSIXlt* is one of the two basic classes of date/times in R. The other class *POSIXct* represents the (signed) number of seconds since the beginning of 1970 (in the UTC time zone) as a numeric vector. *POSIXct* is more convenient for including in data frames, and *POSIXlt* is closer to human readable forms. A virtual class *POSIXt* inherits from both of the classes. That’s why when we ran the *data.class()* function on d2 earlier, we get POSIXt as the result.

*strptime()* also takes a character vector to be converted and the format as arguments.

## There’s more…

The *zoo* package is handy for dealing with time series data. The *zoo()* function takes an argument x, which can be a numeric vector, matrix, or factor. It also takes an *order.by* argument which has to be an index vector with unique entries by which the observations in *x* are ordered:

library(zoo)

d3<-zoo(sales$units,as.Date(sales$date,"%d/%m/%y"))

data.class(d3)

[1] "zoo"

See the help on *DateTimeClasses* to find out more details about the ways dates can be represented in R.

# Plotting date and time on the X axis

In this recipe, we will learn how to plot formatted date or time values on the X axis.

## Getting ready

For the first example, we only need to use the base graphics function *plot()*.

## How to do it…

We will use the *dailysales.csv* example dataset to plot the number of units of a product sold daily in a month:

sales<-read.csv("dailysales.csv")

plot(sales$units~as.Date(sales$date,"%d/%m/%y"),type="l",

xlab="Date",ylab="Units Sold")

## How it works…

Once we have formatted the series of dates using *as.Date()*, we can simply pass it to the *plot()* function as the x variable in either the *plot(x,y)* or *plot(y~x)* format.

We can also use *strptime()* instead of using *as.Date()*. However, we cannot pass the object returned by *strptime()* to *plot()* in the *plot(y~x)* format. We must use the *plot(x,y)* format as follows:

plot(strptime(sales$date,"%d/%m/%Y"),sales$units,type="l",

xlab="Date",ylab="Units Sold")

## There’s more…

We can plot the example using the *zoo()* function as follows (assuming zoo is already installed):

library(zoo)

plot(zoo(sales$units,as.Date(sales$date,"%d/%m/%y")))

Note that we don’t need to specify x and y separately when plotting using zoo; we can just pass the object returned by *zoo()* to *plot()*. We also need not specify the type as “l”.

Let’s look at another example which has full date and time values on the X axis, instead of just dates. We will use the *openair.csv* example dataset for this example:

air<-read.csv("openair.csv")

plot(air$nox~as.Date(air$date,"%d/%m/%Y %H:%M"),type="l",

xlab="Time", ylab="Concentration (ppb)",

main="Time trend of Oxides of Nitrogen")

(Move the mouse over the image to enlarge it.)

The same graph can be made using zoo as follows:

plot(zoo(air$nox,as.Date(air$date,"%d/%m/%Y %H:%M")),

xlab="Time", ylab="Concentration (ppb)",

main="Time trend of Oxides of Nitrogen")