5 min read

Formatting time series data for plotting

Time series or trend charts are the most common form of line graphs. There are a lot of ways in R to plot such data, however it is important to first format the data in a suitable format that R can understand. In this recipe, we will look at some ways of formatting time series data using the base and some additional packages.

Getting ready

In addition to the basic R functions, we will also be using the zoo package in this recipe. So first we need to install it:

install.packages(“zoo”)


How to do it…

Let’s use the dailysales.csv example dataset and format its date column:

sales<-read.csv(“dailysales.csv”)

d1<-as.Date(sales$date,”%d/%m/%y”)

d2<-strptime(sales$date,”%d/%m/%y”)

data.class(d1)
[1] “Date”

data.class(d2)
[1] “POSIXt”


How it works…

We have seen two different functions to convert a character vector into dates. If we did not convert the date column, R would not automatically recognize the values in the column as dates. Instead, the column would be treated as a character vector or a factor.

The as.Date() function takes at least two arguments: the character vector to be converted to dates and the format to which we want it converted. It returns an object of the Date class, represented as the number of days since 1970-01-01, with negative values for earlier dates. The values in the date column are in a DD/MM/YYYY format (you can verify this by typing sales$date at the R prompt). So, we specify the format argument as “%d/%m/%y“. Please note that this order is important. If we instead use “%m/%d/%y“, then our days will be read as months and vice-versa. The quotes around the value are also necessary.

The strptime() function is another way to convert character vectors into dates. However, strptime() returns a different kind of object of class POSIXlt, which is a named list of vectors representing the different components of a date and time, such as year, month, day, hour, seconds, minutes, and a few more.

POSIXlt is one of the two basic classes of date/times in R. The other class POSIXct represents the (signed) number of seconds since the beginning of 1970 (in the UTC time zone) as a numeric vector. POSIXct is more convenient for including in data frames, and POSIXlt is closer to human readable forms. A virtual class POSIXt inherits from both of the classes. That’s why when we ran the data.class() function on d2 earlier, we get POSIXt as the result.

strptime() also takes a character vector to be converted and the format as arguments.

There’s more…

The zoo package is handy for dealing with time series data. The zoo() function takes an argument x, which can be a numeric vector, matrix, or factor. It also takes an order.by argument which has to be an index vector with unique entries by which the observations in x are ordered:

library(zoo)

d3<-zoo(sales$units,as.Date(sales$date,”%d/%m/%y”))

data.class(d3)
[1] “zoo”


See the help on DateTimeClasses to find out more details about the ways dates can be represented in R.

Plotting date and time on the X axis

In this recipe, we will learn how to plot formatted date or time values on the X axis.

Getting ready

For the first example, we only need to use the base graphics function plot().

How to do it…

We will use the dailysales.csv example dataset to plot the number of units of a product sold daily in a month:

sales<-read.csv(“dailysales.csv”)
plot(sales$units~as.Date(sales$date,”%d/%m/%y”),type=”l”,
xlab=”Date”,ylab=”Units Sold”)


Creating Time Series Charts in R

How it works…

Once we have formatted the series of dates using as.Date(), we can simply pass it to the plot() function as the x variable in either the plot(x,y) or plot(y~x) format.

We can also use strptime() instead of using as.Date(). However, we cannot pass the object returned by strptime() to plot() in the plot(y~x) format. We must use the plot(x,y) format as follows:

plot(strptime(sales$date,”%d/%m/%Y”),sales$units,type=”l”,
xlab=”Date”,ylab=”Units Sold”)


There’s more…

We can plot the example using the zoo() function as follows (assuming zoo is already installed):

library(zoo)
plot(zoo(sales$units,as.Date(sales$date,”%d/%m/%y”)))


Note that we don’t need to specify x and y separately when plotting using zoo; we can just pass the object returned by zoo() to plot(). We also need not specify the type as “l”.

Let’s look at another example which has full date and time values on the X axis, instead of just dates. We will use the openair.csv example dataset for this example:

air<-read.csv(“openair.csv”)

plot(air$nox~as.Date(air$date,”%d/%m/%Y %H:%M”),type=”l”,
xlab=”Time”, ylab=”Concentration (ppb)”,
main=”Time trend of Oxides of Nitrogen”)


(Move the mouse over the image to enlarge it.)

The same graph can be made using zoo as follows:

plot(zoo(air$nox,as.Date(air$date,”%d/%m/%Y %H:%M”)),
xlab=”Time”, ylab=”Concentration (ppb)”,
main=”Time trend of Oxides of Nitrogen”)

LEAVE A REPLY

Please enter your comment!
Please enter your name here