Detailed hands-on recipes for creating the most useful types of graphs in R – starting from the simplest versions to more advanced applications
- Learn to draw any type of graph or visual data representation in R
- Filled with practical tips and techniques for creating any type of graph you need; not just theoretical explanations
- All examples are accompanied with the corresponding graph images, so you know what the results look like
- Each recipe is independent and contains the complete explanation and code to perform the task as efficiently as possible
Setting colors of points, lines, and bars
In this recipe we will learn the simplest way to change the colors of points, lines, and bars in scatter plots, line plots, histograms, and bar plots.
All you need to try out this recipe is to run R and type the recipe at the command prompt. You can also choose to save the recipe as a script so that you can use it again later on.
How to do it…
The simplest way to change the color of any graph element is by using the col argument. For example, the plot() function takes the col argument:
The code file can be downloaded from here.
If we choose plot type as line, then the color is applied to the plotted line. Let’s use the dailysales.csv example dataset. First, we need to load it:
Sales <- read.csv(“dailysales.csv”,header=TRUE)
type=”l”, #Specify type of plot as l for line
Similarly, the points() and lines() functions apply the col argument’s value to the plotted points and lines respectively.
barplot() and hist() also take the col argument and apply them to the bars. So the following code would produce a bar plot with blue bars:
The col argument for boxplot() is applied to the color of the boxes plotted.
How it works…
The col argument automatically applies the specified color to the elements being plotted, based on the plot type. So, if we do not specify a plot type or choose points, then the color is applied to points. Similarly, if we choose plot type as line then the color is applied to the plotted line and if we use the col argument in the barplot() or histogram() commands, then the color is applied to the bars.
col accepts names of colors such as red, blue, and black. The colors() (or colours()) function lists all the built-in colors (more than 650) available in R. We can also specify colors as hexadecimal codes such as #FF0000 (for red), #0000FF (for blue), and #000000 (for black). If you have ever made any web pages¸ you would know that these hex codes are used in HTML to represent colors.
col can also take numeric values. When it is set to a numeric value, the color corresponding to that index in the current color palette is used. For example, in the default color palette the first color is black and the second color is red. So col=1 and col=2 refers to black and red respectively. Index 0 corresponds to the background color.
In many settings, col can also take a vector of multiple colors, instead of a single color. This is useful if you wish to use more than one color in a graph. The heat.colors() function takes a number as an argument and returns a vector of those many colors. So heat.colors(5) produces a vector of five colors.
Type the following at the R prompt:
You should get the following output:
 “#FF0000FF” “#FF5500FF” “#FFAA00FF” “#FFFF00FF” “#FFFF80FF”
Those are five colors in the hexadecimal format.
Another way of specifying a vector of colors is to construct one:
In the example, we set the value of col to c(“red”,”blue”,”green”,”orange”,”pink”), which is a vector of five colors.
We have to take care to make a vector matching the length of the number of elements, in this case bars we are plotting. If the two numbers don’t match, R will ‘recycle’ values by repeating colors from the beginning of the vector. For example, if we had fewer colors in the vector than the number of elements, say if we had four colors in the previous plot, then R would apply the four colors to the first four bars and then apply the first color to the fifth bar. This is called recycling in R:
In the example, both the bars for the first and last data rows (Seattle and Mumbai) would be of the same color (red), making it difficult to distinguish one from the other.
One good way to ensure that you always have the correct number of colors is to find out the length of the number of elements first and pass that as an argument to one of the color palette functions. For example, if we did not know the number of cities in the example we have just seen; we could do the following to make sure the number of colors matches the number of bars plotted:
We used the length() function to find out the length or the number of elements in the vector sales$City and passed that as the argument to heat.colors(). So, regardless of the number of cities we will always have the right number of colors.
In the next four recipes, we will see how to change the colors of other elements. The fourth recipe is especially useful where we look at color combinations and palettes.