3 min read

R 4.0.0 was released in source form on Friday, and binaries for Windows, Mac and Linux are available for download now.

As the version number bump suggests, this is a major update to R that makes some significant changes. Some of these changes — particularly the first one listed below — are likely to affect the results of R’s calculations, so I would not recommend running scripts written for prior versions of R without validating them first. In any case, you’ll need to reinstall any packages you were using for R 4.0.0. (You might find this R script useful for checking what packages you have installed for R 3.x.)

You can find the full list of changes and fixes in the NEWS file (it’s long!), but here are the biggest changes:

Imported string data is no long converted to factors. The stringsAsFactors option, which since R’s inception defaulted to TRUE to convert imported string data  to factor objects, is now FALSE. This default was probably the biggest stumbling block for prior users of R: it made statistical modeling a little easier and used a little less memory, but at the expense of confusing behavior on data you probably thought was ordinary strings. This change broke backward compatibility for many packages (mostly now updated on CRAN), and likely affects your own scripts unless you were diligent about including explicit stringsAsFactors declarations in your import function calls.

A new syntax for specifying raw character strings. You can use syntax like r"(any characters except right paren)" to define a literal string. This is particularly useful for HTML code, regular expressions, and other strings that include quotes or backslashes that would otherwise have to be escaped.

An enhanced reference counting system. When you delete an object in R, it usually releases the associated memory back to the operating system. Likewise, if you copy an object with y <- x, R won’t allocate new memory for y unless x is later modified. In prior versions of R, however, that system breaks down if there are more than 2 references to any block of memory. Starting with R 4.0.0, all references will be counted, and so R should reclaim as much memory as possible, reducing R’s overall memory footprint. This will have no impact on how you write R code, but this change make R run faster, especially on systems with limited memory and with slow storage systems.

Normalization of matrix and array types. Conceptually, a matrix is just a 2-dimensional array. But prior versions of R handle matrix and 2-D array objects differently in some cases. In R 4.0.0, matrix objects will formally inherit from the array class, eliminating such inconsistencies.

A refreshed color palette for charts. The base graphics palette for prior versions of R (shown as R3 below) features saturated colors that vary considerably in brightness (for example, yellow doesn’t display as prominently as red). In R 4.0.0, the palette R4 below will be used, with colors of consistent luminance that are easier to distinguish, especially for viewers with color deficiencies. Additional palettes will make it easy to make base graphics charts that match the color scheme of ggplot2 and other graphics systems.

R4 pallette

Performance improvements. The grid graphics system has been revamped (which improves the rendering speed of ggplot2 graphics in particular), socket connections are faster, and various functions have been sped up. 

Cairo graphics devices have been updated to support more fonts and symbols, an improvement particularly relevant to Linux-based users of R.

R version 4 represents a major milestone in the history of R. It’s been just over 20 years since R 1.0.0 was released on February 29 2000, and the history of R extends even further back than that. If you’re interested in the other major milestones, I cover R’s history in this recent talk for the SatRDays DC conference.

For the details on the R 4.0.0 release, including the complete list of changes, check out the announcement at the link below.

R-announce archives: R 4.0.0 is released

  • TAGS
  • R