





















































This article by Alan Mark Berg, the author of Jenkins Continuous Integration Cookbook Second Edition, outlines the main themes surrounding the correct use of a Jenkins server.
Jenkins (http://jenkins-ci.org/) is a Java-based Continuous Integration (CI) server that supports the discovery of defects early in the software cycle. Thanks to over 1,000 plugins, Jenkins communicates with many types of systems building and triggering a wide variety of tests.
CI involves making small changes to software and then building and applying quality assurance processes. Defects do not only occur in the code, but also appear in the naming conventions, documentation, how the software is designed, build scripts, the process of deploying the software to servers, and so on. CI forces the defects to emerge early, rather than waiting for software to be fully produced. If defects are caught in the later stages of the software development life cycle, the process will be more expensive. The cost of repair radically increases as soon the bugs escape to production. Estimates suggest it is 100 to 1,000 times cheaper to capture defects early. Effective use of a CI server, such as Jenkins, could be the difference between enjoying a holiday and working unplanned hours to heroically save the day. And as you can imagine, in my day job as a senior developer with aspirations to quality assurance, I like long boring days, at least for mission-critical production environments.
Jenkins can automate the building of software regularly and trigger tests pulling in the results and failing based on defined criteria. Failing early via build failure lowers the costs, increases confidence in the software produced, and has the potential to morph subjective processes into an aggressive metrics-based process that the development team feels is unbiased.
Jenkins is:
In 2002, NIST estimated that software defects were costing America around 60 billion dollars per year (http://www.abeacha.com/NIST_press_release_bugs_cost.htm). Expect the cost to have increased considerably since.
To save money and improve quality, you need to remove defects as early in the software lifecycle as possible. The Jenkins test automation creates a safety net of measurements. Another key benefit is that once you have added tests, it is trivial to develop similar tests for other projects.
Jenkins works well with best practices such as Test Driven Development (TDD) or Behavior Driven Development (BDD). Using TDD, you write tests that fail first and then build the functionality needed to pass the tests. With BDD, the project team writes the description of tests in terms of behavior. This makes the description understandable to a wider audience. The wider audience has more influence over the details of the implementation.
Regression tests increase confidence that you have not broken code while refactoring software. The more coverage of code by tests, the more confidence.
There are a number of good introductions to software metrics. These include a wikibook on the details of the metrics (http://en.wikibooks.org/wiki/Introduction_to_Software_Engineering/Quality/Metrics). And a well written book is by Diomidis Spinellis Code Quality: The Open Source Perspective.
Remote testing through Jenkins considerably increases the number of dependencies in your infrastructure and thus the maintenance effort. Remote testing is a problem that is domain specific, decreasing the size of the audience that can write tests.
You need to make test writing accessible to a large audience. Embracing the largest possible audience improves the chances that the tests defend the intent of the application.
The technologies highlighted in the Jenkins book include:
Jenkins is not only a CI server, it is also a platform to create extra functionality. Once a few concepts are learned, a programmer can adapt available plugins to their organization's needs.
If you see a feature that is missing, it is normally easier to adapt an existing one than to write from scratch. If you are thinking of adapting then the plugin tutorial (https://wiki.jenkins-ci.org/display/JENKINS/Plugin+tutorial) is a good starting point. The tutorial is relevant background information on the infrastructure you use daily.
There is a large amount of information available on plugins. Here are some key points:
By keeping to Jenkins conventions, the amount of source code you write is minimized and the readability is improved.
The three frameworks that are heavily used in Jenkins are as follows:
In the Jenkins' book, we will also provide recipes that support maintenance cycles. For large scale deployments of Jenkins within diverse enterprise infrastructure, proper maintenance of Jenkins is crucial to planning predictable software cycles. Proper maintenance lowers the risk of failures:
Jenkins has many plugins that allow it to integrate easily into complex and diverse environments. If there is a need that is not directly supported you can always use a scripting language of choice and wire that into your jobs. In this section, we'll explore the R plugin and see how it can help you generate great graphics.
R is a popular programming language for statistics (http://en.wikipedia.org/wiki/R_programming_language). It has many hundreds of extensions and has a powerful set of graphical capabilities. In this recipe, we will show you how to use the graphical capabilities of R within your Jenkins Jobs and then point you to some excellent starter resources.
For a full list of plugins that improve the UI of Jenkins including Jenkins' graphical capabilities, visit https://wiki.jenkins-ci.org/display/JENKINS/Plugins#Plugins-UIplugins.
Install the R plugin (https://wiki.jenkins-ci.org/display/JENKINS/R+Plugin). Review the R installation documentation (http://cran.r-project.org/doc/manuals/r-release/R-admin.html).
sudo apt-get install r-base
apt-cache search r-cran | less
paste('======================================='); paste('WORKSPACE: ', Sys.getenv('WORKSPACE')) paste('BUILD_URL: ', Sys.getenv('BUILD_URL')) print('ls /var/lib/jenkins/jobs/R-ME/builds/') paste('BUILD_NUMBER: ', Sys.getenv('BUILD_NUMBER')) paste('JOB_NAME: ', Sys.getenv('JOB_NAME')) paste('JENKINS_HOME: ', Sys.getenv('JENKINS_HOME')) paste( 'JOB LOCATION: ', Sys.getenv('JENKINS_HOME'),'/jobs/',
Sys.getenv('JOB_NAME'),'/builds/', Sys.getenv('BUILD_NUMBER'),"/test.pdf",sep="") paste('======================================='); filename<-paste('pie_',Sys.getenv('BUILD_NUMBER'),'.pdf',sep="") pdf(file=filename) slices<- c(1,2,3,3,6,2,2) labels <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday","Sunday") pie(slices, labels = labels, main="Number of failed jobs for each day of the week") filename<-paste('freq_',Sys.getenv('BUILD_NUMBER'),'.pdf',sep="") pdf(file=filename) Number_OF_LINES_OF_ACTIVE_CODE=rnorm(10000, mean=200, sd=50) hist(Number_OF_LINES_OF_ACTIVE_CODE,main="Frequency plot of Class Sizes") filename<-paste('scatter_',Sys.getenv('BUILD_NUMBER'),'.pdf',sep="") pdf(file=filename) Y <- rnorm(3000) plot(Y,main='Random Data within a normal distribution')
The following screenshot is a histogram of the values from the random data generated by the R script during the build process. The data simulates class sizes within a large project:
Another view is a pie chart. The fake data representing the number of failed jobs for each day of the week. If you make this plot against your own values, you might see particularly bad days, such as the day before or after the weekend. This might have implications about how developers work or motivation is distributed through the week.
Perform the following steps:
Started by user anonymous
Building in workspace /var/lib/jenkins/workspace/ch4.Powerfull.Visualizations
[ch4.Powerfull.Visualizations] $ Rscript /tmp/hudson6203634518082768146.R [1] "=======================================" [1] "WORKSPACE: /var/lib/jenkins/workspace/ch4.Powerfull.Visualizations" [1] "BUILD_URL: " [1] "ls /var/lib/jenkins/jobs/R-ME/builds/" [1] "BUILD_NUMBER: 9" [1] "JOB_NAME: ch4.Powerfull.Visualizations" [1] "JENKINS_HOME: /var/lib/jenkins" [1] "JOB LOCATION: /var/lib/jenkins/jobs/ch4.Powerfull.Visualizations/builds/9/test.pdf" [1] "=======================================" Finished: SUCCESS
Click on Back to Project
Click on Workspace.
With a few lines of R code, you have generated three different well-presented PDF graphs.
The R plugin ran a script as part of the build. The script printed out the WORKSPACE and other Jenkins environment variables to the console:
paste ('WORKSPACE: ', Sys.getenv('WORKSPACE'))
Next, a filename is set with the build number appended to the pie_ string. This allows the script to generate a different filename each time it is run, as shown:
filename <-paste('pie_',Sys.getenv('BUILD_NUMBER'),'.pdf',sep="")
The script now opens output to the location defined in the filename variable through the pdf(file=filename) command. By default, the output directory is the job's workspace.
Next, we define fake data for the graph, representing the number of failed jobs on any given day of the week. Note that in the simulated world, Friday is a bad day:
slices <- c(1,2,3,3,6,2,2)
labels <- c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday","Saturday","Sunday")
We can also plot a pie graph, as follows:
pie(slices, labels = labels, main="Number of failed jobs for each day of the week")
For the second graph, we generated 10,000 pieces of random data within a normal distribution.
The fake data represents the number of lines of active code that ran for a give job:
Number_OF_LINES_OF_ACTIVE_CODE=rnorm(10000, mean=200, sd=50)
The hist command generates a frequency plot:
hist(Number_OF_LINES_OF_ACTIVE_CODE,main="Frequency plot of Class Sizes")
The third graph is a scatter plot with 3,000 data points generated at random within a normal distribution.
This represents a typical sampling process, such as the number of potential defects found using Sonar or FindBugs:
Y <- rnorm(3000)
plot(Y,main='Random Data within a normal distribution')
We will leave it to an exercise for the reader to link real data to the graphing capabilities of R.
Here are a couple more points for you to think about.
A popular IDE for R is RStudio (http://www.rstudio.com/). The open source edition is free. The feature set includes a source code editor with code completion and syntax highlighting, integrated help, solid debugging features and a slew of other features.
An alternative for the Eclipse environment is the StatET plugin (http://www.walware.de/goto/statet).
The first place to start to learn R is by typing help.start() from the R console. The command launches a browser with an overview of the main documentation
If you want descriptions of R commands then typing ? before a command generates detailed help documentation. For example, in the recipe we looked at the use of the rnorm command. Typing ?rnorm produces documentation similar to:
The Normal Distribution
Description
Density, distribution function, quantile function and random generation
for the normal distribution with mean equal to mean and standard deviation equal to sd.
Usage
dnorm(x, mean = 0, sd = 1, log = FALSE)
pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
qnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE)
rnorm(n, mean = 0, sd = 1)
Jenkins is not just a CI server, it is also a vibrant and highly active community. Enlightened self-interest dictates participation. There are a number of ways to do this:
An efficient approach to learning how to effectively use Jenkins is to download and install the server and then trying out recipes you find in books, the Internet or developed by your fellow developers. I wish you good fortune and an interesting learning experience.
Further resources on this subject: