13 min read

In this article by Luca Milanesio, author of the book Learning Gerrit Code review, we will learn about Gerrit Code revew. GitHub is the world’s largest platform for the free hosting of Git Projects, with over 4.5 million registered developers. We will now provide a step-by-step example of how to connect Gerrit to an external GitHub server so as to share the same set of repositories. Additionally, we will provide guidance on how to use the Gerrit Code Review workflow and GitHub concurrently.

By the end of this article we will have our Gerrit installation fully integrated and ready to be used for both open source public projects and private projects on GitHub.

(For more resources related to this topic, see here.)

GitHub workflow

GitHub has become the most popular website for open source projects, thanks to the migration of some major projects to Git (for example, Eclipse) and new projects adopting it, along with the introduction of the social aspect of software projects that piggybacks on the Facebook hype. The following diagram shows the GitHub collaboration model:

The key aspects of the GitHub workflow are as follows:

  • Each developer pushes to their own repository and pulls from others
  • Developers who want to make a change to another repository, create a fork on GitHub and work on their own clone
  • When forked repositories are ready to be merged, pull requests are sent to the original repository maintainer
  • The pull requests include all of the proposed changes and their associated discussion threads
  • Whenever a pull request is accepted, the change is merged by the maintainer and pushed to their repository on GitHub

 

GitHub controversy

The preceding workflow works very effectively for most open source projects; however, when the projects gets bigger and more complex, the tools provided by GitHub are too unstructured, and a more defined review process with proper tools, additional security, and governance is needed.

In May 2012 Linus Torvalds , the inventor of Git version control, openly criticized GitHub as a commit editing tool directly on the pull request discussion thread: ” I consider GitHub useless for these kinds of things. It’s fine for hosting, but the pull requests and the online commit editing, are just pure garbage ” and additionally, ” the way you can clone a (code repository), make changes on the web, and write total crap commit messages, without GitHub in any way making sure that the end result looks good. ” See https://github.com/torvalds/linux/pull/17#issuecomment-5654674.

Gerrit provides the additional value that Linus Torvalds claimed was missing in the GitHub workflow: Gerrit and GitHub together allows the open source development community to reuse the extended hosting reach and social integration of GitHub with the power of governance of the Gerrit review engine.

GitHub authentication

The list of authentication backends supported by Gerrit does not include GitHub and it cannot be used out of the box, as it does not support OpenID authentication. However, a GitHub plugin for Gerrit has been recently released in order to fill the gaps and allow a seamless integration.

GitHub implements OAuth 2.0 for allowing external applications, such as Gerrit, to integrate using a three-step browser-based authentication. Using this scheme, a user can leverage their existing GitHub account without the need to provision and manage a separate one in Gerrit. Additionally, the Gerrit instance will be able to self-provision the SSH public keys needed for pushing changes for review.

In order for us to use GitHub OAuth authentication with Gerrit, we need to do the following:

  • Build the Gerrit GitHub plugin
  • Install the GitHub OAuth filter into the Gerrit libraries (/lib under the Gerrit site directory)
  • Reconfigure Gerrit to use the HTTP authentication type

 

Building the GitHub plugin

The Gerrit GitHub plugin can be found under the Gerrit plugins/github repository on https://gerrit-review.googlesource.com/#/admin/projects/plugins/github. It is open source under the Apache 2.0 license and can be cloned and built using the Java 6 JDK and Maven.

Refer to the following example:

$ git clone https://gerrit.googlesource.com/plugins/github $ cd github $ mvn install […] [INFO] BUILD SUCCESS [INFO] ——————————————————- [INFO] Total time: 9.591s [INFO] Finished at: Wed Jun 19 18:38:44 BST 2013 [INFO] Final Memory: 12M/145M [INFO] ——————————————————-

The Maven build should generate the following artifacts:

  • github-oauth/target/github-oauth*.jar, the GitHub OAuth library for authenticating Gerrit users
  • github-plugin/target/github-plugin*.jar, the Gerrit plugin for integrating with GitHub repositories and pull requests

Installing GitHub OAuth library

The GitHub OAuth JAR file needs to copied to the Gerrit /lib directory; this is required to allow Gerrit to use it for filtering all HTTP requests and enforcing the GitHub three-step authentication process:

$ cp github-oauth/target/github-oauth-*.jar /opt/gerrit/lib/

Installing GitHub plugin

The GitHub plugin includes the additional support for the overall configuration, the advanced GitHub repositories replication, and the integration of pull requests into the Code Review process.

We now need to install the plugin before running the Gerrit init again so that we can benefit from the simplified automatic configuration steps:

$ cp github-plugin/target/github-plugin-*.jar /opt/gerrit/
plugins/github.jar

Register Gerrit as a GitHub OAuth application

Before going through the Gerrit init, we need to tell GitHub to trust Gerrit as a partner application. This is done through the generation of a ClientId/ClientSecret pair associated to the exact Gerrit URLs that will be used for initiating the 3-step OAuth authentication.

We can register a new application in GitHub through the URL https://github.com/settings/applications/new, where the following three fields are requested:

  • Application name : It is the logical name of the application authorized to access GitHub, for example, Gerrit.
  • Main URL : The Gerrit canonical web URL used for redirecting to GitHub OAuth authentication, for example, https://myhost.mydomain:8443.
  • Callback URL : The URL that GitHub should redirect to when the OAuth authentication is successfully completed, for example, https://myhost.mydomain:8443/oauth.

GitHub will automatically generate a unique pair ClientId/ClientSecret that has to be provided to Gerrit identifying them as a trusted authentication partner.

ClientId/ClientSecret are not GitHub credentials and cannot be used by an interactive user to access any GitHub data or information. They are only used for authorizing the integration between a Gerrit instance and GitHub.

Running Gerrit init to configure GitHub OAuth

We now need to stop Gerrit and go through the init steps again in order to reconfigure the Gerrit authentication. We need to enable HTTP authentication by choosing an HTTP header to be used to verify the user’s credentials, and to go through the GitHub settings wizard to configure the OAuth authentication.

$ /opt/gerrit/bin/gerrit.sh stop Stopping Gerrit Code Review: OK $ cd /opt/gerrit $ java -jar gerrit.war init […] *** User Authentication *** Authentication method []: HTTP RETURN Get username from custom HTTP header [Y/n]? Y RETURN Username HTTP header []: GITHUB_USER RETURN SSO logout URL : /oauth/reset RETURN *** GitHub Integration *** GitHub URL [https://github.com]: RETURN Use GitHub for Gerrit login ? [Y/n]? Y RETURN ClientId []: 384cbe2e8d98192f9799 RETURN ClientSecret []: f82c3f9b3802666f2adcc4 RETURN Initialized /opt/gerrit $ /opt/gerrit/bin/gerrit.sh start Starting Gerrit Code Review: OK

 

Using GitHub login for Gerrit

Gerrit is now fully configured to register and authenticate users through GitHub OAuth. When opening the browser to access any Gerrit web pages, we are automatically redirected to the GitHub for login. If we have already visited and authenticated with GitHub previously, the browser cookie will be automatically recognized and used for the authentication, instead of presenting the GitHub login page. Alternatively, if we do not yet have a GitHub account, we create a new GitHub profile by clicking on the SignUp button.

Once the authentication process is successfully completed, GitHub requests the user’s authorization to grant access to their public profile information. The following screenshot shows GitHub OAuth authorization for Gerrit:

The authorization status is then stored under the user’s GitHub applications preferences on https://github.com/settings/applications.

Finally, GitHub redirects back to Gerrit propagating the user’s profile securely using a one-time code which is used to retrieve the full data profile including username, full name, e-mail, and associated SSH public keys.

Replication to GitHub

The next steps in the Gerrit to GitHub integration is to share the same Git repositories and then keep them up-to-date; this can easily be achieved by using the Gerrit replication plugin.

The standard Gerrit replication is a master-slave, where Gerrit always plays the role of the master node and pushes to remote slaves. We will refer to this scheme as push replication because the actual control of the action is given to Gerrit through a git push operation of new commits and branches.

Configure Gerrit replication plugin

In order to configure push replication we need to enable the Gerrit replication plugin through Gerrit init:

$ /opt/gerrit/bin/gerrit.sh stop Stopping Gerrit Code Review: OK $ cd /opt/gerrit $ java -jar gerrit.war init […] *** Plugins *** Prompt to install core plugins [y/N]? y RETURN Install plugin reviewnotes version 2.7-rc4 [y/N]? RETURN Install plugin commit-message-length-validator version 2.7-rc4 [y/N]? RETURN Install plugin replication version 2.6-rc3 [y/N]? y RETURN Initialized /opt/gerrit $ /opt/gerrit/bin/gerrit.sh start Starting Gerrit Code Review: OK

The Gerrit replication plugin relies on the replication.config file under the /opt/gerrit/etc directory to identify the list of target Git repositories to push to. The configuration syntax is a standard .ini format where each group section represents a target replica slave.

See the following simplest replication.config script for replicating to GitHub:

[remote “github”] url = [email protected]:myorganisation/${name}.git

The preceding configuration enables all of the repositories in Gerrit to be replicated to GitHub under the myorganisa tion GitHub Team account.

Authorizing Gerrit to push to GitHub

Now, that Gerrit knows where to push, we need GitHub to authorize the write operations to its repositories. To do so, we need to upload the SSH public key of the underlying OS user where Gerrit is running to one of the accounts in the GitHub myorganisation team, with the permissions to push to any of the GitHub repositories.

Assuming that Gerrit runs under the OS user gerrit, we can copy and paste the SSH public key values from the ~gerrit/.ssh/id_rsa.pub (or ~gerrit/.ssh/id_dsa.pub) to the Add an SSH Key section of the GitHub account under target URL to be set to: https://github.com/settings/ssh

Start working with Gerrit replication

Everything is now ready to start playing with Gerrit to GitHub replication. Whenever a change to a repository is made on Gerrit, it will be automatically replicated to the corresponding GitHub repository.

In reality there is one additional operation that is needed on the GitHub side: the actual creation of the empty repositories using https://github.com/new associated to the ones created in Gerrit. We need to make sure that we select the organization name and repository name, consistent with the ones defined in Gerrit and in the replication.config file.

Never initialize the repository from GitHub with an empty commit or readme file; otherwise the first replication attempt from Gerrit will result in a conflict and will then fail.

Now GitHub and Gerrit are fully connected and whenever a repository in GitHub matches one of the repositories in Gerrit, it will be linked and synchronized with the latest set of commits pushed in Gerrit. Thanks to the Gerrit-GitHub authentication previously configured, Gerrit and GitHub share the same set of users and the commits authors will be automatically recognized and formatted by GitHub. The following screenshot shows Gerrit commits replicated to GitHub:

Reviewing and merging to GitHub branches

The final goal of the Code Review process is to agree and merge changes to their branches. The merging strategies need to be aligned with real-life scenarios that may arise when using Gerrit and GitHub concurrently.

During the Code Review process the alignment between Gerrit and GitHub was at the change level, not influenced by the evolution of their target branches. Gerrit changes and GitHub pull requests are isolated branches managed by their review lifecycle.

When a change is merged, it needs to align with the latest status of its target branch using a fast-forward, merge, rebase, or cherry-pick strategy. Using the standard Gerrit merge functionality, we can apply the configured project merge strategy to the current status of the target branch on Gerrit. The situation on GitHub may have changed as well, so even if the Gerrit merge has succeeded there is no guarantee that the actual subsequent synchronization to GitHub will do the same!

The GitHub plugin mitigates this risk by implementing a two-phase submit + merge operation for merging opened changes as follows:

  • Phase-1 : The change target branch is checked against its remote peer on GitHub and fast forwarded if needed. If two branches diverge, the submit + merge is aborted and manual merge intervention is requested.
  • Phase-2 : The change is merged on its target branch in Gerrit and an additional ad hoc replication is triggered. If the merge succeeds then the GitHub pull request is marked as completed.

At the end of Phase-2 the Gerrit and GitHub statuses will be completely aligned. The pull request author will then receive the notification that his/her commit has been merged.

Using Gerrit and GitHub on http://gerrithub.io

When using Gerrit and GitHub on the web with public or private repositories, all of the commits are replicated from Gerrit to GitHub, and each one of them has a complete copy of the data. If we are using a Git and collaboration server on GitHub over the Internet, why can’t we do the same for its Gerrit counterpart? Can we avoid installing a standalone instance of Gerrit just for the purpose of going through a formal Code Review?

One hassle-free solution is to use the GerritHub service (http://gerrithub.io), which offers a free Gerrit instance on the cloud already configured and connected with GitHub through the github-plugin and github-oauth authentication library. All of the flows that we have covered in this article are completely automated, including the replication and automatic pull request to change automation. As accounts are shared with GitHub, we do not need to register or create another account to use GerritHub; we can just visit http://gerrithub.io and start using Gerrit Code Review with our existing GitHub projects without having to teach our existing community about a new tool.

GerritHub also includes an initial setup Wizard for the configuration and automation of the Gerrit projects and the option to configure the Gerrit groups using the existing GitHub. Once Gerrit is configured, the Code Review and GitHub can be used seamlessly for achieving maximum control and social reach within your developer community.

Summary

We have now integrated our Gerrit installation with GitHub authentication for a seamless Single-Sign-On experience. Using an existing GitHub account we started using Gerrit replication to automatically mirror all the commits to GitHub repositories, allowing our projects to have an extended reach to external users, free to fork our repositories, and to contribute changes as pull requests.

Finally, we have completed our Code Review in Gerrit and managed the merge to GitHub with a two-phase change submit + merge process to ensure that the target branches on both Gerrit and GitHub have been merged and aligned accordingly. Similarly to GitHub, this Gerrit setup can be leveraged for free on the web without having to manage a separate private instance, thanks to the free set target URL to http://gerrithub.io service available on the cloud.

Resources for Article :


Further resources on this subject:


1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here