Data

GitHub introduces ‘Experiments’, a platform to share live demos of their research projects

2 min read

Yesterday, GitHub introduced the Experiments platform for sharing demonstrations of their research projects and the idea behind them. With this platform, it aims to give the end users “insight into their research and inspire them to think audaciously about the future of software development”.

Why has GitHub introduced ‘Experiments’?

Just like Facebook and Google, GitHub regularly conducts research in machine learning, design, and infrastructure. The resultant products are rigorously evaluated for stability, performance, and security. If these products meet the success criteria for product release, they are then released for end users. Experiments will help GitHub share details about their research as they happen.

‘Semantic Code Search’: The first demo published on Experiments

The GitHub researchers also published their first demo of an experiment called Semantic Code Search. This system helps you search code on GitHub using natural language.

How does Semantic Code Search work?

The following diagram shows how Semantic Code Search works:

Source: GitHub

Step1: Learning representations of code

In this step, a sequence-to-sequence model is trained to summarize code by supplying (code, docstring) pairs. The docstring here is the target variable the model is trying to predict.

Step 2: Learning representations of text phrases

Along with learning representations of code, the researchers wanted to find a suitable representation for short phrases. To achieve this, they trained a neural language model by leveraging the fast.ai library. Using the concat pooling approach, the representations of phrases were extracted from the trained model by summarizing the hidden states.

Step 3: Mapping code representations to the same vector-space as text

In this step, the code representations learned from step 1 were mapped to the vector space of text. To accomplish this they fine-tuned the code-encoder.

Step 4: Creating a semantic search system

The last step is to bringing everything together to create a semantic search mechanism. The vectorized version of all code is stored in a database, and nearest neighbor lookups are performed to a vectorized search query.

You can read the official announcement at GitHub’s blog. To read in more detail about Semantic Code Search, check out the researchers’ post and also try it on Experiments.

Read Next

Packt’s GitHub portal hits 2,000 repositories

GitHub parts ways with JQuery, adopts Vanilla JS for its frontend

Github introduces Project Paper Cuts for developers to fix small workflow problems, iterate on UI/UX, and find other ways to make quick improvements

Bhagyashree R

Share
Published by
Bhagyashree R
Tags: AI News

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago