Data

BlazingDB announces BlazingSQL , a GPU SQL Engine for NVIDIA’s open source RAPIDS

2 min read

The BlazingDB team announced a new and free version of BlazingDB’s query execution engine for RAPIDS open-source software by NVIDIA, called BlazingSQL, yesterday.

BlazingSQL provides query datasets from enterprise Data Lakes directly into GPU memory as a GPU DataFrame (GDF). GPU DataFrame (GDF) is a project that offers support for interoperability between GPU applications. It also defines a common GPU in-memory data layer.

To provide this data lake integration, and to enable SQL queries on the software, critical open-source libraries were built inside the RAPIDS open-source software. These libraries were then layered on a series of modules from BlazingDB. GDF provides users with PyGDF or Dask_GDF that offers a simple interface similar to the Pandas DataFrame.

 

            BlazingSQL

BlazingSQL also allows Python developers to execute SQL queries directly on the flat files that exist inside the distributed file systems. Moreover, it comes with cuML and cuDNN that comprises GPU-accelerated machine learning and deep learning libraries using GDFs. The GPU DataFrame offers developers the ability to run complete machine learning workloads inside the GPU memory. This reduces the cost of data exchange between different tools, as well as the transfer overhead over the PCIe bus.

The BlazingDB team has given a demo and binary roadmap for the upcoming BlazingSQL releases. BlazingSQL 0.1 uses PyBlazing connection to execute SQL queries on GDFs loaded by the PyGDF API. It will be releasing in the next couple of weeks before 25th October.

BlazingSQL 0.2 involves the integration of BlazingDB’s FileSystem API. This adds the ability to directly query flat files inside the existing distributed file systems. This will be releasing sometime between 25th October to 8th November.

BlazingSQL 0.3 comprises the integration of the distributed scheduler so SQL queries are fanned out across multiple GPUs and servers. This will be releasing between 8th November and 30th November.  Finally, the BlazingSQL 0.4 will have Integration of the distributed, multi-layered cache. The release date for BlazingSQL 0.4 hasn’t been assigned but it is expected to release in 2018.

For more information, check out the official BlazingDB blog post.

Read Next

Introducing Watermelon DB: A new relational database to make your React and React Native apps highly scalable

MariaDB acquires Clustrix to give database customers ‘freedom from Oracle lock-in’

RxDB 8.0.0, reactive, offline-first, multiplatform database for JavaScript released!

Natasha Mathur

Tech writer at the Packt Hub. Dreamer, book nerd, lover of scented candles, karaoke, and Gilmore Girls.

Share
Published by
Natasha Mathur

Recent Posts

Harnessing Tech for Good to Drive Environmental Impact

At Packt, we are always on the lookout for innovative startups that are not only…

2 months ago

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago