The BlazingDB team announced a new and free version of BlazingDB’s query execution engine for RAPIDS open-source software by NVIDIA, called BlazingSQL, yesterday.
BlazingSQL provides query datasets from enterprise Data Lakes directly into GPU memory as a GPU DataFrame (GDF). GPU DataFrame (GDF) is a project that offers support for interoperability between GPU applications. It also defines a common GPU in-memory data layer.
To provide this data lake integration, and to enable SQL queries on the software, critical open-source libraries were built inside the RAPIDS open-source software. These libraries were then layered on a series of modules from BlazingDB. GDF provides users with PyGDF or Dask_GDF that offers a simple interface similar to the Pandas DataFrame.
BlazingSQL also allows Python developers to execute SQL queries directly on the flat files that exist inside the distributed file systems. Moreover, it comes with cuML and cuDNN that comprises GPU-accelerated machine learning and deep learning libraries using GDFs. The GPU DataFrame offers developers the ability to run complete machine learning workloads inside the GPU memory. This reduces the cost of data exchange between different tools, as well as the transfer overhead over the PCIe bus.
The BlazingDB team has given a demo and binary roadmap for the upcoming BlazingSQL releases. BlazingSQL 0.1 uses PyBlazing connection to execute SQL queries on GDFs loaded by the PyGDF API. It will be releasing in the next couple of weeks before 25th October.
BlazingSQL 0.2 involves the integration of BlazingDB’s FileSystem API. This adds the ability to directly query flat files inside the existing distributed file systems. This will be releasing sometime between 25th October to 8th November.
BlazingSQL 0.3 comprises the integration of the distributed scheduler so SQL queries are fanned out across multiple GPUs and servers. This will be releasing between 8th November and 30th November. Finally, the BlazingSQL 0.4 will have Integration of the distributed, multi-layered cache. The release date for BlazingSQL 0.4 hasn’t been assigned but it is expected to release in 2018.
For more information, check out the official BlazingDB blog post.
MariaDB acquires Clustrix to give database customers ‘freedom from Oracle lock-in’
RxDB 8.0.0, reactive, offline-first, multiplatform database for JavaScript released!
At Packt, we are always on the lookout for innovative startups that are not only…
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…