Home Data News Announcing Databricks Runtime 4.2!

Announcing Databricks Runtime 4.2!

July 25, 2018 - 1:30 am

2601

2 min read

Databricks announces Databricks Runtime 4.2 with numerous updates and added components on Spark internals, Databricks Delta and improvisions to its previous version.

The databricks runtime 4.2 is powered with Apache Spark 2.3 and recommended for its quick adoption to enjoy the upcoming GA release of Databricks Delta.

Databricks Runtime is a set of software artifacts which runs on the clusters of machines and improves the usability and performance of big data analytics.

New Features of Databricks Runtime 4.2

Added Multi-cluster writing support, enabling users to use the transactional writing features from Databricks Delta.
Streams getting recorded directly to the registered table on Databricks Delta. These streams are stored in the Hive metastore of Databricks Delta platform using df.writeStream.table(…).
Added new streaming foreachBatch() for Scala. This helps to define a function for processing output of every micro batch using DataFrame operations.
Added support for streaming foreach() for Python language which was earlier available only to Scala.
Added from_avro/to_avro functions to support read/write Avro data within DataFrame.

Improvements

All commands and queries of Databricks Delta support referring to a table using its path as an identifier (that is, delta.`/path/to/table`).
DESCRIBE HISTORY includes commit ID and is now ordered newest to oldest by default.

Bug Fixes

Partition-based filtering predicates operate correctly for special cases like when the predicates differ from the table.
Fixed missing column AnalysisException for performing better equality checks on boolean columns in Databricks Delta tables i.e. booleanValue = true.
Stopped modifying transaction log while using CREATE TABLE for creating a pointer to an existing table. This prevents unnecessary conflicts with concurrent streams and allows the creation of metastore pointer to tables where the user only has read access to the data.
Stopped causing Out Of Memory in the driver while Calling display() on a stream with large amounts of data.
Fixed truncation of long lineages which were earlier causing StackOverFlowError while updating the state of a Databricks Delta table.

For more details, please read the release notes officially documented by Databricks.

Top 6 Cybersecurity Books from Packt to Accelerate Your Career

Your Quick Introduction to Extended Events in Analysis Services from Blog…

Logging the history of my past SQL Saturday presentations from Blog…

Storage savings with Table Compression from Blog Posts – SQLServerCentral

Daily Coping 31 Dec 2020 from Blog Posts – SQLServerCentral

Learning Essential Linux Commands for Navigating the Shell Effectively

Exploring the Strategy Behavioral Design Pattern in Node.js

How to integrate a Medium editor in Angular 8

Implementing memory management with Golang’s garbage collector

How to create sales analysis app in Qlik Sense using DAR…

Announcing Databricks Runtime 4.2!

New Features of Databricks Runtime 4.2

Improvements

Bug Fixes

Read Next

LEAVE A REPLY Cancel reply

MobilePro

datapro

Programming

Subscribe to our newsletter