News

.NET for Apache Spark Preview is out now!

2 min read

Yesterday, at the Spark + AI summit, the team at Apache Spark announced .NET for Apache Spark, a popular open source distributed processing engine used for analytics over large data sets. It can also be used for processing real-time streams, batches of data, machine learning, and ad-hoc query.

.NET fo Apache Spark for developers

.NET for Apache Spark aims at making Apache Spark accessible to .NET developers across all Spark APIs. The team at Apache Spark aims to develop .NET for Apache Spark in the open (as a .NET Foundation member project) along with the Spark and .NET community for the developers.

.NET for Apache Spark comes with high-performance APIs for using Spark from C# and F#. With .NET APIs, users can now access all aspects of Apache Spark including streaming, Spark SQL, DataFrames, MLLib, etc. It lets the developers reuse all the skills, code, knowledge, and libraries. The C#/ F# language that binds to Spark will be written on a new Spark interop layer that will offer easier extensibility. .NET for Apache Spark can be used on Linux, macOS, and Windows and is compliant with .NET Standard 2.0.

.NET for Apache Spark performance

The first preview version of .NET for Apache Spark performs well on the popular TPC-H benchmark. This benchmark consists of a suite of business-oriented queries. .NET for Apache Spark has a better performance against Python and Scala. It is also 2 times faster than Python.

What more features can be expected?

In the future, the team aims to simplify the documentation and samples and work towards native integration with developer tools such as Visual Studio, Visual Studio Code, Jupyter notebooks. Developers can also expect .NET support for user-defined aggregate functions and .NET idiomatic APIs for C# and F# (e.g., using LINQ for writing queries). The team is also working towards adding support for Azure Databricks, Kubernetes, etc. and making .NET for Apache Spark part of Spark Core.

Few users are excited about this news and are expecting some major improvement with .NET for Spark. A user commented on HackerNews, “I’ve seen the announcement about .NET interior support in Apache Spark some time ago. The benchmarks are interesting and tell the story – in few cases it is faster than Python, but slower than native (for Spark) Scala/JVM. Maybe with Arrow interchange Python’s performance would increase (and for other interpose that would use Array – i.e. for .Net).”

Few others are confused about the transition, as they have to get their teams shifted to the new setup. Another user commented, “Indeed the real sad part is you can’t lead teams there early (premature optimization). Everybody seems to make the same rough transition on their own.”

To know more about this news, check out the post by Apache Spark.

Read Next

Winners for the 2019 .NET Foundation Board of Directors elections are finally declared

Fedora 31 will now come with Mono 5 to offer open-source .NET support

ML.NET 1.0 RC releases with support for TensorFlow models and much more!

Amrata Joshi

Share
Published by
Amrata Joshi

Recent Posts

Top life hacks for prepping for your IT certification exam

I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…

3 years ago

Learn Transformers for Natural Language Processing with Denis Rothman

Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…

3 years ago

Learning Essential Linux Commands for Navigating the Shell Effectively

Once we learn how to deploy an Ubuntu server, how to manage users, and how…

3 years ago

Clean Coding in Python with Mariano Anaya

Key-takeaways:   Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…

3 years ago

Exploring Forms in Angular – types, benefits and differences   

While developing a web application, or setting dynamic pages and meta tags we need to deal with…

3 years ago

Gain Practical Expertise with the Latest Edition of Software Architecture with C# 9 and .NET 5

Software architecture is one of the most discussed topics in the software industry today, and…

3 years ago