12th Feb 2018 – Data Science News Daily Roundup

DeepMind IMPALA, Dynamometer opensourced, VoltDB v8.0, and more in today’s top stories around machine learning, deep learning,and data science news.

1. DeepMind Lab introduces IMPALA - a new and efficient distributed architecture capable of solving many tasks at the same time

DeepMind has developed a new distributed agent named IMPALA (Importance-Weighted Actor-Learner Architectures) that maximises data throughput using an efficient distributed architecture with TensorFlow.

IMPALA was developed in order to tackle the challenging DMLab-30 suite. DMLab-30 is a set of environments designed using the open source RL environment by DeepMind Lab. These environments enable any DeepRL researcher to test systems on a large spectrum of interesting tasks either individually or in a multi-task setting.

IMPALA is inspired by the popular A3C architecture which uses multiple distributed actors to learn the agent’s parameters. When it was tested on the DMLab-30 levels, IMPALA was 10 times more data efficient and achieved double the final score compared to distributed A3C. Moreover, IMPALA showed positive transfer from training in multi-task settings compared to training in single-task setting.

To know more about IMPALA, you can read the research paper.

2. LinkedIn open-sources Dynamometer, a new tool for testing big-data performance

LinkedIn opensources Dynamometer, a tool which focuses around stress-testing large Hadoop big-data deployments without using massive amounts of infrastructure.

Using Dynamometer, Information technology teams can test production workloads and ensure they’ll be able to cope with any changes to their Hadoop clusters. It is designed for those running large-scale Hadoop deployments, as well as those who propose changes to the core Hadoop project and want to ensure new features don’t hurt performance.

Visit the GitHub Repo for a detailed information on LinkedIn’s Dynamometer.

3. VoltDB Introduces VoltDB v8.0, a Translytical Database for Powering Real-Time Decisions

VoltDB, provider of an enterprise-class translytical database for business-critical applications announced the latest version (v0.8) of its flagship solution. According to Forrester analyst Mike Gualtieri, a translytical database is a “single unified database that supports transaction and analytics in real time without sacrificing transactional integrity, performance, and scale.

The new version delivers more predictable, long-tail latency responses based on real-time data and historical intelligence, improving real-time processing and offering self-service analysis.

What’s new in the VoltDB v8.0?

Improved Network Security
User-Defined Functions
Common Table Expressions
Kafka Enhancements
Python V3 API

For detailed information on v0.8, read the release notes.

4. Amazon adds encryption at rest to DynamoDB database service

Amazon Web Services Inc. added a new encryption feature to its DynamoDB database service, which helps secure users’ data better. DynamoDB, Amazon’s NoSQL database service is designed to store and retrieve unstructured data, and is typically used for big-data workloads and analysis.

With the new update, users can choose to encrypt data stored “at rest,” that is, when the data is not being used. The option is not switched on by default, so users will have to enable it manually when creating a new database table.

Visit the AWS’ official post for a detailed read on this topic.

5. Apache Flink® Master Branch Monthly: New in Flink in January 2018

Apache Flink team highlighted a selection of features that have been merged into Flink’s master branch during the past month in its “Flink Master Monthly” blog post.

The summary of features merged are: