Elasticsearch 6.5 is here with cross-cluster replication and JDK 11 support

The Elastic team released version 6.5.0 of their open source distributed, RESTful search and analytics engine, Elasticsearch, earlier this week. Elasticsearch 6.5.0 explores features such as cross-cluster replication, new source-only snapshots, SQL/ODBC changes, and security features, among others.

Elasticsearch is a search engine based on Lucene library that provides a distributed, multitenant-capable full-text search engine with an HTTP web interface as well as schema-free JSON documents.

Let’s now discuss these features in Elasticsearch 6.5.0.

Cross-cluster replication

Elasticsearch 6.5.0 comes with cross-cluster replication which is a Platinum-level feature for Elasticsearch. Cross-cluster replication allows you to create an index in a local cluster to follow an index in a remote cluster or automatically-follow indices in a remote cluster matching a pattern.

New source-only snapshots

Elasticsearch 6.5 comes with a new source-only snapshot that allows you to store a minimal amount of information (namely, the _source and index metadata). This enables the indices to be rebuilt through a reindex operation when necessary. What’s great about this is that it creates up to 50% reduction in disk space of the snapshots. However, they can take a longer time to restore (in full) as you’ll need to do a reindex to make them searchable.

SQL/ODBC changes

An initial (alpha status) ODBC driver has been added for Elasticsearch 6.5.0. Since ODBC is supported by many BI tools, it makes it easy to connect Elasticsearch to a lot of your favourite 3rd party tools giving you the speed, flexibility, and power of full-text search and relevance.

Other than that, few new functions and capabilities have also been added to Elasticsearch’s SQL capabilities. These include ROUND, TRUNCATE, IN, MONTHNAME, DAYNAME, QUARTER, CONVERT, as well as a number of string manipulation functions such as CONCAT, LEFT, RIGHT, REPEAT, POSITION, LOCATE, REPLACE, SUBSTRING, and INSERT. You can now also query across indices, with different mappings, given that the mapping types are compatible.

New scriptable token filters

Elasticsearch 6.5 introduces new scriptable token filters namely, predicate and conditional. The predicate token filter allows you to remove tokens that don’t match a script. The conditional token filter builds on the idea of scriptable token filters but lets you apply other token filters matching a script. These let you manipulate the data you’re indexing without requiring to write a Java plugin.

Moreover, Elasticsearch 6.5 also comes with a new text type, called annotated_text. This new annotated_text type allows you to use markdown-like syntax to then link to different entities in applications using natural language processing.

JDK 11 and G1GC

Elasticsearch 6.5 offers support for JDK 11. Other than that, Elasticsearch 6.5 also supports the G1 garbage collector on JDK 10+.

Security and Audit Logging

Elasticsearch 6.5 comes with two new security features, namely, authorization realms and audit logging. Authorization realms enable an authenticating realm to delegate the task of pulling the user information (with the username, the user’s roles, etc) to one or more other realms.

Audit logging is a new, completely structured format, where all attributes are named, meaning each log entry is a one-line JSON document and each one of these are printed on a separate line. These attributes are ordered like in any other normal log entry.

Multi-bucket analysis

A multi-metric machine learning job analyzes multiple time series together. Elasticsearch 6.5 introduces multi-bucket analysis for machine learning jobs. Here, features from multiple contiguous buckets are used for anomaly detection. The final anomaly score includes a combination of values from both the “standard” single-bucket analysis and the new multi-bucket analysis.

Additionally, Elasticsearch 6.5, comes with an experimental find file structure API which aims to help discover the structure of a text file. It attempts to read the file and on succeeding returns statistics about the common values of the detected fields and mappings that can be used for ingesting the file into Elasticsearch.

For more information, check out the official Elasticsearch 6.5 blog.

Dejavu 2.0, the open source browser by ElasticSearch, now lets you build search UIs visually

Search company Elastic goes public and doubles its value on day 1

How does Elasticsearch work? [Tutorial]