2 min read

[box type=”note” align=”” class=”” width=””]The following is an excerpt from the book Scala for Machine Learning, Second Edition, written by Patrick R. Nicolas. The book will help you leverage Scala and Machine Learning to study and construct systems that can learn from data. [/box]

This article aims to teach how a mathematical formula can be converted into a machine learning model in three easy steps:

Stackable traits enable developers to follow a strict mathematical formalism while implementing a model in Scala. Scientists use a universally accepted template to solve mathematical problems:

  1. Declare the variables relevant to the problem.
  2. Define a model (equation, algorithm, formulas…) as the solution to the problem.
  3. Instantiate the variables and execute the model to solve the problem.

Let’s consider the example of the concept of kernel functions, a model that consists of the composition of two mathematical functions, and its potential implementation in Scala.

Step 1 – variable declaration

The implementation consists of wrapping (scope) the two functions into traits and defining these functions as abstract values.

The mathematical formalism is as follows:

Machine learning with Scala

The Scala implementation is represented here:

type V = Vector[Double]

trait F{ val f: V => V}

trait G{ val g: V => Double }

Step 2 – model definition

The model is defined as the composition of the two functions. The stack of traits G, F describes the type of compatible functions that can be composed using the self-referenced constraint self: G with F:

Formalism h = f o g

The implementation is as follows:

class H {self: G with F =>def apply(v:V): Double =g(f(v))}

Step 3 – instantiation

The model is executed once the variable f and g are instantiated.

The formalism is as follows:

The implementation is as follows:

val h = new H with G with F {

val f: V=>V = (v: V) =>v.map(exp(_))

val g: V => Double = (v: V) =>v.sum


Lazy value trigger

In the preceding example, the value of h(v) = g(f(v)) can be automatically computed as soon as g and f are initialized, by declaring h as a lazy value.

Clearly, Scala preserves the formalism of mathematical models, making it easier for scientists and developers to migrate their existing projects written in scientific oriented languages such as R.


Emulation of R

Most data scientists use the language R to create models and apply learning strategies. They may consider Scala as an alternative to R in some cases, as Scala preserves the mathematical formalism used in models implemented in R.

In this article we explained the concept preservation of mathematical formalism. This needs to be further extended to dynamic creation of workflows using traits. Design patterns, such as the cake pattern, can be used to do this.

If you enjoyed this excerpt, be sure to check out the book Scala Machine Learning, Second Edition to gain solid foundational knowledge in machine learning with Scala.



Please enter your comment!
Please enter your name here