Getting to know TensorFlow

[box type="note" align="" class="" width=""]

The following book excerpt is from the title Machine Learning Algorithms by Guiseppe Bonaccorso. The book describes important Machine Learning algorithms commonly used in the field of data science. These algorithms can be used for supervised as well as unsupervised learning, reinforcement learning, and semi-supervised learning. Few famous ones covered in the book are Linear regression, Logistic Regression, SVM, Naive Bayes, K-Means, Random Forest, TensorFlow, and Feature engineering.

[/box]

Here, in the article, we look at understanding most important Deep learning library-Tensorflow with contextual examples.

Brief Introduction to TensorFlow

TensorFlow is a computational framework created by Google and has become one of the most diffused deep-learning toolkits. It can work with both CPUs and GPUs and already implements most of the operations and structures required to build and train a complex model. TensorFlow can be installed as a Python package on Linux, Mac, and Windows (with or without GPU support); however, we suggest you follow the instructions provided on the website to avoid common mistakes.

The main concept behind TensorFlow is the computational graph or a set of subsequent operations that transform an input batch into the desired output. In the following figure, there's a schematic representation of a graph:

getting-to-know-tensorflow-img-0

Starting from the bottom, we have two input nodes (a and b), a transpose operation (that works on b), a matrix multiplication and a mean reduction. The init block is a separate operation, which is formally part of the graph, but it's not directly connected to any other node; therefore it's autonomous (indeed, it's a global initializer).

As this one is only a brief introduction, it's useful to list all of the most important strategic elements needed to work with TensorFlow so as to be able to build a few simple examples that can show the enormous potential of this framework:

Graph: This represents the computational structure that connects a generic input batch with the output tensors through a directed network made of operations. It's defined as a tf.Graph() instance and normally used with a Python context Manager.
Placeholder: This is a reference to an external variable, which must be explicitly supplied when it's requested for the output of an operation that uses it directly or indirectly. For example, a placeholder can represent a variable x, which is first transformed into its squared value and then summed to a constant value. The output is thenx2+c, which is materialized by passing a concrete value for x. It's defined as a tf.placeholder() instance.
Variable: An internal variable used to store values which are updated by the algorithm. For example, a variable can be a vector containing the weights of a logistic regression. It's normally initialized before a training process and automatically modified by the built-in optimizers. It's defined as a tf.Variable() instance. A variable can also be used to store elements which must not be considered during training processes; in this case, it must be declared with the parameter
trainable=False
Constant: A constant value defined as a tf.constant() instance. Operation: A mathematical operation that can work with placeholders, variables, and constants. For example, the multiplication of two matrices is an operation defined a tf.constant(). Among all operations, gradient calculation is one of the most important. TensorFlow allows determining the gradients starting from a determined point in the computational graph, until the origin or another point that must be logically before it. We're going to see an example of this Operation.
Session: This is a sort of wrapper-interface between TensorFlow and our working environment (for example, Python or C++). When the evaluation of a graph is needed, this macro-operation will be managed by a session, which must be fed with all placeholder values and will produce the required outputs using the requested devices. For our purposes, it's not necessary to go deeper into this concept; however, I invite the reader to retrieve further information from the website or from one of the resources listed at the end of this chapter. It's declared as an instance of tf.Session() or, as we're going to do, an instance of tf.InteractiveSession(). This type of session is particularly useful when working with notebooks or shell commands, because it places itself automatically as the default one.
Device: A physical computational device, such as a CPU or a GPU. It's declared explicitly through an instance of the class tf.device()and used with a context manager. When the architecture contains more computational devices, it's possible to split the jobs so as to parallelize many operations. If no device is specified, TensorFlow will use the default one (which is the main CPU or a suitable GPU if all the necessary components are installed).

Let’s now analyze this with a simple example here about computing gradients:

Computing gradients

The option to compute the gradients of all output tensors with respect to any connected
input or node is one of the most interesting features of TensorFlow because it allows us to
create learning algorithms without worrying about the complexity of all transformations. In
this example, we first define a linear dataset representing the function f(x) = x in the range
(-100, 100):

import numpy as np

>>> nb_points = 100

>>> X = np.linspace(-nb_points, nb_points, 200, dtype=np.float32)

The corresponding plot is shown in the following figure:

getting-to-know-tensorflow-img-1

Now we want to use TensorFlow to compute:

The first step is defining a graph:

import tensorflow as tf

>>> graph = tf.Graph()

Within the context of this graph, we can define our input placeholder and other operations:

>>> with graph.as_default():

>>> Xt = tf.placeholder(tf.float32, shape=(None, 1), name='x')

>>> Y = tf.pow(Xt, 3.0, name='x_3')

>>> Yd = tf.gradients(Y, Xt, name='dx')

>>> Yd2 = tf.gradients(Yd, Xt, name='d2x')

A placeholder is generally defined with a type (first parameter), a shape, and an optional name. We've decided to use a tf.float32 type because this is the only type also supported by GPUs. Selecting shape=(None, 1) means that it's possible to use any bidimensional vectors with the second dimension equal to 1. The first operation computes the third power if Xt is working on all elements. The second operation computes all the gradients of Y with respect to the input placeholder
Xt. The last operation will repeat the gradient computation, but in this case, it uses Yd, which is the output of the first gradient operation. We can now pass some concrete data to see the results. The first thing to do is create a session connected to this graph:

>>> session = tf.InteractiveSession(graph=graph)

By using this session, we ask any computation using the method run(). All the input parameters must be supplied through a feed-dictionary, where the key is the placeholder, while the value is the actual array:

>>> X2, dX, d2X = session.run([Y, Yd, Yd2], feed_dict={Xt:

X.reshape((nb_points*2, 1))})

We needed to reshape our array to be compliant with the placeholder. The first argument of run()
is a list of tensors that we want to be computed. In this case, we need all operation outputs. The plot of each of them is shown in the following figure:

getting-to-know-tensorflow-img-3

As expected, they represent respectively: x3, 3x2, and 6x.

Further in the book, we look at a slightly more complex example of Logistic Regression to implement a logistic regression algorithm. Refer to Chapter 14, Brief Introduction to Deep Learning and Tensorflow of Machine Learning Algorithms to read the complete chapter.