17 min read

In this article by Nicholas McClure, the author of the book TensorFlow Machine Learning Cookbook, we will cover basic recipes in order to understand how TensorFlow works and how to access data for this book and additional resources:

  • How TensorFlow works
  • Declaring tensors
  • Using placeholders and variables
  • Working with matrices
  • Declaring operations

(For more resources related to this topic, see here.)

Introduction

Google’s TensorFlow engine has a unique way of solving problems. This unique way allows us to solve machine learning problems very efficiently. We will cover the basic steps to understand how TensorFlow operates. This understanding is essential in understanding recipes for the rest of this book.

How TensorFlow works

At first, computation in TensorFlow may seem needlessly complicated. But there is a reason for it: because of how TensorFlow treats computation, developing more complicated algorithms is relatively easy. This recipe will talk you through the pseudo code of how a TensorFlow algorithm usually works.

Getting ready

Currently, TensorFlow is only supported on Mac and Linux distributions. Using TensorFlow on Windows requires the usage of a virtual machine. Throughout this book we will only concern ourselves with the Python library wrapper of TensorFlow. This book will use Python 3.4+ (https://www.python.org) and TensorFlow 0.7 (https://www.tensorflow.org). While TensorFlow can run on the CPU, it runs faster if it runs on the GPU, and it is supported on graphics cards with NVidia Compute Capability 3.0+. To run on a GPU, you will also need to download and install the NVidia Cuda Toolkit (https://developer.nvidia.com/cuda-downloads). Some of the recipes will rely on a current installation of the Python packages Scipy, Numpy, and Scikit-Learn as well.

How to do it…

Here we will introduce the general flow of TensorFlow algorithms. Most recipes will follow this outline:

  1. Import or generate data: All of our machine-learning algorithms will depend on data. In this book we will either generate data or use an outside source of data. Sometimes it is better to rely on generated data because we will want to know the expected outcome.
  2. Transform and normalize data: The data is usually not in the correct dimension or type that our TensorFlow algorithms expect. We will have to transform our data before we can use it. Most algorithms also expect normalized data and we will do this here as well. TensorFlow has built in functions that can normalize the data for you as follows:
    data = tf.nn.batch_norm_with_global_normalization(...)
  3. Set algorithm parameters: Our algorithms usually have a set of parameters that we hold constant throughout the procedure. For example, this can be the number of iterations, the learning rate, or other fixed parameters of our choosing. It is considered good form to initialize these together so the reader or user can easily find them, as follows:
    learning_rate = 0.01
    
    iterations = 1000
  4. Initialize variables and placeholders: TensorFlow depends on us telling it what it can and cannot modify. TensorFlow will modify the variables during optimization to minimize a loss function. To accomplish this, we feed in data through placeholders. We need to initialize both of these, variables and placeholders with size and type, so that TensorFlow knows what to expect. See the following: code:
    a_var = tf.constant(42)
    
    x_input = tf.placeholder(tf.float32, [None, input_size])
    
    y_input = tf.placeholder(tf.fload32, [None, num_classes])
  5. Define the model structure: After we have the data, and have initialized our variables and placeholders, we have to define the model. This is done by building a computational graph. We tell TensorFlow what operations must be done on the variables and placeholders to arrive at our model predictions:
    y_pred = tf.add(tf.mul(x_input, weight_matrix), b_matrix)
  6. Declare the loss functions: After defining the model, we must be able to evaluate the output. This is where we declare the loss function. The loss function is very important as it tells us how far off our predictions are from the actual values:
    loss = tf.reduce_mean(tf.square(y_actual – y_pred))
  7. Initialize and train the model: Now that we have everything in place, we need to create an instance for our graph, feed in the data through the placeholders and let TensorFlow change the variables to better predict our training data. Here is one way to initialize the computational graph:
    with tf.Session(graph=graph) as session:
    
    ...
    
    session.run(...)
    
    ...
    
    Note that we can also initiate our graph with:
    
    session = tf.Session(graph=graph)
    
    session.run(…)
  8. (Optional) Evaluate the model: Once we have built and trained the model, we should evaluate the model by looking at how well it does with new data through some specified criteria.
  9. (Optional) Predict new outcomes: It is also important to know how to make predictions on new, unseen, data. We can do this with all of our models, once we have them trained.

How it works…

In TensorFlow, we have to setup the data, variables, placeholders, and model before we tell the program to train and change the variables to improve the predictions. TensorFlow accomplishes this through the computational graph. We tell it to minimize a loss function and TensorFlow does this by modifying the variables in the model. TensorFlow knows how to modify the variables because it keeps track of the computations in the model and automatically computes the gradients for every variable. Because of this, we can see how easy it can be to make changes and try different data sources.

See also

  • A great place to start is the official Python API Tensorflow documentation: https://www.tensorflow.org/versions/r0.7/api_docs/python/index.html
  • There are also tutorials available: https://www.tensorflow.org/versions/r0.7/tutorials/index.html

Declaring tensors

Getting ready

Tensors are the data structure that TensorFlow operates on in the computational graph. We can declare these tensors as variables or feed them in as placeholders. First we must know how to create tensors. When we create a tensor and declare it to be a variable, TensorFlow creates several graph structures in our computation graph. It is also important to point out that just by creating a tensor, TensorFlow is not adding anything to the computational graph. TensorFlow does this only after creating a variable out of the tensor. See the next section on variables and placeholders for more information.

How to do it…

Here we will cover the main ways to create tensors in TensorFlow.

  1. Fixed tensors:
    • Creating a zero filled tensor. Use the following:
      zero_tsr = tf.zeros([row_dim, col_dim])
    • Creating a one filled tensor. Use the following:
      ones_tsr = tf.ones([row_dim, col_dim])
    • Creating a constant filled tensor. Use the following:
      filled_tsr = tf.fill([row_dim, col_dim], 42)
    • Creating a tensor out of an existing constant.Use the following:
      constant_tsr = tf.constant([1,2,3])

    Note that the tf.constant() function can be used to broadcast a value into an array, mimicking the behavior of tf.fill() by writing tf.constant(42, [row_dim, col_dim])

  2. Tensors of similar shape:
    • We can also initialize variables based on the shape of other tensors, as follows:
      zeros_similar = tf.zeros_like(constant_tsr)
      
      ones_similar = tf.ones_like(constant_tsr)

    Note, that since these tensors depend on prior tensors, we must initialize them in order. Attempting to initialize all the tensors all at once will result in an error.

  3. Sequence tensors:
    • Tensorflow allows us to specify tensors that contain defined intervals. The following functions behave very similarly to the range() outputs and numpy’s linspace() outputs. See the following function:
      linear_tsr = tf.linspace(start=0, stop=1, start=3)
    • The resulting tensor is the sequence [0.0, 0.5, 1.0]. Note that this function includes the specified stop value. See the following function
      integer_seq_tsr = tf.range(start=6, limit=15, delta=3)
      
      The result is the sequence [6, 9, 12]. Note that this function does not include the limit value.
  4. Random tensors:
    • The following generated random numbers are from a uniform distribution:
      randunif_tsr = tf.random_uniform([row_dim, col_dim], minval=0, maxval=1)
    • Know that this random uniform distribution draws from the interval that includes the minval but not the maxval ( minval<=x<maxval ).
    • To get a tensor with random draws from a normal distribution, as follows:
      randnorm_tsr = tf.random_normal([row_dim, col_dim], mean=0.0, stddev=1.0)
    • There are also times when we wish to generate normal random values that are assured within certain bounds. The truncated_normal() function always picks normal values within two standard deviations of the specified mean. See the following:
      runcnorm_tsr = tf.truncated_normal([row_dim, col_dim], mean=0.0, stddev=1.0)
    • We might also be interested in randomizing entries of arrays. To accomplish this there are two functions that help us, random_shuffle() and random_crop(). See the following:
      shuffled_output = tf.random_shuffle(input_tensor)
      
      cropped_output = tf.random_crop(input_tensor, crop_size)
    • Later on in this book, we will be interested in randomly cropping an image of size (height, width, 3) where there are three color spectrums. To fix a dimension in the cropped_output, you must give it the maximum size in that dimension:
      cropped_image = tf.random_crop(my_image, [height/2, width/2, 3])

How it works…

Once we have decided on how to create the tensors, then we may also create the corresponding variables by wrapping the tensor in the Variable() function, as follows. More on this in the next section:

my_var = tf.Variable(tf.zeros([row_dim, col_dim]))

There’s more…

We are not limited to the built in functions, we can convert any numpy array, Python list, or constant to a tensor using the function convert_to_tensor(). Know that this function also accepts tensors as an input in case we wish to generalize a computation inside a function.

Using placeholders and variables

Getting ready

One of the most important distinctions to make with data is whether it is a placeholder or variable. Variables are the parameters of the algorithm and TensorFlow keeps track of how to change these to optimize the algorithm. Placeholders are objects that allow you to feed in data of a specific type and shape or that depend on the results of the computational graph, like the expected outcome of a computation.

How to do it…

The main way to create a variable is by using the Variable() function, which takes a tensor as an input and outputs a variable. This is the declaration and we still need to initialize the variable. Initializing is what puts the variable with the corresponding methods on the computational graph. Here is an example of creating and initializing a variable:

my_var = tf.Variable(tf.zeros([2,3]))

sess = tf.Session()

initialize_op = tf.initialize_all_variables()

sess.run(initialize_op)

To see whatthe computational graph looks like after creating and initializing a variable, see the next part in this section, How it works…, Figure 1.

Placeholders are just holding the position for data to be fed into the graph. Placeholders get data from a feed_dict argument in the session. To put a placeholder in the graph, we must perform at least one operation on the placeholder. We initialize the graph, declare x to be a placeholder, and define y as the identity operation on x, which just returns x. We then create data to feed into the x placeholder and run the identity operation. It is worth noting that Tensorflow will not return a self-referenced placeholder in the feed dictionary. The code is shown below and the resulting graph is in the next section, How it works…:

sess = tf.Session()

x = tf.placeholder(tf.float32, shape=[2,2])

y = tf.identity(x)

x_vals = np.random.rand(2,2)

sess.run(y, feed_dict={x: x_vals})

# Note that sess.run(x, feed_dict={x: x_vals}) will result in a self-referencing error.

How it works…

The computational graph of initializing a variable as a tensor of zeros is seen in Figure 1to follow:

Figure 1: Variable

Figure 1: Here we can see what the computational graph looks like in detail with just one variable, initialized to all zeros. The grey shaded region is a very detailed view of the operations and constants involved. The main computational graph with less detail is the smaller graph outside of the grey region in the upper right. For more details on creating and visualizing graphs.

Similarly, the computational graph of feeding a numpy array into a placeholder can be seen to follow, in Figure 2:

Figure 2: Computational graph of an initialized placeholder

Figure 2: Here is the computational graph of a placeholder initialized. The grey shaded region is a very detailed view of the operations and constants involved. The main computational graph with less detail is the smaller graph outside of the grey region in the upper right.

There’s more…

During the run of the computational graph, we have to tell TensorFlow when to initialize the variables we have created. While each variable has an initializer method, the most common way to do this is with the helper function initialize_all_variables(). This function creates an operation in the graph that initializes all the variables we have created, as follows:

initializer_op = tf.initialize_all_variables()

But if we want to initialize a variable based on the results of initializing another variable, we have to initialize variables in the order we want, as follows:

sess = tf.Session()

first_var = tf.Variable(tf.zeros([2,3]))

sess.run(first_var.initializer)

second_var = tf.Variable(tf.zeros_like(first_var))

# Depends on first_var

sess.run(second_var.initializer)

Working with matrices

Getting ready

Many algorithms depend on matrix operations. TensorFlow gives us easy-to-use operations to perform such matrix calculations. For all of the following examples, we can create a graph session by running the following code:

import tensorflow as tf
sess = tf.Session()

How to do it…

  1. Creating matrices: We can create two-dimensional matrices from numpy arrays or nested lists, as we described in the earlier section on tensors. We can also use the tensor creation functions and specify a two-dimensional shape for functions like zeros(), ones(), truncated_normal(), and so on:
    • Tensorflow also allows us to create a diagonal matrix from a one dimensional array or list with the function diag(), as follows:
      identity_matrix = tf.diag([1.0, 1.0, 1.0]) # Identity matrix
      
      A = tf.truncated_normal([2, 3]) # 2x3 random normal matrix
      
      B = tf.fill([2,3], 5.0) # 2x3 constant matrix of 5's
      
      C = tf.random_uniform([3,2]) # 3x2 random uniform matrix
      
      D = tf.convert_to_tensor(np.array([[1., 2., 3.],[-3., -7., -1.],[0., 5., -2.]]))
      
      print(sess.run(identity_matrix))
      
      [[ 1. 0. 0.]
      
      [ 0. 1. 0.]
      
      [ 0. 0. 1.]]
      
      print(sess.run(A))
      
      [[ 0.96751703 0.11397751 -0.3438891 ]
      
      [-0.10132604 -0.8432678   0.29810596]]
      
      print(sess.run(B))
      
      [[ 5. 5. 5.]
      
      [ 5. 5. 5.]]
      
      print(sess.run(C))
      
      [[ 0.33184157 0.08907614]
      
      [ 0.53189191 0.67605299]
      
      [ 0.95889051 0.67061249]]
      
      print(sess.run(D))
      
      [[ 1. 2. 3.]
      
      [-3. -7. -1.]
      
      [ 0. 5. -2.]]

    Note that if we were to run sess.run(C) again, we would reinitialize the random variables and end up with different random values.

  2. Addition and subtraction uses the following function:
    print(sess.run(A+B))
    
    [[ 4.61596632 5.39771316 4.4325695 ]
    
    [ 3.26702736 5.14477345 4.98265553]]
    
    print(sess.run(B-B))
    
    [[ 0. 0. 0.]
    
    [ 0. 0. 0.]]
    
    Multiplication
    
    print(sess.run(tf.matmul(B, identity_matrix)))
    
    [[ 5. 5. 5.]
    
    [ 5. 5. 5.]]
  3. Also, the function matmul() has arguments that specify whether or not to transpose the arguments before multiplication or whether each matrix is sparse.
  4. Transpose the arguments as follows:
    print(sess.run(tf.transpose(C)))
    
    [[ 0.67124544 0.26766731 0.99068872]
    
    [ 0.25006068 0.86560275 0.58411312]]
  5. Again, it is worth mentioning the reinitializing that gives us different values than before.
  6. Determinant, use the following:
    print(sess.run(tf.matrix_determinant(D)))
    
    -38.0
    • Inverse:
      print(sess.run(tf.matrix_inverse(D)))
      
      [[-0.5       -0.5       -0.5       ]
      
      [ 0.15789474 0.05263158 0.21052632]
      
      [ 0.39473684 0.13157895 0.02631579]]

    Note that the inverse method is based on the Cholesky decomposition if the matrix is symmetric positive definite or the LU decomposition otherwise.

  7. Decompositions:
    • Cholesky decomposition, use the following:
      print(sess.run(tf.cholesky(identity_matrix)))
      
      [[ 1. 0. 1.]
      
      [ 0. 1. 0.]
      
      [ 0. 0. 1.]]
  8. Eigenvalues and Eigenvectors, use the following code:
    print(sess.run(tf.self_adjoint_eig(D))
    
    [[-10.65907521 -0.22750691   2.88658212]
    
    [ 0.21749542   0.63250104 -0.74339638]
    
    [ 0.84526515   0.2587998   0.46749277]
    
    [ -0.4880805   0.73004459   0.47834331]]

Note that the function self_adjoint_eig() outputs the eigen values in the first row and the subsequent vectors in the remaining vectors. In mathematics, this is called the eigen decomposition of a matrix.

How it works…

TensorFlow provides all the tools for us to get started with numerical computations and add such computations to our graphs. This notation might seem quite heavy for simple matrix operations. Remember that we are adding these operations to the graph and telling TensorFlow what tensors to run through those operations.

Declaring operations

Getting ready

Besides the standard arithmetic operations, TensorFlow provides us more operations that we should be aware of and how to use them before proceeding. Again, we can create a graph session by running the following code:

import tensorflow as tf

sess = tf.Session()

How to do it…

TensorFlow has the standard operations on tensors, add(), sub(), mul(), and div(). Note that all of these operations in this section will evaluate the inputs element-wise unless specified otherwise.

  1. TensorFlow provides some variations of div() and relevant functions.
  2. It is worth mentioning that div() returns the same type as the inputs. This means it really returns the floor of the division (akin to Python 2) if the inputs are integers. To return the Python 3 version, which casts integers into floats before dividing and always returns a float, TensorFlow provides the function truediv()shown as follows:
    print(sess.run(tf.div(3,4)))
    
    0
    
    print(sess.run(tf.truediv(3,4)))
    
    0.75
  3. If we have floats and want integer division, we can use the function floordiv(). Note that this will still return a float, but rounded down to the nearest integer. The function is shown as follows:
    print(sess.run(tf.floordiv(3.0,4.0)))
    
    0.0
  4. Another important function is mod(). This function returns the remainder after division.It is shown as follows:
    print(sess.run(tf.mod(22.0, 5.0)))
    
    2.0
  5. The cross product between two tensors is achieved by the cross() function. Remember that the cross product is only defined for two 3-dimensional vectors, so it only accepts two 3-dimensional tensors. The function is shown as follows:
    print(sess.run(tf.cross([1., 0., 0.], [0., 1., 0.])))
    
    [ 0. 0. 1.0]
  6. Here is a compact list of the more common math functions. All of these functions operate element-wise:

    abs()

    Absolute value of one input tensor

    ceil()

    Ceiling function of one input tensor

    cos()

    Cosine function of one input tensor

    exp()

    Base e exponential of one input tensor

    floor()

    Floor function of one input tensor

    inv()

    Multiplicative inverse (1/x) of one input tensor

    log()

    Natural logarithm of one input tensor

    maximum()

    Element-wise max of two tensors

    minimum()

    Element-wise min of two tensors

    neg()

    Negative of one input tensor

    pow()

    The first tensor raised to the second tensor element-wise

    round()

    Rounds one input tensor

    rsqrt()

    One over the square root of one tensor

    sign()

    Returns -1, 0, or 1, depending on the sign of the tensor

    sin()

    Sine function of one input tensor

    sqrt()

    Square root of one input tensor

    square()

    Square of one input tensor

  7. Specialty mathematical functions: There are some special math functions that get used in machine learning that are worth mentioning and TensorFlow has built in functions for them. Again, these functions operate element-wise, unless specified otherwise:

    digamma()

    Psi function, the derivative of the lgamma() function

    erf()

    Gaussian error function, element-wise, of one tensor

    erfc()

    Complimentary error function of one tensor

    igamma()

    Lower regularized incomplete gamma function

    igammac()

    Upper regularized incomplete gamma function

    lbeta()

    Natural logarithm of the absolute value of the beta function

    lgamma()

    Natural logarithm of the absolute value of the gamma function

    squared_difference()

    Computes the square of the differences between two tensors

How it works…

It is important to know what functions are available to us to add to our computational graphs. Mostly we will be concerned with the preceding functions. We can also generate many different custom functions as compositions of the preceding, as follows:

# Tangent function (tan(pi/4)=1)

print(sess.run(tf.div(tf.sin(3.1416/4.), tf.cos(3.1416/4.))))

1.0

There’s more…

If we wish to add other operations to our graphs that are not listed here, we must create our own from the preceding functions. Here is an example of an operation not listed above that we can add to our graph:

# Define a custom polynomial function

def custom_polynomial(value):

   # Return 3 * x^2 - x + 10

   return(tf.sub(3 * tf.square(value), value) + 10)

print(sess.run(custom_polynomial(11)))

362

Summary

Thus in this article we have implemented some introductory recipes that will help us to learn the basics of TensorFlow.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here