Keras is a high-level library for deep learning, which is built on top of *theano* and *tensorflow*. It is written in Python and provides a *scikit-learn* type API for building neural networks. It enables developers to quickly build neural networks without worrying about the mathematical details of tensor algebra, optimization methods, and numerical methods. The key idea behind *keras* is to facilitate fast prototyping and experimentation. In the words of **Francois Chollet**, creator of *keras,* “*Being able to go from idea to result with the least possible delay is the key to doing good research.”*

### Key features of *keras*:

- Any one of the
*theano*and*tensorflow*backends can be used. - Supports both CPU and GPU.
*Keras*is modular in nature in the sense that each component of a neural network model is a separate, standalone module, and these modules can be combined to create new models. New modules are easy to add.- Write only Python code.

### Installation:

*Keras* has the following dependencies: *numpy* – *scipy* – *pyyaml* – *hdf5* (for saving/loading models) – *theano* (for *theano* backend) – *tensorflow* (for *tensorflow* backend).

The easiest way to install *keras* is using *Python Project Index (PyPI)*:

`sudo pip install keras`

### Example: MNIST digits classification using *keras*

We will learn about the basic functionality of *keras* using an example. We will build a simple neural network for classifying hand-written digits from the MNIST dataset. Classification of hand-written digits was the first big problem where deep learning outshone all the other known methods and this paved the way for deep learning on a successful track.

Let’s start by importing data; we will use the sample of hand-written digits provided with the *scikit-learn* base package:

```
from sklearn import datasets
mnist = datasets.load_digits()
X = mnist.data
Y = mnist.target
```

Let’s examine the data:

```
print X.shape, Y.shape
print X[0]
print Y[0]
```

Since we are working with *numpy* arrays, let’s import *numpy*:

```
import numpy
# set seed
np.random.seed(1234)
```

Now, we’ll split the data into training and test sets by randomly picking 70% of the data points as a training set and the remaining for validation:

```
from sklearn.cross_validation import train_test_split
train_X, test_X, train_y, test_y = train_test_split(X, Y, train_size=0.7, random_state=0)
```

*Keras* requires the labels to be *one-hot-encoded*, i.e., the labels 1, 2, 3,..,etc., need to be converted to vectors like [1,0,0,…], [0,1,0,0…], [0,0,1,0,0…], respectively:

```
def one_hot_encode_object_array(arr):
'''One hot encode a numpy array of objects (e.g. strings)'''
uniques, ids = np.unique(arr, return_inverse=True)
return np_utils.to_categorical(ids, len(uniques))
# One hot encode labels for training and test sets.
train_y_ohe = one_hot_encode_object_array(train_y)
test_y_ohe = one_hot_encode_object_array(test_y)
```

We are now ready to build a neural network model. Start by importing the relevant classes from *keras*:

```
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.utils import np_utils
```

In *keras*, we have to specify the structure of the model before we can use it. A **Sequential model** is a linear stack of layers. There are other alternatives in *keras*, but we will with sequential for simplicity:

`model = Sequential()`

This creates an instance of the constructor; we don’t have anything in the model as yet. As stated previously, *keras* is modular and we can add different components to the model via modules. Let’s add a fully connected layer with 32 units. Each unit receives an input from every unit in the input layer, and since the number of units in the input is equal to the dimension (64) of the input vectors, we need the input shape to be 64. *Keras* uses a *Dense* module to create a fully connected layer:

`model.add(Dense(32, input_shape=(64,)))`

Next, we add an activation function after the first layer. We will use *sigmoid* activation. Other choices like relu, etc., are also possible:

`model.add(Activation('sigmoid'))`

We can add any number of layers this way. But for simplicity, we will restrict to only one hidden layer. Add the output layer. Since the output is a 10-dimensional vector, we require the output layer to have 10 units:

`model.add(Dense(10))`

Add activation for the output layer. In classification tasks, we use *softmax* activation. This provides a probilistic interpretation for the output labels:

`model.add(Activation('softmax'))`

Next, we need to configure the model. There are some more choices we need to make before we can run the model, e.g., choose an optimization method, loss function, and metric of evaluation:

`model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])`

The *compile* method configures the model, and the model is now ready to be trained on data. Similar to *sklearn*, *keras* has a *fit* method for training:

`model.fit(train_X, train_y_ohe, nb_epoch=10, batch_size=30)`

Training neural networks often involves the concept of *minibatching*, which means showing the network a subset of the data, adjusting the weights, and then showing it another subset of the data. When the network has seen all the data once, that’s called an “epoch”. Tuning the minibatch/epoch strategy is a somewhat problem-specific issue.

After the model has trained, we can compute its accuracy on the validation set:

```
loss, accuracy = model.evaluate(test_X, test_y_ohe)
print accuracy
```

### Conclusion

We have seen how a neural network can be built using *keras*, and how easy and intuitive the *keras* API is. This is just an introduction, a hello-world program, if you will. There is a lot more functionality in *keras*, including convolutional neural networks, recurrent neural networks, language modeling, deep dream, etc.

### About the author

Janu Verma is a Researcher in the IBM T.J. Watson Research Center, New York. His research interests are in mathematics, machine learning, information visualization, computational biology and healthcare analytics. He has held research positions at Cornell University, Kansas State University, Tata Institute of Fundamental Research, Indian Institute of Science, and Indian Statistical Institute. He has written papers for IEEE Vis, KDD, International Conference on HealthCare Informatics, Computer Graphics and Applications, Nature Genetics, IEEE Sensors Journals, etc. His current focus is on the development of visual analytics systems for prediction and understanding. He advises startups and companies on data science and machine learning in the Delhi-NCR area; email to schedule a meeting.