Training and Visualizing a neural network with R

The development of a neural network is inspired by human brain activities. As such, this type of network is a computational model that mimics the pattern of the human mind. In contrast to this, support vector machines first, map input data into a high dimensional feature space defined by the kernel function, and find the optimum hyperplane that separates the training data by the maximum margin. In short, we can think of support vector machines as a linear algorithm in a high dimensional space.

In this article, we will cover:

Training a neural network with neuralnet
Visualizing a neural network trained by neuralnet

(For more resources related to this topic, see here.)

Training a neural network with neuralnet

The neural network is constructed with an interconnected group of nodes, which involves the input, connected weights, processing element, and output. Neural networks can be applied to many areas, such as classification, clustering, and prediction. To train a neural network in R, you can use neuralnet, which is built to train multilayer perceptron in the context of regression analysis, and contains many flexible functions to train forward neural networks. In this recipe, we will introduce how to use neuralnet to train a neural network.

Getting ready

In this recipe, we will use an iris dataset as our example dataset. We will first split the irisdataset into a training and testing datasets, respectively.

How to do it...

Perform the following steps to train a neural network with neuralnet:

First load the iris dataset and split the data into training and testing datasets:

> data(iris)

> ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.7, 0.3))

> trainset = iris[ind == 1,]> testset = iris[ind == 2,]

Then, install and load the neuralnet package:

> install.packages("neuralnet")> library(neuralnet)

Add the columns versicolor, setosa, and virginica based on the name matched value in the Species column:

> trainset$setosa = trainset$Species == "setosa"

> trainset$virginica = trainset$Species == "virginica"

> trainset$versicolor = trainset$Species == "versicolor"

Next, train the neural network with the neuralnet function with three hidden neurons in each layer. Notice that the results may vary with each training, so you might not get the same result:

> network = neuralnet(versicolor + virginica + setosa~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, trainset, hidden=3)

> network

Call: neuralnet(formula = versicolor + virginica + setosa ~ Sepal.Length +     Sepal.Width + Petal.Length + Petal.Width, data = trainset,     hidden = 3)

 

1 repetition was calculated.

 

         Error Reached Threshold Steps

1 0.8156100175    0.009994274769 11063

Now, you can view the summary information by accessing the result.matrix attribute of the built neural network model:

> network$result.matrix

                                          1

error                        0.815610017474

reached.threshold            0.009994274769

steps                    11063.000000000000

Intercept.to.1layhid1        1.686593311644

Sepal.Length.to.1layhid1     0.947415215237

Sepal.Width.to.1layhid1     -7.220058260187

Petal.Length.to.1layhid1     1.790333443486

Petal.Width.to.1layhid1      9.943109233330

Intercept.to.1layhid2        1.411026063895

Sepal.Length.to.1layhid2     0.240309549505

Sepal.Width.to.1layhid2      0.480654059973

Petal.Length.to.1layhid2     2.221435192437

Petal.Width.to.1layhid2      0.154879347818

Intercept.to.1layhid3       24.399329878242

Sepal.Length.to.1layhid3     3.313958088512

Sepal.Width.to.1layhid3      5.845670010464

Petal.Length.to.1layhid3    -6.337082722485

Petal.Width.to.1layhid3    -17.990352566695

Intercept.to.versicolor     -1.959842102421

1layhid.1.to.versicolor      1.010292389835

1layhid.2.to.versicolor      0.936519720978

1layhid.3.to.versicolor      1.023305801833

Intercept.to.virginica      -0.908909982893

1layhid.1.to.virginica      -0.009904635231

1layhid.2.to.virginica       1.931747950462

1layhid.3.to.virginica      -1.021438938226

Intercept.to.setosa          1.500533827729

1layhid.1.to.setosa         -1.001683936613

1layhid.2.to.setosa         -0.498758815934

1layhid.3.to.setosa         -0.001881935696

Lastly, you can view the generalized weight by accessing it in the network:
```
> head(network$generalized.weights[[1]])
```

How it works...

The neural network is a network made up of artificial neurons (or nodes). There are three types of neurons within the network: input neurons, hidden neurons, and output neurons. In the network, neurons are connected; the connection strength between neurons is called weights. If the weight is greater than zero, it is in an excitation status. Otherwise, it is in an inhibition status. Input neurons receive the input information; the higher the input value, the greater the activation. Then, the activation value is passed through the network in regard to weights and transfer functions in the graph. The hidden neurons (or output neurons) then sum up the activation values and modify the summed values with the transfer function. The activation value then flows through hidden neurons and stops when it reaches the output nodes. As a result, one can use the output value from the output neurons to classify the data.

training-and-visualizing-a-neural-network-with-r-img-0

Artificial Neural Network

The advantages of a neural network are: firstly, it can detect a nonlinear relationship between the dependent and independent variable. Secondly, one can efficiently train large datasets using the parallel architecture. Thirdly, it is a nonparametric model so that one can eliminate errors in the estimation of parameters. The main disadvantages of neural network are that it often converges to the local minimum rather than the global minimum. Also, it might over-fit when the training process goes on for too long.

In this recipe, we demonstrate how to train a neural network. First, we split the iris dataset into training and testing datasets, and then install the neuralnet package and load the library into an R session. Next, we add the columns versicolor, setosa, and virginica based on the name matched value in the Species column, respectively. We then use the neuralnet function to train the network model. Besides specifying the label (the column where the name equals to versicolor, virginica, and setosa) and training attributes in the function, we also configure the number of hidden neurons (vertices) as three in each layer.

Then, we examine the basic information about the training process and the trained network saved in the network. From the output message, it shows the training process needed 11,063 steps until all the absolute partial derivatives of the error function were lower than 0.01 (specified in the threshold). The error refers to the likelihood of calculating Akaike Information Criterion (AIC). To see detailed information on this, you can access the result.matrix of the built neural network to see the estimated weight. The output reveals that the estimated weight ranges from -18 to 24.40; the intercepts of the first hidden layer are 1.69, 1.41 and 24.40, and the two weights leading to the first hidden neuron are estimated as 0.95 (Sepal.Length), -7.22 (Sepal.Width), 1.79 (Petal.Length), and 9.94 (Petal.Width). We can lastly determine that the trained neural network information includes generalized weights, which express the effect of each covariate. In this recipe, the model generates 12 generalized weights, which are the combination of four covariates (Sepal.Length, Sepal.Width, Petal.Length, Petal.Width) to three responses (setosa, virginica, versicolor).

Visualizing a neural network trained by neuralnet

The package, neuralnet, provides the plot function to visualize a built neural network and the gwplot function to visualize generalized weights. In following recipe, we will cover how to use these two functions.

Getting ready

You need to have completed the previous recipe by training a neural network and have all basic information saved in network.

How to do it...

Perform the following steps to visualize the neural network and the generalized weights:

You can visualize the trained neural network with the plot function:
```
> plot(network)
```
Figure 10: The plot of trained neural network

Furthermore, You can use gwplot to visualize the generalized weights:

> par(mfrow=c(2,2))

> gwplot(network,selected.covariate="Petal.Width")

> gwplot(network,selected.covariate="Sepal.Width")

> gwplot(network,selected.covariate="Petal.Length")

> gwplot(network,selected.covariate="Petal.Width")

training-and-visualizing-a-neural-network-with-r-img-2

Figure 11: The plot of generalized weights

How it works...

In this recipe, we demonstrate how to visualize the trained neural network and the generalized weights of each trained attribute. Also, the plot includes the estimated weight, intercepts and basic information about the training process. At the bottom of the figure, one can find the overall error and number of steps required to converge.

If all the generalized weights are close to zero on the plot, it means the covariate has little effect. However, if the overall variance is greater than one, it means the covariate has a nonlinear effect.