7 min read

The development of a neural network is inspired by human brain activities. As such, this type of network is a computational model that mimics the pattern of the human mind. In contrast to this, support vector machines first, map input data into a high dimensional feature space defined by the kernel function, and find the optimum hyperplane that separates the training data by the maximum margin. In short, we can think of support vector machines as a linear algorithm in a high dimensional space.

In this article, we will cover:

  • Training a neural network with neuralnet
  • Visualizing a neural network trained by neuralnet

(For more resources related to this topic, see here.)

Training a neural network with neuralnet

The neural network is constructed with an interconnected group of nodes, which involves the input, connected weights, processing element, and output. Neural networks can be applied to many areas, such as classification, clustering, and prediction. To train a neural network in R, you can use neuralnet, which is built to train multilayer perceptron in the context of regression analysis, and contains many flexible functions to train forward neural networks. In this recipe, we will introduce how to use neuralnet to train a neural network.

Getting ready

In this recipe, we will use an iris dataset as our example dataset. We will first split the irisdataset into a training and testing datasets, respectively.

How to do it…

Perform the following steps to train a neural network with neuralnet:

  1. First load the iris dataset and split the data into training and testing datasets:
    > data(iris)
    
    > ind <- sample(2, nrow(iris), replace = TRUE, prob=c(0.7, 0.3))
    
    > trainset = iris[ind == 1,]> testset = iris[ind == 2,]
  2. Then, install and load the neuralnet package:
    > install.packages("neuralnet")> library(neuralnet)
  3. Add the columns versicolor, setosa, and virginica based on the name matched value in the Species column:
    > trainset$setosa = trainset$Species == "setosa"
    
    > trainset$virginica = trainset$Species == "virginica"
    
    > trainset$versicolor = trainset$Species == "versicolor"
  4. Next, train the neural network with the neuralnet function with three hidden neurons in each layer. Notice that the results may vary with each training, so you might not get the same result:
    > network = neuralnet(versicolor + virginica + setosa~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width, trainset, hidden=3)
    
    > network
    
    Call: neuralnet(formula = versicolor + virginica + setosa ~ Sepal.Length +     Sepal.Width + Petal.Length + Petal.Width, data = trainset,     hidden = 3)
    
     
    
    1 repetition was calculated.
    
     
    
             Error Reached Threshold Steps
    
    1 0.8156100175    0.009994274769 11063
  5. Now, you can view the summary information by accessing the result.matrix attribute of the built neural network model:
    > network$result.matrix
    
                                              1
    
    error                        0.815610017474
    
    reached.threshold            0.009994274769
    
    steps                    11063.000000000000
    
    Intercept.to.1layhid1        1.686593311644
    
    Sepal.Length.to.1layhid1     0.947415215237
    
    Sepal.Width.to.1layhid1     -7.220058260187
    
    Petal.Length.to.1layhid1     1.790333443486
    
    Petal.Width.to.1layhid1      9.943109233330
    
    Intercept.to.1layhid2        1.411026063895
    
    Sepal.Length.to.1layhid2     0.240309549505
    
    Sepal.Width.to.1layhid2      0.480654059973
    
    Petal.Length.to.1layhid2     2.221435192437
    
    Petal.Width.to.1layhid2      0.154879347818
    
    Intercept.to.1layhid3       24.399329878242
    
    Sepal.Length.to.1layhid3     3.313958088512
    
    Sepal.Width.to.1layhid3      5.845670010464
    
    Petal.Length.to.1layhid3    -6.337082722485
    
    Petal.Width.to.1layhid3    -17.990352566695
    
    Intercept.to.versicolor     -1.959842102421
    
    1layhid.1.to.versicolor      1.010292389835
    
    1layhid.2.to.versicolor      0.936519720978
    
    1layhid.3.to.versicolor      1.023305801833
    
    Intercept.to.virginica      -0.908909982893
    
    1layhid.1.to.virginica      -0.009904635231
    
    1layhid.2.to.virginica       1.931747950462
    
    1layhid.3.to.virginica      -1.021438938226
    
    Intercept.to.setosa          1.500533827729
    
    1layhid.1.to.setosa         -1.001683936613
    
    1layhid.2.to.setosa         -0.498758815934
    
    1layhid.3.to.setosa         -0.001881935696
  6. Lastly, you can view the generalized weight by accessing it in the network:
    > head(network$generalized.weights[[1]])

How it works…

The neural network is a network made up of artificial neurons (or nodes). There are three types of neurons within the network: input neurons, hidden neurons, and output neurons. In the network, neurons are connected; the connection strength between neurons is called weights. If the weight is greater than zero, it is in an excitation status. Otherwise, it is in an inhibition status. Input neurons receive the input information; the higher the input value, the greater the activation. Then, the activation value is passed through the network in regard to weights and transfer functions in the graph. The hidden neurons (or output neurons) then sum up the activation values and modify the summed values with the transfer function. The activation value then flows through hidden neurons and stops when it reaches the output nodes. As a result, one can use the output value from the output neurons to classify the data.

Artificial Neural Network

The advantages of a neural network are: firstly, it can detect a nonlinear relationship between the dependent and independent variable. Secondly, one can efficiently train large datasets using the parallel architecture. Thirdly, it is a nonparametric model so that one can eliminate errors in the estimation of parameters. The main disadvantages of neural network are that it often converges to the local minimum rather than the global minimum. Also, it might over-fit when the training process goes on for too long.

In this recipe, we demonstrate how to train a neural network. First, we split the iris dataset into training and testing datasets, and then install the neuralnet package and load the library into an R session. Next, we add the columns versicolorsetosa, and virginica based on the name matched value in the Species column, respectively. We then use the neuralnet function to train the network model. Besides specifying the label (the column where the name equals to versicolor, virginica, and setosa) and training attributes in the function, we also configure the number of hidden neurons (vertices) as three in each layer.

Then, we examine the basic information about the training process and the trained network saved in the network. From the output message, it shows the training process needed 11,063 steps until all the absolute partial derivatives of the error function were lower than 0.01 (specified in the threshold). The error refers to the likelihood of calculating Akaike Information Criterion (AIC). To see detailed information on this, you can access the result.matrix of the built neural network to see the estimated weight. The output reveals that the estimated weight ranges from -18 to 24.40; the intercepts of the first hidden layer are 1.69, 1.41 and 24.40, and the two weights leading to the first hidden neuron are estimated as 0.95 (Sepal.Length), -7.22 (Sepal.Width), 1.79 (Petal.Length), and 9.94 (Petal.Width). We can lastly determine that the trained neural network information includes generalized weights, which express the effect of each covariate. In this recipe, the model generates 12 generalized weights, which are the combination of four covariates (Sepal.LengthSepal.WidthPetal.LengthPetal.Width) to three responses (setosavirginicaversicolor).

See also

  • For a more detailed introduction on neuralnet, one can refer to the following paper: Günther, F., and Fritsch, S. (2010). neuralnet: Training of neural networksThe R journal, 2(1), 30-38.

Visualizing a neural network trained by neuralnet

The package, neuralnet, provides the plot function to visualize a built neural network and the gwplot function to visualize generalized weights. In following recipe, we will cover how to use these two functions.

Getting ready

You need to have completed the previous recipe by training a neural network and have all basic information saved in network.

How to do it…

Perform the following steps to visualize the neural network and the generalized weights:

  1. You can visualize the trained neural network with the plot function:
    > plot(network)

    Figure 10: The plot of trained neural network

  2. Furthermore, You can use gwplot to visualize the generalized weights:
    > par(mfrow=c(2,2))
    
    > gwplot(network,selected.covariate="Petal.Width")
    
    > gwplot(network,selected.covariate="Sepal.Width")
    
    > gwplot(network,selected.covariate="Petal.Length")
    
    > gwplot(network,selected.covariate="Petal.Width")

Figure 11: The plot of generalized weights

How it works…

In this recipe, we demonstrate how to visualize the trained neural network and the generalized weights of each trained attribute. Also, the plot includes the estimated weight, intercepts and basic information about the training process. At the bottom of the figure, one can find the overall error and number of steps required to converge.

If all the generalized weights are close to zero on the plot, it means the covariate has little effect. However, if the overall variance is greater than one, it means the covariate has a nonlinear effect.

See also

  • For more information about gwplot, one can use the help function to access the following document:
    > ?gwplot

Summary

To learn more about machine learning with R, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended:

Resources for Article:


Further resources on this subject:

LEAVE A REPLY

Please enter your comment!
Please enter your name here