In this article, by PKS Prakash and Achyutuni Sri Krishna Rao, authors of R Deep Learning Cookbook we will learn how to Perform logistic regression using TensorFlow.
In this recipe, we will cover the application of TensorFlow in setting up a logistic regression model. The example will use a similar dataset to that used in the H2O model setup.
(For more resources related to this topic, see here.)
What is TensorFlow
TensorFlow is another open source library developed by the Google Brain Team to build numerical computation models using data flow graphs. The core of TensorFlow was developed in C++ with the wrapper in Python. The tensorflow package in R gives you access to the TensorFlow API composed of Python modules to execute computation models. TensorFlow supports both CPU- and GPU-based computations.
The tensorflow package in R calls the Python tensorflow API for execution, which is essential to install the tensorflow package in both R and Python to make R work. The following are the dependencies for tensorflow:
Getting ready
The code for this section is created on Linux but can be run on any operating system. To start modeling, load the tensorflow package in the environment. R loads the default TensorFlow environment variable and also the NumPy library from Python in the np variable:
library("tensorflow") # Load TensorFlow
np <- import("numpy") # Load numpy library
How to do it…
The data is imported using a standard function from R, as shown in the following code.
# Loading input and test data
xFeatures = c("Temperature", "Humidity", "Light", "CO2",
"HumidityRatio")
yFeatures = "Occupancy"
occupancy_train <-
as.matrix(read.csv("datatraining.txt",stringsAsFactors =
T))
occupancy_test <-
as.matrix(read.csv("datatest.txt",stringsAsFactors = T))
# subset features for modeling and transform to numeric
values
occupancy_train<-apply(occupancy_train[, c(xFeatures,
yFeatures)], 2, FUN=as.numeric)
occupancy_test<-apply(occupancy_test[, c(xFeatures,
yFeatures)], 2, FUN=as.numeric)
# Data dimensions
nFeatures<-length(xFeatures)
nRow<-nrow(occupancy_train)
# Reset the graph
tf$reset_default_graph()
# Starting session as interactive session
sess<-tf$InteractiveSession()
# Setting-up Logistic regression graph
x <- tf$constant(unlist(occupancy_train[, xFeatures]),
shape=c(nRow, nFeatures), dtype=np$float32) #
W <- tf$Variable(tf$random_uniform(shape(nFeatures, 1L)))
b <- tf$Variable(tf$zeros(shape(1L)))
y <- tf$matmul(x, W) + b
# Setting-up cost function and optimizer
y_ <- tf$constant(unlist(occupancy_train[, yFeatures]),
dtype="float32", shape=c(nRow, 1L))
cross_entropy<-
tf$reduce_mean(tf$nn$sigmoid_cross_entropy_with_logits(labe
ls=y_, logits=y, name="cross_entropy"))
optimizer <-
tf$train$GradientDescentOptimizer(0.15)$minimize(cross_entr
opy)
# Start a session
init <- tf$global_variables_initializer()
sess$run(init)
# Running optimization
for (step in 1:5000) {
sess$run(optimizer)
if (step %% 20== 0)
cat(step, "-", sess$run(W), sess$run(b), "==>",
sess$run(cross_entropy), "n")
}
How it works…
The performance of the model can be evaluated using AUC:
# Performance on Train
library(pROC)
ypred <- sess$run(tf$nn$sigmoid(tf$matmul(x, W) + b))
roc_obj <- roc(occupancy_train[, yFeatures], as.numeric(ypred))
# Performance on test
nRowt<-nrow(occupancy_test)
xt <- tf$constant(unlist(occupancy_test[, xFeatures]),
shape=c(nRowt, nFeatures), dtype=np$float32)
ypredt <- sess$run(tf$nn$sigmoid(tf$matmul(xt, W) + b))
roc_objt <- roc(occupancy_test[, yFeatures], as.numeric(ypredt)).
AUC can be visualized using the plot.auc function from the pROC package, as shown in the screenshot following this command. The performance for training and testing (holdout) is very similar.
plot.roc(roc_obj, col = "green", lty=2, lwd=2)
plot.roc(roc_objt, add=T, col="red", lty=4, lwd=2) Performance of logistic regression using TensorFlow
Visualizing TensorFlow graphs
TensorFlow graphs can be visualized using TensorBoard. It is a service that utilizes TensorFlow event files to visualize TensorFlow models as graphs. Graph model visualization in TensorBoard is also used to debug TensorFlow models.
Getting ready
TensorBoard can be started using the following command in the terminal:
$ tensorboard --logdir home/log --port 6006
The following are the major parameters for TensorBoard:
The preceding command will launch the TensorFlow service on localhost at port 6006, as shown in the following screenshot: TensorBoard
The tabs on the TensorBoard capture relevant data generated during graph execution.
How to do it…
The section covers how to visualize TensorFlow models and output in TernsorBoard.
# Create Writer Obj for log
log_writer = tf$summary$FileWriter('c:/log', sess$graph)
The graph for logistic regression developed using the preceding code is shown in the following screenshot: Visualization of the logistic regression graph in TensorBoard
# Adding histogram summary to weight and bias variable
w_hist = tf$histogram_summary("weights", W)
b_hist = tf$histogram_summary("biases", b)
# Set-up cross entropy for test
nRowt<-nrow(occupancy_test)
xt <- tf$constant(unlist(occupancy_test[, xFeatures]),
shape=c(nRowt, nFeatures), dtype=np$float32)
ypredt <- tf$nn$sigmoid(tf$matmul(xt, W) + b)
yt_ <- tf$constant(unlist(occupancy_test[, yFeatures]),
dtype="float32", shape=c(nRowt, 1L))
cross_entropy_tst<-
tf$reduce_mean(tf$nn$sigmoid_cross_entropy_with_logits(labe
ls=yt_, logits=ypredt, name="cross_entropy_tst"))
# Add summary ops to collect data
w_hist = tf$summary$histogram("weights", W)
b_hist = tf$summary$histogram("biases", b)
crossEntropySummary<-tf$summary$scalar("costFunction",
cross_entropy)
crossEntropyTstSummary<-
tf$summary$scalar("costFunction_test", cross_entropy_tst)
# Create Writer Obj for log
log_writer = tf$summary$FileWriter('c:/log', sess$graph)
for (step in 1:2500) {
sess$run(optimizer)
# Evaluate performance on training and test data after 50
Iteration
if (step %% 50== 0){
### Performance on Train
ypred <- sess$run(tf$nn$sigmoid(tf$matmul(x, W) + b))
roc_obj <- roc(occupancy_train[, yFeatures],
as.numeric(ypred))
### Performance on Test
ypredt <- sess$run(tf$nn$sigmoid(tf$matmul(xt, W) + b))
roc_objt <- roc(occupancy_test[, yFeatures],
as.numeric(ypredt))
cat("train AUC: ", auc(roc_obj), " Test AUC: ",
auc(roc_objt), "n")
# Save summary of Bias and weights
log_writer$add_summary(sess$run(b_hist),
global_step=step)
log_writer$add_summary(sess$run(w_hist),
global_step=step)
log_writer$add_summary(sess$run(crossEntropySummary),
global_step=step)
log_writer$add_summary(sess$run(crossEntropyTstSummary),
global_step=step)
} }
summary = tf$summary$merge_all()
log_writer = tf$summary$FileWriter('c:/log', sess$graph)
summary_str = sess$run(summary)
log_writer$add_summary(summary_str, step)
log_writer$close()
Summary
In this article, we have learned how to perform logistic regression using TensorFlow also we have covered the application of TensorFlow in setting up a logistic regression model.
Further resources on this subject:
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…