8 min read

With ever-growing data and its accessibility, the applications of Machine Learning are rapidly rising across various industries. However, the pace of the growth in trained data scientists is yet to meet the pace of growth of ML needs in businesses. In spite of having abundant resources and software libraries that make building ML models easier, it takes time and experience for a data scientist and ML engineer to master such skill sets.

The necessity for machine learning is everywhere, and most production enterprise applications are written in C# using tools such as Visual Studio, SQL Server, and Microsoft Azure. Machine Learning with C# uniquely blends together an understanding of various machine learning concepts, techniques of machine learning, and various available machine learning tools through which users can add intelligent features. These tools include image and motion detection, Bayes intuition, and deep learning, to C# .NET applications

This tutorial is an excerpt taken from the book C# Machine Learning Projects written by Yoon Hyup Hwang. In this book, you will learn how to choose a model for your problem, how to evaluate the performance of your models, and how you can use C# to build machine learning models for your future projects.

In today’s post, we will learn how to set up our a C# environment for Machine Learning. We will first install and set up Visual Studio and then do the same for two packages (Accord.NET and Deedle).

Setting up Visual Studio for C#

Assuming you have some prior knowledge of C#, we will keep this part brief. In case you need to install Visual Studio for C#, go to https://www.visualstudio.com/downloads/ and download one of the versions of Visual Studio. In this article, we use the Community Edition of Visual Studio 2017. If it prompts you to download .NET Framework before you install Visual Studio, go to https://www.microsoft.com/en-us/download/details.aspx?id=53344 and install it first.

Installing Accord.NET

Accord.NET is a .NET ML framework. On top of machine learning packages, the Accord.NET framework also has mathematics, statistics, computer vision, computer audition, and other scientific computing modules. We are mainly going to use the ML package of the Accord.NET framework.

Once you have installed and set up your Visual Studio, let’s start installing the ML framework for C#, Accord.NET. It is easiest to install it through NuGet. To install it, open the package manager (Tools | NuGet Package Manager | Package Manager Console) and install Accord.MachineLearning and Accord.Controls by typing in the following commands:

PM> Install-Package Accord.MachineLearning
PM> Install-Package Accord.Controls

Now, let’s build a sample ML application using these Accord.NET packages. Open your Visual Studio and create a new Console Application under the Visual C# category. Use the preceding commands to install those Accord.NET packages through NuGet and add references to our project. You should see some Accord.NET packages added to your references in your Solutions Explorer and the result should look something like the following screenshot:

The model we are going to build now is a very simple logistic regression model. Given two-dimensional arrays and an expected output, we are going to develop a program that trains a logistic regression classifier and then plot the results showing the expected output and the actual predictions by this model. The input and output for this model look like the following:

The code for this sample logistic regression classifier is as follows:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using Accord.Controls;
using Accord.Statistics;
using Accord.Statistics.Models.Regression;
using Accord.Statistics.Models.Regression.Fitting;

namespace SampleAccordNETApp
{
class Program
{
static void Main(string[] args)
{
double[][] inputs =
{
new double[] { 0, 0 },
new double[] { 0.25, 0.25 }, 
new double[] { 0.5, 0.5 }, 
new double[] { 1, 1 },
};

int[] outputs =
{ 
0,
0,
1,
1,
};

// Train a Logistic Regression model
var learner = new IterativeReweightedLeastSquares<LogisticRegression>()
{
MaxIterations = 100
};
var logit = learner.Learn(inputs, outputs);

// Predict output
bool[] predictions = logit.Decide(inputs);

// Plot the results
ScatterplotBox.Show("Expected Results", inputs, outputs);
ScatterplotBox.Show("Actual Logistic Regression Output", inputs, predictions.ToZeroOne());

Console.ReadKey();
}
}
}

Once you are done writing this code, you can run it by hitting F5 or clicking on the Start button on top. If everything runs smoothly, it should produce the two plots shown in the following figure. If it fails, check for references or typos. You can always right-click on the class name or the light bulb icon to make Visual Studio help you find which packages are missing from the namespace references:

Plots produced by the sample program. Left: actual prediction results, right: expected output

This sample code can be found at the following link: https://github.com/yoonhwang/c-sharp-machine-learning/blob/master/ch.1/SampleAccordNETApp.cs.

Installing Deedle

Deedle is an open source .NET library for data frame programming. Deedle lets you do data manipulation in a way that is similar to R data frames and pandas data frames in Python. We will be using this package to load and manipulate the data for our machine learning projects.

Similar to how we installed Accord.NET, we can install the Deedle package from NuGet. Open the package manager (Tools | NuGet Package Manager | Package Manager Console) and install Deedle using the following command:

PM> Install-Package Deedle

Let’s briefly look at how we can use this package to load data from a CSV file and do simple data manipulations. For more information, you can visit http://bluemountaincapital.github.io/Deedle/ for API documentation and sample code. We are going to use daily AAPL stock price data from 2010 to 2013 for this exercise. You can download this data from the following link: https://github.com/yoonhwang/c-sharp-machine-learning/blob/master/ch.1/table_aapl.csv.

Open your Visual Studio and create a new Console Application under the Visual C# category. Use the preceding command to install the Deedle library through NuGet and add references to your project. You should see the Deedle package added to your references in your Solutions Explorer.

Now, we are going to load the CSV data into a Deedle data frame and then do some data manipulations. First, we are going to update the index of the data frame with the Date field. Then, we are going to apply some arithmetic operations on the Open and Close columns to calculate the percentage changes from open to close prices. Lastly, we will calculate daily returns by taking the differences between the close and the previous close prices, dividing them by the previous close prices, and then multiplying it by 100. The code for this sample Deedle program is shown as follows:

using Deedle;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace DeedleApp
{
class Program
{
static void Main(string[] args)
{
// Read AAPL stock prices from a CSV file
var root = Directory.GetParent(Directory.GetCurrentDirectory()).Parent.FullName;
var aaplData = Frame.ReadCsv(Path.Combine(root, "table_aapl.csv"));
// Print the data
Console.WriteLine("-- Raw Data --");
aaplData.Print();

// Set Date field as index
var aapl = aaplData.IndexRows<String>("Date").SortRowsByKey();
Console.WriteLine("-- After Indexing --");
aapl.Print();

// Calculate percent change from open to close
var openCloseChange = 
((
aapl.GetColumn<double>("Close") - aapl.GetColumn<double>("Open")
) / aapl.GetColumn<double>("Open")) * 100.0;
aapl.AddColumn("openCloseChange", openCloseChange);
Console.WriteLine("-- Simple Arithmetic Operations --");
aapl.Print();

// Shift close prices by one row and calculate daily returns
var dailyReturn = aapl.Diff(1).GetColumn<double>("Close") / aapl.GetColumn<double>("Close") * 100.0;
aapl.AddColumn("dailyReturn", dailyReturn);
Console.WriteLine("-- Shift --");
aapl.Print();

Console.ReadKey();
}
}
}

When you run this code, you will see the following outputs.

The raw dataset looks like the following:

After indexing this dataset with the date field, you will see the following:

After applying simple arithmetic operations to compute the change rate from open to close, you will see the following:

Finally, after shifting close prices by one row and computing daily returns, you will see the following:

As you can see from this sample Deedle project, we can run various data manipulation operations with one or two lines of code, where it would have required more lines of code to apply the same operations using native C#. We will use the Deedle library frequently throughout this book for data manipulation and feature engineering.

This sample Deedle code can be found at the following link: https://github.com/yoonhwang/c-sharp-machine-learning/blob/master/ch.1/DeedleApp.cs.

Thus, in this post, we walked you through how to set up a C# environment for our future ML projects. We built a simple logistic regression classifier using the Accord.NET framework and used the Deedle library to load and manipulate the data.

If you enjoyed reading this post, do check out C# Machine Learning Projects to put your skills into practice and implementing your machine learning knowledge in real projects.

Read Next

A Data science fanatic. Loves to be updated with the tech happenings around the globe. Loves singing and composing songs. Believes in putting the art in smart.