Linear Regression in C#

Linear Regression in C#

In many real-world problems, the goal is not only to understand data but also to make predictions based on it. Whether you are estimating house prices, forecasting sales, or analyzing trends in sensor data, you often need a mathematical model that describes the relationship between input and output variables.

Linear Regression is one of the simplest and most widely used techniques in statistics and machine learning. It assumes that there is a linear relationship between an independent variable (input) and a dependent variable (output), and it tries to find the best-fitting straight line through a set of data points.

What is Linear Regression?

Linear Regression is a predictive modeling technique that estimates the relationship between variables using a linear equation. The goal is to find the line that minimizes the difference between predicted values and actual values.

y = a * x + b

In this equation, a represents the slope of the line, while b represents the intercept. The algorithm adjusts these parameters so that the overall error between predicted and actual values is minimized.

In simple terms, Linear Regression tries to answer the question:

"If x changes, how does y change, and can we predict y for new values of x?"

Where is Linear Regression Used?

Linear Regression is used in a wide variety of fields because of its simplicity and interpretability. Even though it is a basic model, it often provides surprisingly strong baseline results for many problems.

  • Predicting house prices based on size, location, or features
  • Sales forecasting in business analytics
  • Stock market trend estimation (basic models)
  • Weather and temperature prediction
  • Performance analysis in engineering systems
  • Medical and biological data modeling

How Linear Regression Works

The core idea behind Linear Regression is to find the "best fit line" through a set of data points. This line is chosen in such a way that the total error between the predicted values and actual values is minimized.

The most commonly used method for finding this line is called the Least Squares Method. It minimizes the sum of squared differences between predicted and actual values.

Loss = Σ (y_actual - y_predicted)²

By minimizing this loss function, the algorithm finds the optimal values for slope and intercept.

Implementing Linear Regression in C# (From Scratch)

The following example demonstrates a simple implementation of Linear Regression using the Least Squares approach in C#. This implementation is intentionally kept simple for educational purposes.

using System;

public class LinearRegression
{
    public double Slope { get; private set; }
    public double Intercept { get; private set; }

    public void Fit(double[] x, double[] y)
    {
        int n = x.Length;

        double sumX = 0, sumY = 0, sumXY = 0, sumX2 = 0;

        for (int i = 0; i < n; i++)
        {
            sumX += x[i];
            sumY += y[i];
            sumXY += x[i] * y[i];
            sumX2 += x[i] * x[i];
        }

        Slope = (n * sumXY - sumX * sumY) / (n * sumX2 - sumX * sumX);
        Intercept = (sumY - Slope * sumX) / n;
    }

    public double Predict(double x)
    {
        return Slope * x + Intercept;
    }
}

This implementation calculates the best-fit line using direct statistical formulas. While it works well for small datasets, more advanced optimization techniques are used in large-scale machine learning systems.

Using ML.NET for Linear Regression

In real-world applications, developers usually prefer machine learning frameworks instead of implementing regression manually. ML.NET is Microsoft's official machine learning framework for .NET applications.

using Microsoft.ML;
using Microsoft.ML.Data;

public class HouseData
{
    public float Size { get; set; }
    public float Price { get; set; }
}

public class HousePrediction
{
    [ColumnName("Score")]
    public float Price { get; set; }
}

var context = new MLContext();

var data = context.Data.LoadFromEnumerable(new List()
{
    new HouseData { Size = 50, Price = 100 },
    new HouseData { Size = 100, Price = 200 },
    new HouseData { Size = 150, Price = 300 }
});

var pipeline = context.Transforms.Concatenate("Features", "Size")
    .Append(context.Regression.Trainers.Sdca());

var model = pipeline.Fit(data);

Popular Libraries for Regression in .NET

  • ML.NET
  • MathNet.Numerics
  • Accord.NET
  • TensorFlow.NET
  • SharpLearning

Linear Regression vs More Complex Models

Feature Linear Regression Complex ML Models
Interpretability Very High Often Low
Training Speed Fast Slower
Accuracy on Simple Data Good Good
Accuracy on Complex Data Limited High
Implementation Complexity Simple Complex

Summary

Linear Regression is one of the foundational algorithms in machine learning and statistics. Despite its simplicity, it is widely used for prediction, trend analysis, and data modeling. It provides a strong baseline for many real-world problems and is often the first step in understanding more complex machine learning techniques.

In C#, Linear Regression can be implemented manually for learning purposes or applied using powerful libraries like ML.NET for production systems. Understanding how it works internally helps developers build better intuition for more advanced machine learning models.

Contents related to 'Linear Regression in C#'

Quadratic Regression in C#
Quadratic Regression in C#