Sigmoid function and its applications

Starting with very basics, let’s understand what is sigmoid?

Sigmoid is just a mathematical function given by the following expression:

When plotted, it forms a ‘S’ like curve ranging between 0 and 1:

Image source

In the realm of machine learning, we can use this function for our advantage. It can be used for classification problems where one class is represented by 0 and the other by 1.

One such algorithm is called Logistic Regression. Let’s discuss the algorithm and see explicitly how sigmoid plays an important role. For this let’s just create an arbitrary dataset with one independent and one dependent variable.

import numpy as np, pandas as pd
import matplotlib.pyplot as plt, seaborn as sns

arr = np.arange(-10,21)
df = pd.DataFrame({'x':arr})
df['y'] = list(np.zeros(11)) + list(np.ones(len(arr)-11))
print(df.head(),'\n\n', df.tail())

Now, let’s plot this to get a visual perspective.

plt.figure(figsize=(20,8))
sns.scatterplot(x=df.x, y=df.y, hue=df.y, s=200)
plt.grid(alpha=0.8)
plt.show()

We can not definitely fit a straight line through the data points since there is no linear relationship. Let’s still assume a weight and intercept to fit an arbitrary line.

eqn = 0.4*arr -1

plt.figure(figsize=(20,8))
sns.scatterplot(x=df.x, y=df.y, hue=df.y, s=200)
plt.plot(df.x, eqn)
plt.grid(alpha=0.8)
plt.show()

We can see that this line is practically of no use. So, let’s just transform the line equation using the sigmoid function, i.e., in the above expression of sigmoid, we are replacing x with the line equation (eqn from code).

sig = 1/(1+np.exp(-eqn))
sig

Now let’s plot these transformed values with respect to the independent variables.

plt.figure(figsize=(20,8))
plt.plot(arr,sig, color='red',marker='o', markerfacecolor='green', markersize=13)
sns.scatterplot(x=df.x, y=df.y, hue=df.y, s=200)

for i in df.x:
    plt.axvline(i, alpha=0.2)
    
for i in sig:
    plt.axhline(i, alpha=0.2)

plt.show()

From the above graph, we can see that the straight line got transformed into a ‘S’ like sigmoid curve that ranges between 0 and 1. We can also see that each data point can now be projected over the sigmoid curve and subsequently we can set a threshold above which the class is 1. Let’s assume the threshold to be 0.23

threshold = 0.23

plt.figure(figsize=(20,8))
plt.plot(arr,sig, color='red',marker='o', markerfacecolor='green', markersize=13)
sns.scatterplot(x=df.x, y=df.y, hue=df.y, s=200)

for i in df.x:
    plt.axvline(i, alpha=0.2)
    
for i in sig:
    plt.axhline(i, alpha=0.2)

plt.axhline(threshold, color='magenta')
plt.show()

In the future upcoming data, after sigmoid transformation, of the value stands out to be more than the threshold (0.23), we can claim it to be belonging to class 1. By the way, logistic regression in sklearn library by default takes 0.5 as the threshold above which all data points are classified to be in class 1.

The best sigmoid curve is fit by choosing the ideal value of weights and intercepts. The process of attaining the optimal value is somewhat similar to linear regression where gradient descent is performed by calculating the gradients with respect to the corresponding loss function. Similarly the loss function for Logistic Regression is called Log Loss which is given by the following expression:

—END—

Leave a comment