Back to Blog

Logistic Regression Explained
Data Science

Logistic Regression in Plain English

3 minute read | May 17, 2019
T.J. DeGroat

Written by:
T.J. DeGroat

Logistic Regression Explained

Logistic regression is a pretty simple—yet very powerful—algorithm used in data science and machine learning. It is a statistical algorithm that classifies data by considering outcome variables on extreme ends and creates a logarithmic line to distinguish between them.

In the typical case, you’re trying to decide for a given data point whether a predicate is true or false. For example, if the data point is a credit card transaction, the predicate might be “this is a valid transaction” versus “this is a fraudulent transaction.” If the data point is a blood test, the predicate might be “this person will react well to this drug” versus “this person will react adversely.” And if the data point is an email message, the predicate might be “this is a real message” versus “this is spam.”

We always start with examples of data points, called training examples. If the predicate is true for an example, we say it’s a positive example; when the predicate is false, it’s a negative example.

The attributes of a given data point are represented numerically and can be imagined as a geometric point in space, like the picture below, where positive (blue) examples are geometrically different from negative (red) examples.

The whole game is then:

1) Learn how to geometrically distinguish positive examples from negative examples when given a training set of data points.

2) After you’ve learned, you can make a prediction on a new, as-yet-unseen data point. You just see which of the two groups of points the new point belongs in.

logistic regression example

(Source.)

So in this picture of training examples, blue points might each represent a valid credit card transaction, and red points might represent a fraudulent transaction.

1) The learning part of the game is to find a placement for the green plate in the picture that best splits the positive and negative examples in space. This is done mathematically, by computing the orientation and position in space for the green sheet relative to the example points so as to best separate the two groups.

Once you’ve placed the green sheet, we can say that positive examples always fall on one side of the sheet and negative examples fall on the other side.  And even better, our confidence in, say, the validity for a given credit transaction is just how far into valid territory the point is: blue points far from the green sheet on the are more confidently valid.

Get To Know Other Data Science Students

Hastings Reeves

Hastings Reeves

Business Intelligence Analyst at Velocity Global

Read Story

Sunil Ayyappan

Sunil Ayyappan

Senior Technical Program Manager (AI) at LinkedIn

Read Story

Bret Marshall

Bret Marshall

Software Engineer at Growers Edge

Read Story

2) The prediction part then just takes a new, unseen point and asks (a) which side of the green sheet is this point and (b) how deep on that side is it? This gives us the true or false prediction and a sense of how confident we are in that prediction.

In logistic regression, the result of the prediction is always represented as a number between 0 and 1, where 0 means the predicate is false for a data point, 1 means the predicate is true, and 0.5 means we can’t really say, which is just when the point falls on the green sheet.

A little bit in one direction, something like 0.55, means that we predict true with low confidence; a little bit the other way, like 0.45, means we predict false with low confidence. A point far into the blue side yields a prediction like 0.95 and means we’re really confident it’s true; a point far into the red side yields a prediction like 0.05 and means were really confident it’s false.

This is called supervised learning because you provide the training data first, learn how to distinguish the training data properly, and are then ready to make new predictions. As a result, we can clearly appreciate the significance of Logistic Regression in data science and supervised machine learning models. If you are an aspiring data scientist, having a strong foundation in logistic regression is very important.

Thanks to Todd Cass for this explanation of logistic regression.

For more data science and machine learning career information, check out these resources:

Since you’re here…
Curious about a career in data science? Experiment with our free data science learning path, or join our Data Science Bootcamp, where you’ll only pay tuition after getting a job in the field. We’re confident because our courses work – check out our student success stories to get inspired.

T.J. DeGroat

About T.J. DeGroat

T.J. is a writer and editor waging war against unnecessary capitalization. You can follow him on Twitter @tjdegroat.


Deprecated: Return type of NinjaTables\Framework\Foundation\Container::offsetExists($key) should either be compatible with ArrayAccess::offsetExists(mixed $offset): bool, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Foundation/Container.php on line 1164

Deprecated: Return type of NinjaTables\Framework\Foundation\Container::offsetGet($key) should either be compatible with ArrayAccess::offsetGet(mixed $offset): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Foundation/Container.php on line 1175

Deprecated: Return type of NinjaTables\Framework\Foundation\Container::offsetSet($key, $value) should either be compatible with ArrayAccess::offsetSet(mixed $offset, mixed $value): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Foundation/Container.php on line 1187

Deprecated: Return type of NinjaTables\Framework\Foundation\Container::offsetUnset($key) should either be compatible with ArrayAccess::offsetUnset(mixed $offset): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Foundation/Container.php on line 1207

Deprecated: Return type of NinjaTables\Framework\Database\Orm\Model::offsetExists($offset) should either be compatible with ArrayAccess::offsetExists(mixed $offset): bool, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Database/Orm/Model.php on line 3586

Deprecated: Return type of NinjaTables\Framework\Database\Orm\Model::offsetGet($offset) should either be compatible with ArrayAccess::offsetGet(mixed $offset): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Database/Orm/Model.php on line 3598

Deprecated: Return type of NinjaTables\Framework\Database\Orm\Model::offsetSet($offset, $value) should either be compatible with ArrayAccess::offsetSet(mixed $offset, mixed $value): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Database/Orm/Model.php on line 3611

Deprecated: Return type of NinjaTables\Framework\Database\Orm\Model::offsetUnset($offset) should either be compatible with ArrayAccess::offsetUnset(mixed $offset): void, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Database/Orm/Model.php on line 3623

Deprecated: Return type of NinjaTables\Framework\Database\Orm\Model::jsonSerialize() should either be compatible with JsonSerializable::jsonSerialize(): mixed, or the #[\ReturnTypeWillChange] attribute should be used to temporarily suppress the notice in /www/springboard_353/public/blog/wp-content/plugins/ninja-tables/vendor/wpfluent/framework/src/WPFluent/Database/Orm/Model.php on line 2545