2025-05-16 00:19:43 · 英文原文

How to Learn the Math Needed for Machine Learning | Towards Data Science

作者：Egor Howell

can be a scary topic for people.

Many of you want to work in machine learning, but the maths skills needed may seem overwhelming.

I am here to tell you that it’s nowhere as intimidating as you may think and to give you a roadmap, resources, and advice on how to learn math effectively.

Let’s get into it!

Do you need maths for machine learning?

I often get asked:

Do you need to know maths to work in machine learning?

The short answer is generally yes, but the depth and extent of maths you need to know depends on the type of role you are going for.

A research-based role like:

Research Engineer — Engineer who runs experiments based on research ideas.
Research Scientist — A full-time researcher on cutting edge models.
Applied Research Scientist — Somewhere between research and industry.

You will particularly need strong maths skills.

It also depends on what company you work for. If you are a machine learning engineer or data scientist or any tech role at:

Deepmind
Microsoft AI
Meta Research
Google Research

You will also need strong maths skills because you are working in a research lab, akin to a university or college research lab.

In fact, most machine learning and AI research is done at large corporations rather than universities due to the financial costs of running models on massive data, which can be millions of pounds.

For these roles and positions I have mentioned, your maths skills will need to be a minimum of a bachelor’s degree in a subject such as math, physics, computer science, statistics, or engineering.

However, ideally, you will have a master’s or PhD in one of those subjects, as these degrees teach the research skills needed for these research-based roles or companies.

This may sound heartening to some of you, but this is just the truth from the statistics.

According to a notebook from the 2021 Kaggle Machine Learning & Data Science Survey, the research scientist role is highly popular among PhD and doctorates.

And in general, the higher your education the more money you will earn, which will correlate with maths knowledge.

However, if you want to work in the industry on production projects, the math skills needed are considerably less. Many people I know working as machine learning engineers and data scientists don’t have a “target” background.

This is because industry is not so “research” intensive. It’s often about determining the optimal business strategy or decision and then implementing that into a machine-learning model.

Sometimes, a simple decision engine is only required, and machine learning would be overkill.

High school maths knowledge is usually sufficient for these roles. Still, you may need to brush up on key areas, particularly for interviews or specific specialisms like reinforcement learning or time series, which are quite maths-intensive.

To be honest, the majority of roles are in industry, so the maths skills needed for most people will not be at the PhD or master’s level.

But I would be lying if I said these qualifications do not give you an advantage.

There are three core areas you need to know:

Statistics
Calculus
Linear Algebra

Statistics

I may be slightly biased, but statistics is the most important area you should know and put the most effort into understanding.

Most machine learning originated from statistical learning theory, so learning statistics will mean you will inherently learn machine learning or its basics.

These are the areas you should study:

Descriptive Statistics — This is useful for general analysis and diagnosing your models. This is all about summarising and portraying your data in the best way.
- Averages: Mean, Median, Mode
- Spread: Standard Deviation, Variance, Covariance
- Plots: Bar, Line, Pie, Histograms, Error Bars
Probability Distributions — This is the heart of statistics as it defines the shape of the probability of events. There are many, and I mean many, distributions, but you certainly don’t need to learn all of them.
- Normal
- Binomial
- Gamma
- Log-normal
- Poisson
- Geometric
Probability Theory — As I said earlier, machine learning is based on statistical learning, which comes from understanding how probability works. The most important concepts are
- Maximum likelihood estimation
- Central limit theorem
- Bayesian statistics
Hypothesis Testing —Most real-world use cases of data and machine learning revolve around testing. You will test your models in production or carry out an A/B test for your customers; therefore, understanding how to run hypothesis tests is very important.
- Significance Level
- Z-Test
- T-Test
- Chi-Square Test
- Sampling
Modelling & Inference —Models like linear regression, logistic regression, polynomial regression, and any regression algorithm originally came from statistics, not machine learning.
- Linear Regression
- Logistic Regression
- Polynomial Regression
- Model Residuals
- Model Uncertainty
- Generalised Linear Models

Calculus

Most machine learning algorithms learn from gradient descent in one way or another. And, gradient descent has its roots in calculus.

There are two main areas in calculus you should cover:

Differentiation

What is a derivative?
Derivatives of common functions.
Turning point, maxima, minima and saddle points.
Partial derivatives and multivariable calculus.
Chain and product rules.
Convex vs non-convex differentiable functions.

Integration

What is integration?
Integration by parts and substitution.
The integral of common functions.
Integration of areas and volumes.

Linear Algebra

Linear algebra is used everywhere in machine learning, and a lot in deep learning. Most models represent data and features as matrices and vectors.

Vectors
- What are vectors
- Magnitude, direction
- Dot product
- Vector product
- Vector operations (addition, subtraction, etc)
Matrices
- What is a matrix
- Trace
- Inverse
- Transpose
- Determinants
- Dot product
- Matrix decomposition
Eigenvalues & Eigenvectors
- Finding eigenvectors
- Eigenvalue decomposition
- Spectrum analysis

There are loads of resources, and it really comes down to your learning style.

If you are after textbooks, then you can’t go wrong with the following and is pretty much all you need:

Practical Statistics For Data Scientist — I recommend this book all the time and for good reason. This is the only textbook you realistically need to learn the statistics for Data Science and machine learning.
Mathematics for Machine Learning — As the name implies, this textbook will teach the maths for machine learning. A lot of the information in this book may be overkill, but your maths skills will be excellent if you study everything.

If you want some online courses, I have heard good things about the following ones.

Mathematics for Machine Learning and Data Science Specialisation — This course is by DeepLearning.AI, the same people who made the Machine Learning Specialisation, arguably the best machine learning course.

Learning Advice

The amount of maths content you need to learn may seem overwhelming, but don’t worry.

The main thing is to break it down step by step.

Pick one of the three: statistics, Linear Algebra or calculus.

Look at the things I wrote above you need to know and choose one resource. It doesn’t have to be any of the ones I recommended above.

That’s the initial work done. Don’t overcomplicate by looking for the “best resource” because such a thing doesn’t exist.

Now, start working through the resources, but don’t just blindly read or watch the videos.

Actively take notes and document your understanding. I personally write blog posts, which essentially employ the Feynman technique, as I am, in a way, “teaching” others what I know.

Writing blogs may be too much for some people, so just make sure you have good notes, either physically or digitally, that are in your own words and that you can reference later.

The learning process is generally quite simple, and there have been studies done on how to do it effectively. The general gist is:

Do a little bit every day
Review old concepts frequently (spaced repetition)
Document your learning

It’s all about the process; follow it, and you will learn!

Join my free newsletter, Dishing the Data, where I share weekly tips, insights, and advice from my experience as a practicing Machine Learning engineer. Plus, as a subscriber, you’ll get my FREE Data Science / Machine Learning Resume Template!

Dishing The Data | Egor Howell | Substack
Advice and learnings on data science, tech and entrepreneurship. Click to read Dishing The Data, by Egor Howell, a…newsletter.egorhowell.com

关于《How to Learn the Math Needed for Machine Learning | Towards Data Science》的评论

暂无评论

发表评论

摘要

Many aspiring machine learning professionals are intimidated by the mathematical prerequisites required for the field, but it's not as daunting as it seems. This article provides a roadmap and advice on how to effectively learn mathematics necessary for machine learning careers. It breaks down whether you need math based on your role (research or industry) and company type, highlighting that research-based roles at major tech companies typically require advanced mathematical skills akin to those gained from master’s or PhD programs in fields like math, physics, computer science, statistics, or engineering. For industrial positions focused on practical applications, foundational high school-level mathematics often suffices, though some areas like reinforcement learning may necessitate more advanced knowledge. The three core areas of focus for machine learning are: 1. Statistics: covering descriptive statistics, probability distributions, hypothesis testing, and modeling. 2. Calculus: focusing on differentiation and integration. 3. Linear Algebra: including vectors, matrices, eigenvalues, and eigenvectors. Resources like textbooks and online courses are recommended to aid in this learning process, alongside practical advice on breaking down the material into manageable chunks, actively engaging with the content through note-taking or blogging, and employing techniques such as spaced repetition for effective long-term retention.

OC