Universal workflow of Machine Learning

Dr. Chengheri BAO

Data Scientist | Founder @Castane AI / We build custom AI solutions

Published Dec 15, 2019

+ Follow

Defining the problem and assembling a dataset

What will your input data be?
What are you trying to predict?
What type of problem are you facing?
Is it binary classification? Multi-class classification? Multi-class, multi-label classification?
Scalar regression? Vector regression?
Clustering?
Generation?
Reinforcement learning?

Choosing a measure of success

==> guide the choice of a loss function

Accuracy?
Precision and recall?
Customer retention?

Deciding on an evaluation protocol

hold-out validation set
K-fold cross validation
iterated K-fold cross validation

Preparing your data

Format data as tensors
Values of these tensors should scalped to small values ([0,1] or [-1,1])
Normalize data, if different features take different value ranges (heterogeneous data)
Feature engineering, especially for small-data problems

Developing a model that does better than a baseline

The goal is to achieve statistical power.

If you can’t beat a random baseline after trying multiple reasonable architectures, you may need to question your hypothesis.
If it goes well, then make three key choices to build your first model:
Last-layer activation
Loss function
Optimization configuration

Scalling up: developing a model that overfits

To find the border between overfitting and underfitting, you need to overfit first:

Add layers
Make the layers bigger
Train for more epochs

Then regularize and tune the model.

Regularizing your model and tuning your hyperparameters

Add dropout
Add or remove layers
Add L1/L2 regularization
Try different numbers of units per layer
Try different learning rates
Add new features or remove features that are not informative

Reference : “Deep learning with Python”, François Chollet

Universal workflow of Machine Learning

Dr. Chengheri BAO

Data Scientist | Founder @Castane AI / We build custom AI solutions

Defining the problem and assembling a dataset

Choosing a measure of success

Deciding on an evaluation protocol

Preparing your data

Developing a model that does better than a baseline

Scalling up: developing a model that overfits

Regularizing your model and tuning your hyperparameters

More articles by this author

Insights from the community

Others also viewed

A Simple Machine Learning Example.

Creating Your First Machine Learning Classifier with Sklearn

5 Skills You Need to Become a Machine Learning Engineer

Integrating Machine Learning in .NET Applications

Explore topics

Defining the problem and assembling a dataset

Choosing a measure of success

Deciding on an evaluation protocol

Preparing your data

Developing a model that does better than a baseline

Scalling up: developing a model that overfits

Regularizing your model and tuning your hyperparameters

[𝗙𝗿𝗶𝗱𝗮𝘆 𝗥𝗲𝗮𝗱𝗶𝗻𝗴𝘀 - 𝗔𝗜 𝗦𝘁𝗼𝗿𝗶𝗲𝘀] 📚✨ 𝗦𝗛𝗔𝗞𝗘𝗬

Mar 8, 2024

What does PhD mean to me

Jan 28, 2018

Insights from the community

Others also viewed

A Simple Machine Learning Example.

Creating Your First Machine Learning Classifier with Sklearn

5 Skills You Need to Become a Machine Learning Engineer

Integrating Machine Learning in .NET Applications

Explore topics