Universal workflow of Machine Learning
Defining the problem and assembling a dataset
- What will your input data be?
- What are you trying to predict?
- What type of problem are you facing?
- Is it binary classification? Multi-class classification? Multi-class, multi-label classification?
- Scalar regression? Vector regression?
- Clustering?
- Generation?
- Reinforcement learning?
Choosing a measure of success
==> guide the choice of a loss function
- Accuracy?
- Precision and recall?
- Customer retention?
Deciding on an evaluation protocol
- hold-out validation set
- K-fold cross validation
- iterated K-fold cross validation
Preparing your data
- Format data as tensors
- Values of these tensors should scalped to small values ([0,1] or [-1,1])
- Normalize data, if different features take different value ranges (heterogeneous data)
- Feature engineering, especially for small-data problems
Developing a model that does better than a baseline
The goal is to achieve statistical power.
- If you can’t beat a random baseline after trying multiple reasonable architectures, you may need to question your hypothesis.
- If it goes well, then make three key choices to build your first model:
- Last-layer activation
- Loss function
- Optimization configuration
Scalling up: developing a model that overfits
To find the border between overfitting and underfitting, you need to overfit first:
- Add layers
- Make the layers bigger
- Train for more epochs
Then regularize and tune the model.
Regularizing your model and tuning your hyperparameters
- Add dropout
- Add or remove layers
- Add L1/L2 regularization
- Try different numbers of units per layer
- Try different learning rates
- Add new features or remove features that are not informative
Reference : “Deep learning with Python”, François Chollet