Artificial Intelligence #48: How do we combine statistical thinking and machine learning?
If you are working with machine learning algorithms.. You often find statistics daunting
The problem is not the lack of information.. The problem is how to put it all together in a concise and pragmatic whole
Christoph Molnar has a good link which summarises data modelling mindsets ideas.
I summarise it below and then attempt to extend this thinking to deep learning
Data modelling mindsets include Bayesian and frequentist statistics, machine learning and causal inference. While these approaches share common methods and models, they differ in assumptions about the data-generating process and when a model is a good generalization of the real world.
Machine learning minimizes a loss function L by finding the best function f that to predict target Y from features X. A good machine learning model has a low loss on the test data.
Statistical inference fits the best parameters of a chosen probability distribution for variables X. A good statistical model has a high goodness-of-fit: the data fits the distribution.
Bayesian inference assumes that the distribution parameters θ are random variables with an a-priori distribution. A good Bayesian model has a high posterior probability (Bayes factor).
Causal inference operates on the principles of causality, intervention and counterfactuals.. A good causal model has high goodness-of-fit and solid causal assumptions.
Recommended by LinkedIn
Then he recommends that the smart way is to be pragmatic about the modeling choices i.e. if you need causal interpretation use causal models; if only predictive performance is important then pick machine learning; want to include prior information about model parameters then choose Bayesian stats.
The above represents a very pragmatic view of unifying statistical thinking and machine learning – including the various paradigms of statistical thinking
But we could also extend it to deep learning
Essentially, the key characteristic of deep learning is representation learning
Deep learning itself has evolved rapidly into a few key areas.
1) Initially, autoencoders were not taken seriously but over time, we realised that the key feature of autoencoders is in their ability to learn a representation. This became significant with variational autoencoders and then with GANs
2) Also, we saw transformers based on the attention mechanism. The ability to train transformers in parallel led to large language models like GPT-3
3) Reinforcement learning continues to evolve
4) Also we are seeing multimodal learning like CLIP
So, in many ways, these three worlds are rapidly evolving with some synergies: statistics, machine learning and deep learning
Note, I say statistics is also rapidly evolving because I believe that Bayesian and causal models will play a key role in the future
Image source: Rutgers university
Professional Freelancer
2yVery interesting and inspiring.
PhD | Professor | Data Science | Machine Learning | Deputy Dean (Research)
2yThe frequentist approach is perfectly fine if you have one pair of hypothesis. In Bayesian inference, the population parameters are considered to be random variables and require a prior degree of belief. The Odds Ratio is then calculated to check for statistical significance. Likelihood ratio and likelihood function is required if we have two- or more than two pair of hypothesis to test, respectively.
Chief Data Officer Executive | Data Transformation Strategist | Digital Experience (Dx) Champion | SAFe Lean-Agile Portfolio Manager | 360 Degree Leader | Program Director
2yThank you!
Digital Twin maker: Causality & Data Science --> TwinARC - the "INSIGHT Digital Twin"!
2ySimple yet elegant framing... 👌
In this edition Christoph Molnar