How ML Differs from Statistics

Akash Mavle Corporate(Group)Head AI L and T Larsen and Toubro

Global, Corporate Group Head of AI at L&T Group |CTO, Sr.VP| IITB | Keynote AI Speaker | $ 27 billion, 3 startups, Entrepreneur | 26 yrs Member of Group Tech Council !| 17 yrs in AI | Gen AI Mob: 9689899815

Published Mar 11, 2019

Classical Statistics in University Under-graduate courses or even Graduate courses starts with descriptive statistics and then moves into distribution fitting and then all the way to complex multivariate analysis. Essentially covering hypothesis testing, correlation, regression , factor analysis and Principal Component analysis.

Statistics assumes a lot of a-priori knowledge about the data and its properties and does not necessarily cover a lot of trial and error or even tinkering.

Machine Learning in new age looks at wide array of techniques and algorithms which themselves learn from the data. Deep Machine Learning, Supervised Learning and Reinforcement Learning covers very interesting algorithm which learn themselves from the wide array of data. So data becomes input and model becomes output. This happens without any human intervention ( except in supervised learning). This is the real beauty of ML over conventional statistics. Although new age ML ( covering CNN/Deep Learning/Reinforcement Learning) draws a lot from statistics, cognitive biology, neuroscience, mathematics and control theory, most of the ML applications have been very new and have large technical and business impact.

In Reinforcement Learning classical optimization functions are used and behaviorism invested in psychology by Skinner comes int play in terms of “reward and punishment”. So behavior of the RL Algorithm is shaped in the same way a child’s behaviour is shaped by parents. Eventually use of Dynamic Programming from the classical optimization ( Operations Research) is used along with Bellman’s optimality conditions and MDP ( Markov Decision Process)

RL ensures that you can start “learning” with minimum domain or problem knowledge. Algorithm has power to learn and come up with its parameters depending on the error conditioning and reward optimization. Multiple algorithms like Temporal Difference Learning, Deep ! Learning and Actor Critic Methods ( A3c) ensure that algorithms in RL have power to create truly domain independent ways to learn in many many new domains without need to have domain knowledge.

ML Tribe( collection of AI Scientists, Data Analysts, ML practitioners, Students, Professors and Industry Professionals) is significantly different from old school statistics in many ways. Statistics assumes a lot of knowledge about the system. Statistical thinking in many ways is top-down, a-priori thinking. ML( Broad umbrella of algorithms in RL, Deep Learning) thinking is inherently is posterior, does not assume much and is bottom-up. In many ways as Richard Dawkins puts it “ The Darwinian thinking is mindless, purposeless bottom-up processes involving R&D, Trial and Error and Tinkering all the way”. ML resembles our own biological evolution. The same way as biological evolution ML algorithms are also evolving. The big advantage is ML algorithms evolution is much faster unlike biological gradual, slow evolution.

ML works a lot like biological processes seen elsewhere in nature. Sometimes ML does not necessarily try to Optimize in the classical Optimization Sense ( finding the best possible solution from large scale solution space). ML tries a process of sophisticated tinkering which moves from finding one sub-optimal solution and then move ahead. This process ensures continuity in learning as well as learning becomes in many ways autonomous.

Statistics used to need a lot of careful sampling, sometimes meticulously planned data cleaning would pre-date a rigorous statistical analysis. ML works with existing data and tries to create inferences.

One of the families of ML algorithms, Bayesian Inferencing using basic Bayes Probability coupled with state-space generators like Monte Carlo simulation so that you create simulated data where data is non-existent or not accurate. ML algorithms this way build a kind of robustness against the Data Quality problems.

Yugdeep Gaur

Business intelligence Manager at Blend360

Rishabh Gulati

Şaban D.

Senior Murex Expert

This post is big misleading for what is ML and how you should do it.

Anand Tamboli

Award-winning Author • Entrepreneur • Filmmaker

A good statistical model is necessary to build effective machine learning algorithm, so inherently they are not comparable, statistics is minimum and fundamental requirement for ML.

How ML Differs from Statistics

Akash Mavle Corporate(Group)Head AI L and T Larsen and Toubro

Global, Corporate Group Head of AI at L&T Group |CTO, Sr.VP| IITB | Keynote AI Speaker | $ 27 billion, 3 startups, Entrepreneur | 26 yrs Member of Group Tech Council !| 17 yrs in AI | Gen AI Mob: 9689899815

More articles by this author

Insights from the community

Others also viewed

Machine Learning Questions and Answers

Deep Learning Roadmap 2022 - The Ultimate Guide

Monitoring Image Data, 5 Reasons to Use SQL, and Understanding Your Neural Network’s Predictions

3 applications of Deep Learning in Big Data analytics

Gradient Descent and its Applications in Deep Learning

Why you should add statistical learning to your machine learning tool kit

Machine Learning Unlocked: A Step-by-Step Guide for Beginners and Beyond

From IT to AI: Charting Your Learning Pathway

The Power of Probability in Data Science: Unlocking Insights and Making Informed Decisions

Bayesian Learning: A Dive into Probabilistic Modeling

Explore topics

Levels of Consciousness- by Daniel Dennet

Jul 30, 2024

Darwin and Turing's Strange Inversion of Reasoning at its implications for AI and Philosophy

Jul 30, 2024

How AI is changing Enterprise Architectures

May 16, 2024

AI in Finance ( Budgeting, Planning and Consolidation )

May 13, 2024

ML Algorithms with Benefits and Core Ideas

Nov 14, 2023

AI Positive Effects and Impacts

Nov 14, 2023

Impact of neuroscience-inspired AI on the workforce and society

Jan 20, 2023

Explainable AI (xAI)

Dec 2, 2019

Artificial Intelligence Driven Hiring

Aug 11, 2019

Domain Models/ Statistics Models and AI Models

Mar 11, 2019