From p-Significance to Uncertainty
Wikimedia

From p-Significance to Uncertainty

The statistical methods that were used to analyze data since the first half of the 20th century might no longer be valid for the era of Big-Data. The experiments scales changed from tenth or hundred of samples to billions of samples. The number of hypothesis tests performed on the data raised from single digits to thousands per second when using deep learning or online sites.

Just this week, the American Statistical Association to published "Statistical Inference in the 21st Century: A World Beyond p < 0.05", a full issue just on p_value that relates to issues of the new way to understand data. This comes just three years after "The ASA's Statement on p-Values: Context, Process, and Purpose" where they brought up the limitations that the p_value as statistical tool has and some of it's impact on research.

It all focus on the same issue, we need a mechanism to understand data and draw conclusions. Apparently it is not easy and not as straight forward as the tasting tea experiment. During the last week, a new paper about Cholesterol and eggs was published that once again changed the belief about correlation between eggs consumption, cholesterol and hear diseases (overview cane be found here). In this case, the Cholesterol and egg consumption were not the issue. The main issue is that we do not understand the data and its consequences. Not surprisingly one of the sites gave it the title "Statisticians' Call To Arms: Reject Significance And Embrace Uncertainty!". "'The world is much more uncertain than that,' says Nicole Lazar, a professor of statistics at the University of Georgia. She is involved in the latest push to ban the use of the term 'statistical significance.' "

The Nature magazine, on the same week published "Scientists rise up against statistical significance". "The trouble is human and cognitive more than it is statistical: bucketing results into ‘statistically significant’ and ‘statistically non-significant’ makes people think that the items assigned in that way are categorically different. The same problems are likely to arise under any proposed statistical alternative that involves dichotomization, whether frequentist, Bayesian or otherwise."

We live in an interesting era where we re-shape the way we understand the reality (and I totally do not relate to "fake news"). The data is very tangible, but it's meaning is still obscured.

To view or add a comment, sign in

More articles by Eddie A.

  • לוירוס הקורונה לא באמת אכפת מהמסקנות שלנו

    לוירוס הקורונה לא באמת אכפת מהמסקנות שלנו

    מאז שפרצה מגפת הקורונה, אנחנו מנסים למצוא דרכים מהירות להתגבר עליה, אבל זה עובד מאוד חלקית. המצב הנוכחי הוא שיש סגר…

  • Bias-Variance as deep learning optimization tool

    Bias-Variance as deep learning optimization tool

    Deep learning is great. It allows to develop fast machine learning capabilities that are both: very accurate and can be…

    1 Comment
  • The illusive data driven prediction

    The illusive data driven prediction

    If you do the things right, this year will (not) be different than other years. Many prepare new year's resolutions…

  • Continuous Testing & Optimal Choosing

    Continuous Testing & Optimal Choosing

    Data collection is cheap, mainly if it based on web. The cheaper it is to collect the data, the more analysis is needed…

  • Are accurate models always work ?

    Are accurate models always work ?

    This post is about why models works well when data is analyzed, but fail to work in real life. Assuming the data…

Insights from the community

Others also viewed

Explore topics