I am currently thinking about formalization of some statistics (in Coq). One thing I don't understand is the logic of e.g. the Shapiro-Wilk test for normality. To explain my problem, let's first look at a Kolmogorov test for normality, which doesn't have this problem. A hypothesis test is in general a contradiction argument: one assumes a certain statistical property of an observation (the null hypothesis) and then shows that with this assumption the observation is highly improbable and concludes that the assumption is likely not true. When I do a KS normality test, I assume as null hypothesis that the distribution is not normal and then show that the distance between the observed distribution and the assumed distribution is so small that this is unlikely. The statistics for the distribution distance derived by Kolmogorov is valid for any continuous distribution, so in essence the logic of such a test is:
"distribution is not normal" -> "distribution is continuous" -> observation is unlikely
from which one can conclude that either premise is likely false.
Now let's look at the Shapiro-Wilk test. The difference to the KS test is, that the distribution of the W statistic given in the 1965 paper by Shapiro and Wilk applies only to normal distributions (otherwise one could use the Shapiro-Wilk test for any distribution). So a normality argument based on W statistics has the logic:
"distribution is not normal" -> "distribution is normal" -> unlikely
where the first premise is the null hypothesis and the second premise is required for applying the W statistics and coming to the "unlikely" conclusion. Again one can conclude from this that either premise is likely false, but in this case this is not that helpful.
A non normality test (assuming normality as null hypothesis) would of cause work.
Can someone please cut this knot for me?
Added (and later edited)
How do people work in practice with statistical methods requiring normality tests? In the abstract of reference [1] it is said: "normal distribution ... is an underlying assumption of many statistical procedures". So people do a test with a null hypothesis that the data is normal, and in case the data is not normal this hypothesis is rejected and the method requiring normal distributed data cannot be used.
What happens if the test does not reject the null hypothesis of normality? I would think many people then apply the methods requiring normal distributed data without much further thought, since the scientific procedure for checking if the data is not not normal was followed.
Is this justified? From a logic point of view not, because after a Shapiro-Wilk test we know nothing at all in case the null hypothesis is accepted. Also as Iosef pointed out (I hope I got him right) statistics claims nothing in this case.
What I wanted to say above is this: in case the null hypothesis is accepted - and I would say this is a frequent use case - some tests really say nothing at all, while other tests still give some information.
What I still don't understand is
- The connection between a Shapiro-Wilk test and the applicability of methods which require normal distributed data. Reference [1] claims that there is such a connection, but I see this only in a negative sense - maybe it is meant in this way.
- If it is possible to know more than nothing at all in case the null hypothesis is accepted and if other tests (like KS) are better in this respect compared to Shapiro-Wilk.
- How to convince a formal logic system like Coq that some methods requiring (approximately) normal distributed data are applicable - as far as I understood Iosef normality tests are always negative, so they can only show that such methods cannot be applied, but not that they can be applied.
References
- Mohd Razali, Nornadiah & Yap, Bee. (2011). Power Comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling Tests. J. Stat. Model. Analytics. 2.
- S. S. Shapiro and M. B. Wilk (1965). An Analysis of Variance Test for Normality (Complete Samples). Biometrika Vol. 52, No. 3/4 (Dec., 1965), pp. 591-611