Every single aspect of Earth's science is broken SIMULTANEOUSLY and the stats are PART OF WHAT'S BROKEN.
The 'null hypothesis significance testing' paradigm means that a 10% apparent effect size on a small group and a 3% apparent effect size on a larger group can be treated as a replication having 'reproduced' the original's 'rejection of the null hypothesis' instead of as a failure to reproduce the apparent 10% effect size.
The 'null hypothesis significance testing' paradigm means that journals treat some findings of an experiment as 'failures' to find a 'statistically significant effect' rather than as valid evidence ruling out some possible effect sizes. That journals then don't publish that evidence against larger effect sizes, because they didn't accept on the basis of preregistration, is an enormous blatant filter on the presented evidence which no sane society would tolerate for thirty seconds, and also, a giant blatant incentive that is not an accuracy incentive. If you think in likelihood functions there are no failures or successes, there is no 'significant' or 'insignificant' evidence, there is just the data and the summaries of how likely that data is given different states of the world.
If you've correctly disentangled your hypotheses, likelihood functions from different experiments just stack. You literally just fucking multiply them together to get the combined update. It becomes enormously easier to accumulate evidence across multiple experiments -
- although, yes, anybody who tries this on Earth will no doubt find that their likelihood function ends up zero-everywhere, because different experiments were done under different conditions, as is itself a vastly important fact that needs to be explicitly accounted-for, and which the "rejection of the null hypothesis" and "meta-analysis" paradigms are overwhelmingly failing in practice to turn up before it's too late.
'Null hypothesis significance testing' rejects the notion of a way reality can be that produces your data. It just says reality isn't like the null hypothesis, now you win, you can publish a paper. That doesn't exactly prompt people to notice if reality is being unlike the null hypothesis in incompatible ways on different occasions.
The 'p-value' paradigm's dependence on the experimenter's private state of mind means that people can't just go out and gather more data, when it turns out they didn't get enough data, because the fact that the experimenter has privately internally chosen how much data to gather breaks the p-value paradigm -