Pondering p-values before breakfast

June 9, 2015 in reviews | Tags: p-value | by Lior Pachter

“Why, sometimes I’ve believed as many as six impossible things before breakfast.” – The White Queen from Through the Looking Glass by Lewis Carroll.

Xiao Wang, Boxuan Simen Zhao, Ian A. Roundtree, Zhike Lu, Dali Han, Honghui Ma, Xiaocheng Weng, Kai Chen, Hailing Shi, Chuan He, N⁶-methyladenosine Modulates Messenger RNA Translation Efficiency, Cell Volume 161, Issue 6, 4 June 2015, Pages 1388–1399

Clarence Yu Cheng, Fang-Chieh Chou, Wipapat Kladwang, Siqi Tian, Pablo Cordero, Rhiju Das, Consistent global structures of complex RNA states through multidimensional chemical mapping, eLife 2015;10.7554/eLife.07600

10 comments

Comments feed for this article

June 10, 2015 at 1:57 am

Lost

What p-value’s little brother – Goodness of fit?

People infer parameters from fitting a (mathematical) model to some real data, but do not bother to analyze how reasonable is it to describe the data with their model.

Unfortunately, I have seen some ridiculous cases even in the most prestige journals, that a one quick look on the figures\data can show that the model is inadequate for the data.

June 23, 2015 at 6:19 pm

well that’s kind of the point isn’t it? all you can say is the model is consistent or inconsistent with my particular null model or it isn’t.

The most you can infer from NHST is that “my model was not reasonable”. That’s what a “positive” result means.

June 24, 2015 at 8:00 am

Lost

kj,
I am not sure I got your point.

In fitting process, one assumes that he has the right (or at least good enough) model to the data, and he just trying to infer parameters given this model.
However, a necessary step after it, is to check how likely is that your best parameter could have generated the data. Which is called goodness of fit.
I have seen examples of cases that people did not bother to do this step, and published the results of a best fit, while the best fit is unlikely to explain the data.

Thus they used a NULL model that should be rejected.

[I am not interested in putting here examples, as I am in touch with the authors, and I do not want to be exposed]

June 10, 2015 at 10:10 am

Lost

Sorry, the first line should be:

What about p-value’s little brother – Goodness of fit?

June 11, 2015 at 10:35 am

psuedo is ver2

My comments are just based on this blog without reading the original papers.

I think it is reasonable to suspect the red circled p-value for the first paper as the box plot clearly shows overlap between the two groups. The p-value can still be close to zero if the sample sizes of the two groups are very big. The author probably round it to 0.0. A better way should be written something like
<10^{-5}.

However, I'm trying to understand red circled p-value in the second paper. Do you mean we should never see p-value 1.0? Looking at other entries in the table, some of them are very tiny (less than 10^{-16}), so it is reasonable to think 1.0 is probably the rounded number. Very few people will write p-value something like 1.0-1.8*10^{-14} (or impossible 0.9999999999999….).

I don't understand your points on the red circled p-values. Yes, it's impossible to have p-value at exactly 0.0 or 1.0. However, with rounding, 1.0 p-value is not problem, while 0.0 p-value should be better to be reported something like <10^{-5}.

June 11, 2015 at 5:44 pm

Nitpicking about rounding p-values (how much more information would you have if they had written p = .0000000001235 and p = .999999992395234?) while accepting the NHST paradigm as the statistical framework of ‘omics analyses loses the forest through the trees. The latter has done far more mostly-unnoticed damage than the former ever will.

June 27, 2015 at 9:52 pm

what are the unnoticed damages of NHST paradigm for `omics analyses? I am using NHST almost every day, so really interested in potential pitfalls with this paradigm.

June 16, 2015 at 3:24 am

Lost

What about P-value that is smaller than 1e-150?
See here: http://www.nature.com/ncomms/2015/150609/ncomms8412/full/ncomms8412.html?WT.ec_id=NCOMMS-20150610

Many of their p-values are ~ 1e-100.

June 17, 2015 at 5:55 pm

Jamie Cate

Well, honestly, p-values be damned. I would never trust my (biochemical) life to structural RMSDs as large as in the 2nd Table. Now, if one also had phylogenetic support for the models, then one could imagine spending the time designing follow-up experiments based on these kinds of models.

July 7, 2015 at 6:30 am

Michael Phillips

Lior, would this count as an appropriate use of statistics and p values by a biologist? Love your work, please keep it up.

http://plantbiologyreview.blogspot.com.es/2015/07/fund-my-research-and-win-world-cup.html

	jeffrey on Yuval Peres
	Michael Rorer on A note on “How the Gaza…
	flyingmonkey on A note on “How the Gaza…
	Wes J on A note on “How the Gaza…
	David McQuillan on A note on “How the Gaza…
	lewi on A note on “How the Gaza…
	David McQuillan on A note on “How the Gaza…
	Izzy on A note on “How the Gaza…
	Lior Pachter on A note on “How the Gaza…
	Izzy on A note on “How the Gaza…

Pondering p-values before breakfast

Recent Comments

Top Posts & Pages

Recent posts

Archives

Biology

Computational Biology

Computer Science

Ideas

Math

Medicine

Statistics

Blog Stats

10 comments

Leave a reply to Jamie Cate Cancel reply

Pondering p-values before breakfast

Share this:

Related

Recent Comments

Top Posts & Pages

Recent posts

Archives

Biology

Computational Biology

Computer Science

Ideas

Math

Medicine

Statistics

Blog Stats

10 comments

Leave a reply to Jamie Cate Cancel reply