In becoming a science-based person, I can imagine a process that involves three tiers. First, you decide that you are going to get your information from reputable sources like scientific journals and then decide that any other claims that you find should have a similar backing. Second, pushing past the veneer of scientific legitimacy, you decide to look into the claims for yourself. This involves not only getting your information from sources based on scientific journal articles, for example, but also going through the study yourself to determine whether it is a “good” study. Lastly, after having navigated scientific sources for some time, you are able to evaluate claims base on methodologies and procedures that you would expect the offered evidence to have if it were indeed credible. Because most of us are not scientists and find it hard to invest in the education it would require to reside comfortably in the third tier, I will try to offer some help with the second.

If you ever would consider a career as a science writer or science journalist, there are a few basic techniques that you must master or at least become proficient at. Among them are learning statistics and how to interpret them, interviewing scientists to get the best information, and how to translate sometimes complex and technical scientific information into something that the lay audience can digest. Another fundamental skill that you must wield effectively is being able to confidently answer the question, “What is a good study?” To this end, what follows are some basic questions that you should ask yourself when trying to determine the validity of a scientific study. You would find these kinds of questions in any introductory level science-writing textbook, and they will become a valuable tool in your skeptical arsenal.

Keep in mind that when you are evaluating a study, the more of these questions that you can have answered, the better off you are. However, if you find yourself questioning every single procedure, method, and ethical choice in a study, this may be a red flag in itself. As a properly skeptical consumer of scientific information, a good place to start at is what is called the null hypothesis. That is to say, assume that a new medical treatment or physics experiment won’t work. Without being downright cynical, greet every claim with this assumption. Your new motto when faced with a claim in a study or elsewhere should be “show me.”

Is the study large enough to pass statistical muster?

Numbers are very important in this regard. For example, the number of patients that a study includes in a clinical trial says a lot about that trial’s “power,” or relative generalizeability (does the study include enough patients to distinguish between treatments?, etc.). Taking a more basic approach, if you were to read in a study that “the majority of US citizens now reject the theory of evolution,” you should find out how many people were in the study. The statistics turn out that if you have less than around 1,024 people for a nationwide study, the margin or error exponentially increases beyond three percent. In study that reports a 49/51 split, this could render the claim worthless.

The other side of this question is to determine if the findings of a study are statistically significant, meaning that there is only an acceptably small chance that the findings were due to random chance alone. The value that is typically used in scientific research is p=0.05. This "p-value" means that the probability that chance alone will produce the findings in a study is only 1 in 20. This is because we must assume the null hypothesis is true, and then assess the probability of some outcome given this assumption. (If this seems to low, it should be noted that many fields in science have much more rigorous standards. Physicists use p-values of p=0.001 to validate their findings. Still, even with the less rigorous standards, most scientific papers are made to be replicated, eliminating chance occurrences even further.) When evaluating a study, pay close attention to this value. As a general rule, any correlation that has a p-value of greater than 0.05 (p>0.05) should not be taken as evidence for anything.

Is the study designed well? Could unintentional bias have affected the results?

This is hard to determine if you are not familiar with a particular field, but you are still able to ask questions that should help you sort the bad studies from the good. Was there a systematic design to the study that remained the same throughout? What were the specific hypotheses of the study and how did the study test for them? If it was a clinical trial, who were the patients and how were they selected?

More generally, was there a control group? Was the sample population that the study selected representative of the general population? Was the study as “blinded” as possible, meaning that no one involved with the study knew which condition was which and who was involved with it? Were there any conflicts of interest that should have been disclosed by the researchers? Funding from a corporation does not automatically mean that the results of a study are false, but it is something that absolutely can bias research.

Did the study last long enough?

This question may not apply to some sciences, but it is especially important in medicine. For example, if a study claims that a new treatment put some cancer patients into remission, the study should also follow those patients for some amount of time afterwards to see if they stayed in remission. If all of the participants died two weeks after the study, you may be getting horrendously skewed conclusions.

Are there any other possible explanations for the findings or reasons to doubt the conclusions?

Remembering that correlation does not prove causation, how does the study frame the findings? Is any association statistically strong? If a causal link is suggested, does the cause indeed precede the effect? Are the associations that are found consistent when other methods are used? Did the study look for other possible explanations, called confounding variables, which could explain the results? For example, a study that claims reading science blogs increases the level of scientific literacy may be leaving out the confounding variable of formal education, which could be controlling both.

(For medical claims) Does a treatment really work?

Could the patient’s improvements be changes that are occurring in the normal course of their disease? This is a source of great confusion for alternative treatment claims like the ones offered by homeopathic “medicine.” While a patient may feel better after taking homeopathic medicine, the improvement could indeed have nothing to do with the treatment and be just the normal ebb and flow of illness. Taking this into account, most studies have found that homeopathic “medicine” does not work.

If a treatment is claimed to work, are there any follow up studies that are needed to confirm that finding?  Are the results applicable to the general population? All of these questions should be answered by the study itself.

Do the conclusions fit other scientific evidence?

Are the results of a study consistent with other findings in that field? If not, why not? Has the study been replicated and confirmed?

Virtually no one study proves anything. Consistency and the preponderance of evidence are what point us in the direction of truth. Of course, the claims of quantum mechanics and other seemingly impossible notions are bizarre at first, but they are then supported and backed up by other research. Contrast this with a pseudoscience like “free energy.” Mountains of evidence and thermodynamics as a whole will refute a study claiming to have cracked the free energy code. A study that goes up against such opposition is not necessarily wrong, but it better offer some extraordinary evidence to show that it is not.

Do I have the full picture?

How does this research play into the field as a whole? Does the study leave out some important aspect of the science that would prove it wrong? Is the study even relevant given other findings? Like the previous question, it is important to understand how a finding fits into other research that has been done. Is it in opposition? Which way is the field moving? Getting the whole picture is critical if you want to understand the importance of a study.

Have the findings been checked by other experts?

This is one of the most important questions that you can ask when looking at a study. Ask yourself: are there experts who disagree with the claims in a study? Why or why not? Are the researchers speaking in an area of their own expertise or have they ventured outside of it? Does the researcher have a good track record when it comes to findings standing up to scrutiny?

Most importantly, as one of the safety nets of science, has the study been through peer review? Is the journal that the study is published in reputable? A study coming out of an obscure journal with no peer review, that is to say, no experts to check over the work of the researchers, is not necessarily wrong but should be highly suspect.

What now?

When looking at scientific studies you need to ask even more basic questions than whether or not the study was systematically designed. Ask common sense questions, like asking if the data really justify the conclusions. If the researchers have extrapolated beyond the evidence, it is warranted? Does the researcher frankly admit any flaws or limitations of the study? Does the researcher acknowledge that the findings may be tentative and offer important caveats?

If you can get your hands on a copy of the original study, and not a press release of the abstract, do it. You may not be able to evaluate all of the procedures and methods, but a good study will be written in a way that answers many of these important questions. Getting good at this kind of evaluation takes practice, but no one ever said science was easy.  

 

Examples from this post were adapted from the book “News and Numbers” by Victor Cohn and Lewis Cope.

You can find a reproducible list of the guidelines above for your use here.

Kyle Hill is the newly appointed JREF research fellow specializing in communication research and human information processing. He writes daily at the Science-Based Life blog and you can follow him on Twitter here.