The Study Scientific

Originally published in the Informanté newspaper on Thursday, 12 May, 2016. 

With the prevalence of social media only increasing year by year, it is quickly becoming the primary news medium for many people. Informante, while primarily a print newspaper, has had to increase its social media footprint to maintain its reputation as the number one source of news for the Namibian populace, and as it now has in excess of 180 000 likes on Facebook, it can legitimately claim to be the news source that reaches the most Namibians.

But not all news from social media is served via the Informante – and a lot of it is simply shared by friends and family. And while I’ve expounded before of the importance of scientific literacy here in Theory of Interest, I’ve seen way too many stories shared where a ‘scientific study’ has indicated a surprising result, often a pleasant one for the person sharing, but when I dug a bit deeper the flaws in the reporting became evident. 

Unfortunately, while our education system is geared towards providing people with a basic level of scientific literacy, it does not provide people with the ability to evaluate a scientific study even well into the tertiary level. The subject in my course that has enabled me to evaluate studies was an honours-level subject, and most courses don’t even have it due to it requiring a basic understanding of statistics – an artefact, perhaps, of a misplaced cultural fear of mathematics. 

Yet the basic skills required to just evaluate, if not necessarily understanding a study, is not much. It is important to be able to sort fact from fancy, as a short excerpt from Edward Tufte’s book “Data Analysis for Politics and Policy” starts to demonstrate:

“One day when I was a junior medical student, a very important Boston surgeon visited the school and delivered a great treatise on a large number of patients who had undergone successful operations for vascular reconstruction. At the end of the lecture, a young student at the back of the room timidly asked, “Do you have any controls?” Well, the great surgeon drew himself up to his full height, hit the desk, and said, “Do you mean did I not operate on half of the patients?” The hall grew very quiet then. The voice at the back of the room very hesitantly replied, “Yes, that’s what I had in mind.” Then the visitor’s fist really came down as he thundered, “Of course not. That would have doomed half of them to their death.” It was quiet then, and one could scarcely hear the small voice ask, “Which half?”” 

That humorous story highlights the importance of a key part of the scientific method – scientific control groups. Science is tested by experimentation, and a study is usually performed on a sample of a population. If I can be permitted to explain with an example close to my heart, consider cardiac medication. If a new pill is developed that can extend the life of a cardiac patient, it does not establish the pill’s effectiveness if the whole sample of the population that will be affected is given the pill, and fewer people die – after all, it could simply be that the selected sample lived longer due to a common external factor.

Instead, two samples of the population are selected, and one is treated with the new pill, while the other group, the control group, is treated with either existing medication (if they wish to compare its efficacy against existing medication) or a placebo (most commonly a sugar pill). Where possible, these studies are performed blind to prevent the placebo effect, where individuals can feel better simply because they are told they’re receiving treatment. The opposite effect, the nocebo effect, can also occur where people exhibit symptoms due to perceived effects – WiFi sensitivity is an example of the nocebo effect.

Ideally, a good study would be a double-blind study – double blind in that neither the patient nor those administering the treatment knows which group is receiving treatment and which is receiving the placebo – although in some cases this is not possible. This serves to eliminate bias, either intentional or unintentional (a doctor revealing placebo treatment either by an accidentally revealing it, or via body language) and it aims to keep the testing objective. 

Thus simply be reading the original study, you can already evaluate how scientific it is simply by checking the design of study presented. But there is another measure you should also be checking that’s been mentioned already – sample size. This is where a bit of basic statistics becomes necessary – but luckily it is not complex. 

Studies test whether results are statistically significant. This is usually expressed as a p-value (with p being probability) and it is generally known as a confidence level. The two most common levels used are p < 0.05 (or testing that we’re 95% confident this effect is not due to random chance) and p < 0.01 (or testing that we’re 99% confident this effect is not due to random chance). 


Next is our confidence interval, or margin of error. This is commonly expressed in results as, for example with our cardiac pill, as that its effect on reducing death in patients was 65% (95 percent confidence interval, 39 to 80 percent; P<0.001). As you can see, when the p value is increased in confidence (now 99.9%), the error margin increases (41% interval - 39% to 80%) with the same sample size. With the usual 95% and 99% confidence level, the margin is usually between 1% and 10%.

Armed with these two values, it is possible to calculate the size of the sample of the population you need to have a statistically significant sample – but that is usually the mathematics that put people off. Luckily, sample sizes do not increase linearly with the population size, and it is possible to use precalculated values to see if the sample size of a study is large enough.

So, for a population of 10 000, at a 95% confidence level, you need a sample size of 370 people for a 5% margin of error, 1332 people for a 2.5% margin of error and 4899 for a 1% margin of error. For 99% confidence, this increases to 622 people, 2098 people and 6239 people respectively. But for a population of 1 000 000, at 95% confidence, you need a sample size of 384 people for 5%, 1534 people for a 2.5% error margin, and 9512 people for 1%.

As you can see, sample sizes tend toward a certain upper limit, and you can develop a rule of thumb. For a study to claim it affects everyone, then, it needs at least 384 participants to be 95% confident about its results, with a 5% margin of error, moving up to 16 500 people to be 99% confident with a 1% margin of error. 

When you thus see a scientific study shared on Facebook that claims that “A Glass Of Red Wine Is The Equivalent To An Hour At The Gym,” read more closely. You’ll see the study was conducted on rats, which means it does not necessarily apply to humans, and that the study merely found that one compound in wine mimicked the effects of endurance similar to exercise training. Similarly, a story that claimed “Study Says Beer Helps You Lose Weight,” was conducted using one compound in beer, and conducted on mice. 

Proper scientific literacy is essential to be an informed citizen. Take the time and read a few scientific studies before sharing them. See if they’ve followed procedure, and had a control group. Examine the sample size, and take care when the study was conducted on as few as 20 people. 

Don’t just believe a study because it claims something you want to be true. After all, as John Oliver said on his show this Sunday, “In science, you don't just get to cherry-pick the parts that justify what you were going to do anyway. That's religion. You're thinking of religion.”

No comments:

Post a Comment