Reader Question: What Is A Statistical Model?

statistics_standard_deviation_formula (1)Yesterday I did a little venting on how the Ebola epidemic in West Africa was being covered by the media. I referred to the fact that the role of infectious vectors was not being discussed; instead limiting discussion to human-to-human contact. “Ebola And Our Forces Overseas

In that short piece I stated that statistical models support the assertion that human-to-human contact is the more likely means of transmitting Ebola and other hemorrhagic fevers.

Well, last night I got a question from a reader that wanted to understand what a statistical model was, so here is my response packaged in as little technical babble as possible.

Statistics is a social science used in arriving at an “objective” description of a population. Statistics attempts to be descriptive and predictive. However, statistics is fraught with problems. For example, the antigun lobby likes to cite statistics suggesting that banning guns will reduce gun homicides, you’ve all seen that. However, here is the problem with that statistic.

It is absolutely correct that if you ban guns you will reduce gun homicides, but it is a meaningless statement without looking at what banning guns did to the number edged weapon homicides, blunt object homicides, strangulations and any other means of killing a person. If you want to be judicious about your analysis you would also need to look at lives lost because individuals could not adequately defend themselves. Although the initial statement is correct, the desired objective and conclusion is way off. This is the reason why statistics is used so often when the uninformed try to have you accept and idea.  It’s also true that If you ban cars you will reduce automobile fatalities and if you ban airplanes you will reduce aviation deaths. It is a meaningless statement without looking at how that action influences substitutes. Do you get my point?

When you talk about a “population” in statistics you are talking about a finite group of people, or things, that exhibit or have a chance of exhibiting a characteristic, or characteristics, that you are trying to describe and not the world population. In virtually all cases, a population is too large to work with so we take samples analyze those samples and make predictions about a “population” on the basis of the samples taken. For example, ammunition manufacturers will select a small lot of shell casings, measure them for length, wall thickness, primer pocket and from that sample they make a prediction about the population of shell casings.

However, samples are not perfect; this is particularly true in the case of diseases. People, because of fear, shame or ignorance will deliberately provide researchers with false information. So, if you have small, bad or poorly selected samples you will have errors in your analysis of the “population.” The size of samples and methods used to obtain a sample of the “population” are critically important in describing the “population.” As a result, when we say there is a .8 (80%) chance of coming down with the disease you have to ask  what is the confidence interval. If someone says that you have a .8 chance of being infected but the confidence interval of that statistic is 60, there’s a lot of room for error. However, if they say you have a .8 chance of contracting the disease with a confidence interval of 93, now you have something to really worry about.

The moral of the story is this. Never accept a statistic without understanding how the sampling was done and never accept a probability as gospel without asking what is your confidence interval. When you are dealing with a disease like Ebola the outside 20%, 10% or 2% is not where you want to be, so take every precaution protecting yourself from both human contact and infectious vectors. You don’t want to be in the 2% that was infected from a mosquito or tick bite!

This entry was posted in Reader Questions and tagged . Bookmark the permalink.