Firearms Data, and an Ode to Data Systems

As my statistics teacher always said, while it is easy to lie with statistics, it is a lot easier to lie without statistics. I would add that it is also easy to lie with bad statistics.

I am currently reading an interesting book about guns, with the subtitle “The Data Don’t Lie.” I believe that claim is incorrect. Data can be, and often are, quite inaccurate.

This essay (a) emphasizes the importance of data systems (“surveillance systems”)—not one-off surveys, but data that are collected consistently and comparably across sites and over time; (b) provides examples of the need to recognize the limitations of each system and to keep improving them; and (c) bemoans the lack of both data and funding for statistical analysis of firearm issues.

The Importance of Statistics

I know I am singing to the choir, but the older I get, the stronger becomes my belief in the importance of statistics. I teach in a public health school and a few years ago, I wrote a book entitled While We Were Sleeping (University of California Press) that describes 64 documented successes of how the world has been made safer in terms of reducing injury and violence. The book also includes descriptions of 36 heroes whom you’ve never heard of, who devoted much of their lives to making your life safer.

Happily for me, since I am a researcher, in virtually all the successes, data and research were important. Data were important in (a) proving that, indeed, there was a serious public health problem; (b) suggesting cost-effective interventions; and (c) evaluating those interventions.

For example, in the mid-1990s, data showed that 16-year-old drivers were at three times the crash risk of 19-year-olds, and 10 times the crash risk of 40-year-olds. How could we get these new drivers the needed real world driving experience without killing so many in the process? Data analysis also showed that two of the situations in which these young drivers were at especially high risk were (a) at night and (b) when only other teenagers were in the car.

Graduated licensing programs for new drivers were introduced in a number of states; these programs allowed 16-year-olds to drive, but not in those two high-risk situations. Analyses of the effects of the laws showed a 30% reduction in crashes involving 16-year-olds. Soon all states adopted these graduated licensing programs, and many young lives have been saved.

The Benefits of Surveillance Systems

The first step in the public health approach to any problem is to create “surveillance systems,” which are data systems that collect consistent and comparable data across sites and over time, including detailed descriptions of the circumstances of the event.

In the motor vehicle area, such data systems exist. The most important of these surveillance systems is for fatalities—the Fatality Analysis Reporting System (FARS). Every time there is a motor vehicle death anywhere on United States roads, more than 100 pieces of information are collected consistently and comparably, on issues such as the weather conditions, the type of roadway, whether the motorists were wearing seat belts, and whether the airbags deployed.

These data are publicly available and there are federal funds for analyzing the data. Thus, we know much about what policies and programs are and are not effective in reducing motor vehicle fatalities. FARS data were crucial for supporting the creation and the evaluation of the graduated licensing programs. The FARS system is one reason that there has been an over-85% reduction in fatalities per mile driven in the U.S. over the past half-century.

The importance of data systems cannot be overstated. Such systems are crucial in all areas of society. I am an economist by training and can attest to the importance of the creation of the national income accounts in the 1930s. Before that time, policymakers tried to determine the state of the economy and appropriate economic policy using limited and fragmentary information such as stock price indices and freight car loadings. At the end of the 20th century, the U.S. Department of Commerce declared the development of the national income accounts (e.g., Gross Domestic Product [GDP], Consumer Price Index [CPI]) its “achievement of the century.”

Criminal justice has created useful surveillance systems such as the National Crime Victimization Survey (NCVS), which measures the incidence of reported and unreported crime, and the FBI’s Uniform Crime Report (UCR), a census of offenses known to the police.

The FARS data system of the National Highway Traffic Safety Administration has allowed researchers to evaluate the impact of many interventions, including changes in speed limits, motorcycle helmet-wearing laws, the minimum legal drinking age, airbag requirements, and monitoring trends in motor vehicle-related deaths, both overall and among particular groups and localities. Unfortunately, many other injury areas—drowning, falls, suicide, poisonings—do not have surveillance systems that are as good.

The National Violent Death Reporting System

Fortunately, an excellent data system on violent injuries—the National Violent Death Data System (NVDRS)—was created in the past two decades to provide statistics on homicides and suicides, and all deaths from firearms. Under the auspices of the Centers for Disease Control and Prevention (CDC), the system assembles data from multiple sources, including death certificates, police reports, medical examiner/coroner reports, and sometimes crime lab information. The data are combined consistently and comparably, and include narratives from both medical examiners/coroners and police reports.

The NVDRS has many advantages over its main component parts. For example, NVDRS categorizes homicide deaths better than does the Supplementary Homicide Reports (SHR — the police reporting system). NVDRS is the best source for data on homicides involving intimate partner violence (IPV). NVDRS correctly classifies the homicide of women by their ex-boyfriends as IPV (which the SHR does not) and includes all the victims of IPV (e.g., the children who are killed by the husband when he kills his wife), which the SHR does not.

In addition, NVDRS has a category of homicide-followed-by suicide, which is missing from SHR. From NVDRS, we now know that while only 5% of homicides are followed by suicides, over half the time that a man kills his current or former intimate partner with a gun, he immediately kills himself.

NVDRS also provides, for the first time, a data system for both suicides and accidental gun deaths that provides detailed information about the incident; the Vital Statistics system only provides information available from death certificates. NVDRS adds a great deal of useful information such as education, veteran status, and alcohol or other drugs in the victim’s body. We now know that fewer than 5% of suicide decedents under age 18 test positive for alcohol, while 1/3 of decedents aged 18–20 (still under the legal drinking age) test positive—about the same rate as adults. We also learn that while suicide prevention efforts directed at 18- to 24-year-olds focus on college campuses, over 80% of the suicides in this age group are not college students.

State NVDRS data are being used to target prevention programs and reduce deaths. In one state, NVDRS data showed that many older adult suicide victims had visited their physicians within the month before the suicide, leading to a program to help physicians identify and treat suicidal older adults. In another state, almost half of all female homicide victims were killed by intimate partners—many of the victims had contacted legal authorities for protection, and the state is now improving its intervention protocols. In another state, 40% of suicides had recent criminal histories, and courts are now trying to take a more proactive role in suicide prevention.

Two specific examples demonstrate the usefulness of the NVDRS, over and above the information already available from its component systems (SHR and Vital Statistics). These involve police killings of civilians and accidental gun deaths of children.

Police Killings

Compared to police in other developed countries, police in the U.S. are in great danger. The U.S. has four times the civilian population of Germany, for example, but 100 times the number of law enforcement officers killed on the job each year. The overwhelming majority of the murdered U.S. police officers are killed by civilians with guns. The FBI data system for police being killed is considered an excellent one. However, that is not the case for the killings by police.

It now has been shown that the majority of police killings of civilians in the U.S. are not reported as such in the FBI’s SHR and the Vital Statistics system misses almost as many. We now know that many more American civilians are killed by police than anyone realized: at least 1,000 per year. For every German civilian killed by German police, our police kill more than 100 Americans.

The SHR and Vital Statistics databases are excellent in many respects, but not for police killings. The SHR and Vitals data systems not only miss many police killings, but miss them in a non-random way. As a result, the studies that have used these systems to try to explain variations across states in police killings of civilians are not to be trusted.

Currently, the only comprehensive data system available for police killing of civilians is the NVDRS—but until very recently, that system was only available for 18 states. Criminologists have also examined the detailed data on police killings of civilians from a few big-city police departments. Those findings are of interest, but data from the NVDRS show that big-city police killings represent only ¼ of police killings of civilians in the U.S. NVDRS estimates of the number of police killings are comparable to recent crowdsourcing estimates (obtained from compiling newspaper reports), but NVDRS data are collected consistently and comparably, and are thus far more useful for research than the newspaper accounts.

Unintentional Firearm Fatalities to Children

Gun advocates point out that “only” about 60 children (aged 0–14) are killed unintentionally with firearms each year in the U.S. These data come from the Vital Statistics. But while Vital Statistics is a complete census of all violent death, it does not do a good job of distinguishing gun homicides from gun accidents.

A boy finds his dad’s semi-automatic pistol, takes out the magazine (which has the bullets), and believes the gun is unloaded. He plays with his younger brother, pulls the trigger, and bam, the bullet in the chamber fires and his younger brother lies dead. This is typically (but not always) reported as manslaughter, and classified as a homicide on the death certificate. Yes, the child pulled the trigger intentionally, but he thought the gun was unloaded. It was an accident. The child had no intention of hurting his brother, whom he loves. Including such incidents as the unintentional deaths that they are, NVDRS data indicate that there are 60–80% more accidental child firearm fatalities than reported in the Vital Statistics.

The studies—including one of my own—that used Vital Statistics data to explain differences in accidental gun deaths across states, or the effects of changes in state-level laws on accidental gun deaths, are not to be trusted. The Vital Statistics system is an excellent data system, but its statistics on unintentional firearm fatalities to children are not any good.

A researcher frequently cited by the gun lobby claims that “about two thirds of accidental deaths to children are not shots fired by other little kids but rather adult males with criminal backgrounds.” He never cites a source, but if this were true, then prevention efforts should probably focus on these adult males. Fortunately, now we have good data from the National Violent Death Reporting System.

When we examine those data, we find that children are usually the shooters, not adults. About 1/3 of these unintentional firearm fatalities are self-inflicted by the child victim; in 1/3 of the cases, other children are the shooters; in 1/6 slightly older teens (e.g., a 15-year-old brother) are the shooters; and only 1/6 of the time is the shooter an adult. These adults are mostly parents who sometimes shoot their children unintentionally while hunting or cleaning the firearm. There is no indication that these adults typically have criminal backgrounds.

Getting good data helps inform cost-effective prevention measures. An analysis of NVDRS data shows that unlike other child age groups, children aged 2–4 typically shoot themselves. Thus, we should not only keep these toddlers away from guns (e.g., lock guns securely), but if guns are accessible, they should be child-proof. Similar to child-proof aspirin bottles—which dramatically reduced the serious problem of unintentional child poisoning—child-proof guns can be used by adults, but are very difficult for children to fire. More than a century ago, D.B. Wesson of Smith & Wesson fame produced such guns to reduce the danger of accidental injury to children.

Another easy technological solution for many unintentional firearm fatalities of children is to have magazine safeties on all semi-automatics, to ensure that when the magazine is removed, the firearm will not fire. Some firearms already have this feature.

Probably the most-effective technological change would be the manufacture of “smart” guns that will only fire when used by the authorized owner.

The Need for More and Better Data on Firearm Issues

Because of limited federal funding, not all states are part of the NVDRS system. The lack of a 50-state NVDRS is a symptom of a much larger problem. Unfortunately for science and prevention, the gun lobby has successfully suppressed both firearm-related data and the funding of scientific research on firearms issues. Much useful data are simply not collected, while other data are collected but deliberately withheld from researchers.

The CDC excludes gun information from many of its data systems. For example, the Behavioral Risk Factor Surveillance System (BFRSS), which was created in the mid-1980s, is the largest continually collected health survey in the world, with all 50 states participating. More than 400,000 individuals provide information each year on such issues as sleep, alcohol and tobacco use, immunizations, falls, seatbelt use, oral health, HIV, and other health-related issues.

The BRFSS provides much of our knowledge about trends in behavioral health issues. Questions on guns were discontinued in 2004, even though gun fatalities at the time were the second leading cause of injury death, following only motor vehicle-related injuries. Thus, we no longer have good information about household gun prevalence in states, or how guns are stored. That makes it difficult to determine the effect of many state-level gun control laws.

Types of firearms information collected but deliberately withheld from researchers include Bureau of Alcohol, Tobacco, Firearms, and Explosives (ATF) gun-tracing data and state-level concealed-carry permit data.

In addition, federal funding for firearms research is incredibly low. Allies of the gun lobby in Congress have cowed the CDC. The CDC provides virtually no funding for firearms research, staff are afraid to say the word “guns” at public health meetings, and the CDC director remains silent about guns, even as mass shootings at schools and elsewhere increase. It has been estimated that public funding for gun research is 2% of what would be expected given the size of the public health problem.

Not surprisingly, there has been relatively little academic research on this public health issue compared to the enormity of the problem, so not nearly enough is known about many important issues, including gun theft, gun training, gun storage, guns on college campuses, open gun carrying, concealed carry, gun intimidation, gun use in self-defense, guns at work, liability laws and guns, insurance and guns, gun trafficking, straw purchases, guns in suburbia, police discretion, machine guns, effects of gun laws, etc. When reporters delve into any specific firearms issue, they are astounded at how little is known once one scratches the surface.

For example, at this writing, even though it is estimated that 300,000 to 500,000 guns are stolen each year, we have just this year published the very first journal article focusing on gun theft. There is still little known about the who, why, when, where, and how of gun theft—the most common way that guns enter the illegal market.


Over the years, I have been increasingly convinced of the importance of good data systems. I had some involvement with the creation of the NVDRS. Looking back on it, that may well be the most-important activity of my research career. I wish I had had more knowledge about data systems then; I might have been able to make a more-substantive contribution.

I know how to analyze data. For example, my research group has used NVDRS data for many studies, including child homicide perpetration, infant homicide victimization, and helium suicides. We recognize the limitations of this and other surveillance systems, but we have no easy way of helping to make them better.

Most surveillance systems have serious, often well-known deficiencies. The GDP (our national income statement) measure, for example, is far from ideal. Unlike all business “income statements,” the GDP fails to register losses in inventory. It takes no notice when we use up our scarce resources, such as oil, coal, natural gas, ground water, clean air, fish, buffalo, or beaver. It makes no record when hurricanes or floods destroy homes or factories. It would be difficult for a business to succeed using such an incomplete measure of income.

Crime data systems also have severe limitations. For example, not only does the UCR miss crimes not reported to police, but some police do not report all crimes in their areas to the UCR. The NCVS fails to ask crime victims directly if they used a gun or other specific weapons in self-defense.

While there are sporadic attempts to improve these data systems, we do not appear to have a systemic approach, as a society, to improving the data. There seems to be nothing like a “continuous quality improvement” approach to ensuring that these systems become and remain topnotch. The problem, I believe, is that these data systems and the scientists who create them are not given enough credit or respect.

At Harvard, there are literally hundreds of courses on how to analyze data. These are important courses, but there does not seem to be a single course on how to create, manage, and improve data systems. I believe this is generally true at most other U.S. universities. Yet we all know the notion of “garbage in, garbage out.” I believe we as a society should be devoting more resources to training statisticians on the important topic of data system development and improvement, and paying more attention to the creation and maintenance of good surveillance systems.

Further Reading

Barber, C., Azrael, D., and Hemenway, D. 2013. A truly national National Violent Death Reporting System. Injury Prevention 19:225–26.

Hemenway, D. 2009. While We Were Sleeping: Success Stories in Injury and Violence Prevention. Berkeley, CA: University of California Press.

Hemenway, D., Barber, C.W., Gallagher, S.S., and Azrael, D.R. 2009. Creating a national violent death reporting system: a successful beginning. American Journal of Preventive Medicine 37:68–71.

Kellermann, A.L., and Rivara, F.P. 2013. Silencing the Science on Gun Research. JAMA 309:549–550.

About the Author

David Hemenway is professor of health policy at the Harvard School of Public Health and director of the Harvard Injury Control Research Center. In 2012, he was recognized by the Centers for Disease Control as one of the 20 “most influential injury and violence professionals over the past 20 years.”

Back to Top

Tagged as: , , , , , ,