WC.com

Tuesday, May 29, 2018

Approaches to Data

Statistics are fascinating. Statistics are often the subject of quips and criticism. Disraeli said, "There are three types of lies - - lies, damn lies, and statistics." Twain said that "facts are stubborn things, but statistics are pliable." And, though it is possible that pliability is a reality, I wonder if the real flexibility of statistics comes down to our individual perspectives?

There was a cacophony last summer when our consciousness realized that American death rates were increasing. There was increased reporting in 2017, but the news had broken in 2016, as reported by the New York Times. The Times hinted at drug overdose, suicide, Alzheimer's disease, and heart disease. It cautioned that the figures were preliminary. 

In 2017, the focus had cleared somewhat. The "primary causes" of increasing death rates were "drug overdoses and alcohol-related deaths," according to the Washington Post. The Post noted that before 2010 "Opioid and alcohol-related deaths were primarily observed among whites who lived in small cities and rural areas." But, since 2010 death rates for "urban demographics" including "whites, blacks and Hispanics" increased. 

Similarly, in 2017 the Washington Post noted that the U.S. fertility rate in 2016 "hit a historic low." That article noted a perceived trend in declining birth rates, particularly among "teens and twenty-somethings." Conversely, "the birthrate for women in their 30s and 40s increased." The paper predicted potential "economic and cultural turmoil" if the trend continued. It noted, without citing any supporting evidence or source, that the goal should be a "replacement level" of births so that the population overall "neither grows nor shrinks." 

Recently, CBS news reported that the trend continued in 2017 with birth rates declining for "women in their teens, 20s and - surprisingly - their 30s." While CBS acknowledged that decreases have been the trend since 2014, it noted that "2017 saw the greatest year-to-year drop." So, U.S. birth rates have purportedly dropped to levels not seen since 1987. 

CBS prognosticated that millennials may be to blame (must we always blame everything on the millennials?). It suggests that generation has "shifting attitudes" regarding starting families. It also suggested that "changes in the immigrant population" could be a cause. It noted that "Asians are making up a larger proportion of immigrants, and they have typically had fewer children." CBS mentioned the "replacement level," concluding that the "U.S. now stands less than the standard benchmark for replacement." 

Early in 2018, the New York Times reported that the fertility "replacement rate in developed countries is around 2.1." That story noted the U.S. rate for a period ending September 2017 was 1.77, but acknowledged the U.S. rate's "most recent peak" was "2.12 in 2007." The Times blames the postponement of marriage (perhaps a veiled Millennial dig). But, it contends that fertility rates "have not changed very much over the last 15 years" when the analysis is "controlled for marital status." The Times also credits contraception (noting that "emergency contraception" use has increased, a process "some consider to be abortions but are not counted in official abortion statistics"). 

A commentator on Fox News opined that "those having the most children are least able to pay for their upbringing," citing 2015 birth rate statistics delineating rates among various income strata. The commentator concedes that income does not equate to quality parenting, but contends that low income may mean "struggle," and that there may be reliance on "taxpayers to finance their upbringing." 

Though one of those is admittedly an "opinion" piece, perhaps they each bring a perspective without doing the topic justice. Pew research brings yet another perspective. In its January 2018 analysis Is U.S. Fertility at an All-Time Low? It Depends. Pew suggests that "hand-wringing" may be premature, and explains that the analysis of fertility rate is "complicated," more so than perhaps news media has explained. Pew outlines three different fertility measures, and concedes that none is " 'right' or 'wrong,' but each tells a different story." None of the measures is affected by the population volume (each measures birth against population, resulting in a percentage). 

Pew notes that the "general fertility rate," or GFR, "is affected by changes in the age distribution among women of childbearing age." This measure will be higher when the "share of women in their peak childbearing years" is higher (and vice versa). And, it notes that the GFR has decreased "in part" due to the "great recession" that we borrowed, stimulated, and TARPed our way through 2007-2015. In other words, Pew hypothesizes that the economy has contributed to lower birth rates among younger Americans. Younger Americans are at the beginning of their economic independence, recession experienced, and wary. Pew seems to put this on younger people's experience, and not on their "Millennial" label. 

Pew also encourages us to focus on a second measure of the fertility rate, the "completed fertility rate," or CFR. This one ignores the age of the mother at the time of birth. The CFR measures "the number of children a woman has in her lifetime." That measure demonstrated its recent American nadir in 2006. Thus, the CFR does not support that U.S. fertility is currently at the "30-year low" proclaimed by CBS News

The third measure is more important, perhaps. Pew explains that the "total fertility rate," or TFR, is an "estimate of lifetime fertility." This one does not count per se but instead uses current fertility patterns to estimate or forecast what birthrates will be. And, it is this TFR that is "most commonly used to characterize “replacement fertility.” Thus, while some reports have perhaps been focused on the GFR and perhaps "hand-wringing" about the "replacement level," there has been a seeming disconnect. The news has perhaps mixed two concepts, but at least has not explained well that it is TFR and "replacement level" that generally go together. None of the cited news stories explain the GFR/TFR distinction, nor that "replacement level" is more related to TFR (a prognostication, not fact). 

Pew contends that TFR has been overestimated at times, including during the "baby boom." Likewise, it suggests that "today’s TFRs may be underestimating what completed fertility" will actually look like. Pew notes that while GFR and TFR have been decreasing, the CFR in America "has risen slightly." Pew suggests that overall, during child-bearing years, the rate of birth may not be decreasing as portrayed. The age-defined measure, and the prognostication, are decreasing, while the actual volume of births (CFR) is not decreasing. 

This analysis supports that current birth figures are decreasing, but that over time births will perhaps be consistent with historic figures. The Pew perspective suggests that those births will merely occur later in the mother's life, thus leading to lower rates today (while these mothers-to-be are younger) and a later demonstration of higher rates. 

The entire discussion illustrates some critical points on statistical analysis. First, deciding what to count is critical. An example is the inclusion or non-inclusion of emergency contraception in the abortion statistics. Second, there may be more than one way to analyze data even after the data set is defined. The example is the three postulates explained by Pew. Third, reporting may be disconnected or mis-connected. An example is the news media "hand-wringing" over "replacement level," based on a metric (GFR) that is not "most commonly" used for "replacement level." 

A final note worthy of consideration is that Pew, and the news stories, attempt to delineate social contributors ("Millennials"), economic factors (Great Recession), demographics (ethnicity, marital status, etc), and science (birth control, emergency or not). Thus, the statisticians are identifying trends, and striving to connect their mathematical conclusions to human and emotional foundations. But, understanding math is perhaps far simpler than understanding and predicting individual human emotions and decisions? 

Both the birth and death rate and analysis of the media strive to combine social analysis with statistical analysis. The statistical analysis is capable of measuring only what is, and perhaps conjecturing thereon what is likely to be. The social analysis is an attempt to understand why the counts and calculations are what they are. 

The figures are mathematically precise, but nonetheless amenable to misinterpretation. How many are born and how many die can be counted. It should be precise and scientific, though Pew instructs us it is not. But more so, perhaps the interpretation of statistics, the "why" is so dependent upon sociology, perspective, and definition that Twain perceived them as "pliable," or flexible, and, Disraeli derided them as somehow worse than "damn lies?" And, there is a suggestion of the conclusion that some measure of the "why" evolves for some from perceptions, prejudices, and oversimplified social conclusions.