Using Neighborhood-Level Census Data to Predict Diabetes Progression in Patients with Laboratory-Defined Prediabetes


Julie A Schmittdiel, PhD; Wendy T Dyer, MS;
Cassondra J Marshall, DrPH, MPH;
Roberta Bivins, PhD

Perm J 2018;22:18-096 [Full Citation]
E-pub: 10/05/2018


Context: Research on predictors of clinical outcomes usually focuses on the impact of individual patient factors, despite known relationships between neighborhood environment and health.
Objective: To determine whether US census information on where a patient resides is associated with diabetes development among patients with prediabetes.
Design: Retrospective cohort study of all 157,752 patients aged 18 years or older from Kaiser Permanente Northern California with laboratory-defined prediabetes (fasting plasma glucose, 100 mg/dL-125 mg/dL, and/or glycated hemoglobin, 5.7%-6.4%). We assessed whether census data on education, income, and percentage of households receiving benefits through the US Department of Agriculture’s Supplemental Nutrition Assistance Program (SNAP) was associated with diabetes development using logistic regression controlling for age, sex, race/ethnicity, blood glucose levels, and body mass index.
Main Outcome Measure: Progression to diabetes within 36 months.
Results: Patients were more likely to progress to diabetes if they lived in an area where less than 16% of adults had obtained a bachelor’s degree or higher (odds ratio [OR] =1.22, 95% confidence interval [CI] = 1.09-1.36), where median annual income was below $79,999 (OR = 1.16 95% CI = 1.03-1.31), or where SNAP benefits were received by 10% or more of households (OR = 1.24, 95% CI = 1.1-1.4).
Conclusion: Area-level socioeconomic and food assistance data predict the development of diabetes, even after adjusting for traditional individual demographic and clinical factors. Clinical interventions should take these factors into account, and health care systems should consider addressing social needs and community resources as a path to improving health outcomes.


Up to one-third of Americans have prediabetes,1 a state of elevated blood glucose levels that increases the risk of development of Type 2 diabetes. Clinical trials such as the Diabetes Prevention Program have shown that lifestyle changes and initiation of metformin therapy in patients with prediabetes can prevent or delay the onset of Type 2 diabetes2-4 and that these prevention efforts may be cost-effective and improve health outcomes.5-6 

Understanding the potential predictors for development of diabetes and other chronic conditions can help clinicians and health care systems design interventions and target clinical responses to patients at elevated disease risk.7-9 However, most studies of diabetes risk focus on individual patient-level factors10-17 and do not consider patient social context. Recent research suggests a relationship between the characteristics of where individuals reside and their short-term and long-term health outcomes,18-20 specifically diabetes risk and development.21,22 Most of the neighborhood information collected in these studies is from data sources that are not readily available on a national scale such as regional or small-scale national surveys, or it involves additional computational analysis such as geographic information system mapping. The systematic use of census-level data, which is readily available for linkage at the patient level in Kaiser Permanente (KP) and other health care systems, is rarely leveraged in predicting patient health risk and is often not incorporated into diabetes prediction tools used in primary care practice.23 The importance of census block-level and tract-level data in predicting diabetes risk is largely unknown.

The purpose of this study is to determine whether US census data on where a patient resides is associated with the development of diabetes in a prediabetes population after adjustment for traditional demographic and clinical factors.

Research Design and Methods

This retrospective cohort study analyzed data from KP Northern California (KPNC), a large integrated health care delivery system with more than 4 million members. The primary data source was the integrated electronic health record (EHR), which combines diagnosis, utilization, pharmacy, and laboratory records. We identified all patients aged 18 years and older with laboratory-defined prediabetes (fasting plasma glucose [FPG] of 100 mg/dL-125 mg/dL and/or glycolated hemoglobin [HbA1C] of 5.7-6.4) diagnosed between January 1, 2006, and December 31, 2010.24-27 To create an incident prediabetes cohort, we then excluded all patients who had tested in this range in the 2 years prior, those with a preexisting diagnosis of diabetes or prediabetes during this period, and those whose prediabetes converted to diabetes within the first 6 months. Patients were required to have at least 2 years of continuous Health Plan enrollment before the index laboratory date (ie, first elevated FPG or HbA1C value) and for 36 months after the index date. Further information on this cohort is available elsewhere.26,27 These laboratory values were obtained from the KPNC EHR, along with other patient demographic and clinical characteristics.

The primary source for the census data used in this study is the American Community Survey 5-year Summary File for 2006 to 2010. The American Community Survey is conducted as part of the US Census Bureau’s Decennial Census Program, which is designed to provide demographic, socioeconomic, and housing data on the US population for geographic areas in the US, including Puerto Rico.28 A subset of these variables is included in the KP Virtual Data Warehouse and is available for epidemiologic and health services research and quality improvement in all KP Regions. We included Virtual Data Warehouse census block-level variables from the 2010 census on the education level of the adult population (aged ≥ 25 years) and median household income. We also included Virtual Data Warehouse census tract-level data on the percentage of households receiving food assistance through the US Department of Agriculture’s Supplemental Nutrition Assistance Program (SNAP). These census variables were chosen for inclusion on the basis of research suggesting a relationship between access to food and diabetes outcomes.29-33

Statistical Analyses

To examine the relationship between demographic, clinical, and census variables with diabetes progression, we used logistic regression to obtain estimates of odds ratios (ORs) with 95% confidence intervals (CIs). The logistic regression model included patient age, sex, race/ethnicity, body mass index (BMI), and index FPG or HbA1C laboratory result, census block-level median household income, census block-level percentage of adults with a bachelor’s degree or higher degree, and census tract-level percentage of households receiving SNAP benefits. We also estimated a model without the census variables, using only age, sex, race/ethnicity, BMI, and index blood glucose laboratory results to compare the C statistic of this pared-down model with that of the main model described. All analyses were performed using SAS Version 9.3 (SAS Institute, Inc, Cary, NC). This study was approved by the KPNC institutional review board.


The cohort included 157,752 patients with prediabetes, with a mean age of 57 years (standard deviation = 14 years), 50% of whom were women, and 59% were non-Hispanic white (Table 1). An average of 4.1% of households received food assistance through SNAP, with a higher proportion receiving assistance in the group in whom diabetes developed in the 36-month observation window (5.1%). In the logistic regression model (Table 2), those aged 40 years and older had statistically significantly higher ORs for diabetes development compared with those aged 18 to 29 years. Black/African American, Asian, Hispanic, American Indian/Alaska Native, and Native Hawaiian/Pacific Islander patients all had statistically significantly higher ORs for developing diabetes compared with whites. Overweight/obesity or an index FPG value above 110 mg/dL or HbA1C greater than 6.0% also were independently and significantly associated with diabetes developing within 36 months of prediabetes identification.

After adjustment for these patient-level characteristics, patients with prediabetes were also more likely to progress to diabetes if they lived in an area where 45% or less of the adult population had obtained a bachelor’s degree or higher (eg, OR = 1.22; 95% CI = 1.09-1.36 for block groups with < 16% obtaining a bachelor’s degree or higher). Patients with prediabetes living in areas where median household incomes were $50,000 to $79,999 had higher odds of progression to diabetes compared with those living in areas with median incomes of $120,000 or more (OR = 1.16; 95% CI = 1.03-1.31). Our results also showed that patients living in an area where SNAP benefits were received by 10% or more of households had higher odds of progression to diabetes within 36 months (OR = 1.24, 95% CI = 1.10-1.41). The C statistics for the models, including the census information, indicated that these models offered slightly higher predictive value compared with the models with age, sex, BMI, blood glucose, and race/ethnicity only (0.77 vs 0.76, data not shown).

18 096


Most studies that examine predictors of diabetes risk focus exclusively on individual-level demographic and clinical factors. Area-level socioeconomic characteristics are rarely included, despite evidence that the socioeconomic characteristics of a person’s residential area are strong determinants of health status. Prior research that addressed the impact of neighborhood factors on health and diabetes risk derived information from small and nonrepresentative data sources that are not readily available on a large scale and/or that involve additional computational analysis such as geographic information system mapping.18-22,29,31,32 Our study added socioeconomic and food assistance information from the US census to traditional predictors of diabetes progression and found that education, income, and receipt of SNAP benefits were all significant predictors of progression to diabetes within 36 months. This finding suggests that leveraging readily available census data may improve the ability to predict diabetes progression, and help physicians and Health Plans target prevention strategies to those who need it most. Other research has suggested that using EHR-based information as a tool for targeting diabetes prevention outreach can improve preventive care.34 Findings of this study’s analysis, based on EHR data and other administrative data readily available on KP patients, support the assertion that these data can be used to identify patients who may be at high risk of diabetes development. These results also suggest that adding census data readily available to Health Plans in addition to their EHR data may add useful information to these efforts.

More than 3 trillion dollars are spent on health care in the US each year, representing 18% of the country’s gross domestic product.30 Most of these resources invested in health care are traditionally focused on providing direct medical care, with less spent on addressing patients’ social and economic needs or environmental conditions that contribute to health status.30 Recent research has postulated that rebalancing some of these resources to address social needs may improve health care and health equity in the US.30 Results of our study, which show that socioeconomic factors and food assistance needs are directly associated with worse health outcomes, suggest that directing health care resources toward social needs may ameliorate the health of the US population.

Previous research findings suggest that food insecurity (defined as a limited access to nutritious food based on cost) is associated with a wide range of chronic diseases and their complications, and that increasing access to healthy foods may improve the health of patients and their families.29-33,35 We found that the percentage of households receiving SNAP benefits in a patient’s neighborhood is significantly related to that individual patient’s risk of diabetes developing independently of other factors, including neighborhood income and educational attainment. This finding suggests that addressing food needs and food insecurity may reduce diabetes risk as well.

Health care systems may have a direct role to play in addressing these community and individual social needs. The UK National Health Service was founded in part on the idea that preventing disease required a holistic approach that incorporated attention to environment, well-being, diet, housing, and clinical care.36 The UK National Health Service currently allows its general practitioners to employ “social” prescribing for direct provision of healthy foods and other nonnutritional services as well.37 Research evidence has shown that social prescribing of healthy foods, fruits, and vegetables through patient discounts on fruit and vegetable purchases reinforces the link between food intake and health.38 Health care policy leaders in the US have recently suggested that a “place-based” approach that makes health care delivery and public health systems accountable for improving population health might be a promising avenue for American health care policy as well.39 Our current findings underscore the important effect of place on individual patient disease risk. Efforts by US health care systems to directly address food insecurity and increase access to healthy eating resources may be important strategies for improving population-level disease prevention and care.

This study has limitations that should be noted. Although the inclusion of census variables on education, income, and food assistance increased the predictive power of the logistic regression model predicting progression to diabetes among patients with prediabetes, it did so by a relatively small amount. It is possible that individual-level socioeconomic data and social needs data would have increased the predictive power of these variables for diabetes progression. Future research should work to collect more refined measures of both individual and community-level socioeconomic indicators, social needs, and resource measures on a systematic basis; to further understand the relationship between place of residence and socioeconomic factors; and to incorporate them into planning patient care. Although prior work has suggested that lower neighborhood “walkability” may also be a place-based variable related to lower rates of diabetes incidence,40 this variable was not available for inclusion in our analysis. We limited our inclusion of census data to 3 variables on the basis of the current literature; it is possible that other census variables may also be significantly associated with diabetes progression as well.

Furthermore, our findings are from a single health care delivery system within 1 state (California), which may limit generalizability. The percentage of people in a census tract in our sample (4.09%) was less than the percentage of those receiving SNAP assistance in the State of California as a whole (7.4%, with a margin of error of 0.1%)41; this may limit generalizability as well. Finally, our results show the statistical significance of including census data in a model for only 1 outcome (diabetes progression). Future research and quality improvement efforts should test the predictive power of using census data and other information on social and resource needs on a wider range of patient-centered health outcomes.


Census information on socioeconomic status and receipt of public food assistance predict diabetes development in patients with prediabetes, even after adjusting for traditional individual demographic and clinical factors. Clinical interventions should take these factors into account, and health care systems should consider addressing social needs and community resources as a path to improving individual and population-level health outcomes.

Disclosure Statement

The author(s) have no conflicts of interest to disclose.

Author Contributions

Julie A Schmittdiel, PhD, supervised all aspects of conceptualizing the study design and data analysis and wrote the first draft of the manuscript. Cassondra J Marshall, DrPH, MPH, contributed to creating the conceptual framework, interpreting the analysis results, and drafting the manuscript. Wendy T Dyer, MS, performed all data analysis and assisted in drafting the manuscript. Roberta Bivins, PhD, contributed to creating the conceptual framework, interpreting the analysis results, and drafting the manuscript. All authors approved the final version of the manuscript submitted.


Kathleen Louden, ELS, of Louden Health Communications provided editorial assistance.

How to Cite this Article

Schmittdiel JA, Dyer WT, Marshall CJ, Bivins R. Using neighborhood-level census data to predict diabetes progression in patients with laboratory-defined prediabetes. Perm J 2018;22:18-096. DOI:

1.    Cowie CC, Rust KF, Byrd-Holt DD, et al. Prevalence of diabetes and impaired fasting glucose in adults in the US population: National Health and Nutritional Examination Survey 1999-2002. Diabetes Care 2006 Jun;29(6):1263-8. DOI:
    2.    Tuomilehto J, Lindström J, Eriksson JG, et al; Finnish Diabetes Prevention Study Group. Prevention of type 2 diabetes mellitus by changes in lifestyle among subjects with impaired glucose tolerance. N Engl J Med 2001 May 3;344(18):1343-50. DOI:
    3.    Knowler WC, Barrett-Connor E, Fowler SE; Diabetes Prevention Program Research Group. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N Engl J Med 2002 Feb 7;346(6):393-402. DOI:
    4.    Pan XR, Li GW, Hu YH, et al. Effects of diet and exercise in preventing NIDDM in people with impaired glucose tolerance. The Da Qing IGT and Diabetes Study. Diabetes Care 1997 Apr;20(4):537-44. DOI:
    5.    Hoerger TJ, Hicks KA, Sorenson SW, et al. Cost-effectiveness of screening for pre-diabetes among overweight and obese US adults. Diabetes Care 2007 Nov;30(11):2874-9. DOI:
    6.    Herman WH, Edelstein SL, Ratner RE, et al; Diabetes Prevention Program Research Group. Effectiveness and cost-effectiveness of diabetes prevention among adherent participants. Am J Manag Care 2013 Mar;19(3):194-202.
    7.    Cichosz SL, Johansen MD, Hejlesen O. Toward big data analytics: Review of predictive models in management of diabetes and its complications. J Diabetes Sci Technol 2015 Oct 14;10(1):27-34. DOI:
    8.    Lagani V, Koumakis L, Chiarugi F, Lakasing E, Tsamardinos I. A systematic review of predictive risk models for diabetes complications based on large scale clinical studies. J Diabetes Complications 2013 Jul-Aug;27(4):407-13. DOI:
    9.    Raghupathi W, Raghupathi V. Big data analytics in healthcare: Promise and potential. Health Inf Sci Syst 2014 Feb 7;2:3. DOI:
    10.    Hippisley-Cox J, Coupland C, Robson J, Sheikh A, Brindle P. Predicting risk of type 2 diabetes in England and Wales: Prospective derivation and validation of QDScore. BMJ 2009 Mar 17;338:b880. DOI:
    11.    Schulze MB, Hoffmann K, Boeing H, et al. An accurate risk score based on anthropometric, dietary, and lifestyle factors to predict the development of type 2 diabetes. Diabetes Care 2007 Mar;30(3):510-5. DOI:
    12.    Tuomilehto J, Lindström J, Hellmich M, et al. Development and validation of a risk-score model for subjects with impaired glucose tolerance for the assessment of the risk of type 2 diabetes mellitus—The STOP-NIDDM risk-score. Diabetes Res Clin Pract 2010 Feb;87(2):267-74. DOI:
    13.    Gray LJ, Davies MJ, Hiles S, et al. Detection of impaired glucose regulation and/or type 2 diabetes mellitus, using primary care electronic data, in a multiethnic UK community setting. Diabetologia 2012 Apr;55(4):959-66. DOI:
    14.    Anderson JP, Parikh JR, Shenfeld DK, et al. Reverse engineering and evaluation of prediction models for progression to type 2 diabetes: An application of machine learning using electronic health records. J Diabetes Sci Technol 2015 Dec 20;10(1):6-18. DOI:
    15.    Chen X, Wu Z, Chen Y, et al. Risk score model of type 2 diabetes prediction for rural Chinese adults: The Rural Deqing Cohort Study. J Endocrinol Invest 2017 Oct;40(10):1115-23. DOI:
    16.    Sung KC, Ryu S, Sung JW, et al. Inflammation in the prediction of type 2 diabetes and hypertension in healthy adults. Arch Med Res 2017 Aug;48(6):535-45. DOI:
    17.    Leong A, Daya N, Porneala B, et al. Prediction of type 2 diabetes by hemoglobin A1C in two community-based cohorts. Diabetes Care 2018 Jan;41(1):60-8. DOI:
    18.    Dwyer-Lindgren L, Bertozzi-Villa A, Stubbs RW, et al. Inequalities in life expectancy among US counties, 1980 to 2014: Temporal trends and key drivers. JAMA Intern Med 2017 Jul 1;177(7):1003-11. DOI:
    19.    Auchincloss AH, Mujahid MS, Shen M, Michos ED, Whitt-Glover MC, Diez Roux AV. Neighborhood health-promoting resources and obesity risk (the multi-ethnic study of atherosclerosis). Obesity (Silver Spring) 2013 Mar;21(3):621-8. DOI:
    20.    Kaiser P, Diez Roux AV, Mujahid M, et al. Neighborhood environments and incident hypertension in the multi-ethnic study of atherosclerosis. Am J Epidemiol 2016 Jun 1;183(11):988-97. DOI:
    21.    Liu L, Núñez AE. Multilevel and urban health modeling of risk factors for diabetes mellitus: A new insight into public health and preventive medicine. Adv Prev Med 2014;2014:246049. DOI:
    22.    Christine PJ, Auchincloss AH, Bertoni AG, et al. Longitudinal associations between neighborhood physical and social environments and incident type 2 diabetes: the Multi-Ethnic Study of Atherosclerosis (MESA). JAMA Intern Med 2015 Aug;175(8):1311-20. DOI:
    23.    Glauber H, Vollmer WM, Nichols GA. A simple model for predicting two-year risk of diabetes development in individuals with prediabetes. Perm J 2018;22:17-050. DOI:
    24.    American Diabetes Association. Standards of medical care in diabetes—2006. Diabetes Care 2006 Jan;29 Suppl 1:S4-42. Erratum in: Diabetes Care 2006 May;29(5):1192.
    25.    American Diabetes Association. Standards of medical care in diabetes—2013. Diabetes Care 2013 Jan;36 Suppl 1:S11-66. DOI:
    26.    Schmittdiel JA, Adams SR, Segal J, et al. Novel use and utility of integrated electronic health records to assess rates of prediabetes recognition and treatment: Brief report from an Integrated electronic health records pilot study. Diabetes Care 2014 Feb;37(2):565-8. DOI:
    27.    Marshall C, Adams S, Dyer W, Schmittdiel J. Opportunities to reduce diabetes risk in women of reproductive age: Assessment and treatment of prediabetes within a large integrated delivery system. Womens Health Issues 2017 Nov-Dec;27(6):666-72. DOI:
    28.    American community survey (ACS) [Internet]. Washington, DC: United States Census Bureau; 2018 [cited 2018 Apr 20]. Available from:
    29.    Gabert R, Thomson B, Gakidou E, Roth G. Identify high-risk neighborhoods using electronic medical records: A population-based approach for targeting diabetes prevention and treatment interventions. PLoS One 2016 Jul 27;11(7):e0159227. DOI:
    30.    Kindig DA, Milstein B. A balanced investment portfolio for equitable health and well-being is an imperative, and within reach. Health Aff (Millwood) 2018 Apr;37(4):579-84. DOI:
    31.    Seligman HK, Laraia Ba, Kushel MB. Food insecurity is associated with chronic disease among low-income NHANES participants. J Nutr 2010 Feb;140(2):304-10. DOI:
    32.    Jilcott SB, Laraia BA, Evenson KR, Lowenstein LM, Ammerman AS. A guide for developing intervention tools addressing environmental factors to improve diet and physical activity. Health Promot Pract 2007 Apr;8(2):192-204. DOI:
    33.    Gundersen C, Ziliak JP. Food insecurity and health outcomes. Health Aff (Millwood) 2015 Nov;34(11):1830-9. DOI:
    34.    Berkowitz SA, Karter AJ, Corbie-Smith G, et al. Food insecurity, food “deserts,” and glycemic control in patients with diabetes: A longitudinal analysis. Diabetes Care 2018 Jun;41(6):1188-95. DOI:
    35.    Berkowitz SA, Terranova J, Hill C, et al. Meal delivery programs reduce the use of costly health care in dually eligible Medicare and Medicaid beneficiaries. Health Aff (Millwood) 2018 Apr;37(4):535-42. DOI:
    36.    Jameson W. The place of nutrition in a public health program. Am J Public Health Nations Health 1947 Nov;37(11):1371-5.
    37.    Pasha-Robinson L, Matthews-King A. Help a Hungry Child: NHS doctors to pilot food prescriptions as poverty soars [Internet]. London, UK: The Independent; 2017 Dec 24 [cited 2018 Apr 20]. Available from:
    38.    Kearney M, Bradbury C, Ellahi B, Hodgson M, Thurston M. Mainstreaming prevention: Prescribing fruit and vegetables as a brief intervention in primary care. Public Health 2005 Nov;119(11):981-6. DOI:
    39.    Briggs ADM, Alderwick H, Fisher ES. Overcoming challenges to US payment reform: Could a place-based approach help? JAMA 2018 Apr 17;319(15):1545-6. DOI:
    40.    Creatore MI, Glazier RH, Moineddin R, et al. Association of neighborhood walkability with change in overweight, obesity, and diabetes. JAMA 2016 May 24-31;315(20):2211-20. DOI:
    41.    Loveless TA. Food stamp/Supplemental Nutrition Assistance Program (SNAP) receipt in the past 12 months for households by state: 2009 and 2010. Washington, DC: United States Census Bureau; 2011 Nov [cited 2018 Apr 26]. Available from:


Click here to join the eTOC list or text ETOC to 22828. You will receive an email notice with the Table of Contents of The Permanente Journal.


2 million page views of TPJ articles in PubMed from a broad international readership.


Indexed in MEDLINE, PubMed Central, EMBASE, EBSCO Academic Search Complete, and CrossRef.




ISSN 1552-5775 Copyright © 2021

All Rights Reserved