Place Matters: Regional Risks in Breast Cancer Mortality
The Influence of Clinical and Structural Factors on In-Hospital Breast Cancer Mortality Among Black Women
Executive Summary
Problem: Black women in the United States experience disproportionately high breast cancer mortality, yet the specific drivers of in-hospital death, particularly the role of structural inequities beyond clinical severity, remain underexamined.
Methodology
This study analyzes 5,934 hospitalizations from the 2021 National Inpatient Sample using a machine learning framework.
- Data Preparation: Extraction of breast cancer hospitalizations using ICD-10 code C50. Cleaning and recoding of clinical and structural variables. Application of discharge weights to produce nationally representative estimates. Handling of categorical variables through encoding and normalization.
- Exploratory Data Analysis: Descriptive statistics of demographic, clinical, and structural variables. Comparative analysis across geographic regions and socioeconomic strata.
- Statistical & Machine Learning Models: Six predictive models were trained and compared:
- Logistic Regression (L1/L2) – Best Model – Baseline statistical model for inference and prediction; achieved the highest cross-validated performance (CV AUC ≈ 0.71)
- Elastic Net – Handles multicollinearity and feature selection, but showed reduced performance due to over-fitting (CV AUC ≈ 0.62)
- Random Forest – Captures nonlinear relationships and interactions (CV AUC ≈ 0.66)
- Gradient Boosting (GBM) – Enhances predictive performance; strongest nonlinear model but below logistic regression (CV AUC ≈ 0.68)
- XGBoost – Optimized ensemble learning for imbalanced data; comparable to GBM but did not outperform logistic regression (CV AUC ≈ 0.67)
Training/Validation Split: 70% / 30% | Evaluation Metric: Area Under the ROC Curve (AUC)
Feature Importance: Assessed using permutation importance.
Key Features
• Race-Specific Analysis: Focus exclusively on non-Hispanic Black women.
• Nationally Representative Estimates: Use of NIS discharge weights.
• Integration of Clinical and Structural Determinants.
• Machine Learning–Driven Risk Prediction.
• Structural Inequity Gap: Quantifies mortality differences attributable to systemic factors.
• Regional Analysis: Highlights disparities across U.S. Census divisions.
Key Findings
- In-Hospital Mortality Rate: 5.3% among Black women.
- Strongest Clinical Predictor: Metastatic disease (OR ≈ 2.18).
- Key Structural Predictors:
- Geographic Region: Highest risk in the Deep South (Division 6) and lowest in Division 8.
- Insurance Status: Self-pay and Medicaid associated with higher mortality.
- Socioeconomic Status: Lower ZIP income quartiles linked to increased risk.
- Admission Type: Non-elective admissions significantly increase mortality.
- Structural Inequity Gap: Up to a 37-percentage-point difference in predicted mortality between high- and low-risk regions. A model‑based dashboard was created for user exploration of the inequity gap.
Breast Cancer Mortality in Black Women – Datastory
Breast Cancer Mortality in Black Women – Interactive Model Based Dashboard
Summary: Black women in the US experience disproportionately high breast cancer mortality. This model‑based dashboard visualizes structural inequity gaps based on data modelling from the 5,934 Black female cancer hospitalizations (2021 NIS): including in‑hospital mortality rate (5.3%), impact of metastatic disease (OR=2.18), and geographic/structural risk factors.
💡 User instructions: Hover over data points to see exact values. Use filters (if enabled) to explore by region, age, or socioeconomic strata.