Annually, more than 40 million noncardiac surgeries take place in the US,1 with 1%-3% of patients experiencing a major adverse cardiovascular event (MACE) such as acute myocardial infarction (AMI) or cardiac arrest postoperatively.2 Such patients are at markedly increased risk of both perioperative and long-term death.2-5
Over the past 40 years, efforts to model the risk of cardiac complications after noncardiac surgery have examined relationships between preoperative risk factors and postoperative cardiovascular events. The resulting risk-stratification tools, such as the Lee Revised Cardiac Risk Index (RCRI), have been used to inform perioperative care, including strategies for risk factor management prior to surgery, testing for cardiac events after surgery, and decisions regarding postoperative disposition.6 However, tools used in practice have not incorporated intraoperative data on hemodynamics or medication administration in the transition to postoperative care, which is often provided by nonsurgical clinicians such as hospitalists. Presently, there is active debate about the optimal approach to postoperative evaluation and management of MACE, particularly with regard to indications for cardiac biomarker testing after surgery in patients without signs or symptoms of acute cardiac syndromes. The lack of consensus is reflected in differences among guidelines for postoperative cardiac biomarker testing across professional societies in Europe, Canada, and the United States.7-9
In this study, we examined whether the addition of intraoperative data to preoperative data (together, perioperative data) improved prediction of MACE after noncardiac surgery when compared with RCRI. Additionally, to investigate how such a model could be applied in practice, we compared risk stratification based on our model to a published risk factor–based guideline algorithm for postoperative cardiac biomarker testing.7 In particular, we evaluated to what extent patients recommended for postoperative cardiac biomarkers under the risk factor–based guideline algorithm would be reclassified as low risk by the model using perioperative data. Conducting biomarker tests on these patients would potentially represent low-value care. We hypothesized that adding intraoperative data would (a) lead to improved prediction of MACE complications when compared with RCRI and (b) more effectively identify, compared with a risk factor–based guideline algorithm, patients for whom cardiac biomarker testing would or would not be clinically meaningful.
We followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) reporting guideline.10
Baseline, preoperative, and intraoperative data were collected for patients undergoing surgery between January 2014 and April 2018 within the University of Pennsylvania Health System (UPHS) electronic health record (EHR), and these data were then integrated into a comprehensive perioperative dataset (data containing administrative, preoperative, intraoperative, and postoperative information related to surgeries) created through a collaboration with the Multicenter Perioperative Outcomes Group.11 The University of Pennsylvania Institutional Review Board approved this study.
Patients aged 18 years or older who underwent inpatient major noncardiac surgery across four tertiary academic medical centers within UPHS in Pennsylvania during the study period were included in the cohort (see Appendix for inclusion/exclusion criteria).12,13 Noncardiac surgery was identified using primary Current Procedural Terminology (CPT) code specification ranges for noncardiac surgeries 10021-32999 and 34001-69990. The study sample was divided randomly into a training set (60%), validation (20%), and test set (20%),14 with similar rates of MACE in the resulting sets. We used a holdout test set for all final analyses to avoid overfitting during model selection.
The composite outcome used to develop the risk-stratification models was in-hospital MACE after major noncardiac surgery. Following prior literature, MACE was defined using billing codes for ST-elevation/non–ST-elevation myocardial infarction (STEMI/NSTEMI, ICD-9-CM 410.xx, ICD-10-CM I21.xx), cardiac arrest (ICD-9-CM 427.5, ICD-10-CM I46.x, I97.121), or all-cause in-hospital death.2,15-17
Variables were selected from baseline administrative, preoperative clinical, and intraoperative clinical data sources (full list in Appendix). Baseline variables included demographics, insurance type, and Elixhauser comorbidities.18,19 Preoperative variables included surgery type, laboratory results, and American Society of Anesthesiologists (ASA) Physical Status classification.20 Intraoperative variables included vital signs, estimated blood loss, fluid administration, and vasopressor use. We winsorized outlier values and used multiple imputation to address missingness. Rates of missing data can be found in Appendix Table 1.
Risk-Stratification Models Used as Comparisons
Briefly, RCRI variables include the presence of high-risk surgery,21 comorbid cardiovascular diseases (ie, ischemic heart disease, congestive heart failure, and cerebrovascular disease), preoperative use of insulin, and elevated preoperative serum creatinine.6 RCRI uses the inputs to calculate a point score that equates to different risk strata and is based on a stepwise logistic regression model with postoperative cardiovascular complications as the dependent outcome variable. For this study, we implemented the weighted version of the RCRI algorithm and computed the point scores (Appendix).6,7,22
We also applied a risk factor–based algorithm for postoperative cardiac biomarker testing published in 2017 by the Canadian Cardiovascular Society (CCS) guidelines to each patient in the study sample.7 Specifically, this algorithm recommends daily troponin surveillance for 48 to 72 hours after surgery among patients who have (1) an elevated NT-proBNP/BNP measurement or no NT-proBNP/BNP measurement before surgery, (2) have a Revised Cardiac Risk Index score of 1 or greater, (3) are aged 65 years and older, (4) are aged 45 to 64 years with significant cardiovascular disease undergoing elective surgery, or (5) are aged 18 to 64 years with significant cardiovascular disease undergoing semiurgent, urgent, or emergent surgery.
We compared patient characteristics and outcomes between those who did and those who did not experience MACE during hospitalization. Chi-square tests were used to compare categorical variables and Mann Whitney tests were used to compare continuous variables.
To create the perioperative risk-stratification model based on baseline, preoperative, and intraoperative data, we used a logistic regression with elastic net selection using a dichotomous dependent variable indicating MACE and independent variables described earlier. This perioperative model was fit on the training set and the model coefficients were then applied to the patients in the test set. The area under the receiver operating characteristic curve (AUC) was reported and the outcomes were reported by predicted risk decile, with higher deciles indicating higher risk (ie, higher numbers of patients with MACE outcomes in higher deciles implied better risk stratification). Because predicted risk of postoperative MACE may not have been distributed evenly across deciles, we also examined the distribution of the predicted probability of MACE and examined the number of patients below thresholds of risk corresponding to 0.1% or less, 0.25% or less, 0.5% or less, and 1% or less. These thresholds were chosen because they were close to the overall rate of MACE within our cohort.
We tested for differences in predictive performance between the RCRI logistic regression model AUC and the perioperative model AUC using DeLong’s test.23 Additionally, we illustrated differences between the perioperative and RCRI models’ performance in two ways by stratifying patients into deciles based on predicted risk. First, we compared rates of MACE and MACE component events by predicted decile of the perioperative and RCRI models. Second, we further classified patients as RCRI high or low risk (per RCRI score classification in which RCRI score of 1 or greater is high risk and RCRI score of 0 is low risk) and examined numbers of surgical cases and MACE complications within these categories stratified by perioperative model predicted decile.
To compare the perioperative model’s performance with that of a risk factor–based guideline algorithm, we classified patients according to CCS guidelines as high risk (those for whom the CCS guidelines algorithm would recommend postoperative troponin surveillance testing) and low risk (those for whom the CCS guidelines algorithm would not recommend surveillance testing). We also used a logistic regression to examine if the predicted risk from our model was independently associated with MACE above and beyond the testing recommendation of the CCS guidelines algorithm. This model used MACE as the dependent variable and model-predicted risk and a CCS guidelines–defined high-risk indicator as predictors. We computed the association between a 10 percentage–point increase in predicted risk on observed MACE outcome rates.24
In sensitivity analyses, we used a random forest machine learning classifier to test an alternate model specification, used complete case analysis, varied RCRI thresholds, and limited to patients aged 50 years or older. We also varied the penalty parameter in the elastic net model and plotted AUC versus the number of variables included to examine parsimonious models. SAS v9.4 (SAS Institute Inc) was used for main analyses. Data preparations and sensitivity analysis were done in Python v3.6 with Pandas v0.24.2 and Scikit-learn v0.19.1.
Patients who underwent major noncardiac surgery in our sample (n = 72,909) were approximately a mean age of 56 years, 58% female, 66% of White race and 26% of Black race, and most likely to have received orthopedic surgery (33%) or general surgery (20%). Those who experienced MACE (n = 558; 0.77%) differed along several characteristics (Table 1). For example, those with MACE were older (mean age, 65.4 vs 55.4 years; P < .001) and less likely to be female (41.9% vs 58.3%; P < .001).
Model Performance After Intraoperative Data Inclusion
In the perioperative model combining preoperative and intraoperative data, 26 variables were included after elastic net selection (Appendix Table 2). Model discrimination in the test set of patients demonstrated an AUC of 0.88 (95% CI, 0.85-0.92; Figure). When examining outcome rates by predicted decile, the outcome rates of in-hospital MACE complications were higher in the highest decile than in the lowest decile, notably with 58 of 92 (63%) cases with MACE complications within the top decile of predicted risk (Table 2). The majority of patients had low predicted risk of MACE, with 5,309 (36.1%), 8,796 (59.7%), 11,335 (77.0%), and 12,972 (88.1%) below the risk thresholds of to 0.1%, 0.25%, 0.5%, and 1.0% respectively. The associated MACE rates were 0.04%, 0.10%, 0.17%, and 0.25% (average rate in sample was 0.63%) (Appendix Table 3).
Model Performance Comparisons
The perioperative model AUC of 0.88 was higher when compared with RCRI’s AUC of 0.79 (95% CI, 0.74-0.84; P < .001). The number of MACE complications was more concentrated in the top decile of predicted risk of the perioperative model than it was in that of the RCRI model (58 vs 43 of 92 events, respectively; 63% vs 47%; Table 2). Furthermore, there were fewer cases with MACE complications in the low-risk deciles (ie, deciles 1 to 5) of the perioperative model than in the those of the RCRI model. These relative differences were consistent for MACE component outcomes of STEMI/NSTEMI, cardiac arrest, and in-hospital death, as well.
There was substantial heterogeneity in the perioperative model predicted risk of patients classified as either RCRI low risk or high risk (ie, each category included patients with low and high predicted risk) categories (Table 3). Patients in the bottom (low-risk) five deciles of the perioperative model’s predicted risk who were in the RCRI model’s high-risk group were very unlikely to experience MACE complications (3 out of 722 cases; 0.42%). Furthermore, among those classified as low risk by the RCRI model but were in the top decile of the perioperative model’s predicted risk, the MACE complication rate was 3.5% (8 out of 229), which was 6 times the sample mean MACE complication rate.
The perioperative model identified more patients as low risk than did the CCS guidelines’ risk factor–based algorithm (Table 3). For example, 2,341 of the patients the CCS guidelines algorithm identified as high risk were in the bottom 50% of the perioperative model’s predicted risk for experiencing MACE (below a 0.18% chance of a MACE complication); only four of these patients (0.17%) actually experienced MACE. This indicates that the 2,341 of 7,597 (31%) high-risk patients identified as low risk in the perioperative model would have been recommended for postoperative troponin testing by CCS guidelines based on preoperative risk factors alone—but did not go on to experience a MACE. Regression results indicated that both CCS guidelines risk-factor classification and the perioperative model’s predicted risk were predictive of MACE outcomes. A change in the perioperative model’s predicted risk of 10 percentage points was associated with an increase in the probability of a MACE outcomes of 0.45 percentage points (95% CI, 0.35-0.55 percentage points; P < .001) and moving from CCS guidelines’ low- to high-risk categories was associated with an increased probability of MACE by 0.96 percentage points (95% CI, 0.75-1.16 percentage points; P < .001).
Results were consistent with the main analysis across all sensitivity analyses (Appendix Tables 4-7). Parsimonious models with variables as few as eight variables retained strong predictive power (AUC, 0.870; Appendix Figure 1 and Table 8).
In this study, the addition of intraoperative data improved risk stratification for MACE complications when compared with standard risk tools such as RCRI. This approach also outperformed a guidelines-based approach and identified additional patients at low risk of cardiovascular complications. This study has three main implications.
First, this study demonstrated the additional value of combining intraoperative data with preoperative data in risk prediction for postoperative cardiovascular events. The intraoperative data most strongly associated with MACE, which likely were responsible for the performance improvement, included administration of medications (eg, sodium bicarbonate or calcium chloride) and blood products (eg, platelets and packed red blood cells), vitals (ie, heart rate), and intraoperative procedures (ie, arterial line placement); all model variables and coefficients are reported in Appendix Table 9. The risk-stratification model using intraoperative clinical data outperformed validated standard models such as RCRI. While this model should not be used in causal inference and cannot be used to inform decisions about risk-benefit tradeoffs of undergoing surgery, its improved performance relative to prior models highlights the potential in using real-time data. Preliminary illustrative analysis demonstrated that parsimonious models with as few as eight variables perform well, whose implementation as risk scores in EHRs is likely straightforward (Appendix Table 8). This is particularly important for longitudinal care in the hospital, in which patients frequently are cared for by multiple clinical services and experience handoffs. For example, many orthopedic surgery patients with significant medical comorbidity are managed postoperatively by hospitalist physicians after initial surgical care.
Second, our study aligns well with the cardiac risk-stratification literature more broadly. For example, the patient characteristics and clinical variables most associated with cardiovascular complications were age, history of ischemic heart disease, American Society of Anesthesiologists physical status, use of intraoperative sodium bicarbonate or vasopressors, lowest intraoperative heart rate measured, and lowest intraoperative mean arterial pressure measured. While many of these variables overlap with those included in the RCRI model, others (such as American Society of Anesthesiologists physical status) are not included in RCRI but have been shown to be important in risk prediction in other studies using different data variables.6,25,26
Third, we illustrated a clinical application of this model in identifying patients at low risk of cardiovascular complications, although benefit may extend to other patients as well. This is particularly germane to clinicians who frequently manage patients in the postsurgical or postprocedural setting. Moreover, the clinical relevance to these clinicians is underscored by the lack of consensus among professional societies across Europe, Canada, and the United States about which subgroups of patients undergoing noncardiac surgery should receive postoperative cardiac biomarker surveillance testing in the 48 to 72 hours after surgery.6-9 This may be in part caused by differences in clinical objectives. For example, the CCS guidelines in part aim to detect myocardial injury after noncardiac surgery (MINS) up to 30 days after surgery, which may be more sensitive to myocardial injury but less strongly associated with outcomes like MACE. The results of this study suggest that adopting such risk factor–based testing would likely lead to additional testing of low risk patients, which may represent low value surveillance tests. For example, there were 2,257 patients without postoperative cardiac biomarker testing in our data who would have been categorized as high risk by risk factor guidelines and therefore recommended to receive at least one postoperative cardiac biomarker surveillance test but were classified as low-risk individuals using a predicted probability of MACE less than 0.18% per our perioperative risk stratification model (Appendix Table 4). If each of these patients received one troponin biomarker test, the associated cost increase would be $372,405 (using the $165 cost per test reported at our institution). These costs would multiply if daily surveillance troponin biomarker tests were ordered for 48 to 72 hours after surgery, as recommended by the risk factor–based testing guidelines. This would be a departure from testing among patients using clinician discretion that may avoid low-value testing.
Applying the perioperative model developed in this paper to clinical practice still requires several steps. The technical aspects of finding a parsimonious model that can be implemented in the EHR is likely quite straightforward. Our preliminary analysis illustrates that doing so will not require accessing large numbers of intraoperative variables. Perhaps more important steps include prospective validation of the safety, usability, and clinical benefit of such an algorithm-based risk score.27
The study has several limitations. First, it was an observational study using EHR data subject to missingness and data quality issues that may have persisted despite our methods. Furthermore, EHR data is not generated randomly, and unmeasured variables observed by clinicians but not by researchers could confound the results. However, our approach used the statistical model to examine risk, not causal inference. Second, this is a single institution study and the availability of EHR data, as well as practice patterns, may vary at other institutions. Furthermore, it is possible that performance of the RCRI score, the model fitting RCRI classification of high vs low risk on the sample data, and our model’s performance may not generalize to other clinical settings. However, we utilized data from multiple hospitals within a health system with different surgery and anesthesia groups and providers, and a similar AUC was reported for RCRI in original validation study.6 Third, our follow up period was limited to the hospital setting and we do not capture longitudinal outcomes, such as 30-day MACE. This may impact the ability to risk stratify for other important longer-term outcomes, limit clinical utility, and hinder comparability to other studies. Fourth, results may vary for other important cardiovascular outcomes that may be more sensitive to myocardial injury, such as MINS. Fifth, we used a limited number of modeling strategies.
Addition of intraoperative data to preoperative data improves prediction of cardiovascular complications after noncardiac surgery. Improving the identification of patients at low risk for such complications could potentially be applied to reduce unnecessary postoperative cardiac biomarker testing after noncardiac surgery, but it will require further validation in prospective clinical settings.
Dr Navathe reports grants from the following entities: Hawaii Medical Service Association, Anthem Public Policy Institute, Commonwealth Fund, Oscar Health, Cigna Corporation, Robert Wood Johnson Foundation, Donaghue Foundation, Pennsylvania Department of Health, Ochsner Health System, United Healthcare, Blue Cross Blue Shield of NC, Blue Shield of CA; personal fees from the following: Navvis Healthcare, Agathos, Inc, Navahealth, YNHHSC/CORE, Maine Health Accountable Care Organization, Maine Department of Health and Human Services, National University Health System - Singapore, Ministry of Health - Singapore, Social Security Administration - France, Elsevier Press, Medicare Payment Advisory Commission, Cleveland Clinic, Embedded Healthcare; and other support from Integrated Services, Inc, outside of the submitted work. Dr Volpp reports grants from Humana during the conduct of the study; grants from Hawaii Medical Services Agency, Discovery (South Africa), Merck, Weight Watchers, and CVS outside of the submitted work; he has received consulting income from CVS and VALHealth and is a principal in VALHealth, a behavioral economics consulting firm. Dr Holmes receives funding from the Pennsylvania Department of Health, US Public Health Service, and the Cardiovascular Medicine Research and Education Foundation. All other authors declare no conflicts of interest.
2019 Academy Health Annual Research Meeting, Poster Abstract Presentation, June 2 to June 4, 2019, Washington, DC.
This project was funded, in part, under a grant with the Pennsylvania Department of Health. This research was independent from the funder. The funder had no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the article for publication. The department specifically disclaims responsibility for any analyses, interpretations, or conclusions.