Gastric cancer is the fourth most prevalent cancer worldwide. The ability to accurately predict surgery-related morbidity and mortality is critical in deciding both the timing of surgery and choice of surgical procedure. The aim of this study is to compare the POSSUM, p-POSSUM, o-POSSUM, and APACHE II scoring systems for predicting surgical morbidity and mortality in Chinese gastric cancer patients, as well as to create new scoring systems to achieve better prediction.
Data from 612 gastric cancer patients undergoing gastrectomy between January 2007 and December 2011 were included in this study. The predictive abilities of the four scoring systems were compared by examining observed-to-expected (O/E) ratios, the receiver operating characteristic curve, Student t test, and χ2 test results.
The observed complication rate of 34% (n = 208) did not differ significantly from the rate of 36.6% (n = 208) predicted by the POSSUM scoring system (O/E ratio = 0.93). The observed mortality rate was 2.9% (n = 18). For predicting mortality, POSSUM had an O/E ratio of 0.34 as compared with p-POSSUM (O/E ratio = 0.91), o-POSSUM (O/E ratio = 1.26), and APACHE II (O/E ratio = 0.28).
The POSSUM scoring system performed well with respect to predicting morbidity risk following gastric cancer resection. For predicting postoperative mortality, p-POSSUM and o-POSSUM exhibited superior performance relative to POSSUM and APACHE II.
gastric cancer;morbidity;mortality;receiver operating characteristic curve;scoring systems
Postoperative complications and mortality rates following gastric cancer surgery remain comparatively high, which is mainly attributable to the extent of surgery and degree of technical experience.1 ; 2 Postoperative complications are a serious concern for clinicians treating gastric cancer surgically. There are many controversies regarding gastric cancer surgery, compounded by the lack of studies investigating early postoperative complications associated with gastric cancer surgery.3 Therefore, research into early postoperative complications and mortality may be beneficial in providing reference points that improve the success of surgical procedures for gastric cancer treatment. Comparison of surgical outcomes is made difficult by vague explanations of postoperative complications and a lack of standard auditing methods. As a result, numerous scoring systems have been developed to predict postoperative mortality and morbidity outcomes.
The Physiological and Operative Severity Score for the enumeration of Mortality and morbidity (POSSUM) was developed in 1991 for use within general operative practice.4 POSSUM evaluates 12 preoperative physiological variables and six operative variables using a 4-grade scoring system, with results analyzed using linear or exponential methods. The POSSUM scoring system has been reported to overestimate mortality, particularly in low-risk patients.5 ; 6 In order to address this problem across a number of surgical procedures, modifications of the POSSUM scoring system have been proposed, including Portsmouth-POSSUM (p-POSSUM)7 and oesophagogastric-POSSUM (o-POSSUM).8 p-POSSUM includes a revision to both its regression equation constant and weighting to predict in-patient mortality. Numerous researchers have found the predictive ability of p-POSSUM to be more accurate as compared to POSSUM.7 ; 9 By contrast, o-POSSUM was designed to predict only postoperative mortality. These three scoring systems can be used to predict actual mortality rate to a certain degree, however, since they were developed for broad applicability, their ability to accurately predict mortality for a specific patient population is suboptimal.10
The APACHE scoring system evaluates disease severity by quantifying 34 physiological variables.11 The initial APACHE system was replaced by APACHE II, which reduced the number of variables to 12, including both physiological and laboratory measurements, and added variables for age and prior health status.12 The APACHE II scoring system was designed primarily for use in intensive care units, with evidence suggesting that it can predict perioperative events in patients undergoing a variety of surgical procedures.13
Few studies have been performed to determine whether the POSSUM grading system functions well for preoperative evaluation on Chinese gastric cancer patients. The objective of this study was to investigate the ability of the POSSUM scoring system to predict postoperative morbidity, as well as to compare the POSSUM, p-POSSUM, o-POSSUM, and APACHE II scoring systems for their ability to predict mortality in patients undergoing curative surgical resection for gastric cancer.
A total of 612 cases of patients with gastric cancer who underwent total gastrectomy (n = 112) or subtotal gastrectomy (n = 500) between January 2008 and December 2012 at the First Affiliated Hospital and Affiliated Tumor Hospital, Guangxi Medical University, Nanning, China were included in this study. The study protocol was approved by the Ethics Review Committee of Guangxi Medical University and written or verbal informed consent was provided by either the patients or their family members. All patients underwent preoperative esophagogastroduodenoscopy biopsies and histopathologic examination was performed either preoperatively or during intraoperative pathological examination. The following exclusion criteria were applied: presence of nongastric cancer pathologies, nonstandard gastric surgery staging laparoscopy, diagnosis of unresectable cancer after laparotomy, and palliative surgery. Surgical procedures were classified as emergency, urgent, scheduled, or elective according to the National Confidential Enquiry into Perioperative Death classification. 14 Elective operations were performed in 510 patients (83.3%) and 102 surgeries (16.7%) were classified as urgent (Table 1). Analysis of in-patient morbidity or mortality was the sole endpoint of this study.
|No. of patients (n)||510||102||612|
|Body mass index (kg/m2)||25.1 ± 2.0||24.8 ± 2.2||25.1 ± 2.0|
|Age (y)||61.2 ± 13.2||64.0 ± 12.9||61.7 ± 13.1|
|Methods of operation|
|Total gastrectomy||100 (19.6)||12 (11.8)||112 (18.3)|
|Proximal subtotal gastrectomy||220 (43.1)||40 (39.2)||260 (42.5)|
|Distant subtotal gastrectomy||190 (37.3)||50 (49)||240 (39.2)|
Data are presented as n (%) or mean ± SD.
a. Urgent procedures were defined as surgical interventions within the first 24 hours after admission and all others were defined as elective procedures.
The following data were collected through comprehensive review of original patient records: age, gender, site of neoplasia, body mass index, presence of distant metastasis, and surgical protocol performed. Total gastrectomy is performed in cases of proximal stomach injuries and when the tumor is localized to the gastric body or in cases where the tumor is located in the distal third of the stomach, but a 5 cm proximal margin of safety cannot be obtained. For cases exhibiting cardiac affection or gastric-esophagus joint involvement, an esophagogastrectomy may be required. Patients with malignancies situated in the distal third of the stomach are also candidates for a subtotal gastrectomy. D2 lymphadenectomy was performed in patients younger than 70 years, otherwise, a D1 lymphadenectomy was performed. Postoperative incidence of morbidity and mortality was documented for 30 days, including analysis of all adverse effects, including surgical and nonsurgical complications. All complications were further stratified into (1) local infection, which appeared at the surgical wound site without systemic involvement; (2) systemic complications, defined as those affecting the entire body; and (3) those needing additional surgical procedures. Death occurring within 30 days of surgery was defined as an operative mortality.
Variable descriptions and statistical analyses ware performed using SPSS, version 19.0 (IBM Corp., Armonk, NY, USA). Quantitative descriptions of variables were performed and data expressed as the mean and standard deviation (SD) after showing normality from the variant. Categorical variables were presented as percentages and 95% confidence intervals (CI) and were compared using the χ2 test. Continuous variables were presented as the mean ± SD and compared using the Student t test. A p value < 0.05 indicated that the discriminatory abilities of different scoring systems were significantly different. The mortality and morbidity rates predicted by the scoring systems were compared with observed mortality and morbidity rates, wherein primary and secondary outcomes were mortality and morbidity, respectively.
In order to assess prediction accuracy, receiver operating characteristic (ROC) curves were generated for each scoring system, with sensitivity plotted on the Y-axis and specificity plotted on the X-axis. The area under the ROC curve (AUC) was considered to be a more reliable method for examining the properties of a diagnostic test and was used to compare the diagnostic abilities of the four scoring systems. Exponential analysis methods were applied for the POSSUM scoring system.15 The observed-to-expected (O/E) operative morbidity (mortality) ratio for POSSUM was calculated, with the O/E value representing the ratio of actual mortality (mortality) to measured (predicted) morbidity (mortality). An O/E value of 1 indicates ideal predictive ability of a scoring system. An O/E ratio <1 indicates lower morbidity than expected, while An O/E ratio >1 indicates greater morbidity than expected.3 ; 16
A total of 612 patients who underwent gastrectomy due to gastric cancer were included in this study. Age, sex, body mass index,and surgical methods are shown in Table 1 and details of postoperative complications are described in Table 2. The overall results and the annual ratio monitoring of our results as compared to the expected outcomes are shown in Table 3.
|(n = 510)||(n = 102)||(n = 612)|
|Wound hemorrhage||6 (3.6)||2 (2.8)||8 (3.3)|
|Wound infection||28 (16.7)||12 (16.7)||40 (16.7)|
|Wound dehiscence||26 (15.5)||14 (19.4)||40 (16.7)|
|Respiratory failure||8 (4.8)||6 (8.3)||14 (5.8)|
|Deep hemorrhage||8 (4.8)||4 (5.6)||12 (5.0)|
|Chest infection||30 (17.9)||16 (22.2)||46 (19.2)|
|Urinary infection||20 (11.9)||14 (19.4)||34 (14.2)|
|Septicemia||6 (3.6)||2 (2.8)||8 (3.3)|
|Pyrexia of unknown origin||4 (2.4)||—||4 (1.7)|
|Cardiac failure||6 (3.6)||2 (2.8)||8 (3.3)|
|Impaired renal function||6 (3.6)||—||6 (2.5)|
|Hypotension||6 (3.6)||—||6 (2.5)|
|Anastomotic leak||10 (6.0)||—||10 (4.2)|
|Intestinal obstruction||4 (2.4)||—||4 (1.7)|
|Total complications (n)||168||72||240|
|Re-operation||12 (7.1)||—||12 (5.0)|
Data are presented as n (%).
|(n = 510)||O:E||p||(n = 102)||O:E||p||(n = 612)||O:E||p|
|14 (2.7)||4 (3.9)||18 (2.9)|
|160 (31.4)||—||—||48 (47.1)||—||—||208 (34.0)||—||—|
Data are presented as n (%) or %.
APACHE II = Acute Physiology and Chronic Health Evaluation; NS = not statistically significant (p > 0.05); O:E = observed-expected ratio; POSSUM = Physiological and Operative Severity Score for enumeration of mortality and morbidity; o-POSSUM = oesophagogastric-POSSUM; p-POSSUM = Portsmouth-POSSUM.
Patients ranged in age from 21 years to 78 years (mean ± SD = 61.7 ± 13.1 years). The sum of individual cases of complications was not equivalent to the number of total complications due to the occurrence of multiple complications in a given patient. In the overall study population, 208 (34.0%) patients experienced a total of 240 complications. Among the patients undergoing elective surgery, 160 (31.4%) had one or more complications, as compared with 48 (47.1%) of patients classified as having acute operations. The most common local complications included wound infection and wound dehiscence and the most common systemic complications were chest and urinary tract infections.
The expected mortalities predicted by four scoring systems were as follows: POSSUM, 8.5%; p-POSSUM, 3.2%; o-POSSUM, 2.3%; and APACHE II, 10.3%. The observed incidence of mortality was 2.9%, yielding O/E mortality ratios of 0.34, 0.91, 1.26, and 0.28 for POSSUM, p-POSSUM, o-POSSUM, and APACHE II, respectively. In the overall patient population, the POSSUM and APACHE II scoring systems predicted significantly higher mortality incidence (p < 0.01) relative to the observed incidence of postoperative mortality ( Table 3).
Mean morbidity, as predicted by the POSSUM scoring system, was 36.6% with an O/E ratio of 0.93. The morbidity rate predicted by the POSSUM scoring system was not significantly different from that observed in cases of both elective and acute surgeries. The POSSUM scoring system AUC was 0.787 for predicting the morbidity rate (Figure 1) and AUC analysis demonstrated that the o-POSSUM scoring system was significantly better at predicting postoperative mortality as compared with the POSSUM, p-POSSUM, or APACHE II scoring systems (Table 4, p < 0.05 for each comparison; Figure 2).
ROC curve for the POSSUM scoring system for predicting the rate of postoperative morbidity (i.e., complications) in patients undergoing surgical resection for gastric cancer (n = 612). Diagonal segments are produced by ties. AUC = 0.79. AUC = area under the curve; POSSUM = Physiological and Operative Severity Score for the enUmeration of mortality and morbidity; ROC = receiver operating characteristic.
|Scoring system||AUC||SE||Asymptomatic 95% CI|
|Lower limit||Upper limit|
APACHE II = Acute Physiology and Chronic Health Evaluation; AUC = area under curve; CI = confidence interval; POSSUM = Physiological and Operative Severity Score for enumeration of mortality and morbidity; o-POSSUM = oesophagogastric-POSSUM; p-POSSUM = Portsmouth-POSSUM; SE = standard error.
ROC curves for the four scoring systems used to predict postoperative mortality in patients undergoing surgical resection for gastric cancer (n = 612). ROC = receiver operating characteristic.
Rates of perioperative in-patient mortality and morbidity are important objective indices commonly used to evaluate the quality of surgical institutions. Therefore, preoperative assessment and predictions of postoperative outcomes are useful for reducing the morbidity and mortality associated with a given surgical procedure. The development of new drugs and improvements in equipment and methodologies has helped minimize the risks associated with anaesthesiological and surgical procedures. Despite this, acceptable levels of risk have not decreased significantly due to significant numbers of elderly patients and patients in poor health undergoing extensive surgical procedures. Thus, a more accurate scoring system to predict postoperative morbidity and mortality, particularly mortality, is required to facilitate optimal postoperative care for surgical patients. While the POSSUM, p-POSSUM, and o-POSSUM scoring systems are all based on the studies of patients in the United Kingdom, it has been suggested that these systems may also be suitable for patients in other countries.5 ; 17
POSSUM was originally designed for use in all general surgery cases and, therefore, also accounts for very minor complications. In cases of gastric surgery, the majority of these complications are negligible, however, accounting for these complications results in a considerable increase in the morbidity rate. In the present study, POSSUM exhibited superior performance, as the observed morbidity rate closely approximated the estimated morbidity rate (O/E ratios of 0.96 and 0.83 for elective and acute surgeries, respectively), and demonstrated reasonable discriminatory power for predicting postoperative morbidity (AUC 0.79). Ugolini et al18 reported an O/E ratio of 0.72 when using the POSSUM scoring system to predict morbidity, while another study calculated an O/E ratio closer to 1,19 suggesting the usefulness of the POSSUM scoring system in predicting postoperative morbidity.
By contrast, the POSSUM scoring system was not reliable at predicting postoperative mortality (O/E ratio of 0.39 and 0.25 in the elective and acute settings, respectively). The poor predictive value of the POSSUM scoring system with respect to mortality in these patients may reflect the original design of the model, which was based on data from general operative patients. In gastric cancer patients, POSSUM overpredicted the operative severity score, resulting in elevated risk prediction. This discrepancy may be due to the inability of the chest X-ray, electrocardiogram, and Glasgow coma scale in the scoring system to accurately reflect dangerous levels of gastric cancer surgery. Alternatively, decreases in observed risk may be a function of increasing use of minimally invasive operative techniques and improvements in perioperative care. The selection of an operation that may carry an increased risk of morbidity and mortality is a critical issue for surgeons. Westerners have often associated obesity, circulatory, and pulmonary comorbidities. Thus, European and American surgeons seldom perform D2 radical gastric cancer resections. By contrast, D2 radical gastric cancer resection has long been considered a standard surgical procedure in China and Japan. Different surgical methods influence the incidence of surgical complications. Total gastrectomy has more surgical morbidities, such as anastomotic leakage, intra-abdominal abscess, etc., as compared to subtotal gastrectomy.
In the present study, the p-POSSUM scoring system yielded a near-ideal O/E ratio of 1.08 for elective procedures and an O/E ratio of 0.54 in the acute setting. However, p-POSSUM yields significantly overpredicted result for mortality in the acute setting. As reported by Tekkis et al,20 the mortality of elderly patients and patients undergoing emergency surgery as predicted by p-POSSUM was lower than the observed mortality, whereas overestimation usually persisted for low-risk groups, such as the young and those seeking elective surgery. Although p-POSSUM was the most useful risk-prediction model for esophageal resections, it significantly overpredicted the risk associated with gastric resections.
O-POSSUM was developed specifically for gastroesophageal surgery and performed well in predicting mortality following gastric cancer surgery. In this study, the o-POSSUM scoring system calculated an O/E ratio of 1.50 for elective operations and slightly overpredicted mortality for acute surgical procedures. Higher prediction of mortality in older patients and exclusion of operative variables, such as blood loss, may have contributed to o-POSSUM overprediction of mortality, however, decreased score values may be another reason. The research data obtained close to operation time and patient serum indexes, such as renal function, haemoglobin, and leukocyte levels, were corrected after treated preoperatively. The present study confirmed operative mortality rates to be highest in elderly patients, those with high American Society of Anesthesiologists grades, and in those requiring emergency surgery.
The APACHE II scoring system consists of an acute physiology score, which includes 12 physiological measurements, as well as points for age and chronic health status. The present study did not observe a satisfactory predictive value for the APACHE II scoring system in predicting postoperative mortality in patients undergoing surgical resection for gastric cancer. The O/E ratios in the elective and acute settings were both very low, with this scoring system overpredicting mortality by 7.4% in total. The APACHE II system has been described as being flexible with no significant differences in the prediction of outcomes between elective and acute surgery.21
O-POSSUM demonstrated adequate prognostic ability, with an AUC of 0.89 in contrast with the POSSUM and APACHE II scoring systems (AUC of 0.62 and 0.63, respectively). However, improvement is still needed in the future, given that none of these scoring systems yielded an AUC value exceeding 0.9 for operations with varying levels of severity.
The advantages of these scoring systems include the requirement for simple calculations and small amounts of data. Such models are developed and validated on an international level. The POSSUM, p-POSSUM, and o-POSSUM scoring systems will remain the standard systems, as they offer the best prediction models for risk adjustments, in general, and for esophagus and gastric surgeries, specifically.
There are some limitations to the present study. First, these scoring systems do not account for cardiological findings, including acute ischaemic electrocardiographic alterations, presence of severe arrhythmias, nutritional status of the patient, or a history of recent myocardial infarction, all of which increase operative risk.22 This index of postoperative risk is adequate for intensive care unit patients, but has the limitation of requiring 24-hour surveillance.23 Second, bias may exist, as D2 lymphadenectomy was performed in patients younger than 70 years and D1 lymphadenectomy was performed in patients aged 70 years and older. The extent of lymphadenectomy would increase both morbidity and mortality for gastric cancer patients. Third, retrospective updates are a limitation of our study, as are the analyses of prospectively collected data. The findings were derived from two hospitals, however, our results must be validated by conducting similar analyses in an independent center. Gastric cancer surgery is highly heterogeneous in the extent and incidence of postoperative complications. The work presented in this study indicates that the POSSUM system, combined with complication stratification, offers a valid algorithm to analyze postoperative complications in gastric cancer patients, however, a prospective study involving a larger cohort of patients is necessary to confirm this result.
The POSSUM scoring system performed well in predicting morbidity risk following gastric cancer resection. The p-POSSUM and o-POSSUM scoring systems were identified as being better predictors of postoperative mortality relative to the POSSUM and APACHE II scoring systems. As shown by ROC-curve analysis, the o-POSSUM scoring system was able to more accurately predict postoperative mortality relative to other scoring systems. To improve surgery risk estimates, novel prediction models must be developed for risk adjustment.