Yanqi Huang, Lan He, Zhenhui Li, Xin Chen, Chu Han, Ke Zhao, Yuan Zhang,Jinrong Qu, Yun Mao, Changhong Liang1,, Zaiyi Liu1,,3
1The Second School of Clinical Medicine, Southern Medical University, Guangzhou 510515, China; 2 Department of Radiology, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China; 3Guangdong Provincial Key Laboratory of Artificial Intelligence in Medical Image Analysis and Application, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou 510080, China; 4Department of Radiology, the Third Affiliated Hospital of Kunming Medical University, Yunnan Cancer Hospital, Yunnan Cancer Center, Kunming 650118, China; 5Department of Radiology, Guangzhou First People’s Hospital, School of Medicine,South China University of Technology, Guangzhou 510180, China; 6 Department of Radiology, the Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Zhengzhou 450003, China; 7Department of Radiology, the First Affiliated Hospital of Chongqing Medical University, Chongqing 400042, China
Abstract Objective: This study aimed to establish a method to predict the overall survival (OS) of patients with stage I-III colorectal cancer (CRC) through coupling radiomics analysis of CT images with the measurement of tumor ecosystem diversification.Methods: We retrospectively identified 161 consecutive patients with stage I-III CRC who had underwent radical resection as a training cohort. A total of 248 patients were recruited for temporary independent validation as external validation cohort 1, with 103 patients from an external institute as the external validation cohort 2. CT image features to describe tumor spatial heterogeneity leveraging the measurement of diversification of tumor ecosystem, were extracted to build a marker, termed the EcoRad signature. Multivariate Cox regression was used to assess the EcoRad signature, with a prediction model constructed to demonstrate its incremental value to the traditional staging system for OS prediction.Results: The EcoRad signature was significantly associated with OS in the training cohort [hazard ratio(HR)=6.670; 95% confidence interval (95% CI): 3.433-12.956; P<0.001), external validation cohort 1 (HR=2.866;95% CI: 1.646-4.990; P<0.001) and external validation cohort 2 (HR=3.342; 95% CI: 1.289-8.663; P=0.002).Incorporating the EcoRad signature into the prediction model presented a higher prediction ability (P<0.001) with respect to the C-index (0.813, 95% CI: 0.804-0.822 in the training cohort; 0.758, 95% CI: 0.751-0.765 in the external validation cohort 1; and 0.746, 95% CI: 0.722-0.770 in external validation cohort 2), compared with the reference model that only incorporated tumor, node, metastasis (TNM) system, as well as a better calibration,improved reclassification and superior clinical usefulness.Conclusions: This study establishes a method to measure the spatial heterogeneity of CRC through coupling radiomics analysis with measurement of diversification of the tumor ecosystem, and suggests that this approach could effectively predict OS and could be used as a supplement for risk stratification among stage I-III CRC patients.
Keywords: Radiomics; tumor ecosystem; spatial heterogeneity; survival prediction; colorectal cancer
Colorectal cancer (CRC), as the third most common type of cancer, comprises comprehensive heterogeneity with respect to prognosis, which represents a major challenge to decision making at diagnosis (1). Traditional prediction of survival outcome achieved by tumor staging [American Joint Committee on Cancer (AJCC)/The Union for International Cancer Control (UICC)-TNM classification], although powerful, assumes homogeneity within each stage and fails to provide adequate prognostic information (2). There are rapidly growing insights into the extraordinarily spatial and temporal heterogeneity at different levels (genes, proteins, cells, microenvironment,tissues and organs) in CRCs (3,4). This emphasizes the unmet need for improved characterization of tumor heterogeneity for better risk stratification to enable tailored treatment regimens in the era of personalized medicine.
Computed tomography (CT) is the modality of choice for initial workup, staging, and assessment of surveillance according to the evidence-based guidelines (5). Remarkable advances in high-throughput technology have revolutionized the field of CT imaging interpretation,putting novel quantitative image analysis methods,particularly radiomics, under the spotlight (6-9). The focus of on-going research effort to apply radiomics approach in oncology has greatly expanded from primarily a diagnostic tool to a more central role to assist in individualized prognostication by reflecting underlying biological information (10-13). Promising results have been reported in several cancer settings including CRC, with encouraging data continually to emerging (10,14,15).
In traditional radiomic analysis, features were typically extracted as simple “average value” measurements of region-of-interest (ROI) or volume of interest (VOI) of the tumor (16,17). Given that accumulating observations have highlighted the microenvironmental heterogeneity as a crucial determinant of tumor prognosis, sub-regional analysis methods capable of quantifying the spatial variability in each of the radiomic features within the tumor volume may hold promise for an in-depth understanding of the variability in patient prognosis. Previous studies have shown how measurements of diversification of the tumor ecosystem can be used to quantify the spatial heterogeneity of the tumor microenvironment based on high-throughput pathology image analysis of cancer and microenvironmental cells, with implications for prognosis and management of cancers (18,19). Several ecological measures derived through the integration of ecological statistics with an automated computational pathology approach have been proposed (18,20,21). For example, one of the proposed ecological measures, the ecosystem diversity index (EDI), was computed by examining the global differences in cellular diversity among local regions using unsupervised Gaussian mixture clustering of the regional diversity features and successfully predicting outcomes among patients with high-grade breast cancer(18). Inspired by the successful application of the above ecological method of analyzing microenvironmental spatial heterogeneity at the micro scale in digital pathology, it is of interest to extend its application using sub-regional analysis at the macroscopic level for intra-tumor heterogeneity assessment in radiomics. We hypothesized that combining high-throughput sub-regional analysis with ecological measures to derive a comprehensive picture of the variability of tumor composition along the spatial dimensions may be effective to fully release the prognostic potential of CT images. However, to the best of our knowledge, there are no established methods to define such quantitative measures and demonstrate their clinical relevance. Accordingly, this study aimed to establish an approach to quantifying tumor spatial heterogeneity through coupling sub-regional radiomics analysis with measurement of diversification of the tumor ecosystem based on CT images and predict OS in patients with CRC.
The study program was approved by the Research Ethical Committee and Institutional Review Board of the Guangdong Provincial People’s Hospital [No.GDREC2017031H(R1)], with requirement of informed consent waived. Consecutive patients who were pathologically confirmed with stage I-III CRC and who underwent radical primary resection were retrospectively identified. Those with 1) unavailable preoperative CT images; 2) multiple primary malignancies; 3) neo-adjuvant therapy performed; 4) hereditary non-polyposis CRC or familial adenomatous polyposis; or 5) contaminant local side effects (perforation, bleeding and abscess formation)were excluded. The final patient cohorts included a training cohort of 161 eligible patients treated at Guangdong Provincial People’s Hospital between March 1,2009 and August 31, 2012, and a temporary validation cohort of 248 patients treated at Guangdong Provincial People’s Hospital between September 1, 2012 and December 31, 2014. An external validation was performed using a separate cohort of 103 patients collected at Yunnan Cancer Hospital between January 1, 2013 and April 30,2015.
The outcome of interest in this study was overall survival(OS), defined as the time from surgery to death due to any cause. Patients were postoperatively followed up with abdominal CT every 6-12 months for the first 2 years and then annually, according to the follow-up protocol of Guangdong Provincial People’s Hospital. The follow-up duration was measured from the time of surgery to the last follow-up date, with information regarding the survival status at the last follow-up recorded. Patients were followed up to ensure a minimum of 36 months’ follow-up after the surgery.
Baseline information was collected, including demographic information (age and sex) and the TNM stage. The 7th edition of AJCC TNM staging manual was used to classify the patients.
Source code of proposed method can be found in the following Github repository, https://github.com/helan0811/Radiomics-project_CRC. Imaging data and clinical information are not publicly available for patient privacy purposes, but are available from the corresponding author upon reasonable request.
All patients underwent contrast-enhanced abdominal CT using one of the two multi-detector row CT (MDCT)systems (GE Lightspeed Ultra 8; GE Healthcare, Hino,Japan or 64-slice LightSpeed VCT, GE Medical systems,Milwaukee, WI, USA). The acquisition parameters are as follows: 120 kV; 160 mAs; 0.5- or 0.4-second rotation time;detector collimation: 8×2.5 mm or 64×0.625 mm; field of view, 350 mm × 350 mm; matrix, 512×512. After routine non-enhanced CT, arterial and portal venous-phase contrast-enhanced CT was performed after 22 s and 60 s delay following intravenous administration of 90-100 mL of iodinated contrast material (Ultravist 370, Bayer Schering Pharma, Berlin, Germany) at a rate of 3.0 or 3.5 mL/s with a pump injector (Ulrich CT Plus 150, Ulrich Medical, Ulm, Germany). Contrast-enhanced CT was reconstructed with reconstruction thickness of 2.5 mm.Portal venous-phase CT images were retrieved from the picture archiving and communication system (PACS)(Carestream, Canada) for image feature extraction because of well differentiation of tumor tissue from adjacent normal bowel wall.
The in-house MATLAB-based feature extraction algorithm was applied to all patients, following the definitions and guidelines from the Imaging Biomarkers Standardization Initiative (IBSI) (22).
Inspired by the concept of EDI proposed by Natrajanet al.(18)., we extracted EcoRad features to measure intratumoral spatial heterogeneity based on pre-treatment contrast-enhanced CT images (Figure 1). First, the tumor contour was delineated to obtain VOI. Then, on each slice,the tumor region was divided into multiple nonoverlapping sub-regions of n × n pixels. A total of 45 radomics features, including gray-level histogram features and gray-level co-occurrence matrix (GLCM) features,were then extracted from each sub-region (Supplementary materials). Gaussian mixture model (GMM) unsupervised clustering in an unbiased manner was used to identify the optimum number of clusters (K) of each radiomics feature in the whole tumor volume (23). The GMM unsupervised clustering was given by
whereD=d1,d2,···,dnrepresents the image features forregions in a tumor.,andwere the mean,variance, and weight of a Gaussian distribution.K, the number of clusters, determined using the Bayesian information criterion, was designated as the “EcoRad feature”. The range of K values was set from 1 to 5 to avoid an insufficient sample size per cluster. In this manner, 45 EcoRad features (Supplementary Table S1) that could provide side information regarding the spatial distribution of each radiomic feature within the tumor volume, thereby reflecting the degree of heterogeneity in the whole tumor volume, were derived in our study.
Figure 1 The workflow showing EcoRad feature extraction.
A detailed approach to selecting the optimal sub-region size is described inSupplementary materials. The correlation between each of the K values and the number of sub-regions was calculated using the Pearson correlation coefficient to test if the K value was independent from the number of sub-regions (Supplementary materials).
As a benchmark for the utility of the EcoRad features, we also extracted 45 volume-averaged radiomics features using the same calculation algorithm, as presented inSupplementary materials.
Given that the delineation of the tumor and normal bowel wall could be challenging, we assessed the reproducibility of the EcoRad features extraction. We randomly selected 50 patients for reproducibility analysis by using intra- and inter-correlation coefficients (ICCs). For the assessment of inter-observer agreement of the EcoRad features, two radiologists with 15 years (Reader 1) and 10 years (Reader 2) of experience in abdominal CT interpretation performed the VOI delineation for EcoRad features extraction, in a blind fashion, respectively. Reader 1 repeated the delineation procedure with an interval of two weeks to assess the intra-observer agreement.
The least absolute shrinkage and selection operator(LASSO) Cox method, suitable for the regression of highdimensional data, was used to select the most relevant EcoRad features based on the association between each feature and OS in the training cohort. The selected features were constructed into an EcoRad signature, via a linear combination weighted by their respective coefficients. Forty five volume-averaged radiomics features were also subjected to LASSO Cox analysis to select relevant features, as to construct a volume-averaged radiomics signature.
The potential association of the EcoRad signature with OS was first assessed in the training cohort and then validated in the external validation cohort 1 and the external validation cohort 2 by using Kaplan-Meier survival analysis.Patients were classified into high-risk and low-risk groups according to the EcoRad-score, with the optimal cut-off for dichotomizing the EcoRad-score identified by the timedependent relative operating characteristic (ROC) analysis at 3 years. The same fixed threshold was then applied to the validation cohorts. Survival curves were assessed using the log-rank test.
The discrimination performance of the EcoRad signature, as a continuous variable, for OS prediction was assessed with respect to the Harrell’s C-index and integrated area under the ROC curve (iAUC), and the time-dependent area under the ROC curve (tAUC) was presented (24-26).
The evaluation of the EcoRad signature as an independent marker was performed by integrating candidate risk factors into the multivariable Cox proportional hazards analysis using a backward stepwise approach. Hazard ratios (HRs) and 95% confidence intervals (95% CI) were computed to determine the correlation with OS. Candidate variables adjusted in the multivariable models include age, gender, T stage and N stage.
To demonstrate the incremental value of the EcoRad signature to the traditional staging system for individualized prediction of OS, a reference model was built based on independent clinicopathological factors in the above multivariate survival analyses, followed by the construction of a radiomics model to integrate the EcoRad signature with independent clinicopathological factors in the training cohort.
The incremental value of the EcoRad signature was first assessed by comparing the discrimination performance of the radiomics model and reference model, with respect to the Harrell’s C-index and iAUC, and the tAUC was presented. The two-tailed Student’sttest for dependent samples was used to compare two Harrell’s C-indices. The calibration of the radiomics model and reference model was tested using the Gr?nnesby and Borgan goodness-of-fit statistic by comparing the observed and expected events based on the Kaplan-Meier 3-year estimates, with smaller values of the test statistic (i.e. non-significant P values)suggesting better calibration (27). Calibration performance of the two models was also visualized by using the calibration curves.
To quantify the improvement of usefulness offered by the EcoRad signature intuitively, a net reclassification improvement (NRI) calculation was also applied. NRI, by assessing the correctness of movement of predicted probabilities, could quantify the improvement in performance provided by the predictor to an existing model.
We also used the decision curve analysis to evaluate the net benefits of the prediction models, which assesses the benefits of correctly detecting people who will have an event compared with the harms from a false-positive classification. The net benefit of a prediction model at a given risk threshold is calculated by calculating the difference between the proportion of true positives and the proportion of false positives multiplied by the odds of the risk. The net benefits across a range of threshold probabilities were calculated and compared with alternative strategies such as not treating anyone or treating everyone.The radiomics model that integrates the EcoRad signature was then visualized as a nomogram.
The median and interquartile range (IQR) were reported for the continuous variables. Frequency and percentage were reported for categorical variables. The reported statistical significance levels were all two-sided, with statistical significance set at 0.05. Statistical analyses were conducted using R software (version 3.4.1; R Foundation for Statistical Computing, Vienna, Austria). The LASSO Cox regression analysis was done using the “glmnet”package. The ROC analyses were done using the“survivalROC” package. The multivariable Cox regression analysis, calibration and nomogram plots were done with the “rms” package. The calculation of the C-index was performed with the “Hmisc” package, and the calculation of NRI was performed with the “survIDINRI” package.The calculation of the iAUC and tAUC was performed with the “risksetROC” package. The acquisition of the decision curve analysis (DCA) was performed with the“stdca.R” package.
Detailed patient characteristics for the training cohort,external validation cohort 1, and external validation cohort 2 are listed inTable 1. The median follow-up time was 100(IQR, 83-111) months for the training cohort, 64 (IQR,49-71) months for the external validation cohort 1, and 59(IQR, 51-70) months for the external validation cohort 2.
A sub-region size of 4 × 4 was utilized in our study based on the results described inSupplementary Table S2. In addition,the correlation mapping (Supplementary Figure S1) showed that the K values were independent of the number of subregions, as determined by the Pearson’s correlation coefficient (allr<0.75).
There were no statistically significant differences in the EcoRad features extraction between Reader 1 and Reader 2(inter-observer ICCs ranged from 0.713 to 0.980; all P<0.01), nor between Reader 1’s twice extraction (intraobserver ICCs ranged from 0.711 to 0.971; all P<0.01).
Nine EcoRad features were selected using the LASSO Cox regression method to construct the EcoRad signature(Supplementary Figure S2). The formula of the EcoRad signature was as follow:
The EcoRad-score was calculated for each patient using the above formula. An optimal cut-off of the EcoRad-score was determined based on the time-dependent ROC analysis(Figure 2). As shown inFigure 2, the point with the maximal sensitivity and specificity was selected as the cutoff point (EcoRad-score=0.291), and accordingly the patients were divided into two risk groups, the low-score group as the low-risk gourp (EcoRad score≤0.291) and high-score group as the high-risk group (EcoRad score>0.291). The same threshold was then applied to thevalidation cohorts.
Table 1 Distribution of EcoRad score and clinicopathological characteristics
Figure 2 Time-dependent receiver-operating characteristic curve at 3 years in the training cohort.
The 3-year OS and 5-year OS were 97.9% (95% CI,95.0%-100%) and 94.7% (95% CI, 90.3%-99.3%) for the low-score group, and 68.7% (95% CI, 58.4%-80.7%) and 59.7% (95% CI, 49.0%-72.7%) for the high-score group in the training cohort (unadjusted HR: 6.670, 95% CI:3.433-12.956; P<0.001;Figure 3A), respectively. Similar results suggesting that lower EcoRad-scores were associated with improved OS were also observed in the external validation cohort 1 (HR: 2.866, 95% CI:1.646-4.990; P<0.001;Figure 3B) and external validation cohort 2 (HR: 3.342, 95% CI: 1.289-8.663; P=0.002;Figure 3C).
For the discrimination performance, the EcoRad signature provided a C-index of 0.749 (95% CI:0.738-0.760) in the training cohort, 0.683 (95% CI:0.674-0.692) in the external validation cohort 1, and 0.716(95% CI: 0.694-0.738) in the external validation cohort 2.The iAUC was 0.734 in the training cohort, 0.725 in the external validation cohort 1 and 0.735 in the external validation cohort 2. The tAUC of the EcoRad signature is shown inFigure 4.
Unexpectedly, no relevant volume-averaged features could be selected to build a volume-averaged radiomics signature (Supplementary Figure S3). Therefore, univariate Cox regression analysis was also conducted to assess potential associations between each single EcoRad feature/each single volume-averaged feature and OS, and the results also showed that three EcoRad features were significantly associated with OS, with other two showing marginal significance, while none of the volume-averaged features reached statistical significance (Supplementary materials,Supplementary Table S3).
Multivariate analysis revealed additional prognostic value of the EcoRad signature given by any of the clinicopathologic variables tested including age, sex, T stage and N stage. As shown inTable 2, HRs of the EcoRad signature for survival were higher than those of the other independent risk factors (T and N stage). The reference model integrated only the T and N stages.
Figure 3 Kaplan-Meier curves for overall survival according to the EcoRad signature in (A) training cohort (unadjusted HR: 6.670, 95%CI: 3.433-12.956; P<0.001); (B) external validation cohort 1 (HR: 2.866; 95% CI: 1.646-4.990; P<0.001); and (C) external validation cohort 2 (HR: 3.342, 95% CI: 1.289-8.663; P=0.002). Patients were stratified into high EcoRad-score and low EcoRad-score group according to individual EcoRad-score based on the cut-off value 0.291 identified in the training cohort. HR, hazard ratio; 95% CI, 95% confidence interval.
Figure 4 Time-dependent AUC measured from one month to five years at 1-month intervals to reflect the prediction performance of the radiomics model and reference model at each time point in (A) training cohort; (B) external validation cohort 1; and (C) external validation cohort 2. A reference model was developed with only the T stage and N stage, and the radiomics model was constructed to integrate the EcoRad signature.
Table 3summarizes the discrimination performance of the radiomic model and the reference model, with respect to C-index, 3-year AUC, 5-year AUC and iAUC. The radiomics model showed better discrimination performance than the reference model in terms of the C-index (0.813,95% CI: 0.804-0.822vs.0.755, 95% CI: 0.745-0.765,P<0.001 in the training cohort; 0.758, 95% CI:0.751-0.765vs. 0.684, 95% CI: 0.677-0.691, P<0.001 in the external validation cohort 1; and 0.746, 95% CI:0.722-0.770vs.0.683, 95% CI: 0.662-0704, P<0.001 in the external validation cohort 2) and iAUC (0.841vs.0.763 for the training cohort; 0.785vs.0.664 for the external validation cohort 1; and 0.790vs.0.669 for the external validation cohort 2). tAUCs that demonstrate the improvement in OS prediction model after adding EcoRad signature to the reference model could be depicted inFigure 4.
As for the calibration performance, Gr?nnesby and Borgan tests suggested that both models demonstrated good calibration (radiomics model: training cohort χ2=2.593, P=0.107; external validation cohort 1 χ2=2.911,P=0.088; external validation cohort 2 χ2=0.231, P=0.631;reference model: training cohort χ2=0.445, P=0.505;external validation cohort 1 χ2=0.451, P=0.502; external validation cohort 2 χ2=0.150, P=0.699). The calibration plots showed that both the radiomics model and referencemodel showed good agreement in each patient cohort between the model prediction and actual observation for the 3-year OS (Figure 5). The integration of the EcoRad signature to the reference model showed improved reclassification performance (NRI: 0.341, 95% CI:0.098-0.653, P=0.007 in training cohort; 0.401, 95% CI:0.190-0.591, P<0.001 in external validation cohort 1;0.470, 95% CI: 0.041-0.701, P<0.033 in external validation cohort 2). The decision curve analysis showed that the radiomics model provided a range of risk thresholds that yielded a positive net benefit compared with the reference model, as shown inFigure 6.
Table 2 Association of predictors with overall survival in training cohort
Table 3 Prediction performance of radiomics model and reference model in all patient cohorts
The radiomics model that integrates the EcoRad signature was visualized as a nomogram (Figure 7), which could be used as an easy-to-use tool for probability scoring for the 3-year OS.
In this study, we established a method to predict OS through coupling radiomics analysis of CT images with the measurement of tumor ecosystem diversification in patients with stage I-III CRC. The derived marker, EcoRad signature, was identified as an independent prognostic predictor of OS and could serve as a good supplement to the traditional TNM staging system for risk stratification in patients with CRC.
Figure 5 Calibration curves for predicting overall survival (OS) at each time point in (A) training cohort; (B) external validation cohort 1;and (C) external validation cohort 2. The signature/model-predicted OS is plotted on the x-axis; the actual OS is plotted on the y-axis. A plot along the 45-degree line would indicate a perfect calibration model in which the predicted probabilities are identical to the actual outcomes. The radiomics model showed the best agreement between predictions and observed outcomes compared with that of the reference model and the EcoRad signature.
Figure 6 Decision curve analysis at 3 years of overall survival for the radiomics model and reference model in (A) training cohort; (B)external validation cohort 1; and (C) external validation cohort 2. The vertical axis shows the net benefit. The horizontal axis displays the correspondence between the risk threshold probabilities.
Figure 7 Visualization of the radiomics model as a prognostic nomogram.
Interest in radiomics analysis leveraging CT images has led to substantial efforts in prognostic image marker identification in the field of oncology. Through univariable analysis and subsequent multivariable analysis, we identified the CT-based EcoRad signature as an independent prognostic factor for OS in patients with CRC. These findings were in high concordance with previous reports that have revealed the potential of radiomics analysis on conventional standard-of-care CT scans for risk stratification in patients with CRC (14,28-32). However, challenges that need to be overcome before the clinical translation of these preliminary reports could not be overlooked, including issues regarding the univariable analytic design to identify the association between individual features and patient survival (33-36). As demonstrated in the results, instead of reporting univariable associations of radiomic features with survival outcomes, we constructed an EcoRad signature by integrating a panel of selected features as a prognostic marker that demonstrated adequate prediction performance for patient survival. This is in line with the report by Daiet al., who successfully developed a radiomics signature consisting of 13 volume-averaged features to predict OS in stage I-III colon cancer (29). Although good performance of the developed signature in predicting OS has been demonstrated (AUC: 0.768; 95% CI,0.745-0.791), no external validation sets were used to substantiate their prognostic performance in the above study. In conducting this analysis, we validated the prognostic performance of the derived EcoRad signature using external validation cohorts to guarantee its repeatability and reliability, which is therefore another point that strengthens its clinical relevance. Another recent study was conducted on stage II-III colon cancer to identify surrogate markers of OS, with results suggesting that the CT-based radiomics marker could serve as a useful supplement for individualized risk stratification (28). In the above study, a CT-based radiomic nomogram was constructed for survival prediction, yielding a C-index of 0.780 (95% CI: 0.734-0.847) in the training cohort and 0.678 (95% CI: 0.622-0.768) in the external validation cohort, which was lower than the prediction results obtained in our study, although disease-free survival was used as the main endpoint rather than OS. Besides, unlike the above studies (28,29) in which two-dimensional (2D)ROIs were segmented along the lesion contour in the largest cross-sectional area in a single image slice, our study delineated the VOIs in all layers to reconstruct the 3D morphology of the tumor, as a volumetric analysis could provide a more global picture of the tumor microenvironment, making the analysis more stable and representative than 2D analysis.
Typically, region- or volume-averaged features are extracted and linked to available data on survival outcomes(37). Radiomic analyses can also be performed on subregions of the tumor (habitats). Indeed, there is an ongoing effort to delineate tumor heterogeneity, such as the“habitat” imaging, which was obtained by the combination of multiparametric MR images to create one set of phenotypic micro-environmental heterogeneity maps based on their similarities and differences (clustering of voxels with specific colors produces regions that reflect different physiologic microenvironments) (38). The quantification of between-sub-region diversity in tumors using CT images,however, remains relatively unexplored. This study reports our preliminary experience on coupling radiomics analysis of CT images with the measurement of tumor ecosystem diversification using sub-regional analysis. Recent developments in computer vision have enabled ecological statistics to be directly applied to digital pathology,providing quantitative spatial heterogeneity measures that are predictive of prognosis in breast cancer (18,21,39).While pathomics provides quantitative information at the micro scale, radiomics enables quantification of tissue anatomic and functional heterogeneity at the macroscopic level. Therefore, inspired by the successful application of the ecological method of analyzing microenvironmental spatial heterogeneity at the micro scale in digital pathology,this study extends its application using sub-regional analysis at the macroscopic level for intra-tumor heterogeneity assessment in radiomics. Sub-regional EcoRad features were generated in our approach to quantify “betweenregion” diversity, instead of extracting simple “average value” measurements within the tumor to describe “inregion” diversity. A low value of the EcoRad feature indicates a relatively homogeneous distribution pattern,whereas a high value suggests a diverse pattern. As a benchmark for the utility of the EcoRad features, we also extracted 45 volume-averaged radiomics features as a benchmark for the utility of the EcoRad features.Unexpectedly, none of the volume-averaged features could be selected to build a volume-averaged prognostic radiomics signature. Univariate Cox regression analysis showed that five EcoRad features were associated with OS,while no volume-averaged features reached statistical significance. These findings suggest that, with our subregional analysis approach, CT data could be better used to depict tumor spatial heterogeneity while enabling survival prediction. Moreover, like other omics technologies,radiomics analysis has enabled the high-throughput measurement of thousands of features. Although the emergence of related data-driven markers is promising for clinical management, the generalizability of these markers has been challenged because of the overfitting problem,since it is methodologically fraught to identify robust markers when the number of features is far larger than the study population (7,8). In our study, we extracted smallscale EcoRad features. This will reduce variability to ensure broad application. Our study provides empirical evidence on how radiological images could be mined through coupling radiomics analysis of CT images with ecological measures at the macroscopic level to reveal prognostic information.
Taking a step forward, integrating the EcoRad signature provided incremental prognostic value to the current staging system of CRC. Similarly, previous studies found that incorporating radiomics data into the prediction model showed incremental value to the TNM staging system for individual survival estimation (15,40,41). Altogether, these findings suggest that the radiomics data could be used as a supplement to current clinical risk stratification over existing staging systems. Potential clinical utility is another important factor for evaluating the value of prognostic model. In this study, we conducted decision curve analysis for OS prediction at 3-year for the radiomics model and the reference model, as illustrated inFigure 6. The DCA curves suggested that the radiomics model adds net benefit compared with the “treat all” or “treat none” strategy. In addition, the radiomics model brought significantly more benefit (a positive net benefit) than the reference model across most of the range of the threshold probability,indicating that the integration of the radiomics signature with the traditional staging system into a prediction model provides notably more clinical utility.
This study has a few limitations. First, clinical considerations, including information regarding treatment after the surgery, need to be considered when integrating the prediction model into the clinic. Hence, a large-scale,independent, prospective validation cohort with treatment information available is warranted to assess the generalizability of the reported findings. Second, manual segmentation was performed to extract tumor regions from CT images, which are prone to inter-reader variability and are much more time consuming. Further effort should be made to develop improved segmentation tools that can provide fast, accurate, and less time-consuming workflows.
This study constructed an EcoRad signature that could predict OS and could be used as a supplement to the TNM system for individualized risk stratification in stage I-III CRC patients. This suggests that coupling CT radiomics analysis with measurement of diversification of the tumor ecosystem facilitates the identification of markers that may enable better understanding of tumor heterogeneity and pave the way for precision medicine.
This study was supported by the National Key R&D Program of China (No. 2021YFF1201003); the Key R&D Program of Guangdong Province, China (No.2021B0101420006); the National Science Fund for Distinguished Young Scholars (No. 81925023 and 82071892); the National Natural Science Foundation of China (No. 81771912 and 82071892); and the National Natural Science Foundation for Young Scientists of China(No. 81701782 and 81901910).
Conflicts of Interest: The authors have no conflicts of interest to declare.
Supplementary materials
Radiomics feature extraction algorithms
Three Laplacian of Gaussian filters (filter parameter=1.0, 1.5, 2.0, 2.5, respectively) were applied to retrieve quantitative values for the following radiomics features within each sub-region. Values of these radiomics features were also obtained without filtration (filter parameter=0).
The Laplacian of the Gaussian filter (?2G) distribution is given by
x,ydenote the spatial coordinates of the pixel, andσis the value of the filter parameter.
A total of 45 radiomic features were extracted, including gray-level histogram features and gray-level co-occurrence matrix(GLCM) features.
Gray-level histogram features
X(i) indicates the intensity of the gray leveli, andNdenotes the sum of the pixels in the image.
Gray-level co-occurrence matrix (GLCM) features
A matrixP(i,j) indicates the relative frequency with the intensity values of two pixels (iandj) at the distances ofi=1 and inj=0o direction. Ngis the number of discrete intensity levels in the image;x,ydenotes the spatial coordinates of the pixel.μ,μx(i),μy(j) is the mean ofP(i,j),Px(i),Py(j), andσx(i),σy(j) is the standard deviation ofPx(i),Py(j), respectively.
Optimization for choice of sub-region size
Region of interest (ROI) on each slice was divided into multiple non-overlapping square sub-regions with equal sizes (n×npixels). Size of sub-region needs to be small enough to ensure sufficient numbers of sub-region for the following clustering process, yet sufficiently large to measure intratumoral heterogeneity. Therefore, the optimal value ofnwas explored within the range from 3 to 10.We used 3 as the minimum value ofnconsidering that it is the minimum unit area to ensure accurate calculation of features, and 10 as the maximum value as to avoid small samples of sub-regions in each cluster. To select the optimal sub-region size, the performance of EcoRad signature based on different sub-region sizes for survival stratification(high-riskvs.low-risk) with respect to the Harrell concordance index (C-index).
Correlation between each of K values and number of sub-regions
The correlation between each of K values and number of sub-regions was calculated using the Pearson correlation coefficient.
Univariable analyses to assess association between each radiomics feature and overall survival
Univariable analyses to assess the association between each single EcoRad feature/each single volume-averaged feature and OS were conducted with the Cox proportional hazards model, and results are listed inSupplementary Table S3.
Figure S1 Correlation map between each of K values and the number of sub-regions. K values were independent from the number of subregions, as determined by the Pearson’s correlation coefficient (all r<0.75). The brighter the red (blue) color, the higher (lower) the Pearson’s correlation coefficient value.
Figure S2 EcoRad feature selection using LASSO Cox regression model. (A) Tuning parameter (λ) selection in LASSO model used 10-fold cross-validation via minimum criteria. The partial likelihood deviance was plotted versus log (λ). The dotted vertical lines were drawn at the optimal values using the minimum criteria and the 1-SE criteria. A value λ of 0.0399 with log (λ)=-3.22 was chosen (the minimum criteria)according to ten-fold cross-validation; (B) LASSO coefficient profiles of 45 EcoRad features. A coefficient profile plot was produced against the log-lambda sequence. The vertical line was drawn at the value selected using ten-fold cross-validation, where the optimal λ resulted in 9 non-zero coefficients. LASSO, the least absolute shrinkage and selection operator.
Figure S3 Volume-averaged radimics feature selection using LASSO Cox regression analysis. (A) Tuning parameter (λ) selection in LASSO model used 10-fold cross-validation via minimum criteria. The partial likelihood deviance was plotted versus log (λ). The dotted vertical lines were drawn at the optimal values using the minimum criteria and the 1-SE criteria. A value λ of 0.0588 with log (λ)=-2.834 was chosen(the minimum criteria) according to ten-fold cross-validation; (B) LASSO coefficient profiles of 45 EcoRad features. A coefficient profile plot was produced against the log-lambda sequence. The vertical line was drawn at the value selected using ten-fold cross-validation, where the optimal λ resulted in no non-zero coefficients. LASSO, the least absolute shrinkage and selection operator.
Table S1 Calculation of 45 radiomic features within each square in EcoRad feature extraction
Table S2 Performance of EcoRad signature based on different sub-region sizes for OS prediction
Table S3 Univariable analyses of EcoRad feature and volume-averaged feature by Cox proportional hazards model
Chinese Journal of Cancer Research2022年1期