Quirino Lai, Gabriele Spoletini, Gianluca Mennini, Zoe Larghi Laureiro, Diamantis I Tsilimigras, Timothy Michael Pawlik, Massimo Rossi
Abstract
Key Words: Deep learning; Artificial neuronal network; Recurrence; Liver transplantation; Resection; Hepatocellular cancer
Hepatocellular carcinoma (HCC) is the most common primary liver malignancy and the third most common cause of cancer-related death worldwide. Surgery, in the form of liver transplantation and resection, is the mainstay of treatment as the only potentially curative treatment option. Ablation has emerged as an alternative treatment to resection for small tumors. In contrast, intra-arterial treatments and chemotherapy can offer disease control and be used as part of a multimodal therapeutic strategy[1].
Many factors affect survival following the treatment of HCC. Among them, we can consider background liver condition, radiologic and histologic characteristics of the tumor, biologic markers, and comorbidities.
Traditionally, conventional linear models, such as the survival analysis and the Cox proportional hazard models, have been used to evaluate the prognosis of HCC[2-4]. Nevertheless, linear systems can have considerable limitations and often fail to capture the complexity of the interactions among clinicopathological characteristics[5]. With the intent to overcome such constraints, artificial intelligence (AI) has been employed with growing interest in healthcare research during the last decade, in particular applying deep learning (DL) techniques in artificial neural networks (ANN)[6]. ANN is a mathematical model that resembles the structure and function of a biological neural system using computer technology. It consists of a highly interconnected set of units, beginning with an input layer (the data to be analyzed), one or more hidden layers that process the data, and an output layer that provides the outcomes. The peculiarity of ANN is that it can be trained by exposing the network to examples of input/output pairs, thus improving its reliability[7]. During DL, the model reassigns a different weight to the connections within each hidden layer. ANN can learn from errors by comparing any generated output with desired outputs. The error is backpropagated, and the existing weights between connections are modified accordingly. Once learning is complete, ANN can create connections and make predictions on datasets that have not been observed before.
AI has been used to build models to predict a variety of outcomes related to HCC, such as tumor diagnosis, pathology characteristics, response to treatment, and survival[7,8]. With the growing availability of big data from fields such as genomics, AI can unravel otherwise hidden connections between tumor elements because of the increasing computational power of modern technology[9].
The objective of the current study was to systematically review the application of AI and DL in the prediction of survival among patients who were treated for HCC, as well as compare the performance of AI methods relative to linear prediction models.
A systematic review of the published literature focused on the prognostic impact of AI in the management of HCC was undertaken. The search strategy was performed following the Preferred Reporting Items for Systemic Reviews and Meta-Analysis (PRISMA) guidelines[10].
The specific research question formulated in the present study includes the following PICO components: (1) Patient: Patient with a confirmed HCC; (2) Intervention: Evaluation of HCC treatment using AI; (3) Comparison: Evaluation of HCC treatment without using AI; and (4) Outcome: Patient death and/or tumor recurrence. A search of the PubMed and Cochrane Central Register of Controlled Trials Databases was conducted using the following terms: (Artificial intelligence OR deep learning) AND (HCC OR hepatocellular carcinoma OR hepatocellular cancer). The search period was from "1985/01/01" to "2020/02/29".
The systematic qualitative review included only English studies that included human patients. Published reports were excluded based on several criteria: (1) Data on animal models; (2) Lacked enough clinical details; and (3) Had non-primary source data (e.g., review articles, non-clinical studies, letters to the editor, expert opinions, and conference summaries). In the case of studies originating from the same center, possible overlapping of clinical cases was examined, and the most informative study was considered eligible.
Following a full-text review of the eligible studies, two independent authors (Lai Q and Larghi Laureiro Z) performed the data extraction and crosschecked all outcomes. During the selection of articles and extraction of the data, potential discrepancies were resolved following a consensus with a third reviewer (Mennini G). Collected data included the first author of the publication, year of publication, country, number of reported cases, research question/purpose, the method used, and key findings.
Selected studies were systematically reviewed with the intent to identify potential sources of bias. The quality of the papers was assessed using the Risk of Bias In Nonrandomized Studies of Interventions tool[11].
The PRISMA flow diagram schematically depicts the article selection process (Figure 1). Among the 598 articles screened, a total of 127 studies reported on the use of AI in HCC. Among these articles, only 9 (7.1%) studies referred to the use of AI in the prediction of survival among patients with HCC and were included in this review[12-20]. Other studies using AI in HCC were excluded; specifically, these studies reported on the use of AI for the diagnosis of the tumor (n= 76, 59.8%), identification of specific genes or pathways (n= 17, 13.4%), prediction of tumor response after therapy (n= 16, 12.6%), and the prediction of pathological aspects (n= 9, 7.1%) (Figure 2). All studies included in the analytic cohort were published in the last decade except for one that was published in 1995[12]. All articles were from Asia; five studies were based on a population from Taiwan[13-17], two from China[18,20], one from Japan[12], and one from India[19].
Figure 1 Preferred Reporting Items for Systemic Reviews and Meta-Analysis flowchart of the literature search and study selection.
Results from the qualitative assessment of the included studies are depicted in Figure 3. Six studies had a low risk of bias, while two studies were at high risk for bias, mainly due to the presence of potential confounders. In one study, due to the absence of clear data explaining the characteristics of the comparison groups, the risk of bias was unclear.
Data extracted from the nine eligible articles are reported in detail in Table 1. The largest studies were based on the same population of patients coming from the Taiwan Bureau of National Health Insurance. All patients had a diagnosis of a malignant neoplasm of the liver and underwent a hepatectomy between 1998-2009 (n= 22926)[14,15]. In all other studies, the sample size was smaller than 1000 cases, and in two cases, the sample size was smaller than 100[12,17].
The use of ANN in populations of patients who underwent surgery was reported in six articles[12-16,18]. The outcomes investigated included in-hospital postoperative mortality[14], long-term overall survival[12,15,16,18], and disease-free survival after hepatic resection[13]. Several other studies used different AI systems rather than ANN. Specifically, a support vector machine was used for the development of predictive models relative to the recurrence of HCC following radiofrequency ablation[17]. Besides, an Artificial Plant Optimization algorithm was used to assess the effectiveness and efficiency to predict HCC recurrence[19]. Peritumoral radiomics was used to predict early recurrence after HCC curative-intent resection or ablation[20].
A cohort was used in the majority of studies to train the AI network[12-16,18,20]; in one study, a double five-fold cross-validation loop method was adopted[17]. In all studies, AI demonstrated superior predictive performance compared with other traditional models. In several studies, the ANN outperformed logistic regression or Cox regression models[13-16,18]. In all cases, the prediction accuracy of the AI models expressed as the areas under the curve was significantly improved compared with traditional statistical techniques[13-16,18].
Table 1 Articles focused on the role of artificial intelligence in the prediction of survival
Figure 2 Different articles exploring the impact of artificial intelligence as diagnostic or prognostic tool in the setting of hepatocellular carcinoma management. AI: Artificial intelligence; HCC: Hepatocellular carcinoma; LRT: Locoregional therapy.
Figure 3 Results of the Risk of Bias In Non-randomized Studies of Interventions tool for the extracted articles.
The use of AI in healthcare began in the early 1970s and has gained increased acceptance over the last decades. In particular, the development of AI in medical research and its clinical applications have gained popularity, in part because of the widespread use of AI in almost all fields of human life[21]. The current literature search revealed that many AI studies focused on diagnosis, and the application of AI to distinguish the radiological features of HCC. The identification and diagnostic discrimination of benignvsmalignant liver masses has been the objective of a previous systematic review that noted AI could differentiate liver cancer and, in particular, HCC from other lesions better compared with other methods such as Bayesian models and expert radiologists image inspection[8]. The present systematic review is important because it is the first to summarize the ability of AI systems to predict patient survival following treatment of HCC. Our results revealed that different types of AI methods have been employed in the existing studies with heterogeneous patient sample sizes. The majority of the included studies (n= 6/9) utilized ANN for the analysis of predictors of post-treatment survival, which is in line with the results of other systematic reviews on the prediction of outcomes[22,23]. Considering the need for more accurate prediction, investigators have compared AI techniques with traditional linear models to optimize treatment decision-making. Although several prediction models have utilized both pre- and postoperative variables, these models have not proved useful in clinical decision-making since they require information that can only be available after resection or other treatment. In contrast, models with only preoperative variables can help guide treatment strategies in the preoperative setting[24,25].
Importantly, our systematic review revealed that the prediction of survival using AI methodology was highly accurate and remained robust in studies with limited sample sizes, although current knowledge in prediction modeling using AI has noted that AI performs better when applied to larger sample sizes[26]. Although the reason for the consistent high predictive accuracy of AI models is multifactorial, the complexity of AI models (e.g., a higher number of events per variable) further reinforces the superiority of their performance, which might explain the outstanding results even when used in smaller size studies[27].
Reproducibility and applicability of AI models in clinical practice and across different centers might be questioned due to the difficulties in acquiring and utilizing a dedicated software to process the data. In addition, as ANN learns from examples, one may argue that ANN needs to be trained before it can be applied to varying datasets that are different from the one it was initially built on. Nevertheless, what emerged from this systematic review was that AI could be an outstanding adjunct to conventional linear systems of analysis to predict post-treatment survival. Cucchettiet al[7]made their ANN available online so that other centers can test and possibly enrich their model aiming to predict HCC tumor grade and micro-vascular invasion preoperatively. Besides, when applied to other aspects of HCC, AI is particularly useful for exploring interconnections of big data such as in genomics. ANN combined with genotyping for microsatellite mutations/deletions was able to predict HCC recurrence after liver transplantation with an 85% accuracy in the center where the model was developed, and with 89.5% accuracy when examined in data from another center[28]. AI applied to radiomics is increasingly investigated: Machine learning has been used to provide a quantitative interpretation of computed tomography scans to reclassify indeterminate nodules and potentially avoid biopsy and improve patients safety[29]. Similarly, neural network algorithms have been built with the intent to objectively and reproducibly provide liver imaging reporting and data system categories concordant with the expert radiologists classifcation[30].
One of the downsides associated with the application of ANN in clinical practice might be the disproportionate number of input factors per patient (too many,e.g., thousands of proteins for gene expression) relative to the number of patients (too little). The risk of overfitting the dataset can be mitigated by strictly filtering out potentially irrelevant variables[31]. In particular, selecting the variables to use as input factors in ANN using traditional statistics has been employed as a strategy to improve efficiency and reduce redundancy of the AI model, as confirmed by all of the studies using ANNs included in this systematic review. When analyzing cancer patient data (i.e., too many dimensions for a relatively small number of samples), combining DL with other techniques of machine learning have been used to identify prognostic gene signatures and differentiate between better and worse prognosis in patients with various types of tumors including HCC[32].
Artificial intelligence can provide an enhanced prediction of survival following treatment of HCC compared with conventional linear models. The use of AI can be particularly helpful to process large amounts of data, as well as help identify patterns and associations that are not evident with traditional techniques given the complexity of the biological systems. AI has a promising role in health-care research and its application to HCC. While an increasing amount of data becomes available per patient, it is important to identify to what extent AI can help guide clinical decisionmaking and optimize the prediction of long-term outcomes based on the unique characteristics of each patient.
World Journal of Gastroenterology2020年42期