亚洲免费av电影一区二区三区,日韩爱爱视频,51精品视频一区二区三区,91视频爱爱,日韩欧美在线播放视频,中文字幕少妇AV,亚洲电影中文字幕,久久久久亚洲av成人网址,久久综合视频网站,国产在线不卡免费播放

        ?

        Machine learning models and over-fitting considerations

        2022-12-28 09:00:28ParisCharilaouRobertBattat
        World Journal of Gastroenterology 2022年5期

        Paris Charilaou,Robert Battat

        Abstract Machine learning models may outperform traditional statistical regression algorithms for predicting clinical outcomes. Proper validation of building such models and tuning their underlying algorithms is necessary to avoid over-fitting and poor generalizability, which smaller datasets can be more prone to. In an effort to educate readers interested in artificial intelligence and model-building based on machine-learning algorithms, we outline important details on crossvalidation techniques that can enhance the performance and generalizability of such models.

        Key Words: Machine learning; Over-fitting; Cross-validation; Hyper-parameter tuning

        TO THE EDITOR

        Conet al[1 ] explore artificial intelligence (AI) in a classification problem of predicting biochemical remission of Crohn’s disease at 12 mo post-induction with infliximab or adalimumab. They illustrate that, after applying appropriate machine learning (ML)methodologies, ML methods outperform conventional multivariable logistic regression (a statistical learning algorithm). The area-under-the-curve (AUC) was the chosen performance metric for comparison and cross-validation was performed.

        Their study elucidates a few important points regarding the utilization of ML. First,the use of repeated k-fold cross-validation, which is primarily utilized to prevent overfitting of the models. This technique, while common in ML, it has not been traditionally used in conventional regression models in the literature so far. Especially in small datasets, such as in their study (n= 146 ), linear (and non-linear, in the case of neural networks) relationships risk being “l(fā)earned” by chance, leading to poor generalization of the models when applied to previously “unseen” or future data points. It was evident from their analysis that the “na?ve” AUCs (training the model on all the data), was significantly higher than the mean cross-validated AUCs, in all 3 models, suggestive of “over-fitting” when one does not cross-validate. Smaller datasets tend to be more susceptible to over-fitting as they are less likely to accurately represent the population in question.

        Second, the authors utilized “hyper-parameter tuning” for their neural network models, where the otherwise arbitrarily selected “settings” (or hyper-parameters, such as the number of inner neuron layers and number of neurons per layer) of the neural network are chosen based on performance. Hyper-parameters cannot be “l(fā)earned” or“optimized” by simply fitting the model (as it happens with predictor coefficients),and the only way to discover the best values is by fitting the model with various combinations and assessing its performance. The combinations can be evaluated stochastically (randomly orviaa Bayes-based approach) or using a grid approach (e.g.,for 3 hyper-parameters that take 5 potential values, there are 5 × 5 × 5 = 53 = 125 combinations to evaluate) over k times. One may ask, if one was to fit a model 125 × k times, on 146 observations, is not there a risk for over-fitting the “optimal” hyperparameter values? To avoid such a problem, nested k-fold cross-validation must be performed: within each repeated k-fold training data subset, a sub-k-fold “inner”training/validation must be done to evaluate each hyper-parameter combination. In this way, we overcome potential bias to optimistic model performance, which can occur when we use the same cross-validation procedure and dataset to both tune the hyper-parameters and evaluate the model’s performance metrics (e.g., AUC)[2 ]. The authors did not elaborate on how the hyperparameter tuning was performed.

        Another point to consider in k-fold cross-validation in small datasets is the number of k-folds used, specifically in classification problems (i.e., yes/no binary outcomes). In this study[1 ], the outcome prevalence was 64 % (n ≈ 93 ). With a chosen k = 5 , the training folds would comprise 80 % of that data, leading to approximately 74 positive cases of biochemical remission. The number of positive outcomes in each training fold must be considered, especially in logistic regression, where the rule of thumb recommends at least ten positive events per independent predictor, to minimize overfitting[3 ]. In this study[1 ], six predictors were eventually used in the multivariable model, making over-fitting less likely from a model-specification standpoint. Finally,k-folds are recommended to be stratified by the outcome, so the outcome prevalence is equal among the training and testing folds. This becomes crucial when the prevalence of outcome of interest is < 10 %-20 % (imbalanced classification problem). While imbalanced classification is not an issue in this study[1 ], the authors did not mention whether they used outcome-stratified k-folds.

        Lastly, the endpoint utilized, CRP normalization, has poor specificity for endoscopic inflammation in Crohn’s disease[4 ]. More robust endpoints would include endoscopic inflammation and/or deep remission using validated disease activity indices[5 ].

        We congratulate the authors for their effort, which acts both as a proof-of-concept for using ML in improved prediction of outcomes in IBD, but also for the methodologies outlined to reduce over-fitting. In general, with the advent of AI and specifically ML-based models in IBD[6 ], it is important to recognize that while now we have the tools to construct more accurate models and enhance precision medicine, most MLbased models, such as artificial neural networks, lack in being intuitively interpretable(i.e., “black-box”). Efforts in “explainable AI” are under way[7 ], hopefully eliminating the “black-box” concept in future clinical decision tools. Applying these to validated disease activity assessments will be essential for prediction models in future studies.

        日韩色久悠悠婷婷综合| 精品国模一区二区三区| 少妇高潮喷水正在播放| 免费无码AⅤ片在线观看| 中文字幕不卡在线播放| 国产三级在线观看性色av| 中文字幕精品亚洲字幕| 性猛交╳xxx乱大交| 国内精品一区二区三区| 少妇被粗大猛进进出出| 涩涩鲁精品亚洲一区二区| 大地资源在线观看官网第三页| 国产AV无码专区亚洲AⅤ| 国产亚洲av手机在线观看| 亚洲高清国产一区二区| 欧美中日韩免费观看网站| 色综合久久无码中文字幕app| 日韩av中文字幕少妇精品| 色欲色香天天天综合网www| 色八a级在线观看| 青草蜜桃视频在线观看| 久久成人精品国产免费网站 | 亚洲精品久久久久久| 国产精品美女久久久浪潮av| 久久伊人亚洲精品视频| 国产精品网站在线观看免费传媒 | 日本一本免费一二区| 国产中老年妇女精品| 国产亚洲欧洲三级片A级| 日韩av一区二区三区精品久久| 国产欧美日韩一区二区三区| 欧美在线区| 精品蜜臀国产av一区二区| 国产成人无码a区在线观看导航 | 精品一区二区三区在线观看l| av天堂免费在线播放| 性无码专区无码| 国产一区a| 亚洲高清精品一区二区| 少妇愉情理伦片丰满丰满| 精品久久久久久久久久久aⅴ|