Classifiers. For the classifiers to identify roughly the same number of EDs as their respective expert anno- tation they were trained on, the thresholds for identification based on the sigmoid output were adjusted to 0.48, 0.45, 0.35, and 0.55. The performances as assessed by the metrics were moderate in most instances (Tab. 1): B-ACC 0.76–0.82, R-AUC 0.52–0.72, PR-AUC 0.51–0.65, F0.5 0.53–0.75, MCC 0.51–0.72, sensitivity 0.54–0.75, and precision 0.55–0.75. Most of the EEG was non-ED, and this was reflected in higher scores for ACC 0.94–0.96, specificity 0.97–0.98, and NPV 0.97-0.98. The best overall result was obtained for classifier U. The metrics were also calculated comparing the experts using expert 1 as ground truth. In comparison to the classifiers most values were somewhat lower for the experts. 1. Results for the classifiers. Values are averages over folds (±standard deviation). E12: using expert 1 as ground truth and comparing with expert 2; E1: expert 1; E2: expert 2; U: union of expert 1 and 2; I: intersection of expert 1 and 2; Data: indicate which annotation that was used for training and evaluation; ACC: accuracy; B-ACC: balanced accuracy; R-AUC: area under the curve based on receiver operating characteristic; PR-AUC: precision-recall area under the curve; MCC: ▇▇▇▇▇▇▇’s correlation coefficient; Sens: sensitivity; Spec: specificity; Prec: precision; NPV: negative predictive value. Blue indicates comparison between expert 1 and 2; green indicates results from testing the classifiers; light green indicates results from training the classifiers. Data ACC B-ACC R-AUC PR-AUC F0.5 MCC Sens Spec Prec NPV
Appears in 2 contracts
Sources: Pilot Study, Pilot Study