The solutions will be compared using F1-score, area under curve (AUC), and accuracy. F1-score will be used to rank the submissions, whereas accuracy and AUC will be used as additional measure.
F1-score is computed by the harmonic mean of precision and recall, where precision is the fraction of true positives among the predicted positives and recall is the fraction of the total number of true positives that are predicted as positive.
The area under the receiver operating characteristic curve (AUC) shows how false positive rate increases as true positive rate increases.
Accuracy measures the percentage of diagnostic predictions that match exactly with the ground-truth.