Methods

Training and Evaluation Design

LesionShiftAI uses a shared pipeline across model families so cross-model comparisons remain tied to the same preprocessing, splitting, and evaluation protocol.

Data Flow

Dataset Protocol

Training Domain

ISIC 2019 samples are split into train and validation partitions using deterministic seeding configured in YAML.

External Domain

HAM10000 is held out as an external test domain and never used for model selection.

Preprocessing

Shared image transforms and DataLoader settings ensure consistent feature-space assumptions across all pipelines.

Models

Compared Pipelines

Baseline CNN

ResNet50 backbone for single-model benchmarking with direct validation-to-external transfer measurement.

Ensemble CNN

Five fold-specific ResNet50 CNN members are trained and merged via mean malignancy probability to test robustness under shift.

Vision Transformer (ViT-B16)

ViT-B16 initialized from pretrained weights with warmup and minimum-learning-rate control for stable fine-tuning.

Vision Transformer (ViT-L16)

ViT-L16 initialized from pretrained weights to test higher-capacity transfer under the same protocol.

Evaluation

Metric and Artifact Policy

  • Core metrics: accuracy, precision, recall, F1, ROC AUC, PR AUC.
  • Confusion terms: TN, FP, FN, TP are exported for each split.
  • Curve artifacts: split-level ROC/PR PNG files and JSON payloads.
  • Generalization gap: validation - external test per metric.