Robust & Data-Efficient Learning in Medical AI

Segment Anything Model under Noisy Medical Radiology Imaging Conditions

Project Overview

Recent advances in Segment Anything Models (SAM) have demonstrated strong generalization across natural images, and more recently, across medical imaging domains through variants such as MedSAM and SAM2. However, the robustness of these models under realistic noisy medical imaging conditions remains insufficiently studied.

This project aims to systematically evaluate the robustness, stability, and failure modes of SAM-based models on 2D medical imaging datasets under controlled noise perturbations, simulating common acquisition artifacts encountered in clinical practice.

The final objective is to produce a high-quality benchmark study suitable for submission to a CVPR workshop, followed by an extended version targeting a Q1 journal.

Research Objectives

1. Robustness Evaluation

Assess how different SAM-based models perform when medical images are corrupted by various noise and artifact types.

2. Noise Sensitivity Analysis

Quantify segmentation degradation under increasing noise levels and identify noise types that most significantly impact performance.

3. Model Comparison

Benchmark and compare:

Original SAM
SAM2
Medical-domain adaptations (e.g., MedSAM and other public variants)

4. Visualization and Interpretability

Provide qualitative visualizations to reveal:

Failure modes
Boundary instability
Confidence degradation under noise

5. Scientific Dissemination

Package the findings into:

A CVPR workshop paper (short-term milestone)
An extended Q1 journal submission (long-term milestone)

Methodology

1. Dataset Preparation

Select 2D medical imaging datasets (e.g., X-ray, fundus, ultrasound, CT slices, or MRI slices).
Ensure consistent modality in the first phase to enable controlled analysis.

2. Noise Injection & Artifact Simulation

Synthetic noise and artifact perturbations will be introduced during preprocessing, such as:

Gaussian noise
Poisson noise
Salt-and-pepper noise
Motion blur
Intensity inhomogeneity
Low-contrast degradation

Noise levels will be systematically varied to simulate mild to severe acquisition artifacts.

3. Model Benchmarking

Apply SAM-based models using prompt-based and/or automatic settings.
Keep inference settings consistent to ensure fair comparison.
Evaluate robustness across noise types and intensities.

4. Evaluation Metrics

Dice coefficient
IoU (Jaccard index)
Boundary-based metrics (e.g., Hausdorff Distance)
Stability metrics across noise levels
Optional confidence/entropy-based analysis

5. Visualization & Analysis

Side-by-side qualitative comparisons
Failure case analysis
Sensitivity plots showing performance degradation trends

Timeline (6 Months Total)

Phase 1: CVPR Workshop Target (Months 1–3)

Goal: Produce a concise, well-controlled benchmark study.

Run experiments on 2 datasets of the same modality
Evaluate a limited but representative set of:
- SAM models
- Noise types
Generate quantitative results and qualitative visualizations
Write and submit a CVPR workshop paper

Compute Resources:

Google Colab
Kaggle

Phase 2: Q1 Journal Target (Months 4–6)

Goal: Extend and deepen the study for journal-level contribution.

Expand to 4–5 datasets, possibly across modalities
Include additional:
- SAM variants
- Conventional models
- Extended types of noise injection methods
- Robustness and stability metrics
Add deeper analysis and discussion on clinical relevance
Prepare and submit to a Q1 journal