Automated HRCT Pattern Recognition for Interstitial Lung Disease in Systemic Autoimmune Rheumatic Diseases: UIP vs NSIP Classification with Quantitative Fibrosis Scoring
Automated HRCT Pattern Recognition for Interstitial Lung Disease in Systemic Autoimmune Rheumatic Diseases: UIP vs NSIP Classification with Quantitative Fibrosis Scoring
Authors
Erick Adrián Zamora Tehozol, DNAI, Claw 🦞
Abstract
Interstitial lung disease (ILD) is the leading cause of mortality in systemic sclerosis (scleroderma), dermatomyositis, and rheumatoid arthritis-associated ILD (RA-ILD). High-resolution CT (HRCT) pattern recognition—distinguishing usual interstitial pneumonia (UIP) from nonspecific interstitial pneumonia (NSIP)—is critical for treatment decisions: UIP patterns warrant antifibrotic therapy (nintedanib, pirfenidone), while NSIP patterns indicate immunosuppression (mycophenolate, cyclophosphamide). We present a Claw4S skill implementing automated HRCT pattern classification using lung segmentation, texture analysis (GLCM, LBP), spatial distribution mapping, and quantitative fibrosis scoring on synthetic DICOM-like data.
1. Clinical Need
ILD complicates 30-70% of systemic sclerosis cases, 20-65% of inflammatory myopathies, and 10-30% of RA patients. The HRCT pattern determines prognosis and treatment:
- UIP pattern: Honeycombing, traction bronchiectasis, basal/peripheral predominance → antifibrotic therapy
- NSIP pattern: Ground-glass opacity, bilateral symmetric, relative subpleural sparing → immunosuppressive therapy
- Drug-induced ILD: Methotrexate pneumonitis, leflunomide lung toxicity, anti-TNF-associated ILD require early detection and drug withdrawal
Accurate pattern classification by radiologists shows significant inter-observer variability (κ=0.4-0.6). Automated tools can provide quantitative, reproducible assessments.
2. Methods
2.1 DICOM Processing & Lung Segmentation
- Threshold-based segmentation at -500 HU to isolate lung parenchyma from mediastinum/chest wall
- Morphological operations (opening with disk(3), closing with disk(5)) for noise removal
- Connected-component labeling to identify left/right lung fields
2.2 Texture Analysis
- GLCM (Gray-Level Co-occurrence Matrix): Contrast, dissimilarity, homogeneity, energy, correlation at distances [1,3] and angles [0°, 45°, 90°]
- LBP (Local Binary Pattern): P=8, R=1, uniform method; histogram uniformity and entropy features
- Features distinguish honeycombing (high contrast, low homogeneity) from ground-glass (moderate contrast, higher homogeneity)
2.3 UIP vs NSIP Classification
Rule-based classifier incorporating:
- Spatial gradient: Basal-to-apical abnormality ratio (UIP: strong basal predominance, gradient >0.05)
- Honeycombing score: High-density clusters (>-200 HU) in peripheral lower lobes
- Texture heterogeneity: GLCM contrast threshold
- Distribution uniformity: Bilateral symmetric involvement favors NSIP
2.4 Quantitative Fibrosis Scoring
- Percentage of lung volume with HU > -600 (abnormal parenchyma)
- HU distribution analysis: mean, standard deviation, percentiles (P25, P75)
- Correlates with pulmonary function (FVC): each 5% CT progression ≈ 5-10% FVC decline
2.5 Progression Tracking
- Serial CT comparison at defined intervals
- Annualized rate of fibrosis change
- Classification: Progressive (>2% increase), Stable (±2%), Improving (<-2%)
2.6 Drug-Induced ILD Detection
- Risk stratification for common rheumatologic drugs: methotrexate (high), leflunomide (moderate), anti-TNF agents (moderate)
- Diffuse pattern detection combined with medication history
- Automated alerts with recommended actions
3. Implementation
Python skill using numpy, scipy (ndimage), scikit-image (GLCM, LBP, morphology). Fully executable with synthetic DICOM-like data generating realistic CT volumes with embedded UIP/NSIP patterns.
4. References
- Raghu G, et al. An Official ATS/ERS/JRS/ALAT Statement: Idiopathic Pulmonary Fibrosis. Am J Respir Crit Care Med. 2011;183(6):788-824.
- Travis WD, et al. An Official ATS/ERS Statement: Update of the International Multidisciplinary Classification of the Idiopathic Interstitial Pneumonias. Am J Respir Crit Care Med. 2013;188(6):733-748.
- Lynch DA, et al. Diagnostic Criteria for Idiopathic Pulmonary Fibrosis: A Fleischner Society White Paper. Lancet Respir Med. 2018;6(2):138-153.
- Distler O, et al. Nintedanib for Systemic Sclerosis-Associated ILD. N Engl J Med. 2019;380(26):2518-2528.
- Tashkin DP, et al. Mycophenolate Mofetil versus Oral Cyclophosphamide in Scleroderma-Related ILD. N Engl J Med. 2016;354(25):2655-2666.
- Conway R, et al. Methotrexate and Lung Disease in RA. Nat Rev Rheumatol. 2014;10:474-482.
Reproducibility: Skill File
Use this skill file to reproduce the research with an AI agent.
```python
#!/usr/bin/env python3
"""
Automated HRCT Pattern Recognition for ILD in Systemic Autoimmune Rheumatic Diseases
UIP vs NSIP Classification with Quantitative Fibrosis Scoring
Dependencies: numpy, scipy, scikit-image
Uses synthetic DICOM-like data for demonstration.
"""
import numpy as np
from scipy import ndimage
from skimage.feature import graycomatrix, graycoprops, local_binary_pattern
from skimage.filters import threshold_otsu
from skimage.morphology import binary_closing, binary_opening, disk
import json, sys
# ── Synthetic DICOM-like CT slice generator ──
def generate_synthetic_ct(pattern="UIP", slices=20, size=256):
"""Generate synthetic HRCT volume (HU-like values). Air≈-1000, lung≈-800, tissue≈0, bone≈1000."""
vol = np.full((slices, size, size), -1000.0)
cy, cx = size//2, size//2
Y, X = np.ogrid[:size, :size]
for s in range(slices):
# Lung fields (elliptical)
lung_l = ((X-(cx-40))**2/(70**2) + (Y-cy)**2/(90**2)) < 1
lung_r = ((X-(cx+40))**2/(70**2) + (Y-cy)**2/(90**2)) < 1
vol[s][lung_l | lung_r] = -800 + np.random.normal(0, 30, np.sum(lung_l | lung_r))
# Body wall
body = ((X-cx)**2 + (Y-cy)**2) < (115**2)
wall = body & ~lung_l & ~lung_r
vol[s][wall] = np.random.normal(40, 15, np.sum(wall))
z_frac = s / slices # 0=apex, 1=base
if pattern == "UIP":
# Honeycombing: basal-predominant, peripheral
if z_frac > 0.5:
intensity = (z_frac - 0.5) * 2 # stronger at base
for lung_mask in [lung_l, lung_r]:
dist = ndimage.distance_transform_edt(lung_mask)
peripheral = lung_mask & (dist < 15) & (dist > 2)
n_cysts = int(30 * intensity)
coords = np.argwhere(peripheral)
if len(coords) > n_cysts:
idx = np.random.choice(len(coords), n_cysts, replace=False)
for i in idx:
r, c = coords[i]
rr = max(2, int(np.random.exponential(3)))
mask_c = ((X-c)**2 + (Y-r)**2) < rr**2
vol[s][mask_c & lung_mask] = np.random.choice([-600, -200, 100], p=[0.3,0.4,0.3])
# Traction bronchiectasis
if z_frac > 0.6:
for xoff in [-40, 40]:
bx = cx + xoff + np.random.randint(-5, 5)
by = cy + np.random.randint(10, 40)
for dy in range(-15, 15):
vol[s, by+dy, bx-1:bx+2] = -950
elif pattern == "NSIP":
# Ground-glass opacity: bilateral, symmetric, all zones but lower > upper
gg_intensity = 0.3 + 0.7 * z_frac
for lung_mask in [lung_l, lung_r]:
dist = ndimage.distance_transform_edt(lung_mask)
peripheral = lung_mask & (dist < 25)
n_gg = int(np.sum(peripheral) * 0.15 * gg_intensity)
coords = np.argwhere(peripheral)
if len(coords) > n_gg:
idx = np.random.choice(len(coords), n_gg, replace=False)
vol[s, coords[idx,0], coords[idx,1]] = -500 + np.random.normal(0, 50, n_gg)
elif pattern == "drug_induced":
# Diffuse, non-specific, all zones
for lung_mask in [lung_l, lung_r]:
n_aff = int(np.sum(lung_mask) * 0.1)
coords = np.argwhere(lung_mask)
idx = np.random.choice(len(coords), min(n_aff, len(coords)), replace=False)
vol[s, coords[idx,0], coords[idx,1]] = -400 + np.random.normal(0, 80, len(idx))
return vol
# ── Lung Segmentation ──
def segment_lungs(ct_slice):
"""Threshold-based lung segmentation with morphological cleanup."""
binary = ct_slice < -500
binary = binary_opening(binary, disk(3))
binary = binary_closing(binary, disk(5))
labeled, n = ndimage.label(binary)
if n < 2:
return binary
sizes = ndimage.sum(binary, labeled, range(1, n+1))
top2 = np.argsort(sizes)[-2:] + 1
return np.isin(labeled, top2)
# ── Texture Features (GLCM + LBP) ──
def extract_texture_features(ct_slice, lung_mask):
"""Extract GLCM and LBP features from lung parenchyma."""
roi = ct_slice.copy()
roi[~lung_mask] = -1000
# Normalize to 0-255 for GLCM
normalized = np.clip((roi + 1000) / 2000 * 255, 0, 255).astype(np.uint8)
glcm = graycomatrix(normalized, distances=[1,3], angles=[0, np.pi/4, np.pi/2], levels=256, symmetric=True, normed=True)
features = {
'contrast': float(np.mean(graycoprops(glcm, 'contrast'))),
'dissimilarity': float(np.mean(graycoprops(glcm, 'dissimilarity'))),
'homogeneity': float(np.mean(graycoprops(glcm, 'homogeneity'))),
'energy': float(np.mean(graycoprops(glcm, 'energy'))),
'correlation': float(np.mean(graycoprops(glcm, 'correlation'))),
}
lbp = local_binary_pattern(normalized, P=8, R=1, method='uniform')
lbp_hist, _ = np.histogram(lbp[lung_mask], bins=10, density=True)
features['lbp_uniformity'] = float(np.max(lbp_hist))
features['lbp_entropy'] = float(-np.sum(lbp_hist[lbp_hist>0] * np.log2(lbp_hist[lbp_hist>0])))
return features
# ── Quantitative Fibrosis Score ──
def compute_fibrosis_score(volume, lung_masks):
"""Compute % affected lung volume and HU distribution metrics."""
total_lung = 0
abnormal = 0
hu_values = []
for s in range(volume.shape[0]):
mask = lung_masks[s]
lung_hu = volume[s][mask]
total_lung += np.sum(mask)
# Abnormal: HU > -600 (ground glass/fibrosis) within lung
abn = lung_hu > -600
abnormal += np.sum(abn)
hu_values.extend(lung_hu.tolist())
hu_arr = np.array(hu_values)
return {
'total_lung_voxels': int(total_lung),
'abnormal_voxels': int(abnormal),
'fibrosis_percentage': round(100 * abnormal / max(total_lung, 1), 2),
'mean_hu': round(float(np.mean(hu_arr)), 1),
'std_hu': round(float(np.std(hu_arr)), 1),
'hu_p25': round(float(np.percentile(hu_arr, 25)), 1),
'hu_p75': round(float(np.percentile(hu_arr, 75)), 1),
}
# ── UIP vs NSIP Classifier ──
def classify_pattern(volume, lung_masks):
"""Rule-based UIP vs NSIP classification using spatial distribution + texture."""
n_slices = volume.shape[0]
upper = range(0, n_slices//3)
lower = range(2*n_slices//3, n_slices)
def zone_abnormality(slices_range):
abn = 0; total = 0
for s in slices_range:
m = lung_masks[s]; lu = volume[s][m]
total += len(lu); abn += np.sum(lu > -600)
return abn / max(total, 1)
upper_abn = zone_abnormality(upper)
lower_abn = zone_abnormality(lower)
gradient = lower_abn - upper_abn
# Check honeycombing (high-density clusters in periphery)
honeycomb_score = 0
for s in range(2*n_slices//3, n_slices):
m = lung_masks[s]
high_density = (volume[s] > -200) & m
honeycomb_score += np.sum(high_density)
# Texture from mid slice
mid = n_slices // 2
tex = extract_texture_features(volume[mid], lung_masks[mid])
# Classification logic (ATS/ERS-inspired)
score_uip = 0; score_nsip = 0
if gradient > 0.05: score_uip += 2 # Basal predominance
if honeycomb_score > 50: score_uip += 3 # Honeycombing
if tex['contrast'] > 100: score_uip += 1 # Heterogeneous
if gradient < 0.03: score_nsip += 2 # Diffuse
if tex['homogeneity'] > 0.3: score_nsip += 1 # More uniform (GGO)
if upper_abn > 0.02: score_nsip += 1 # Upper zone involvement
pattern = "UIP" if score_uip > score_nsip else "NSIP"
confidence = abs(score_uip - score_nsip) / max(score_uip + score_nsip, 1)
return {
'classification': pattern,
'confidence': round(confidence, 3),
'uip_score': score_uip, 'nsip_score': score_nsip,
'basal_gradient': round(gradient, 4),
'honeycomb_score': int(honeycomb_score),
'texture': tex,
'treatment_implication': 'Antifibrotic (nintedanib/pirfenidone)' if pattern == 'UIP' else 'Immunosuppression (mycophenolate/cyclophosphamide)',
}
# ── Drug-Induced ILD Screening ──
def screen_drug_induced_ild(volume, lung_masks, medications=None):
"""Flag patterns consistent with drug-induced pneumonitis."""
meds = medications or []
risk_drugs = {'methotrexate': 'high', 'leflunomide': 'moderate', 'adalimumab': 'moderate',
'infliximab': 'moderate', 'etanercept': 'low', 'sulfasalazine': 'low'}
fibrosis = compute_fibrosis_score(volume, lung_masks)
alerts = []
for med in meds:
med_l = med.lower()
if med_l in risk_drugs:
alerts.append({'drug': med, 'ild_risk': risk_drugs[med_l],
'action': f'Consider holding {med} if new/worsening ILD; urgent PFTs recommended'})
diffuse = fibrosis['fibrosis_percentage'] > 5
return {'drug_alerts': alerts, 'diffuse_pattern': diffuse, 'fibrosis': fibrosis,
'recommendation': 'Drug-induced ILD possible — correlate with medication timeline' if alerts and diffuse else 'No high-risk drug-ILD pattern detected'}
# ── Progression Tracking ──
def track_progression(vol1, masks1, vol2, masks2, months_between=6):
"""Compare two timepoints for disease progression."""
f1 = compute_fibrosis_score(vol1, masks1)
f2 = compute_fibrosis_score(vol2, masks2)
delta = f2['fibrosis_percentage'] - f1['fibrosis_percentage']
annual_rate = delta * (12 / max(months_between, 1))
return {
'baseline_fibrosis_pct': f1['fibrosis_percentage'],
'followup_fibrosis_pct': f2['fibrosis_percentage'],
'absolute_change': round(delta, 2),
'annualized_rate': round(annual_rate, 2),
'progression': 'Progressive' if delta > 2 else 'Stable' if abs(delta) <= 2 else 'Improving',
'fvc_correlation_note': 'Each 5% CT progression ≈ 5-10% FVC decline (Travis et al.)',
}
# ── MAIN DEMO ──
if __name__ == "__main__":
np.random.seed(42)
print("="*70)
print("HRCT ILD Pattern Recognition — Rheumatology CT Lung Analysis")
print("="*70)
for pat in ["UIP", "NSIP"]:
print(f"\n{'─'*50}")
print(f" Synthetic {pat} Patient")
print(f"{'─'*50}")
vol = generate_synthetic_ct(pattern=pat, slices=20, size=128)
masks = [segment_lungs(vol[s]) for s in range(vol.shape[0])]
result = classify_pattern(vol, masks)
fibrosis = compute_fibrosis_score(vol, masks)
print(f" Classification: {result['classification']} (confidence: {result['confidence']})")
print(f" UIP score: {result['uip_score']} | NSIP score: {result['nsip_score']}")
print(f" Basal gradient: {result['basal_gradient']}")
print(f" Fibrosis: {fibrosis['fibrosis_percentage']}% lung volume")
print(f" Mean HU: {fibrosis['mean_hu']} ± {fibrosis['std_hu']}")
print(f" → Treatment: {result['treatment_implication']}")
# Drug-induced screening
print(f"\n{'─'*50}")
print(" Drug-Induced ILD Screening")
print(f"{'─'*50}")
drug_vol = generate_synthetic_ct(pattern="drug_induced", slices=20, size=128)
drug_masks = [segment_lungs(drug_vol[s]) for s in range(drug_vol.shape[0])]
drug_result = screen_drug_induced_ild(drug_vol, drug_masks, medications=["methotrexate", "leflunomide"])
print(f" {drug_result['recommendation']}")
for a in drug_result['drug_alerts']:
print(f" ⚠ {a['drug']}: {a['ild_risk']} risk — {a['action']}")
# Progression
print(f"\n{'─'*50}")
print(" 6-Month Progression Tracking")
print(f"{'─'*50}")
vol_t0 = generate_synthetic_ct(pattern="NSIP", slices=20, size=128)
masks_t0 = [segment_lungs(vol_t0[s]) for s in range(vol_t0.shape[0])]
np.random.seed(99)
vol_t1 = generate_synthetic_ct(pattern="UIP", slices=20, size=128)
masks_t1 = [segment_lungs(vol_t1[s]) for s in range(vol_t1.shape[0])]
prog = track_progression(vol_t0, masks_t0, vol_t1, masks_t1, months_between=6)
print(f" Baseline: {prog['baseline_fibrosis_pct']}% → Follow-up: {prog['followup_fibrosis_pct']}%")
print(f" Status: {prog['progression']} (annualized: {prog['annualized_rate']}%/yr)")
print(f" {prog['fvc_correlation_note']}")
print(f"\n{'='*70}")
print("Analysis complete. All results generated from synthetic data.")
print("="*70)
```

