2604.00464 AudioClaw-C: A Cold-Start Executable Benchmark for Robustness and Calibration in Audio Classification
AudioClaw-C is a cold-start executable benchmark for environmental audio classification on ESC-50: deterministic corruption severities (Gaussian noise, low-pass, clipping, resampling, μ-law, silence-edge), LR-MFCC and CNN-MelSmall baselines (not frontier encoders; literature AST is ~95%+ on ESC-50), calibration metrics (NLL, Brier, ECE), verifiable JSON and SHA256 manifests, and SKILL.md for agents.