We analyze sparse attention patterns in autoregressive language models across 8 architectures ranging from 125M to 70B parameters. Using a novel attention topology metric based on persistent homology, we discover that attention heads in layers 12 and beyond converge to masks that align with document structure elements (paragraphs, sections, lists) with 0.
We demonstrate that membership inference attacks against fine-tuned large language models achieve 0.95 AUC using only output token probabilities, without access to model parameters or gradients.
This paper investigates the econometric foundations underlying synthetic control methods fail when pre-treatment fit is below r² = 0.85: a placebo-based calibration.
We present new results on hamilton cycles with applications to hypergraphs. Our main theorem establishes sharp bounds that improve upon the best previously known results, settling a conjecture in the affirmative for the cases considered.
Diffusion models have achieved state-of-the-art image generation quality as measured by FID and IS scores. However, we demonstrate that these metrics mask a critical failure mode: anatomically implausible human hands.
Distributed tracing is foundational to microservice observability, yet its performance overhead is poorly quantified, particularly at tail latencies. We instrument 23 production microservice deployments across 4 organizations, measuring tracing overhead at the 50th, 95th, and 99th percentiles of CPU utilization.
Continual learning methods are universally evaluated under a discrete task-boundary assumption, where distribution shifts occur instantaneously between clearly delineated tasks. We argue this assumption is ecologically invalid and demonstrate that five leading continual learning methods (EWC, SI, PackNet, ER, DER++) fail catastrophically when task boundaries are gradual.
We present new results on ramsey theory with applications to sat solvers. Our main theorem establishes sharp bounds that improve upon the best previously known results, settling a conjecture in the affirmative for the cases considered.
We establish a new result in algebraic geometry and combinatorics: derived categories of cubic fourfolds containing a plane are equivalent to k3 surfaces of degree 14. Our proof introduces a novel filtration technique combined with deformation-theoretic arguments that resolve a long-standing open question in the field.
We establish new results concerning supersingular surfaces in the context of artin invariant, resolving a question that has remained open since it was first posed in the literature. Our approach combines techniques from crystalline cohomology with careful analysis of degeneration phenomena to construct explicit examples and derive sharp bounds.
We establish a new result in algebraic geometry and combinatorics: the number of antichains in the boolean lattice 2^[n] grows as 2^(1.0000134 * binom(n, n/2)) for large n.
We present new results on sphere packing with applications to energy minimization. Our main theorem establishes sharp bounds that improve upon the best previously known results, settling a conjecture in the affirmative for the cases considered.
We establish a new result in algebraic geometry and combinatorics: moduli spaces of stable maps to p^1 x p^1 have picard number exactly 3 for genus g >= 2. Our proof introduces a novel filtration technique combined with deformation-theoretic arguments that resolve a long-standing open question in the field.
We present new results on graph reconstruction with applications to reconstruction conjecture. Our main theorem establishes sharp bounds that improve upon the best previously known results, settling a conjecture in the affirmative for the cases considered.
We conduct the largest study to date on coreference, analyzing 38,271 instances across 17 datasets spanning multiple domains. Our key finding is that clinical nlp accounts for 17.
We establish a new result in algebraic geometry and combinatorics: the hodge conjecture holds for codimension-2 cycles on abelian fourfolds with cm by q(zeta_5). Our proof introduces a novel filtration technique combined with deformation-theoretic arguments that resolve a long-standing open question in the field.
We empirically characterize how inference-time compute scales with task performance for agentic AI workloads. Across 14 agentic benchmarks spanning web navigation, code generation with tool use, and multi-step reasoning, we find that performance follows a power law with exponent 0.
Foundation models for zero-shot object detection, including CLIP-based detectors and Grounding DINO, have achieved remarkable performance on natural image benchmarks. However, their deployment in industrial quality inspection remains largely untested.
We present new results on oriented coloring with applications to planar graphs. Our main theorem establishes sharp bounds that improve upon the best previously known results, settling a conjecture in the affirmative for the cases considered.
We establish a new result in algebraic geometry and combinatorics: explicit height bounds for rational points on bielliptic curves of genus 2 over q. Our proof introduces a novel filtration technique combined with deformation-theoretic arguments that resolve a long-standing open question in the field.