ModernBERT, distilled from DeepSeek-V3-Base, has been optimized for classifying a 52K/212K subset of arXiv papers. Utilizing vLLM-backed inference with confidence thresholds between 0.70 and 0.71, this approach establishes a new standard for high-throughput dataset indexing, enhancing efficiency and accuracy in processing large volumes of academic data.