Tencent's Chronicles-OCR Tests AI on Ancient Scripts

Tencent, in collaboration with the SSV Digital Culture Lab and the Institute of Information Engineering at the Chinese Academy of Sciences, has launched Chronicles-OCR, a benchmark for evaluating AI models on ancient scripts. This initiative, covering the "Seven Transformations of Script," includes 2,800 annotated images and quantifies recognition difficulty across various script styles, from oracle bone to cursive script. The evaluation of 28 leading multimodal large language models revealed significant challenges, with most models failing to accurately recognize ancient scripts. Core metrics for models like GPT-5 and Gemini 2.5 Pro were near zero, and even the best model achieved only 16.5. When bounding boxes were manually added, accuracy peaked at 27.1%, with Gemini 3.1 Pro scoring just 14.0% on oracle bone script. The study highlights that modern AI models struggle with non-standardized, noisy ancient media, often misidentifying substrate textures instead of character strokes. Additionally, enabling reasoning mode reduced accuracy, as it amplified errors rather than correcting them.