(Updated on Dec 7, 2012)

Math Retrieval Subtask

NTICR-sandbox.tar.gz (7.2G) 100,000 xhtml documents transformed to XHTML+MathML with the LATEXML converter
NTCIR-Math-formula-search.xml (98K) Topics for formula search
NTCIR-Math-fulltext-search.xml (66K) Topics for fulltext search
NTCIR-Math-open-mir.xml (52K) Topics for open MIR
topics.pdf (271K) Documentation for search topics

Math Understanding Subtask

Description_Extraction_Ver01.zip (3.8M) arxiv_msc_10.zip

  • 10 papers from ArXiv.org dataset (the papers are the same as the initial dataset but the annotations are updated)


  • Additional 25 papers from ArXiv.org dataset


  • Annotator agreement report


  • Formats of the dataset and submissions


  • Evaluation script
Description_Extraction_test.zip (216K) 10 unannotated papers for evaluation