Task Definition

NTCIR Math Track: mathematical content access based on both natural language text and mathematical formulae.

  • Math Retrieval Subtask
  • Math Understanding Subtask
  • (NEW) Free Subtask

Math Retrieval Subtask

Subtask Goal

    Given a document collection, retrieve relevant mathematical formulae or documents for a given query.

Dataset

  • Scientific Articles from ArXiv e-print server
  • Converted into XML+MathML by arXMLiv project
  • 10,000 docs for the preliminary study
  • Additional 100,000 docs for the formal run
  • Each run should contain up to 100 formulae per a query. Each group can submit up to 4 runs
  • (NOTE: Only the 100,000 docs will be used for the evaluation.)

Search Types

    The Math retrieval task uses the above 110,000 documents and can be envisaged in the following three different search scenarios:
  • Formula Search
      Search by formula queries within the formulae database of the used dataset
  • Full-Text Search
      Search the document collection using hybrid queries containing both keywords and formulae.
  • Open Information Retrieval
      Search the document collection using free textual queries.

Sample queries

Notes on the evaluation


Math Understanding Subtask
Subtask Goal

  • Extract natural language descriptions of mathematical expressions in a document for their semantic interpretation.

Dataset

  • Initial Data Set
  • Development Data
    • 35 annotated papers selected from ArXiv.org dataset.
    • The dataset contains 10 papers from the initial dataset with updated annotations.
  • Evaluation Data
    • 10 unannotated papers selected from ArXiv.org dataset.
    • (submission period: about five days is assumed)

Notes

  • All the mathematical expressions in the dataset are expressed using MathML Parallel Markup.

Math Free Subtask

We welcome contributions for the Free Task on any topics related to the Math Search and Math Understanding tasks. Although participants are encouraged to use the provided NTCIR Math Task dataset in their Free Task contributions, they may submit using any other dataset(s).

Participation in the NTCIR Math Task is not a pre-requisite to participate in the NTCIR “Free Task”. In order to participate in the ‘free’ task, please submit your paper by April 1, 2013.