Task Definition
NTCIR Math Track: mathematical content access based on both natural language text and mathematical formulae.
- Math Retrieval Subtask
- Math Understanding Subtask
- (NEW) Free Subtask
Math Retrieval Subtask
Subtask Goal
- Given a document collection, retrieve relevant mathematical formulae or documents for a given query.
Dataset
- Scientific Articles from ArXiv e-print server
- Converted into XML+MathML by arXMLiv project
- 10,000 docs for the preliminary study
- Additional 100,000 docs for the formal run
- Each run should contain up to 100 formulae per a query. Each group can submit up to 4 runs
- (NOTE: Only the 100,000 docs will be used for the evaluation.)
Search Types
- The Math retrieval task uses the above 110,000 documents and can be envisaged in the following three different search scenarios:
-
Formula Search
- Search by formula queries within the formulae database of the used dataset
-
Full-Text Search
- Search the document collection using hybrid queries containing both keywords and formulae.
-
Open Information Retrieval
- Search the document collection using free textual queries.
Sample queries
Notes on the evaluation
- Please refer here
Math Understanding Subtask
Subtask Goal
- Extract natural language descriptions of mathematical expressions in a document for their semantic interpretation.
Dataset
- Initial Data Set
- 10 annotated papers selected from ArXiv.org dataset.
- Click here to get more information
- Development Data
- 35 annotated papers selected from ArXiv.org dataset.
- The dataset contains 10 papers from the initial dataset with updated annotations.
- Evaluation Data
- 10 unannotated papers selected from ArXiv.org dataset.
- (submission period: about five days is assumed)
Notes
- All the mathematical expressions in the dataset are expressed using MathML Parallel Markup.
Math Free Subtask
We welcome contributions for the Free Task on any topics related to the Math Search and Math Understanding tasks. Although participants are encouraged to use the provided NTCIR Math Task dataset in their Free Task contributions, they may submit using any other dataset(s).
Participation in the NTCIR Math Task is not a pre-requisite to participate in the NTCIR “Free Task”. In order to participate in the ‘free’ task, please submit your paper by April 1, 2013.