BNC 1000 Test Set

Gold Standard Parse Trees for 1,000 sentences from the British National Corpus -- annotated according to Penn Treebank bracketing guidelines and checked using Markus Dickinson's treebank annotation error detection software (Dickinson, 2008).

Download HERE.

For more information see:

Jennifer Foster and Josef van Genabith, 2008. Parser Evaluation and the BNC: Evaluating 4 constituency parsers with 3 metrics. Proceedings of LREC-2008. Marrakech, Morocco. PDF

Markus Dickinson and Jennifer Foster, 2009. Similarity Rules! Exploring Methods for Ad-hoc Rule Detection. Proceedings of TLT-2009. Groningen, The Netherlands. PDF


Click here to access an older version of the test set. This is the version which was used to obtain the parsing results in Foster and van Genabith 2008. The 2009 version (see above) is the result of applying the annotation error detection software by Markus Dickinson to this version.


Back...