Gold Standard Clause Samples

Gold Standard. For testing and evaluation purposed, the gold standard of (▇▇▇▇▇▇▇▇ et al., 2006b) has been used, which consists of SCFs and relative frequencies for 183 general-language verbs, based on approximately 250 manually annotated sentences per verb. The verbs were selected randomly, subject to the restriction that they take multiple SCFs. The gold standard includes 116 SCFs. Details of the method and the experiment are given in ▇▇▇▇▇▇, ▇▇▇▇▇▇▇ and ▇▇▇▇▇▇▇▇ (2012).

View Similar (2)

Gold Standard the same setup as the solution method but after solving the first training tasks workers are shown the correct answers for the tasks and informed that they had been used for training purposes. ▇▇▇▇▇▇ et al. [2011] used this method in tasks as quality control mechanism rather than using it as a training method.

Gold Standard. To train NER algorithms, and assess the accuracy of the models, a manually tagged collection of documents is needed. This is called a gold standard, and at the start of the project, the data set created in the ARIADNE project was used (Vlachidis et al., 2017). This data set consists of eight documents, 355k tokens, 20k entities across nine categories. This set has been annotated by hand by highlighting spans in the Microsoft Word word processor. These highlighted entities have been extracted from the eXtensible Markup Language (XML) of the Word file, and converted to the BIO file format. However, when we started experiments with this data set, we found some inconsistencies and issues in the annotations that might be causing low F1 scores on the NER task. These problems with the data set have also been described by ▇▇▇▇▇▇▇▇▇ et al. (2017). To try and improve our system, we created a new data set, optimally annotated for NER, which we further describe in the next chapter. ▇▇▇▇.▇▇▇▇▇▇▇▇.▇▇▇ ▇▇▇▇.▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇.▇▇▇‌

Filter & Search

Parent Clauses

Sub-Clauses