The Orthographic Agreement applied to Unitex graphs and dictionaries)Orthographic Agreement • February 26th, 2016
Contract Type FiledFebruary 26th, 2016Abstract: The Orthographic Agreement of 1990 proposes unifying the orthography of Portuguese, the language of the CPLP members (Community of Portuguese Language Countrie), namely, Brazil, Portugal, Angola, Cape Verde, São Tomé and Príncipe, East Timor, Guinea- Bissau, and Mozambique. Since part of the Portuguese lexicon was modified, its dictionaries need revising and adapting to the new rules. Our work contributes to the development of NLP (Natural Language Processing) enhanced tools, like Unitex. Unitex is a software developed by the Linguistics and Computing team of the Université Paris-Est Marne-la-Vallée (PAUMIER, 2002), which allows the research of regular expressions within large corpora, besides other functionalities. This software is characterized by its embedded large coverage electronic dic- tionaries that need constant revision for maintenance of such lexicons.