Improving transcription agreement of non-native English speech corpus transcribed by non-nativesTranscription Agreement • July 1st, 2011
Contract Type FiledJuly 1st, 2011This paper proposes an economical and effective phonetic transcription method for dealing with a large amount of non- native English speech corpus. The method provides a consistent transcription agreement, although the corpus is transcribed by non-natives. To minimize the possibility of confusion in transcription process, forced aligned phone sequences and a set of possible mispronunciation candidate phones that Korean L2 learners are expected to make are given to the Korean transcribers for reference. The proposed method is evaluated by measuring the transcription agreement using Fleiss’ kappa as well as percentage agreement. Furthermore, the transcription consistency is analyzed by comparing it to that performed on the English corpus transcribed by native speakers. As a result, a transcription agreement of 0.869 is achieved, while the Buckeye corpus transcribed by natives shows a transcription agreement of 0.803.