Embedding Models Used Sample Clauses

Embedding Models Used. The test on conversational datasets involved two embeddings: ELMo and Flair. The ELMo model has a structure of 2 layers of bidirectional LSTM with a hidden state size of 1028 and an output projection size of 128. Concatenating the output of the forward layer and backward layer results in an output embedding dimension size of 256. The average embedding of the three output layers were used. This approach also worked with the Flair model of 1 layer of bidirectional LSTM with a hidden size of 1028, producing an output embedding dimension size of 2048. The models were prepared two ways on our dataset: trained and tuned. Trained models were directly trained on the conversational dataset, and tuned model are word embedding models originally trained on a large dataset (1-billion-word corpus) and tuned (continue training) on our conversational dataset.