Sparse and Unbiased Subnetworks Clause Samples

Sparse and Unbiased Subnetworks shows the results of mask training with the OOD training data. We can see that the general patterns in paraphrase identification and fact verification datasets are basically the same as the NLI datasets. Although the identified subnetworks cannot achieve 100% accuracy on PAWS and FEVER- Symmetric as on HANS, they substantially narrow the gap between OOD and ID performance, as compared with the full ▇▇▇▇. An exception is on the Symm2, where the upper bound of SRNets seems not very high. This is probably because we do not have enough examples (708 in total) to represent the data distribution of the FEVER-Symmetric dataset. Therefore, we conjecture that the existence of sparse and unbiased subnetworks might be ubiquitous.