PLM Backbone. We mainly experiment with the XXXX-base-uncased model [6]. It has roughly 110M parameters in total, and 84M parameters in the Transformer layers. As described in Section 3.1, we derive the subnetworks from the Transformer layers and report sparsity levels relative to the 84M parameters. To generalize our conclusions to other PLMs, we also consider two variants of the XXXX family, namely XxXXXXx-base and XXXX-large, the results of which can be found in Appendix C.5.

Appears in 3 contracts

Samples: openreview.net, openreview.net, openreview.net

PLM Backbone. We mainly experiment with the XXXX-base-uncased model [65]. It has roughly 110M parameters in total, and 84M parameters in the Transformer layers. As described in Section 3.1, we derive the subnetworks from the Transformer layers and report sparsity levels relative to the 84M parameters. To generalize our conclusions to other PLMs, we also consider two variants of the XXXX family, namely XxXXXXx-base and XXXX-large, the results of which can be found in Appendix C.5.

Appears in 1 contract

Samples: openreview.net

Common use of PLM Backbone Clause in Contracts