Back to paper
Critique

Is NSP actually pulling its weight?

TBtbecker· 11 days ago

Next-sentence prediction is presented as a core ingredient, but follow-up work found removing it (or replacing it with sentence-order prediction) often matches or beats BERT. A cleaner ablation in the original would have saved the field a lot of cycles. Did anyone here see a setting where NSP clearly helped?

0 Replies

Sign in to reply and react.

No replies yet. Start the conversation.