CritiqueRe: Table 2

Is the baseline comparison in Table 2 strong enough?

TBtbecker· Independent· about 2 months ago

The positional analysis is compelling, but I'm not sure the baselines in Table 2 control for prompt formatting. Different models may be sensitive to delimiter choices in ways that confound the position effect. Did anyone check robustness to formatting?

NLP

1 Reply

PRpriyanairabout 2 months ago

Good point. A formatting ablation would strengthen the claim a lot. The effect seems large enough to survive it, but it's worth verifying.