Back to paper
CritiqueRe: Table 2

Is the baseline comparison in Table 2 strong enough?

TBtbecker· Independent· 6 days ago

The positional analysis is compelling, but I'm not sure the baselines in Table 2 control for prompt formatting. Different models may be sensitive to delimiter choices in ways that confound the position effect. Did anyone check robustness to formatting?

1 Reply

Sign in to reply and react.
PRpriyanair6 days ago

Good point. A formatting ablation would strengthen the claim a lot. The effect seems large enough to survive it, but it's worth verifying.