Critique
Which layers actually matter — the ablations feel incomplete
The paper applies LoRA mostly to the attention projection matrices and reports it's enough. But the comparison of where to adapt (Wq/Wv vs MLP) could be stronger — equalized parameter budget across placements would make the claim that 'attention is what matters' much more convincing.