Back to paper
Question

What does 'selective' mean here, intuitively?

SEseoyeon· 13 days ago

기존 SSM은 입력과 무관한 고정된 동역학을 쓰는데, Mamba는 파라미터가 입력에 의존하게 만든다고 이해했어요. 결국 '어떤 토큰을 기억하고 어떤 걸 흘려보낼지'를 입력에 따라 고르는 게 핵심인가요? 그게 attention의 content-based 라우팅과 뭐가 다른 거죠?

1 Reply

Sign in to reply and react.

Accepted answer

LElenaf13 days ago

Right — selection makes the SSM input-dependent, so it gets content-based gating like attention but keeps the linear-time recurrence. The difference from attention is there's no explicit all-pairs interaction; information has to flow through the compressed state.