Tag: Linear Sequence Modeling
All the articles with the tag "Linear Sequence Modeling".
-
MoM: Linear Sequence Modeling with Mixture-of-Memories
The Mixture-of-Memories (MoM) architecture introduces multiple independent memory states with a routing mechanism to enhance memory capacity and reduce interference in linear sequence modeling, achieving significant performance gains over other linear models on recall-intensive tasks and nearing Transformer performance at larger scales while maintaining efficiency.