Tag: RNN

All the articles with the tag "RNN".

ATLAS: Learning to Optimally Memorize the Context at Test Time

Published: 31 May, 2025 at 11:22 AM

86.98 🤔

本文提出Atlas，一种高容量长期内存模块，通过滑动窗口Omega规则和Muon优化器优化上下文记忆，在语言建模和长上下文理解任务中显著优于Transformer和现代RNN。
Compact Recurrent Transformer with Persistent Memory

Published: 9 May, 2025 at 11:06 AM

66.84 🤔

This paper introduces the Compact Recurrent Transformer (CRT), which combines shallow Transformers with RNNs to efficiently process long sequences using a single persistent memory vector, achieving superior or comparable performance to full-length Transformers and Transformer-XL on language and video tasks with significantly reduced computational cost.