Tag: Prediction
All the articles with the tag "Prediction".
-
Compact Recurrent Transformer with Persistent Memory
This paper introduces the Compact Recurrent Transformer (CRT), which combines shallow Transformers with RNNs to efficiently process long sequences using a single persistent memory vector, achieving superior or comparable performance to full-length Transformers and Transformer-XL on language and video tasks with significantly reduced computational cost.
-
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference
本研究提出 SpargeAttn,一种通用稀疏注意力机制,通过两阶段在线过滤器和量化技术加速各种模型的推理,同时保持端到端性能无损。
-
SuperARC: An Agnostic Test for Narrow, General, and Super Intelligence Based On the Principles of Recursive Compression and Algorithmic Probability
本文提出SuperARC测试框架,通过算法概率和Kolmogorov复杂度的原理,设计了一个客观的AGI和ASI评估方法,证明递归压缩等价于预测,并展示了LLMs的局限性。