Posts

All the articles I've posted.

Agentic AI: The Era of Semantic Decoding

Published: 8 May, 2025 at 12:27 AM

89.68 🤔

本文提出语义解码视角，将大型语言模型、人类和工具的协作框架化为语义空间中的优化过程，通过语义令牌的交换和语义解码算法的设计探索AI系统的新计算范式。
MELON: Provable Indirect Prompt Injection Defense via Masked Re-execution and Tool Comparison

Published: 8 May, 2025 at 10:22 AM

89.40 🤔

MELON introduces a novel training-free defense against indirect prompt injection attacks on LLM agents by detecting independence of tool calls from user inputs through masked re-execution, achieving superior attack prevention (0.24% ASR on GPT-4o) and utility preservation (58.78% UA on GPT-4o) compared to existing methods.
MoM: Linear Sequence Modeling with Mixture-of-Memories

Published: 8 May, 2025 at 06:19 PM

89.33 🤔

The Mixture-of-Memories (MoM) architecture introduces multiple independent memory states with a routing mechanism to enhance memory capacity and reduce interference in linear sequence modeling, achieving significant performance gains over other linear models on recall-intensive tasks and nearing Transformer performance at larger scales while maintaining efficiency.
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

Published: 8 May, 2025 at 10:22 AM

89.20 🤔

The Video Prediction Policy (VPP) introduces a novel generalist robot policy that leverages predictive visual representations from fine-tuned video diffusion models to learn implicit inverse dynamics, achieving significant improvements of 41.5% on the Calvin ABC→D benchmark and 31.6% in real-world dexterous manipulation tasks over state-of-the-art baselines.
Always Skip Attention

Published: 8 May, 2025 at 11:06 AM

89.20 🤔

This paper theoretically demonstrates the ill-conditioning of Self-Attention Blocks in Vision Transformers without skip connections, highlights their role as regularizers, and proposes Token Graying (SVD and DCT) to improve input token conditioning, achieving modest performance gains in supervised and self-supervised tasks.

Posts

Agentic AI: The Era of Semantic Decoding

MELON: Provable Indirect Prompt Injection Defense via Masked Re-execution and Tool Comparison

MoM: Linear Sequence Modeling with Mixture-of-Memories

Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

Always Skip Attention