Tag: Representation Learning

All the articles with the tag "Representation Learning".

When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars

Published: 4 May, 2025 at 04:30 PM

69.59 🤔

本论文通过上下文无关文法合成数据研究了元数据条件化在语言模型预训练中的影响，发现其对长提示任务有益但对短提示任务有害，揭示了潜在语义推断的权衡。
Training Plug-n-Play Knowledge Modules with Deep Context Distillation

Published: 4 May, 2025 at 04:28 PM

69.06 🤔

本文提出使用深度上下文蒸馏训练可插拔知识模块的方法，能够在低数据场景下高效整合文档知识，并通过实验证明其在问答任务中优于传统方法且与 RAG 具有协同效应。
Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders

Published: 10 May, 2025 at 10:58 AM

61.78 🤔

This paper uses Sparse Autoencoders to identify and manipulate language-specific features in Large Language Models, introducing a monolinguality metric, demonstrating context dependency via code-switching, and enhancing steering vectors for better control over multilingual generation while revealing significant language-specific impacts through ablation studies.
Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision

Published: 4 May, 2025 at 04:27 PM

59.70 🤔

本文提出Instruct-LF方法，通过结合LLMs的指令遵循能力和梯度-based统计模型，实现无需任务监督的目标导向潜在因素发现，提高了下游任务性能并在人工评估中被偏好。
SuperARC: An Agnostic Test for Narrow, General, and Super Intelligence Based On the Principles of Recursive Compression and Algorithmic Probability

Published: 4 May, 2025 at 04:26 PM

54.84 🤔

本文提出SuperARC测试框架，通过算法概率和Kolmogorov复杂度的原理，设计了一个客观的AGI和ASI评估方法，证明递归压缩等价于预测，并展示了LLMs的局限性。

Tag: Representation Learning

When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars

Training Plug-n-Play Knowledge Modules with Deep Context Distillation

Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders

Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision

SuperARC: An Agnostic Test for Narrow, General, and Super Intelligence Based On the Principles of Recursive Compression and Algorithmic Probability