Posts

All the articles I've posted.

LZ Penalty: An information-theoretic repetition penalty for autoregressive language models

Published: 6 May, 2025 at 11:19 PM

67.26 🤔

本文提出LZ惩罚方法，基于LZ77压缩算法的码长变化动态调整自回归语言模型的采样分布，在贪婪解码下有效消除退化重复，同时保持推理基准性能。
Small or Large? Zero-Shot or Finetuned? Guiding Language Model Choice for Specialized Applications in Healthcare

Published: 4 May, 2025 at 04:31 PM

67.24 🤔

本文通过实证实验指导在医疗专业应用中语言模型的选择，强调微调小语言模型和领域特定预训练的显著优势，使其在特定任务上超越零-shot 大语言模型。
Compact Recurrent Transformer with Persistent Memory

Published: 9 May, 2025 at 11:06 AM

66.84 🤔

This paper introduces the Compact Recurrent Transformer (CRT), which combines shallow Transformers with RNNs to efficiently process long sequences using a single persistent memory vector, achieving superior or comparable performance to full-length Transformers and Transformer-XL on language and video tasks with significantly reduced computational cost.
X-Fusion: Introducing New Modality to Frozen Large Language Models

Published: 4 May, 2025 at 04:31 PM

66.52 🤔

本文提出X-Fusion框架，通過凍結LLM參數並添加雙塔結構，高效實現多模態理解和生成，同時保留原始語言能力。
Prompt-Based Cost-Effective Evaluation and Operation of ChatGPT as a Computer Programming Teaching Assistant

Published: 4 May, 2025 at 04:26 PM

66.50 🤔

本文通过设计基于ICL和CoT的提示模板，实现了ChatGPT在编程教育中的成本效益评估和操作，显著降低了手动评估需求并提升了反馈的结构化分析。

Posts

LZ Penalty: An information-theoretic repetition penalty for autoregressive language models

Small or Large? Zero-Shot or Finetuned? Guiding Language Model Choice for Specialized Applications in Healthcare

Compact Recurrent Transformer with Persistent Memory

X-Fusion: Introducing New Modality to Frozen Large Language Models

Prompt-Based Cost-Effective Evaluation and Operation of ChatGPT as a Computer Programming Teaching Assistant