Tag: Supervised Learning

All the articles with the tag "Supervised Learning".

RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs

Published: 22 May, 2025 at 11:16 AM

89.06 🤔

本文通过理论和实验分析，揭示了当前RL（如GRPO）在LLM后训练中的MDP结构假设使其退化为过滤迭代监督微调，并指出响应长度增加源于奖励分配偏差，而非推理能力提升。
Deep Learning for On-Street Parking Violation Prediction

Published: 14 May, 2025 at 11:08 AM

89.01 🤔

This paper develops a Deep Learning model with a novel data smoothing technique to predict fine-grained on-street parking violation rates in Thessaloniki, Greece, using indirect features like weather and time, achieving improved accuracy (MAE of 0.146) over baseline methods.
Large Language Models are Miscalibrated In-Context Learners

Published: 25 May, 2025 at 11:24 AM

88.84 🤔

本文通过对大型语言模型在低资源场景下的校准问题进行深入分析，揭示上下文学习（ICL）未一致改善校准效果，并提出自集成方法显著提升校准性能（平均降低ECE 43%），同时维持或略提升任务性能。
Nonparametric learning of covariate-based Markov jump processes using RKHS techniques

Published: 8 May, 2025 at 12:18 AM

88.71 🤔

本文提出了一种基于再生核希尔伯特空间（RKHS）的非参数化方法，通过频率学和贝叶斯框架建模连续时间马尔可夫链（CTMC）中协变量驱动的非线性转移率，显著提升了个体化状态转移预测的准确性。
Making Small Language Models Efficient Reasoners: Intervention, Supervision, Reinforcement

Published: 17 May, 2025 at 11:04 AM

88.64 🤔

This paper introduces Temperature Scaling (TS) and Trace Length Control for Dynamic Reasoning (TLDR) to enhance token efficiency in small language models, achieving up to 50% reduction in response length with minimal accuracy loss across multiple reasoning benchmarks.

Tag: Supervised Learning

RL in Name Only? Analyzing the Structural Assumptions in RL post-training for LLMs

Deep Learning for On-Street Parking Violation Prediction

Large Language Models are Miscalibrated In-Context Learners

Nonparametric learning of covariate-based Markov jump processes using RKHS techniques

Making Small Language Models Efficient Reasoners: Intervention, Supervision, Reinforcement