Tag: Robustness

All the articles with the tag "Robustness".

Wasserstein Distributionally Robust Nonparametric Regression

Published: 16 May, 2025 at 11:36 AM

88.71 🤔

This paper introduces a Wasserstein Distributionally Robust Optimization framework for nonparametric regression, using Lipschitz-constrained feedforward neural networks to derive non-asymptotic error bounds for local worst-case risk under model misspecification, demonstrating robustness through simulations and MNIST dataset application.
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs

Published: 17 May, 2025 at 11:17 PM

88.54 🤔

本文提出上下文牵引（Contextual Entrainment）现象，揭示语言模型对提示中出现token的机制性偏好，并通过可微分掩码方法识别牵引头（entrainment heads），为理解和缓解分心问题提供了新视角。
Hide & Seek: Transformer Symmetries Obscure Sharpness & Riemannian Geometry Finds It

Published: 14 May, 2025 at 11:09 AM

88.35 🤔

This paper introduces geodesic sharpness, a novel measure using Riemannian geometry to account for transformer symmetries on a quotient manifold, demonstrating stronger correlations with generalization across diagonal networks, vision transformers, and language models compared to traditional adaptive sharpness.
Does quantization affect models' performance on long-context tasks?

Published: 2 Jun, 2025 at 11:34 AM

87.84 🤔

本文系统评估了量化对大型语言模型在长上下文任务中的性能影响，发现8-bit量化基本保持准确率（下降约0.8%），而4-bit量化导致显著损失（最高达59%），且影响因模型、任务和语言而异，强调了在长上下文和多语言场景下谨慎应用量化的必要性。
Sparse-Group Boosting with Balanced Selection Frequencies: A Simulation-Based Approach and R Implementation

Published: 8 May, 2025 at 10:25 AM

87.75 🤔

This paper introduces sparse-group boosting and a simulation-based group balancing algorithm within the 'sgboost' R package to mitigate variable selection bias in high-dimensional grouped data, demonstrating improved fairness and interpretability through simulations and ecological data analysis.

Tag: Robustness

Wasserstein Distributionally Robust Nonparametric Regression

Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs

Hide & Seek: Transformer Symmetries Obscure Sharpness & Riemannian Geometry Finds It

Does quantization affect models' performance on long-context tasks?

Sparse-Group Boosting with Balanced Selection Frequencies: A Simulation-Based Approach and R Implementation