Tag: Speculative Decoding
All the articles with the tag "Speculative Decoding".
-
Accelerating Large Language Model Reasoning via Speculative Search
Speculative Search (SpecSearch) accelerates LLM reasoning by up to 2.12× through a bi-level speculative thought generator that collaborates between small and large models, maintaining comparable reasoning quality via a quality-preserving rejection mechanism.