From d883f6e3c411803d06f1eb4c34e93f4550eb84a1 Mon Sep 17 00:00:00 2001 From: Lanxiang Hu Date: Sun, 3 Mar 2024 01:57:42 -0800 Subject: [PATCH] add redirection to home --- hugo.yaml | 2 +- public/index.html | 18 +++++++++++++++++- public/sitemap.xml | 8 ++++---- 3 files changed, 22 insertions(+), 6 deletions(-) diff --git a/hugo.yaml b/hugo.yaml index f2b52ef..0ce29f3 100644 --- a/hugo.yaml +++ b/hugo.yaml @@ -28,7 +28,7 @@ menu: weight: 10 - identifier: Blogs name: Blogs - url: + url: '' weight: 30 - identifier: People name: People diff --git a/public/index.html b/public/index.html index 820727a..e910494 100644 --- a/public/index.html +++ b/public/index.html @@ -144,7 +144,23 @@ -
+
+
+ +
+

Consistency Large Language Models: A Family of Efficient Parallel Decoders

+ + +

TL;DR: In this blog, we introduce consistency large language models (CLLMs), a new family of models developed with our proposed techniques to reduce inference latency by efficiently decoding $n$ tokens in parallel. This decoding method is called Jacobi decoding, which improves inference efficiency by breaking the sequential nature of conventional auto-regressive (AR) decoding. CLLMs are trained with the objective of performing efficient Jacobi decoding by mapping any randomly initialized $n$-token sequence to a correctly predicted sequence in as few steps as possible.

+
+
+ +
+

Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

+ + +

TL;DR: We introduce lookahead decoding, a new, exact, and parallel decoding algorithm to accelerate LLM inference. Lookahead decoding breaks the sequential dependency in autoregressive decoding by concurrently extracting and verifying n-grams directly with the LLM, utilizing the Jacobi iteration method. Lookahead decoding functions without the need for a draft model or a data store. It linearly decreases the number of decoding steps directly correlating with the log(FLOPs) used per decoding step.

+