Add Float8QuantizedTensor (AQT subclass) and replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs #1599

danielvegamyhre · 2025-01-22T19:35:29Z

Context

Currently, AQT has the method from_hp_to_floatx for float8 quantization, and from_hp_to_fpx for low precision floating point data types like fp6 (technically can support fp1-fp7).

from_hp_to_floatx re-uses from_hp_to_intx, which in turn uses these generic quantization primitives.

Overall, in the current state the float8 path is a bit confusing for developers, due to both the naming ("floatx") and the use of generic functions which include a bunch of params which are unrelated to float8 quantization.

Summary of changes

The goal of this PR stack is to refactor this to have a clean separation of concerns, and simpler internal API surfaces for code using in float8 quantization for inference.

Specifically:

Separate quantization primitives for float8
Integrate those new quant primitives into AQT
Integrate new AQT methods into float8 quantization APIs <------------------- (this PR)

Note: I will add float8 static quantization in a separate set of PRs.

[ghstack-poisoned]

danielvegamyhre · 2025-01-22T19:35:30Z

Stack from ghstack (oldest at bottom):

-> Add Float8QuantizedTensor (AQT subclass) and replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs #1599
integrate new float8 quantization primitives into AQT #1598
add separate quantization primitives for float8 #1597

pytorch-bot · 2025-01-22T19:35:33Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1599

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 2f15cc1 with merge base 32d9b0b ():

NEW FAILURES - The following jobs have failed:

Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch --index-url https://download.pytorch.org/wh... / linux-job (gh)
RuntimeError: Command docker exec -t 84ed1b6238c85976efd407f2dccfb3e367a35c6459cd1613a0be45cb65fb0ba0 /exec failed with exit code 1
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
RuntimeError: Command docker exec -t 10b1e1a3e3ec936f8a6e94f598ea3637e9c886a117fc6a5bb06ba4438414fdb1 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

… quantization APIs ghstack-source-id: 293124bd8577fa1a3168d55942efd74af28e0f61 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 059b6978da29d45ed55481b0c510231f2ad93303 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: d09ded5f1d785c6ad85cc0a578049e5569265c4e ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 33f1e89a69344ccc38b98e297f88450e204c41b1 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 3faa777647431779fa0213b631f57891b24af86d ghstack-comment-id: 2608105249 Pull Request resolved: #1599

jerryzh168 · 2025-01-22T20:58:33Z

thanks, we also want to split out a Float8 (and floatx) specific AQT implementations as well, I talked to @jainapurva before

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 43890bf2fd3b4d9cc251b4ea614de6ff8d93735b ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 26c1a6b2f4bf0bb6086d85b8cf18195f9485db65 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

danielvegamyhre · 2025-01-22T21:10:59Z

thanks, we also want to split out a Float8 (and floatx) specific AQT implementations as well, I talked to @jainapurva before

Yep that makes sense, when I talked to her earlier she said she is planning to create these AQT subclasses, so I decided to do this part of the refactor.

jainapurva · 2025-01-23T00:50:55Z

torchao/dtypes/__init__.py

@@ -38,6 +34,7 @@
    "to_affine_quantized_fpx",
    "to_affine_quantized_floatx",


Please remove floatx, float8 should replace floatx.

Oh I left it in since it's still in use in other parts of the code base (autoquant, autoquant v2), and I wasn't sure if I should be touching those - is it ok to replace all instances across the whole codebase?

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 08fb7c834a304079f27d93d27c64b449323d92b7 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

… quantization APIs ghstack-source-id: cba5e1cd1ea9a91b3551218dbd0407fecc4c3ee4 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 160d2dfad0ef985346edf0c184412b6b07fa120f ghstack-comment-id: 2608105249 Pull Request resolved: #1599

jainapurva · 2025-01-23T01:30:27Z

thanks, we also want to split out a Float8 (and floatx) specific AQT implementations as well, I talked to @jainapurva before

Yep that makes sense, when I talked to her earlier she said she is planning to create these AQT subclasses, so I decided to do this part of the refactor.

Yes, we want all the instances replaced. Autoquant is using it for Float8. Hence would be better to rename it float8

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 15a37e94b2ff3cf3136f6553e5b50144eb05112c ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 3028fc5f84252f60353df9144ce3fda62b26fe8c ghstack-comment-id: 2608105249 Pull Request resolved: #1599

danielvegamyhre · 2025-01-23T16:44:55Z

thanks, we also want to split out a Float8 (and floatx) specific AQT implementations as well, I talked to @jainapurva before

Yep that makes sense, when I talked to her earlier she said she is planning to create these AQT subclasses, so I decided to do this part of the refactor.

Yes, we want all the instances replaced. Autoquant is using it for Float8. Hence would be better to rename it float8

Done!

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 98647b42b631117f7a05f425ed8957c3c22f48ed ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 451b9f7c4ba252e367c08ebaced0efb54c24885f ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: f655d60cc7481b5c8db708318b5d6da720a7a0ea ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 75b010de13f2a6627d542965d6a9fa6f60b86bbb ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 89552840a0083b5048a40cd7b72ae68e62bc88ec ghstack-comment-id: 2608105249 Pull Request resolved: #1599

vkuzo · 2025-01-23T17:53:50Z

torchao/dtypes/float8/float8_layout.py

@@ -209,19 +215,64 @@ def __repr__(self):
        )


+class Float8QuantizedTensor(AffineQuantizedTensor):


I'm not a fan of this, this introduces one more abstraction (Float8QuantizedTensor), while keeping the complexity of AffineQuantizedTensor. I think either staying with AQT or just writing a float8 tensor without using AQT would seem more attractive.

Interesting - cc @jainapurva @jerryzh168 thoughts on this?

For context AQT subclassing was part of a BE effort for the week, I'll share the doc with you internally

Removing AQT abstraction is easy, but the only reason I felt like keeping it was consistency in all dtypes. Though I do agree that it adds another level of abstraction

[ghstack-poisoned]

… quantization APIs ghstack-source-id: a331504337cc231cd64a16c06efd2bdf08f78159 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 2cbe6199d279bde6a8c890f160ce3fe25cc2faf3 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

[ghstack-poisoned]

… quantization APIs ghstack-source-id: 61cc8c2acc548ff09454232cbf16235957e33b32 ghstack-comment-id: 2608105249 Pull Request resolved: #1599

danielvegamyhre · 2025-01-23T20:55:53Z

Discussed offline, closing until internal discussions are finalized.

Update

a1becad

[ghstack-poisoned]

danielvegamyhre mentioned this pull request Jan 22, 2025

add separate quantization primitives for float8 #1597

Merged

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 22, 2025

danielvegamyhre mentioned this pull request Jan 22, 2025

integrate new float8 quantization primitives into AQT #1598

Open

danielvegamyhre added quantize topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) labels Jan 22, 2025

danielvegamyhre requested a review from jainapurva January 22, 2025 19:36

Update

4a1c1ea

[ghstack-poisoned]

danielvegamyhre removed the request for review from jainapurva January 22, 2025 20:16

Update

745529e

[ghstack-poisoned]

Update

33c958e

[ghstack-poisoned]

Update

f0fa1d9

[ghstack-poisoned]

Update

76ae4bc

[ghstack-poisoned]

Update

0bc1899

[ghstack-poisoned]

danielvegamyhre requested review from jainapurva and jerryzh168 January 22, 2025 21:14

jainapurva reviewed Jan 23, 2025

View reviewed changes

Update

3255653

[ghstack-poisoned]

Update

fdbd828

[ghstack-poisoned]

This comment has been minimized.

Sign in to view

Update

c29f835

[ghstack-poisoned]

Update

5d23bda

[ghstack-poisoned]

Update

d9d7bc6

[ghstack-poisoned]

Update

288dff3

[ghstack-poisoned]

Update

48b01e1

[ghstack-poisoned]

Update

488fc6f

[ghstack-poisoned]

danielvegamyhre changed the title ~~replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs~~ Add Float8QuantizedTensor (AQT subclass) and replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs Jan 23, 2025

Update

6d583af

[ghstack-poisoned]

vkuzo reviewed Jan 23, 2025

View reviewed changes

Update

dc97eba

[ghstack-poisoned]

Update

0d8cdd9

[ghstack-poisoned]

Update

2f15cc1

[ghstack-poisoned]

danielvegamyhre closed this Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Float8QuantizedTensor (AQT subclass) and replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs #1599

Add Float8QuantizedTensor (AQT subclass) and replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs #1599

danielvegamyhre commented Jan 22, 2025 •

edited

Loading

danielvegamyhre commented Jan 22, 2025 •

edited

Loading

pytorch-bot bot commented Jan 22, 2025 •

edited

Loading

jerryzh168 commented Jan 22, 2025

danielvegamyhre commented Jan 22, 2025

jainapurva Jan 23, 2025

danielvegamyhre Jan 23, 2025 •

edited

Loading

jainapurva commented Jan 23, 2025

This comment has been minimized.

danielvegamyhre commented Jan 23, 2025

vkuzo Jan 23, 2025

danielvegamyhre Jan 23, 2025 •

edited

Loading

jainapurva Jan 23, 2025 •

edited

Loading

danielvegamyhre commented Jan 23, 2025

		@@ -38,6 +34,7 @@
		"to_affine_quantized_fpx",
		"to_affine_quantized_floatx",

		@@ -209,19 +215,64 @@ def __repr__(self):
		)


		class Float8QuantizedTensor(AffineQuantizedTensor):

Add Float8QuantizedTensor (AQT subclass) and replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs #1599

Add Float8QuantizedTensor (AQT subclass) and replace to_affine_quantized_floatx with to_affine_quantized_float8 in quantization APIs #1599

Conversation

danielvegamyhre commented Jan 22, 2025 • edited Loading

danielvegamyhre commented Jan 22, 2025 • edited Loading

pytorch-bot bot commented Jan 22, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1599

❌ 2 New Failures

jerryzh168 commented Jan 22, 2025

danielvegamyhre commented Jan 22, 2025

jainapurva Jan 23, 2025

Choose a reason for hiding this comment

danielvegamyhre Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

jainapurva commented Jan 23, 2025

This comment has been minimized.

danielvegamyhre commented Jan 23, 2025

vkuzo Jan 23, 2025

Choose a reason for hiding this comment

danielvegamyhre Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

jainapurva Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

danielvegamyhre commented Jan 23, 2025

danielvegamyhre commented Jan 22, 2025 •

edited

Loading

danielvegamyhre commented Jan 22, 2025 •

edited

Loading

pytorch-bot bot commented Jan 22, 2025 •

edited

Loading

danielvegamyhre Jan 23, 2025 •

edited

Loading

danielvegamyhre Jan 23, 2025 •

edited

Loading

jainapurva Jan 23, 2025 •

edited

Loading