Skip to content

Commit

Permalink
fix(docs): fix all the broken links
Browse files Browse the repository at this point in the history
  • Loading branch information
baptistecolle committed Jan 17, 2025
1 parent d77901a commit a338ce7
Show file tree
Hide file tree
Showing 8 changed files with 40 additions and 33 deletions.
2 changes: 1 addition & 1 deletion docs/source/howto/advanced-tgi-serving.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

We recommend using Jetstream with TGI for the best performance. If for some reason you want to use the Pytorch/XLA backend instead, you can set the `JETSTREAM_PT_DISABLE=1` environment variable.

For more information, see our discussion on the [difference between jetstream and pytorch XLA](./conceptual_guide/difference)
For more information, see our discussion on the [difference between jetstream and pytorch XLA](../conceptual_guides/difference_between_jetstream_and_xla)

## Quantization
When using Jetstream Pytorch engine, it is possible to enable quantization to reduce the memory footprint and increase the throughput. To enable quantization, set the `QUANTIZATION=1` environment variable. For instance, on a 2x4 TPU v5e (16GB per chip * 8 = 128 GB per pod), you can serve models up to 70B parameters, such as Llama 3.3-70B. The quantization is done in `int8` on the fly as the weight loads. As with any quantization option, you can expect a small drop in the model accuracy. Without the quantization option enabled, the model is served in bf16.
Expand Down
8 changes: 2 additions & 6 deletions docs/source/howto/installation_inside_a_container.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,6 @@ python3 -c "import torch_xla.core.xla_model as xm; print(xm.xla_device())"
You should see output indicating the XLA device is available (e.g., `xla:0`).

## Next Steps

After setting up your container, you can:
- Start training models using Optimum-TPU
- Run inference workloads
- Access TPU-specific features and optimizations

For more details on using Optimum-TPU, refer to our [documentation](https://huggingface.co/docs/optimum/tpu/overview).
- Start training models using Optimum-TPU. Refer to our [training example section](../howto/more_examples).
- Run inference workloads. Check out our [serving guide](../howto/serving).
2 changes: 1 addition & 1 deletion docs/source/howto/serving.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ For a list of supported models, check the [Supported Models page](../supported-a

## Deploy TGI on a Cloud TPU Instance

This guide assumes you have a Cloud TPU instance running. If not, please refer to our [deployment guide](./deploy).
This guide assumes you have a Cloud TPU instance running. If not, please refer to our [deployment guide](../tutorials/tpu_setup).

You have two options for deploying TGI:
1. Use our pre-built TGI image (recommended)
Expand Down
2 changes: 1 addition & 1 deletion docs/source/howto/training.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ See [Supported Models](../supported-architectures).

Before starting the training process, ensure you have:

1. A configured Google Cloud TPU instance (see [Deployment Guide](./deploy))
1. A configured Google Cloud TPU instance (see [Deployment Guide](../tutorials/tpu_setup))
2. Optimum-TPU installed with PyTorch/XLA support:
```bash
pip install optimum-tpu -f https://storage.googleapis.com/libtpu-releases/index.html
Expand Down
38 changes: 26 additions & 12 deletions docs/source/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,7 @@ If you are here to start using HuggingFace products on TPU, then you are in the
The API provides the overall same user-experience as HuggingFace transformers with the minimum amount of changes required to target performance for inference and training.

Optimum TPU is meant to reduce as much as possible the friction in order to leverage Google Cloud TPU accelerators.
As such, we provide a pip installable package to make sure everyone can get easily started.
```bash
As such, we provide a pip installable package to make sure everyone can get easily started.```bash
pip install optimum-tpu -f https://storage.googleapis.com/libtpu-releases/index.html
```

Expand All @@ -39,26 +38,41 @@ TPUs excel at large-scale machine learning workloads with matrix computations, e
Optimum-TPU serves as the bridge between the HuggingFace ecosystem and Google Cloud TPU hardware. It dramatically simplifies what would otherwise be a complex integration process, providing an intuitive interface that abstracts away TPU-specific implementation details while maintaining high performance. Through automated optimizations, efficient batching strategies, intelligent memory management and more, Optimum-TPU ensures your models run at peak efficiency on TPU hardware. The framework's deep integration with the HuggingFace Hub catalog of models and datasets enables easy deployment and fine-tuning of state-of-the-art models with the familiar ease of use of HuggingFace libraries while maximizing TPU hardware capabilities.

<div class="mt-10">
<div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 md:gap-y-4 md:gap-x-5">
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./tutorials/overview">
<div class="w-full flex flex-col space-y-4 md:space-y-0 md:grid md:grid-cols-2 lg:grid-cols-2 md:gap-y-4 md:gap-x-5">
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./tutorials/tpu_setup">
<div class="w-full text-center bg-gradient-to-br from-blue-400 to-blue-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
Tutorials
</div>
<p class="text-gray-700">
Learn the basics and become familiar with deploying transformers on Google TPUs.
Start here if you are using 🤗 Optimum-TPU for the first time!
Learn the basics and become familiar with deploying transformers on Google TPUs.
Start here if you are using 🤗 Optimum-TPU for the first time!
</p>
</a>
<a
class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg"
href="./howto/overview"
>
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./howto/serving">
<div class="w-full text-center bg-gradient-to-br from-indigo-400 to-indigo-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
How-to guides
</div>
<p class="text-gray-700">
Practical guides to help you achieve a specific goal. Take a look at these guides to learn how to use 🤗 Optimum-TPU
to solve real-world problems.
Practical guides to help you achieve a specific goal. Take a look at these guides to learn how to use 🤗 Optimum-TPU
to solve real-world problems.
</p>
</a>
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./conceptual_guides/tpu_hardware_support">
<div class="w-full text-center bg-gradient-to-br from-green-400 to-green-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
Conceptual Guides
</div>
<p class="text-gray-700">
Deep dives into key concepts behind TPU optimization, architecture, and best practices.
Understand how TPUs work and how to maximize their potential.
</p>
</a>
<a class="!no-underline border dark:border-gray-700 p-5 rounded-lg shadow hover:shadow-lg" href="./reference/fsdp_v2">
<div class="w-full text-center bg-gradient-to-br from-purple-400 to-purple-500 rounded-lg py-1.5 font-semibold mb-5 text-white text-lg leading-relaxed">
Reference
</div>
<p class="text-gray-700">
Technical descriptions of how the classes and methods of 🤗 Optimum-TPU work.
Detailed API documentation, configuration options, and implementation details.
</p>
</a>
</div>
Expand Down
2 changes: 1 addition & 1 deletion docs/source/tutorials/inference_on_tpu.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This tutorial guides you through setting up and running inference on TPU using T
## Prerequisites

Before starting, ensure you have:
- A running TPU instance (see [TPU Setup Guide](./tpu_setup))
- A running TPU instance (see [TPU Setup Guide](../tutorials/tpu_setup))
- SSH access to your TPU instance
- A HuggingFace account

Expand Down
6 changes: 3 additions & 3 deletions docs/source/tutorials/tpu_setup.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Click the "Create" button to setup your TPU instance.

1. Select TPU type:
- We will use a TPU `v5e-8` (corresponds to a v5litepod8). This is a TPU node containing 8 v5e TPU chips
- For detailed specifications about TPU types, refer to our [TPU hardware types documentation](../conceptual_guides/tpu_hardware_support.mdx)
- For detailed specifications about TPU types, refer to our [TPU hardware types documentation](../conceptual_guides/tpu_hardware_support)

2. Choose a runtime:
- Select `v2-alpha-tpuv5-lite` runtime
Expand Down Expand Up @@ -72,5 +72,5 @@ Now that you have a working TPU environment, you can start using it for AI workl
- Learn how to start training ML models on TPU

Choose the tutorial that best matches your immediate needs:
- For deploying existing models, start with our [model serving tutorial](../tutorials/inference_on_tpu.mdx)
- For training new models, begin with our [model training tutorial](../tutorials/training_on_tpu.mdx)
- For deploying existing models, start with our [model serving tutorial](../tutorials/inference_on_tpu)
- For training new models, begin with our [model training tutorial](../tutorials/training_on_tpu)
13 changes: 5 additions & 8 deletions docs/source/tutorials/training_on_tpu.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,20 +4,17 @@ This tutorial walks you through setting up and running model training on TPU usi

## Prerequisites

Before starting, ensure you have:
- A running TPU instance (see [TPU Setup Guide](./setup))
- Docker installed on your TPU instance
- HuggingFace authentication token
- Basic familiarity with Jupyter notebooks
Before starting, ensure you have a running TPU instance (see [TPU Setup Guide](../tutorials/tpu_setup.mdx))

## Environment Setup
First, create and activate a virtual environment:

First, create and activate a virtual environment:
```bash
python -m venv .venv
source .venv/bin/activate
```

Install the required packages:
```bash
# Install optimum-tpu with PyTorch/XLA support
pip install optimum-tpu -f https://storage.googleapis.com/libtpu-releases/index.html
Expand Down Expand Up @@ -112,6 +109,6 @@ You should now see the loss decrease during training. When the training is done,

## Next Steps
Continue your TPU training journey by exploring:
- More complex training scenarios in our [examples](./howto/more_examples)
- Different [model architectures supported by Optimum TPU](../supported-architectures.mdx)
- More complex training scenarios in our [examples](../howto/more_examples)
- Different [model architectures supported by Optimum TPU](../supported-architectures)

0 comments on commit a338ce7

Please sign in to comment.