Skip to content

Commit

Permalink
Merge pull request #313 from mhd-medfa/main
Browse files Browse the repository at this point in the history
Correct Typographical Error in Vision Transformer Introduction
  • Loading branch information
johko authored Jul 17, 2024
2 parents 0118f94 + 08f347a commit 3db7572
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Introduction

As the Transformers architecture scaled well in Natural Language Processing, the same architecture was applied to images by creating small patches of the image and treating them as tokens. The result was a Vision Transformer (Vision Transformers). Before we get started with transfer learning / fine-tuning concepts, let's compare Convolutional Neural Networks (CNNs) with Vision Transformers.
As the Transformers architecture scaled well in Natural Language Processing, the same architecture was applied to images by creating small patches of the image and treating them as tokens. The result was a Vision Transformer (ViT). Before we get started with transfer learning / fine-tuning concepts, let's compare Convolutional Neural Networks (CNNs) with Vision Transformers.

### CNN vs Vision Transformers: Inductive Bias

Expand Down

0 comments on commit 3db7572

Please sign in to comment.