This benchmark is about reading pure PDF files - notscanned documents and not documents that applied OCR.
Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz
# | Name | File Size | Pages |
---|---|---|---|
1 | 2201.00214 | 2.4MiB | 22 |
2 | GeoTopo-book | 5.1MiB | 117 |
3 | 2201.00151 | 1.5MiB | 12 |
4 | 1707.09725 | 7.0MiB | 134 |
5 | 2201.00021 | 2.6MiB | 10 |
6 | 2201.00037 | 2.9MiB | 33 |
7 | 2201.00069 | 14.7MiB | 15 |
8 | 2201.00178 | 2.3MiB | 16 |
9 | 2201.00201 | 1.3MiB | 9 |
10 | 1602.06541 | 2.9MiB | 16 |
11 | 2201.00200 | 284.8KiB | 7 |
12 | 2201.00022 | 1.1MiB | 11 |
13 | 2201.00029 | 797.6KiB | 12 |
14 | 1601.03642 | 1004.9KiB | 8 |
Name | Last PyPI Release | License | Version | Dependencies |
---|---|---|---|---|
Borb | 2023-06-23 | AGPL/Commercial | 2.1.16 | |
pypdfium2 | 2023-07-04 | Apache-2.0 or BSD-3-Clause | 4.18.0 | PDFium (Foxit/Google) |
pdfminer.six | 2022-11-05 | MIT/X | 20221105 | |
pdfplumber | 2023-07-29 | MIT | 0.10.2 | pdfminer.six |
pdfrw | 2017-09-18 | MIT | 0.4 | |
pdftotext | - | GPL | 0.86.1 | build-essential libpoppler-cpp-dev pkg-config python3-dev |
PyMuPDF | 2023-08-24 | GNU AFFERO GPL 3.0 / Commerical | 1.23.1 | MuPDF |
pypdf | 2023-08-26 | BSD 3-Clause | 3.15.4 | |
Tika | 2023-01-01 | Apache v2 | 2.6.0 | Apache Tika |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | PyMuPDF | 0.1s | 0.4s | 0.2s | 0.2s | 0.2s | 0.0s | 0.1s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s |
2 | pypdfium2 | 0.2s | 1.9s | 0.2s | 0.2s | 0.2s | 0.0s | 0.1s | 0.1s | 0.1s | 0.0s | 0.1s | 0.0s | 0.0s | 0.0s | 0.0s |
3 | pdftotext | 0.3s | 0.8s | 1.0s | 0.3s | 0.8s | 0.1s | 0.2s | 0.2s | 0.1s | 0.0s | 0.1s | 0.1s | 0.1s | 0.0s | 0.0s |
4 | Tika | 1.1s | 12.9s | 0.9s | 0.6s | 0.4s | 0.1s | 0.3s | 0.2s | 0.1s | 0.1s | 0.1s | 0.1s | 0.1s | 0.0s | 0.0s |
5 | pypdf | 2.6s | 18.7s | 4.8s | 5.3s | 2.3s | 0.7s | 0.9s | 0.4s | 0.5s | 0.3s | 0.6s | 0.5s | 0.4s | 0.4s | 0.2s |
6 | pdfminer.six | 4.5s | 26.0s | 12.9s | 8.0s | 4.6s | 1.3s | 2.1s | 1.0s | 1.2s | 0.8s | 1.5s | 0.9s | 0.9s | 0.6s | 0.6s |
7 | pdfplumber | 6.7s | 41.7s | 10.9s | 11.5s | 8.4s | 2.4s | 4.3s | 2.0s | 1.9s | 1.9s | 2.7s | 1.8s | 1.7s | 1.0s | 1.2s |
8 | Borb | 34.7s | 111.2s | 105.0s | 1.4s | 87.2s | 21.1s | 7.4s | 83.5s | 16.4s | 20.3s | 5.4s | 3.4s | 18.8s | 3.2s | 2.1s |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | PyMuPDF | 0.5s | 0.3s | 0.5s | 0.0s | 1.7s | 0.4s | 0.0s | 3.2s | 0.4s | 0.4s | 0.1s | 0.0s | 0.3s | 0.2s | 0.0s |
2 | pypdf | 2.8s | 16.4s | 2.1s | 0.8s | 9.2s | 1.1s | 0.0s | 6.7s | 0.9s | 0.9s | 0.4s | 0.0s | 0.7s | 0.2s | 0.1s |
3 | pdfminer.six | 6.5s | 31.8s | 13.7s | 9.2s | 24.0s | 1.5s | 2.3s | 1.5s | 1.4s | 0.9s | 1.5s | 0.9s | 1.0s | 0.6s | 0.5s |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | PyMuPDF | 0.0s | 0.0s | 0.1s | 0.0s | 0.1s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s | 0.0s |
2 | pdfrw | 0.1s | 0.0s | 0.4s | 0.0s | 0.3s | 0.1s | 0.1s | 0.1s | 0.1s | 0.1s | 0.1s | 0.0s | 0.1s | 0.0s | 0.0s |
3 | pypdf | 0.4s | 0.6s | 1.7s | 0.4s | 0.9s | 0.2s | 0.3s | 0.4s | 0.3s | 0.2s | 0.3s | 0.1s | 0.2s | 0.0s | 0.2s |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | pdfrw | 3.4MB | 2.5MB | 5.7MB | 1.6MB | 7.3MB | 2.7MB | 3.1MB | 15.4MB | 2.4MB | 1.3MB | 3.0MB | 0.3MB | 1.1MB | 0.8MB | 1.0MB |
2 | pypdf | 3.5MB | 2.5MB | 5.7MB | 1.6MB | 7.3MB | 2.7MB | 3.1MB | 15.4MB | 2.4MB | 1.3MB | 3.0MB | 0.3MB | 1.1MB | 0.8MB | 1.0MB |
3 | PyMuPDF | 3.7MB | 2.7MB | 6.8MB | 1.7MB | 8.5MB | 2.8MB | 3.4MB | 15.5MB | 2.5MB | 1.4MB | 3.2MB | 0.3MB | 1.2MB | 0.9MB | 1.1MB |
# | Library | Average | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | pypdfium2 | 98% | 99% | 97% | 94% | 99% | 98% | 96% | 99% | 98% | 99% | 99% | 98% | 98% | 99% | 99% |
2 | pypdf | 97% | 98% | 93% | 94% | 98% | 98% | 96% | 97% | 98% | 99% | 99% | 98% | 98% | 98% | 99% |
3 | PyMuPDF | 97% | 98% | 96% | 93% | 97% | 98% | 96% | 98% | 98% | 98% | 98% | 97% | 97% | 98% | 99% |
4 | Tika | 96% | 99% | 98% | 92% | 97% | 98% | 96% | 93% | 97% | 98% | 93% | 98% | 93% | 98% | 96% |
5 | pdftotext | 93% | 96% | 93% | 91% | 94% | 92% | 96% | 96% | 96% | 97% | 83% | 94% | 96% | 96% | 79% |
6 | pdfminer.six | 90% | 95% | 79% | 86% | 92% | 86% | 93% | 95% | 93% | 92% | 92% | 93% | 86% | 98% | 86% |
7 | pdfplumber | 75% | 94% | 84% | 61% | 97% | 61% | 93% | 61% | 89% | 57% | 59% | 67% | 59% | 98% | 67% |
8 | Borb | 45% | 70% | 79% | 0% | 40% | 48% | 92% | 0% | 64% | 51% | 41% | 55% | 43% | 0% | 53% |