Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the function of load the model yolox_l0.05.onnx in local disk by reading the unstructured source code. #41

Open
ggservice007 opened this issue Mar 13, 2024 · 5 comments
Assignees

Comments

@ggservice007
Copy link

what

Implement the function of load the model yolox_l0.05.onnx in local disk by reading the unstructured source code.

why

Now unstructured try to load the unstructuredio/yolo_x_layout/yolox_l0.05.onnx by downloading from huggingface if
it can not found.

@bjwswang
Copy link
Collaborator

@ggservice007 please show the example code for this

@wangxinbiao
Copy link
Collaborator

wangxinbiao commented Mar 13, 2024

dependencies

unstructured==0.12.0
unstructured-inference==0.7.21
unstructured.pytesseract==0.3.12
pdf2image==1.17.0
pdfminer.six==20231228
pikepdf==8.13.0

apt-get install poppler-utils

example

from unstructured.partition.pdf import partition_pdf

partition_pdf(
    filename=file_path,
    strategy="hi_res",
    extract_images_in_pdf=True,
    extract_image_block_output_dir=output_dir
)

unstructured
https://github.com/Unstructured-IO/unstructured/blob/main/unstructured/partition/pdf.py#L136

@wangxinbiao
Copy link
Collaborator

wangxinbiao commented Mar 13, 2024

下载的镜像默认放在/root/.cache/huggingface/hub路径下,可通过设置环境变量HF_HUB_CACHE的值更改路径
@ggservice007 @bjwswang

@wangxinbiao
Copy link
Collaborator

wangxinbiao commented Mar 13, 2024

使用unstructured时,面对大图片会报错

PIL.Image.DecompressionBombError: Image size (8284731418 pixels) exceeds limit of 178956970 pixels, could be decompression bomb DOS attack.

@bjwswang
Copy link
Collaborator

下载的镜像默认放在/root/.cache/huggingface/hub路径下,可通过设置环境变量HF_HUB_CACHE的值更改路径
@ggservice007 @bjwswang

能不能直接设置路径?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants