-
Notifications
You must be signed in to change notification settings - Fork 731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug/OCRAgentGoogleVision takes 1 positional argument but 2 were given #3659
Comments
I’m experiencing the same issue. The issue arises in the @staticmethod
@functools.lru_cache(maxsize=None)
def get_instance(ocr_agent_module: str, language: str) -> "OCRAgent":
module_name, class_name = ocr_agent_module.rsplit(".", 1)
if module_name not in OCR_AGENT_MODULES_WHITELIST:
raise ValueError(
f"Environment variable OCR_AGENT module name {module_name} must be set to a "
f"whitelisted module part of {OCR_AGENT_MODULES_WHITELIST}."
)
try:
module = importlib.import_module(module_name)
loaded_class = getattr(module, class_name)
return loaded_class(language) # <--- This is where the issue occurs
except (ImportError, AttributeError) as e:
logger.error(f"Failed to get OCRAgent instance: {e}")
raise RuntimeError(
"Could not get the OCRAgent instance. Please check the OCR package and the "
"OCR_AGENT environment variable."
) However, the I'm willing to submit a PR to address this but want to know what the desired approach to solving this would be. Some possible options are:
|
Hi @DavidBlore, Thank you for your willingness to submit a PR to address this issue. After considering the options you've presented, I believe the most suitable approach would be:
Implementation suggestion:
This change allows users to specify a language if needed, but defaults to English ('en') if not provided, similar to other OCR agents. Next steps:
|
This PR addresses issue #3659 by adding an optional `language` parameter to the `OCRAgentGoogleVision` class constructor. This parameter serves as a "language hint" for the `document_text_detection` method in the `ImageAnnotatorClient`. For more information on language hints, refer to the [Google Cloud Vision documentation](https://cloud.google.com/vision/docs/languages). **Default Behavior**: The language parameter defaults to None, allowing Google Cloud Vision to auto-detect the language, as recommended in their documentation. **Purpose**: This change is necessary because the `OCRAgent`'s `get_instance` method expects all `OCRAgent`s to include a language parameter in their constructors. **Context on Issue:** When trying to parse a PDF with `OCR_AGENT=unstructured.partition.utils.ocr_models.google_vision_ocr.OCRAgentGoogleVision`, an error occurs in the `get_instance` method. The method expects a `language` parameter, which the current `OCRAgentGoogleVision` constructor does not support, leading to a positional argument error. --------- Co-authored-by: Christine Straub <[email protected]>
Describe the bug
Try to parse a pdf with
OCR_AGENT=unstructured.partition.utils.ocr_models.google_vision_ocr.OCRAgentGoogleVision
.To Reproduce
Provide a code snippet that reproduces the issue.
Expected behavior
No error
Environment Info
OS version: Linux-6.8.0-45-generic-x86_64-with-glibc2.39
Python version: 3.11.4
unstructured version: 0.15.14.dev1
unstructured-inference version: 0.7.36
pytesseract is not installed
Torch version: 2.4.1
Detectron2 is not installed
PaddleOCR version: None
Libmagic version: file-5.45
magic file from /etc/magic:/usr/share/misc/magic
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: