New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

图片处理 #697

Open

renwuliang opened this issue Oct 22, 2024 · 1 comment

renwuliang commented Oct 22, 2024

作者您好，请问在图片处理部分是怎样的，如何高效的定位图片中的有效区域并进行文字识别？需要用到什么样的算法呢？

Owner

hiroi-sora commented Oct 23, 2024

图片处理部分是怎样的

本项目使用 PaddleOCR 、 RapidOCR 等开源OCR引擎，作为文字识别核心组件。

如何高效的定位图片中的有效区域并进行文字识别？需要用到什么样的算法呢？

您可参阅上述项目（尤其是PaddleOCR）的文档和官网来获取详细的技术细节。

简而言之，OCR引擎分为三部分，det文本检测负责查找图片中可能存在文本的区域，cls方向分类负责矫正文本方向，rec文字识别负责识别小区域中的句子。主要模型结构为编码解码器架构的CRNN，用CTC实现非固定长度序列输出。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment