识别不准确，标题总是分到右边 #14308

xiaohongri · 2024-12-02T09:03:26Z

🔎 Search before asking

I have searched the PaddleOCR Docs and found no similar bug report.
I have searched the PaddleOCR Issues and found no similar bug report.
I have searched the PaddleOCR Discussions and found no similar bug report.

🐛 Bug (问题描述)

识别不准确，标题总是分到右边

🏃‍♂️ Environment (运行环境)

paddleocr --image_dir=./png_test/5 --type=structure --recovery=true --formula=true --recovery_to_markdown=true --lang=ch --output=./2

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

检测图片：

识别结果：

xiaohongri · 2024-12-02T09:05:07Z

感觉是自动分双栏的原因？

GreatV · 2024-12-02T13:10:32Z

关于问题“识别不准确，标题总是分到右边”，以下是可能的原因分析及解决方案：

原因分析

版面分析模型（Layout Analysis Model）配置问题
- 从运行参数 layout_model_dir 和 type=structure 来看，你正在使用版面分析模型，但默认模型可能无法准确处理特定版面，导致标题区域被错误分类或分组。
检测框位置偏差
- 检测框可能受到图片质量、文字排版或模型参数（如 det_db_thresh 和 det_db_box_thresh）的影响，导致标题部分被错误归类。
语言模型问题
- 使用 lang=ch 时，可能存在语言关联的模型适配问题，尤其是在混合语言场景或版面复杂的情况下。
恢复模式（Recovery Mode）与版面分类冲突
- 你启用了 recovery=true 和 recovery_to_markdown=true，这可能在版面恢复过程中导致错误的文字组合和排版。

解决方案

以下是几个优化方向，逐步排查和调整：

1. 使用更适合的版面分析模型

默认的版面分析模型（如 picodet_lcnet_x1_0_fgd_layout_infer）可能不适合你的场景。建议尝试以下模型：
- 针对中文场景：layout_model_dir="ch_ppstructure_mobile_v2.0_SLANet_infer"
- 增强版面检测：layout_model_dir="picodet_lcnet_x1_0_fgd_layout_cdla_infer"（适用于复杂版面）

修改命令：

paddleocr --image_dir=./png_test/5 --type=structure --layout_model_dir="ch_ppstructure_mobile_v2.0_SLANet_infer" --recovery=true --formula=true --recovery_to_markdown=true --lang=ch --output=./2

2. 调整检测参数

优化检测框设置：
- 增加 det_db_thresh，减少噪声框的干扰。
- 调整 det_db_box_thresh，确保较高的检测准确性。

示例调整：

paddleocr --det_db_thresh=0.4 --det_db_box_thresh=0.7 --image_dir=./png_test/5 --type=structure --recovery=true --formula=true --recovery_to_markdown=true --lang=ch --output=./2

3. 禁用或优化版面恢复

如果 recovery=true 导致排版错误，可以尝试禁用恢复模式，直接输出检测和识别结果：
```
paddleocr --image_dir=./png_test/5 --type=structure --lang=ch --output=./2
```

或者使用 --recovery_to_docx=true 替代 Markdown 恢复模式，避免排版冲突：

paddleocr --image_dir=./png_test/5 --type=structure --recovery=true --recovery_to_docx=true --lang=ch --output=./2

4. 检查数据预处理

确认输入图片质量，避免模糊或倾斜。
如果标题部分偏移明显，可以尝试手动调整图像或使用透视矫正工具增强版面效果。

其他相关讨论与问题

类似问题链接
- Issue: pdf 图表检测不准确：讨论了版面分析模型对表格和标题的误检问题。
- Discussion: pdf 图表检测不准确：提供了优化版面检测的模型选择建议。

总结

通过调整版面分析模型、检测参数和恢复模式，可以有效改善标题错误分组的问题。如果问题仍未解决，建议提供更详细的输入图片和输出日志信息以进一步排查。

Response generated by 🤖 feifei-bot | chatgpt-4o-latest

xiaohongri · 2024-12-03T03:19:21Z

关于问题“识别不准确，标题总是分到右边”，以下是可能的原因分析及解决方案：

原因分析

版面分析模型（Layout Analysis Model）配置问题

从运行参数 layout_model_dir 和 type=structure 来看，你正在使用版面分析模型，但默认模型可能无法准确处理特定版面，导致标题区域被错误分类或分组。

检测框位置偏差

检测框可能受到图片质量、文字排版或模型参数（如 det_db_thresh 和 det_db_box_thresh）的影响，导致标题部分被错误归类。

语言模型问题

使用 lang=ch 时，可能存在语言关联的模型适配问题，尤其是在混合语言场景或版面复杂的情况下。

恢复模式（Recovery Mode）与版面分类冲突

你启用了 recovery=true 和 recovery_to_markdown=true，这可能在版面恢复过程中导致错误的文字组合和排版。

解决方案

以下是几个优化方向，逐步排查和调整：

1. 使用更适合的版面分析模型
默认的版面分析模型（如 picodet_lcnet_x1_0_fgd_layout_infer）可能不适合你的场景。建议尝试以下模型：

针对中文场景：layout_model_dir="ch_ppstructure_mobile_v2.0_SLANet_infer"

增强版面检测：layout_model_dir="picodet_lcnet_x1_0_fgd_layout_cdla_infer"（适用于复杂版面）
修改命令：
paddleocr --image_dir=./png_test/5 --type=structure --layout_model_dir="ch_ppstructure_mobile_v2.0_SLANet_infer" --recovery=true --formula=true --recovery_to_markdown=true --lang=ch --output=./2
2. 调整检测参数
优化检测框设置：

增加 det_db_thresh，减少噪声框的干扰。

调整 det_db_box_thresh，确保较高的检测准确性。
示例调整：
paddleocr --det_db_thresh=0.4 --det_db_box_thresh=0.7 --image_dir=./png_test/5 --type=structure --recovery=true --formula=true --recovery_to_markdown=true --lang=ch --output=./2
3. 禁用或优化版面恢复
如果 recovery=true 导致排版错误，可以尝试禁用恢复模式，直接输出检测和识别结果：
paddleocr --image_dir=./png_test/5 --type=structure --lang=ch --output=./2
或者使用 --recovery_to_docx=true 替代 Markdown 恢复模式，避免排版冲突：
paddleocr --image_dir=./png_test/5 --type=structure --recovery=true --recovery_to_docx=true --lang=ch --output=./2
4. 检查数据预处理

确认输入图片质量，避免模糊或倾斜。

如果标题部分偏移明显，可以尝试手动调整图像或使用透视矫正工具增强版面效果。

其他相关讨论与问题

类似问题链接

Issue: pdf 图表检测不准确：讨论了版面分析模型对表格和标题的误检问题。

Discussion: pdf 图表检测不准确：提供了优化版面检测的模型选择建议。

总结

通过调整版面分析模型、检测参数和恢复模式，可以有效改善标题错误分组的问题。如果问题仍未解决，建议提供更详细的输入图片和输出日志信息以进一步排查。

Response generated by 🤖 feifei-bot | chatgpt-4o-latest

方法都试过，不太行，而且将版面恢复功能recovery=true去掉，就不输出word文档了

GreatV · 2024-12-03T03:29:23Z

估计是有bug，我后面抽空看看。

xiejingchao · 2024-12-11T09:19:30Z

recovery_to_doc.py中sorted_layout_boxes方法的规则比较简单，容易把单栏误判为双栏

SWHL assigned GreatV Dec 5, 2024

GreatV mentioned this issue Dec 16, 2024

enhancing recovery_to_doc #14396

Open

xiaohongri closed this as completed Dec 20, 2024

paddle-bot bot added the status/close label Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

识别不准确，标题总是分到右边 #14308

识别不准确，标题总是分到右边 #14308

xiaohongri commented Dec 2, 2024

xiaohongri commented Dec 2, 2024

GreatV commented Dec 2, 2024

xiaohongri commented Dec 3, 2024

原因分析

解决方案

1. 使用更适合的版面分析模型

2. 调整检测参数

3. 禁用或优化版面恢复

4. 检查数据预处理

其他相关讨论与问题

总结

GreatV commented Dec 3, 2024

xiejingchao commented Dec 11, 2024 •

edited

Loading

识别不准确，标题总是分到右边 #14308

识别不准确，标题总是分到右边 #14308

Comments

xiaohongri commented Dec 2, 2024

🔎 Search before asking

🐛 Bug (问题描述)

🏃‍♂️ Environment (运行环境)

🌰 Minimal Reproducible Example (最小可复现问题的Demo)

xiaohongri commented Dec 2, 2024

GreatV commented Dec 2, 2024

原因分析

解决方案

1. 使用更适合的版面分析模型

2. 调整检测参数

3. 禁用或优化版面恢复

4. 检查数据预处理

其他相关讨论与问题

总结

xiaohongri commented Dec 3, 2024

原因分析

解决方案

1. 使用更适合的版面分析模型

2. 调整检测参数

3. 禁用或优化版面恢复

4. 检查数据预处理

其他相关讨论与问题

总结

GreatV commented Dec 3, 2024

xiejingchao commented Dec 11, 2024 • edited Loading

xiejingchao commented Dec 11, 2024 •

edited

Loading