Add image label to output MD file. #52

tungsten106 · 2023-12-27T14:52:04Z

Using Pymupdf package to extract image bbox and sorted with y-position, adding the MD formated image label as text to the output markdown file;
Image data saved in metadata.json file with key "image" and is a Dict, format: {img_path: img_byte_content}, it then could be saved to each path with the file convert_single.py.
Not all pictures in pdf (such as image on page 2 of Multi-column CNN) could not be identified, as noted by @yachty66. But technically that is not a picture, it is an image formed with text boxes and arrows, etc. I am unsure about how to resolve this at the moment as well.
Hope it could helps :)

…_single_page_blocks() modified

VikParuchuri · 2023-12-28T02:01:38Z

@tungsten106 Thanks for much for this! It was on my list of functionality to add soon. I'll take a look next week (after the holiday).

VikParuchuri · 2024-01-02T19:32:44Z

@tungsten106 I'd love to review this, but the diffs seem to have issues (entire file is shown as deleted, with all the lines also shown as added). I'm having a hard time seeing what was changed. Do you know why this is happening with the diffs?

tungsten106 · 2024-01-03T16:23:33Z

@tungsten106 I'd love to review this, but the diffs seem to have issues (entire file is shown as deleted, with all the lines also shown as added). I'm having a hard time seeing what was changed. Do you know why this is happening with the diffs?

It is probably a problem raised by Windows vscode end-of-line sequence settings. I have changed its selection from CRLF back to LF, and the diff should work now.

OmriNach · 2024-01-04T14:58:03Z

Following to know when this is implemented. With GPT4V out, the focus is on multimodal retrieval systems. Since marker outperforms most pdf readers, the addition of images would make it very valuable for general purpose pdf loading for this purpose.

morizin · 2024-01-24T13:43:19Z

Not all pictures in pdf (such as image on page 2 of Multi-column CNN) could not be identified, as noted by @yachty66. But technically that is not a picture, it is an image formed with text boxes and arrows, etc. I am unsure about how to resolve this at the moment as well.

Why can't we do somethingg like get the box and screenshot that part and add

CBIhalsen · 2024-01-29T19:56:38Z

After adding the image, continue to add the translation function to the project, and right-click the image and select GPT-4-vision to answer, which will be a great essay tool.

catalystK · 2024-03-12T17:37:44Z

Is the image extract feature included in latest, as today, i cloned git-master branch (as there is no release) and ran
i couldnt get the image in output .md file, I thought, MD file, will have image embeddings in it.. but didnt find any
Should i set any variable, to extract image, and emebd it tinto, output md file?..

is this feature upcoming..

also, is there any way, I can run this on hugginface, deploy there -- can you create something similar, some remote solution

tungsten106 added 3 commits December 25, 2023 22:35

set up environment on mac M1, cpu

3955deb

add image label (markdown format) to blocks, markers.extract_text.get…

927f4b7

…_single_page_blocks() modified

debug, test with Multi-column CNN file

4ce7ca0

tungsten106 added 3 commits January 4, 2024 00:07

change end of line sequence

b9e6e59

change end of line sequence to LF

fb3301d

remove sketch files

34bed7c

wciq1208 mentioned this pull request Jul 9, 2024

Crashed in a multi-threaded environment #225

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add image label to output MD file. #52

Add image label to output MD file. #52

tungsten106 commented Dec 27, 2023

VikParuchuri commented Dec 28, 2023

VikParuchuri commented Jan 2, 2024

tungsten106 commented Jan 3, 2024

OmriNach commented Jan 4, 2024

morizin commented Jan 24, 2024

CBIhalsen commented Jan 29, 2024

catalystK commented Mar 12, 2024 •

edited

Loading

Add image label to output MD file. #52

Are you sure you want to change the base?

Add image label to output MD file. #52

Conversation

tungsten106 commented Dec 27, 2023

VikParuchuri commented Dec 28, 2023

VikParuchuri commented Jan 2, 2024

tungsten106 commented Jan 3, 2024

OmriNach commented Jan 4, 2024

morizin commented Jan 24, 2024

CBIhalsen commented Jan 29, 2024

catalystK commented Mar 12, 2024 • edited Loading

catalystK commented Mar 12, 2024 •

edited

Loading