Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问能输入一张图片,然后直接输出图片中的内容和相关性吗(也就是不要点击),我想用来给图片生成描述方便搜索 #13

Open
wacdev opened this issue Apr 30, 2023 · 6 comments

Comments

@wacdev
Copy link

wacdev commented Apr 30, 2023

请问能输入一张图片,然后直接输出图片中的内容和相关性吗(也就是不要点击),我想用来给图片生成描述方便搜索

@ttengwang
Copy link
Owner

@wacdev Thanks for the question. We have added the "Caption everything in a paragraph" feature.

@wanghaisheng
Copy link

@ttengwang does this "Caption everything in a paragraph" feature rely on openai chatgpt?
can we use bing gpt instead

@ttengwang
Copy link
Owner

Yes, a chatGPT-like LLM is required for paragraph generation. It is ok to replace it with another gpt, as long as there is an API available to facilitate the integration.

@wanghaisheng
Copy link

wanghaisheng commented May 18, 2023

@ttengwang
another question
without click to drive the prompt, what input gpt consume?
can you add explanation to existing click driven image like this
https://github.com/ttengwang/Caption-Anything/blob/main/assets/demo1.png
at last I just want to thank you for your work, this definitely give me confidence and a great start to catch on.
during last 2 years I have dig about Audio description service which I want to integrate a affordable wearable camera to aid visual impairment people in their daily life.

Audio description (also referred to as “description” or “video description”) is defined as “the verbal depiction of key visual elements in media and live productions.” AD is meant to provide information on visual content that is considered essential to the comprehension of the program

@ttengwang
Copy link
Owner

ttengwang commented May 18, 2023

Thank you so much for your kind words and encouragement. It truly means a lot to our team. You can check out our technical report for more details https://arxiv.org/pdf/2305.02677.pdf

The description of "paragraph generation" is at the bottom of page 5.

image

@wanghaisheng
Copy link

gotta
@ttengwang

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants