How to Achieve Real-Time Continuous Narration as Shown in Figure 2 of the Paper? #40

hongming21 · 2024-10-13T08:55:36Z

Hi,
I am trying to replicate the continuous real-time narration of the video as shown in Figure 2 of your paper. I am using the demo code and the demo video (cooking.mp4), and at time 0, I input the query: "Describe what I am doing in real time."

However, I only receive a single response, and the inference ends. The response is also somewhat repetitive and contains errors, as shown below:

Response: (Video Time = 0s) Assistant: You adjust the camera. Assistant: You walk around the room. (Sorry, the last response is wrong) You walk around the room. (Sorry, the last response is wrong) You adjust the camera. You walk around the room. (Sorry, the last response is wrong) You walk around the room. (Sorry, the last response is wrong) You adjust the camera. You walk around the room. (Sorry, the last response is wrong)

Could you kindly provide guidance on how to implement continuous real-time narration as demonstrated in the paper?

Thank you!

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Achieve Real-Time Continuous Narration as Shown in Figure 2 of the Paper? #40

How to Achieve Real-Time Continuous Narration as Shown in Figure 2 of the Paper? #40

hongming21 commented Oct 13, 2024

How to Achieve Real-Time Continuous Narration as Shown in Figure 2 of the Paper? #40

How to Achieve Real-Time Continuous Narration as Shown in Figure 2 of the Paper? #40

Comments

hongming21 commented Oct 13, 2024