Added video processing section (Unit 7 - Multimodal Based Video Model) #355

1kmmk1 · 2024-10-05T05:54:02Z

Added multimodal-based-video-models.mdx at video processing section. This document provides an overview of various multimodal video architectures which integrate different kinds of modalities into a unified representation space.

Part of Proposed Outline Revision for Unit 7. Video & Video Processing #348

Who can review? (Initial)

@jungnerd @cjfghk5697 @mreraser and anyone who wants to review!

chapters/en/unit7/video-processing/multimodal-based-video-models.mdx

Co-authored-by: Jiwook Han <[email protected]>

johko

Thanks for the contribution and sorry for the late review.
It was a really nice read and I feel like I've got a better idea about multimodal video models now ;)

I mainly left formatting suggestions (mostly repetitive) and make sure to add the pictures :)

johko · 2024-10-26T19:46:54Z

chapters/en/unit7/video-processing/multimodal-based-video-models.mdx

@@ -0,0 +1,120 @@
+# Multimodal Based Video Models[[mutilmodal-based-video-models]]


It is really nice that you thought about adding anchors, but actually this is not really needed, unless you want to refer back to the chapter within your own file.
In general the hf-doc-builder will create anchors for every headline automatically, so you can actually remove them :)

johko · 2024-10-26T19:48:54Z

chapters/en/unit7/video-processing/multimodal-based-video-models.mdx

+5. Depth Modality: Represents the 3D spatial information of the video.
+6. Sensor Modality: In some applications, videos may include modalities like temperature or biometric data.
+
+/* Modality Overview Image */


Is there supposed to be an actual image here?

johko · 2024-10-26T19:52:31Z

chapters/en/unit7/video-processing/multimodal-based-video-models.mdx

+
+- **Overview**
+
+/* VideoBERT Overview Image */


Again the image question ;)

johko · 2024-10-26T19:53:03Z

chapters/en/unit7/video-processing/multimodal-based-video-models.mdx

+
+- **Overview**
+
+/* VATT Overview Image */ 


Image? (don't want to annoy you, just make sure you don't miss these :) )

chapters/en/unit7/video-processing/multimodal-based-video-models.mdx

johko · 2024-10-26T20:09:33Z

chapters/en/unit7/video-processing/multimodal-based-video-models.mdx

+
+- **Overview**
+
+/* ImageBind Overview Image */


chapters/en/unit7/video-processing/multimodal-based-video-models.mdx

1kmmk1

Hello, @johko ! First of all, sorry for the late reply. I really appreciate your detailed review! I committed most of your suggestions, removed the anchor and added images.

…ls.mdx Co-authored-by: Johannes Kolbe <[email protected]>

Co-authored-by: Johannes Kolbe <[email protected]>

johko

Thanks for the changes 🙂
Now it looks good to me 👍

ATaylorAerospace

Great additions..LGTM!

1kmmk1 added 3 commits October 5, 2024 14:24

add: unit7/multimodal-based-video-models

3cd2067

edits: ImageBind

e5922dc

edit: anchor links

8c14349

1kmmk1 marked this pull request as ready for review October 5, 2024 05:59

1kmmk1 requested review from merveenoyan and johko as code owners October 5, 2024 05:59