Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization

Peter Schaldenbrand, Zhixuan Liu, and Jean Oh. The Robotics Institute, Carnegie Mellon University

An approach to generating videos based on a series of given language descriptions of the video. We currently only have a Colab implementation which is linked above.

Please message Peter at pschalde at andrew dot cmu dot edu with any questions or make a GitHub issue. Thanks!

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
images		images
LICENSE		LICENSE
README.md		README.md
Text2Video.ipynb		Text2Video.ipynb
beautifier.py		beautifier.py
clip_loss.py		clip_loss.py
cog.yaml		cog.yaml
predict.py		predict.py
requirements.txt		requirements.txt
text2video.py		text2video.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization

About

Languages

License

pschaldenbrand/Text2Video

Folders and files

Latest commit

History

Repository files navigation

Towards Real-Time Text2Video via CLIP-Guided, Pixel-Level Optimization

About

Topics

Resources

License

Stars

Watchers

Forks

Languages