Trending

Stories

Nvidia Partners Cornell University to Unveil AI Video Generation Model – VideoLDM

Must Read

Yusuf Balogun
Yusuf Balogunhttps://mssg.me/q19uh
Yusuf is a fresh law graduate and freelance journalist with a special interest in tech reporting. He joined the tech sphere in 2019 and has written several articles. He believes in tech innovations as an aspiring health law expert, in the future, Yusuf hopes to use the same for solving global health challenges.

The emergence of artificial intelligence AI has been one of the most significant technological advancements of the 21st century. From self-driving cars to virtual assistants and chatbots, AI has become ubiquitous in our daily lives. Its effects on several businesses and society at large are immense.

To complement these technological developments, the renowned American graphics processing unit manufacturer, Nvidia in partnership with researchers from Cornell University, has unveiled an AI video generation model named VideoLDM. The new AI can generate high-resolution videos based on text descriptions.

Also Read

VideoLDM: Nvidia AI Video Generation Model

Based on a text description, the AI model can create videos with a maximum resolution of 2048 x 1280 pixels, 24 frame rates, and a maximum runtime of 4.7 seconds. The stable diffusion neural network is the foundation of the model. Only 2.7 billion of the 4.1 billion parameters in the NVIDIA solution used video for training.

This is quite modest by the standards of modern AI. Using a powerful Latent Diffusion Model (LDM) method, engineers were able to produce a wide range of high-definition films that were both diversified and time-consistent.

VideoLDM Features

The research team from both Nvidia and Cornell University highlight the following features of this model: Both the creation of customized videos and temporal convolution synthesis. LDM image reference networks that have been fine-tuned beforehand in the DreamBooth picture collection are inserted with temporal layers trained in VideoLDM to translate text to video.

You can produce slightly longer clips with no quality loss by applying the learned time planes wrinkle-wise over time. Additionally, the model can produce films of driving scenes. Videos can last up to 5 minutes and have a resolution of 1024×512 pixels.

By employing bounding boxes to create an engaging environment, synthesizing an appropriate source image, and then producing convincing films, it is feasible to recreate a particular driving experience. Additionally, the model may generate a variety of conceivable missions from a single initial frame to provide multimodal predictions of motion scenarios.

Currently, this research is a participant in the Machine Vision and Pattern Recognition Conference taking place June 18-22 in Vancouver. The described neural network is currently simply a research project, and it is unknown when NVIDIA will make something similar available to the general public.

Stay updated

Subscribe to our newsletter and never miss an update on the latest tech, gaming, startup, how to guide, deals and more.

SourceNvidia

Latest

Stories

- Advertisement -
- Advertisement -

Latest

Grow Your Business

Place your brand in front of tech-savvy audience. Partner with us to build brand awareness, increase website traffic, generate qualified leads, and grow your business.

- Advertisement -

Related

- Advertisement -
- Advertisement -
TikTok Targets $20B E-Commerce Expansion in Southeast Asia WatchOS 10: Automatic Night Mode for Apple Watch Ultra Twitter Blue: One-Hour Tweet Editing Window for Subscribers Google Pay Enables Aadhaar-Based UPI Authentication iOS 17: Improved Autocorrect with Personalized On-Device ML OnePlus Nord N30 5G goes official in the U.S. Apple Introduces Vision Pro VR Headset for Augmented and Virtual Reality Microsoft to Pay $20M Fine for Violating Children’s Privacy with Xbox Nothing Phone (2) Confirmed for India Production Fitbit Integration with Google Accounts Begins
TikTok Targets $20B E-Commerce Expansion in Southeast Asia WatchOS 10: Automatic Night Mode for Apple Watch Ultra Twitter Blue: One-Hour Tweet Editing Window for Subscribers Google Pay Enables Aadhaar-Based UPI Authentication iOS 17: Improved Autocorrect with Personalized On-Device ML OnePlus Nord N30 5G goes official in the U.S. Apple Introduces Vision Pro VR Headset for Augmented and Virtual Reality Microsoft to Pay $20M Fine for Violating Children’s Privacy with Xbox