Nvidia and the MIT Computer Science & Artificial Intelligence Laboratory (CSAIL) have open-sourced their video-to-video synthesis model. A generative adversarial learning framework is used as a method to generate high-resolution, photorealistic and temporally coherent results with various input format, including segmentation masks, sketches and poses.
There has been less research into video to video synthesis compared to image to image translation. Video to video synthesis aims to solve the problem of low visual quality and incoherency of video results in existing image synthesis approach. The research group proposed a novel video-to-video synthesis approach capable of synthesizing 2K resolution videos of street scenes up to 30 seconds long.
An extensive experimental validation was performed on various datasets by the authors and the model showed better results than existing approaches in quantitative and qualitative perspectives. When this method was extended to multimodal video synthesis with identical input data, it produced new visual properties with high resolution and coherency.
Researchers suggested the model may be improved in the future by adding additional 3D cues such as depth maps to better synthesize turning cars. We can use object tracking to ensure an object maintains its colour and appearance throughout the video; and training with coarser semantic labels to solve issues in semantic manipulation.
The Video-to-Video Synthesis paper is on arxiv, the team’s model and data can be found on the Github page.
Nvidia unveils new Turing architecture: “The world’s first ray tracing GPU”
Baidu announces ClariNet, neural network for text-to-speech synthesis
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…