Keynote

Synthesia: From computer vision research to real-world AI avatars

Lourdes Agapito ⋅ Vittorio Ferrari

2024 Keynote

Abstract

Synthesia is one of Europe's newest billion-euro startups. Its core technology is script-to-video: realistic AI avatars delivernig compelling presentations to the virtual camera. Used by more than 50,000 companies worldwide, including 400 of the Fortune 500, it is computer vision technology that operates in the real world.

Lourdes Agapito and Vittorio Ferrari will talk about the development of this technology from computer vision research papers to real-world product, and about the current and future directions of their research.

Speakers

Lourdes Agapito

Lourdes Agapito holds the position of Professor of 3D Vision at the Department of Computer Science, University College London (UCL) and is co-founder of Synthesia. Her research in computer vision has consistently focused on the inference of 3D information from single images or videos acquired from a moving camera. She received her BSc, MSc and PhD degrees from the Universidad Complutense de Madrid (Spain). In 1997 she joined the Robotics Research Group at the University of Oxford as an EU Marie Curie Postdoctoral Fellow. In 2001 she was appointed as Lecturer at Queen Mary University of London where she held an ERC Starting Grant to focus on theoretical and practical aspects of deformable 3D reconstruction from monocular sequences. In 2013 she joined the Department of Computer Science at UCL and was promoted to full professor in 2015. Lourdes has served as Program Chair for CVPR 2016 and ICCV 2023, serves regularly as Area Chair for the top Computer Vision conferences (CVPR, ICCV, ECCV) and was keynote speaker at ICRA 2017 and ICLR 2021. In 2017 she co-founded Synthesia, a recent generative AI unicorn and the world’s largest AI video generation platform that allows users to create professional videos directly in the browser, removing the physical constraints of conventional production.

Vittorio Ferrari

Vittorio Ferrari is the Director of Science at Synthesia, where he leads R&D groups developing cutting-edge generative AI technology. Previously he built and led multiple research groups on computer vision and machine learning at Google (Principal Scientist), the University of Edinburgh (Full Professor), and ETH Zurich (Assistant Professor). He has co-authored over 160 scientific papers and won the best paper award at the European Conference in Computer Vision in 2012 for his work on large-scale segmentation. He received the prestigious ERC Starting Grant, also in 2012. He led the creation of Open Images, one of the most widely adopted computer vision datasets worldwide. While at Google his groups contributed technology to several major products (with launches e.g. on the Pixel phone, Google Photos, Google Lens). He was a Program Chair for ECCV 2018 and a General Chair for ECCV 2020. He is an Associate Editor of IEEE Pattern Analysis and Machine Intelligence, and formerly of the International Journal of Computer Vision. His recent research interests are in 3D Deep Learning and Vision+Language models.

Video

Chat is not available.