Professor Nathan Jacobs
Washington University St. Louis, USA
Talk Title: Learning to Map Anything, Anywhere, Anytime
Abstract: What might it sound like here? How would you describe this place? Would it be unusual to see a large mammal if I took an early morning walk? These are all questions that are inherently spatial in nature and difficult to answer precisely. This talk explores a new approach to multi-modal remote sensing that shows how we might build a system that supports answering such questions at a global scale, enabling us to understand the Earth with a level of semantic, spatial, and temporal resolution that was previously impossible.
Professor Yaser Sheikh
Director of Meta Reality Labs, Pittsburgh, USA
Talk Title: Photorealistic Telepresence
Abstract: Telepresence has the potential to bring billions of people into artificial reality (AR/MR/VR). It is the next step in the evolution of telecommunication, from telegraphy to telephony to videoconferencing. In this talk, I will describe early steps taken at Meta Reality Pittsburgh towards achieving photorealistic telepresence: realtime social interactions in AR/VR with avatars that look like you, move like you, and sound like you. If successful, photorealistic telepresence will introduce pressure for the concurrent development of the next generation of algorithms and computing platforms for computer vision and computer graphics. In particular, I will introduce codec avatars: the use of neural networks to unify the computer vision (inference) and computer graphics (rendering) problems in signal transmission and reception. The creation of codec avatars require capture systems of unprecedented 3D sensing resolution, which I will also describe.
Bio: Yaser Sheikh is the Vice President and founding director of the Meta Reality Lab in Pittsburgh, devoted to achieving photorealistic social interactions in augmented and virtual reality. He is a consulting professor at the Robotics Institute, Carnegie Mellon University, where he directed the Perceptual Computing Lab producing OpenPose and the Panoptic Studio. His research broadly focuses on machine perception and rendering of social behavior, spanning sub-disciplines in computer vision, computer graphics, and machine learning. He has served as an associate editor for the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) and has regularly served as a senior program committee member for SIGGRAPH, CVPR, and ICCV. His research has been featured by various news and media outlets including The New York Times, BBC, CBS, WIRED, and The Verge. With colleagues and students, he has won the Hillman Fellowship (2004), Honda Initiation Award (2010), Popular Science’s “Best of What’s New” Award (2014), as well as several conference best paper and demo awards (CVPR, ECCV, WACV, ICML).
Prof. Simon Lucey
Director of AIML, Professor SCS, University of Adelaide, Australia
Talk Title: The Rise of Neural Priors
Abstract: The performance of an AI is nearly always associated with the amount of data you have at your disposal. Self-supervised machine learning can help – mitigating tedious human supervision – but the need for massive training datasets in modern AI seems unquenchable. Sometimes it is not the amount of data, but the mismatch of statistics between the train and test sets – commonly referred to as bias ¬– that limits the utility of an AI. In this talk I will explore a new direction based on the concept of a “neural prior” that relies on no training dataset whatsoever. A neural prior speaks to the remarkable ability of neural networks to both memorise training and generalise to unseen testing examples. Though never explicitly enforced, the chosen architecture of a neural network applies an implicit neural prior to regularise its predictions. It is this property we will leverage for problems that historically suffer from a paucity of training data or out-of-distribution bias. We will demonstrate the practical application of neural priors to augmented reality, autonomous driving and noisy signal recovery – with many of these outputs already being taken up in industry.
Bio: Simon Lucey Ph.D. is the Director of the Australian Institute for Machine Learning (AIML) and a professor in the School of Computer Science (SCS), at the University of Adelaide. Prior to this he was an associate research professor at Carnegie Mellon University’s Robotics Institute (RI) in Pittsburgh USA; where he spent over 10 years as an academic. He was also Principal Research Scientist at the autonomous vehicle company Argo AI from 2017-2022. He has received various career awards, including an Australian Research Council Future Fellowship (2009-2013). Simon’s research interests span computer vision, machine learning, and robotics. He enjoys drawing inspiration from AI researchers of the past to attempt to unlock computational and mathematic models that underlie the processes of visual perception.
Prof. Mohammed BennamounWinthrop
The University of WA, Australia
Talk Title: 3D Vision for Intelligent Robots
Abstract: In structured settings like industrial environments, robotic technology has exhibited remarkable efficiency. However, its deployment in dynamic and less predictable environments, such as domestic settings, remains a challenge. Robots, in areas like agility, power, and precision, often surpass human abilities. Yet, they still encounter difficulties in tasks like object and person identification, linguistic interpretation, manual dexterity, and social interaction and understanding capabilities. The quest for computer vision systems mirroring human visual abilities has been arduous. Two primary obstacles have been: (i) the absence of 3D sensors that can parallel the human eye’s capability to concurrently record visual attributes (e.g., colour and texture) and the dynamic surface shapes of objects, and (ii) the lack of real-time data processing algorithms. However, with the recent emergence of cost-effective 3D sensors, there’s a surge in the creation of functional 3D systems. These span from 3D biometric systems, e.g., for face recognition, to assistive home robotic systems to assist the elderly with mild cognitive impairment.
The objective of the talk will be to describe few 3D computer vision projects and tools used towards the development of a platform for assistive robotics in messy living environments. Various systems with applications and their motivations will be described including 3D object recognition, 3D face/ear biometrics, grasping of unknown objects, and systems to estimate the 3D pose of a person.
Bio: Mohammed Bennamoun is Winthrop Professor in the Department of Computer Science and Software Engineering at the University of Western Australia (UWA) and is a researcher in computer vision, machine/deep learning, robotics, and signal/speech processing. He has published 4 books (available on Amazon), 1 edited book, 1 Encyclopedia article, 14 book chapters, 220+ journal papers, 290+ conference publications, 16 invited and keynote publications. His h-index is 73 and his number of citations is 26,200+ (Google Scholar). He was awarded 82+ competitive research grants, from the Australian Research Council, and numerous other Government, UWA and industry Research Grants. He successfully supervised 40+ PhD students to completion. He won the Best Supervisor of the Year Award at Queensland University of Technology (1998) and received award for research supervision at UWA (2008 and 2016) and Vice-Chancellor Award for mentorship (2016). He delivered conference tutorials at major conferences, including IEEE CVPR 2016, Interspeech 2014, IEEE ICASSP, and ECCV. He was also invited to give a Tutorial at an International Summer School on Deep Learning (DeepLearn 2017).