Event

Talks on Computer Vision

Featured image

Location

Date

Type

Talk 1

Scale is religion?

Intelligent robots do not just respond to command; they imagine what you meant, what you wanted. what you believed. And they do this while learning from very little, and running on a chip in your living room. In this talk, I will present recent advances in generative modeling that aim to equip embodied agents with efficient “brains” that can run faster, with fewer data, and more efficient models and imagine possible futures under uncertainty.

 

Speaker

Vicky Kalogeiton

Vicky Kalogeiton is a Professor at École Polytechnique, IP Paris and an Ellis member. Before, she was a research fellow at the University of Oxford, and obtained her PhD from Inria and Edinburgh University. Part of her thesis won the best poster award from the University of Grenoble Alpes, and her master's thesis won the best master's thesis award from DUTh. Since 2021, projects she supervised have received several awards, including spotlights at NeurIPS 2025 and CVPR 2024, a student honorable mention award at ACCV 2022, and the best paper award at ICCV-W 2021. She will be Program Chair for CVPR 2027 and Diversity Chair for ICCV 2025. Since 2021, she has been serving regularly as an Area Chair at major vision conferences (outstanding Area Chair in ACCV 2022) and before, she used to serve as a reviewer, having been awarded six times as an outstanding reviewer. Her research interests focus on generative AI using visual data, text, and audio. 

 

Talk 2

Finding needles in a haystack 

This talk focuses on image-to-image retrieval at the finest level of granularity, i.e. instance-level, where the objective is to identify specific objects rather than broad categories. I will introduce ILIAS (CVPR 2025), a new large-scale benchmark designed to expose open challenges in this domain, such as retrieving small or heavily occluded objects within cluttered scenes. Building on the insights of ILIAS, I will present three distinct approaches that rely on local representations, each with different characteristics: (1) transformer-based architectures optimized for instance-level retrieval (AMES, ECCV 2024), (2) a lightweight, interpretable model with strong inductive biases for robust domain generalization (ELVIS, ongoing work), and (3) a training-free strategy to index multimodal language models for image similarity estimation (ongoing work). These methods reflect diverse design philosophies and trade-offs. The discussion will emphasize large-scale retrieval and, in particular, the critical balance between performance and memory efficiency.

 

Speaker

Giorgos Tolias

Giorgos Tolias is an Associate Professor at CTU in Prague and leads a research team within the Visual Recognition Group (VRG). He received his PhD in 2013 from NTUA, Greece, under the supervision of Yannis Avrithis and Stefanos Kollias. From 2014 to 2015, he was a postdoctoral researcher at Inria Rennes, France, working with Hervé Jégou, and later joined CTU in Prague for a postdoc with Ondřej Chum. He received a Best Science Paper Award – Honorable Mention at BMVC 2017 and a Junior Star Starting Grant (2021–2025) from the Czech Science Foundation. His research focuses on computer vision, with emphasis on visual representation learning and instance-level recognition.