Watch the “Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval” Session at ECCV 2022

Contact the Project Coordinator - Tel. +49 511 762 17715 | Email: please use this contact form.

You are currently viewing Watch the “Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval” Session at ECCV 2022

Watch the “Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval” Session at ECCV 2022

Post author:CRiTERIA
Post published:7 November 2022
Post category:Events / News / Publications

On the 16th of October, 2022, Damianos Galanopoulos and Vasileios Mezaris of the Information Technologies Institute CERTH presented their paper titled “Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval” at the ECCV 2022 Workshop on AI for Creative Video Editing and Understanding.

You can now watch the recording of the paper presentation! Click here to access the recording on YouTube.

“Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval” by Damianos Galanopoulos and Vasileios Mezaris (Proc. ECCV 2022 Workshop on AI for Creative Video Editing and Understanding (CVEU), Oct. 2022):

In this paper we tackle the cross-modal video retrieval problem and, more specifically, we focus on text-to-video retrieval. We investigate how to optimally combine multiple diverse textual and visual features into feature pairs that lead to generating multiple joint feature spaces, which encode text-video pairs into comparable representations. To learn these representations our proposed network architecture is trained by following a multiple space learning procedure. Moreover, at the retrieval stage, we introduce additional softmax operations for revising the inferred query-video similarities. Extensive experiments in several setups based on three large-scale datasets (IACC.3, V3C1, and MSR-VTT) lead to conclusions on how to best combine text-visual features and document the performance of the proposed network.

Download the Publication

This paper is available in our publications portal.

Tags: Video

“Are All Combinations Equal? Combining Textual and Visual Features with Multiple Space Learning for Text-Based Video Retrieval” by Damianos Galanopoulos and Vasileios Mezaris (Proc. ECCV 2022 Workshop on AI for Creative Video Editing and Understanding (CVEU), Oct. 2022):

Download the Publication

You Might Also Like

Applications Now Open – OSINT Training for Humanitarian Professionals

CRiTERIA Virtual Meeting, 13-14 April, 2022

CRiTERIA Joins The H2020 Border External Security Cluster initiated and led by METICOS