Main Content

Content-based Image and Video Analysis

Several connected neurons suspended in a grey void. — Photo: Colourbox.de

When will search engines for multimedia data reach the quality of searching for text documents on the WWW? What is considered necessary for this? How well does searching in images and videos already work today? How is a face or superimposed text in an image automatically recognised? Which basic methods for image/video processing and machine learning are necessary for this? The aim of the lecture is to provide answers to these questions.

Automatic content-based image, sound and video analysis is the prerequisite for successfully searching for information in larger image and video databases (keyword: multimedia information "retrieval"). The lecture gives an overview of this exciting field, which has been intensively researched for a little more than a decade.

The following topics will be covered:

Basics of image and video processing
Machine learning
Basics of deep neural networks (CNN, LSTM)
cut detection
image recognition
similarity search
image segmentation
Person recognition
Text spotting