My approach in applying computer vision to cinema is to build a database of film shots by logging the algorithmically salient elements of each frame and finding correspondences between them.
To start, I will be attempting to develop detection schemes for optical flow and SIFT feature tracking using openCV.
Optical Flow
Optical flow is a method for measuring movement along a sequence of images. In cinematic terms, if an actor moves from camera left to camera right over the course of a shot, optical flow will describe the amount and direction of the actor’s movement between each frame of film. If the camera is static, and the actor is simply walking across the set, only the movement of the actor will be measured, since the background will not change.
SIFT
SIFT feature tracking is a patented algorithm developed by David Lowe1. As its core, it is a very robust method for finding parts of an image within other images. For instance, it could be used to detect a prop that appears in two different shots. The SIFT acronym stands for Scale Invariant Feature Transform; essentially, this means that the algorithm accounts for differences in scale, lighting, occlusion, and even 3D rotation between the two images. For instance, if in one shot we see a close up of a gun on a table, we should be able to use SIFT to find that same gun in the background of another wide shot, despite the fact that the gun is now rotated, poorly lit, and partially obscured by an actor’s hand.
Purpose
The reason for analyzing optical flow in films is that movement has been the basis of cinema since the early chronophotography of Eadweard Muybridge and Étienne-Jules Marey. In narrative filmmaking, continuity editing makes explicit use of movement to suture shots through match-on-action cuts. One use of optical flow would be to find similar movements across a database of film shots.
Using SIFT to analyze films also addresses the narrative continuity of filmmaking, which largely relies on the repetition of visual elements. For instance, we could compile a database of props and scenery within a film and then use that data to map the repetitive use of those elements throughout the narrative.
Features
Both optical flow and SIFT use variant methods of feature tracking to achieve their results. A feature in computer vision is simply a specific part of an image that carries enough unique information for an algorithm to analyze, and in our case, to recognize its reappearance in sequential frames or different shots. Optical flow has been developed to track the frame-by-frame movement of features within a shot, whereas SIFT was developed to recognize a pattern of features that repeats between shots.
Next
In the following post, I will be documenting my attempt to implement SIFT feature tracking using C++ through openFrameworks and the ofxOpenCV and ofxCV addons.