Computer Vision
Spring 2011


Lecture slides* Szeliski Supplementary
Introduction (pdf)
Filters and scale (pdf)
Ch. 1, 3.2, 3.5
Ch. 3.5.4 (optional)
- Steerable filters [Freeman & Adelson], Sect. 1-2
- Image pyramids [Adelson & Anderson] (optional)
Image features (pdf) Ch. 4.1 - SIFT [Lowe], Sect. 1-6.2 (optional)
- Multi-image matching [Brown] (optional)
Edges and lines (pdf) Ch. 4.2-4.3 - Edge detection [Cipolla & Gee]
Introduction to cameras
and projection (pdf)
 Ch. 2.1 - Hartley & Zisserman note, p. 1-39 (optional)
- Soft introduction [Hartley & Zisserman] (optional)
Motion and warping (pdf) Ch. 8.1-8.3, 3.6 - Creating panoramas [Szeliski & Shum] (optional)
Optical flow (pdf) Ch. 8.4 - Optical flow [Wikipedia] (optional)
- Optical flow [Fleet & Weiss] (optional)
Segmentation (pdf) Ch. 5.2-5.4  
Segmentation II (pdf) Ch. 5.1 - Active Appearance Models, Sect 1-2.3
- A tutorial on PCA (optional)
Camera calibration and projective geometry (pdf) Ch. 6.1-6.3 (optional) - Projective Geometry [Mundy & Zisserman], (read  23.1-23.5, 23.10)
- Camera calibration from vanishing points [Cipolla, Drummond & Robertson] (optional)
Structure from motion and stereo vision (pdf) Ch. 7.1-7.2
Ch. 11.1-11.2
- Projective Geometry [Mundy & Zisserman] (23.11), (optional)
- Hartley & Zisserman note, p. 1-39 (optional)
Light color and 3D modeling (pdf) Ch. 2.2, 12.1-12.2, 12.7  
Recognition Ch. 14.1-14.2 (optional)  

*Note that the lecture slides are included in the curriculum.

After taking the computer vision course you should be able to…

  • Filters (Szeliski 3.2)

    • Define correlation and convolution

    • Mention that convolution is associative and commutative (why is that useful?)

    • Explain the principles of superposition and shift invariance

    • Define and motivate the use of separable filters

    • Define basic filter kernels (e.g., Gaussian, Laplacian, Laplacian of Gaussian, Sobel

    • Mention typical applications of basic filters (e.g., noise removal, low- and high-pass filtering, image differentiation, and edge detection)

    • Explain the difference between a “steerable filter” and a “basis filter”

    • Define the smoothed directional derivative filter (i.e., the steerable version of the derivative of Gaussian filter)

    • Explain and motivate the use of “template matching”

  • Scale (Szeliski 3.5)

    • Explain the role of an interpolation kernel

    • Explain the role of a smoothing kernel (in the context of decimation)

    • Explain how to construct a Gaussian pyramid

    • Explain how to construct a Laplacian pyramid

    • Mention typical applications of image pyramids

  • Image features (Szeliski 4.1)

    • Explain what an “interest point” is, and what such points can be used for

    • Describe different strategies for “feature detection”, particularly how to detect corners in an image (e.g., using the Harris corner detector)

    • Explain the terms “scale invariance” and “rotational invariance”

    • Describe strategies for obtaining scale invariance and rotational invariance (e.g., automatic scale selection)

    • Describe different strategies for “feature description” (e.g., SIFT)

    • Describe the overall principle of “feature matching”

    • Describe the overall principle of RANSAC (also relevant for motion estimation)

    • Explain the role of “receiver operator characteristics”

  • Edges and lines (Szeliski 4.2-4.3)

    • Explain how to detect edges using the gradient operator

    • Explain how to detect edges using the Laplacian of Gaussian operator

    • Explain how the Canny operator works (most importantly, you should be able to describe what is meant by non-maximum suppression and hysteresis)

    • Describe different strategies for linking edges

    • Explain the basic principle of successive approximation of lines/curves

    • Explain the basic principle of the Hough transform, and why this representation can be useful if lines in the real world are broken up in pieces.

  • Cameras and projection (Szeliski 2.1)

    • You should be familiar with the geometric primitives described in section 2.1.1, as well as the basic spatial transformations (also called motion models) described in sections 2.1.2+2.1.3.

    • Define and explain the difference between orthographic and perspective projection

    • Describe the geometry of the camera model/pinhole camera (e.g., center of projection, projection plane, principal point, focal length, etc.)

    • Describe the role of the calibration matrix (K) and the camera intrinsics

    • Describe the camera matrix (P) and the role of the camera extrinsics

    • Define what is meant by a homography or projective transformation (mapping between two planes with the same center of projection)

    • Mention at least one strategy for calibrating a camera and more generally how to solve for an unknown homography

  • Motion and optical flow (Szeliski 8.1-8.4)

    • Explain the basic principles of translational alignment, including different error metrics.

    • Motivate and explain the basic concept of hierarchical motion estimation

    • Know that translation in the image domain corresponds to a linear phase shift in the Fourier (or frequency) domain

    • Explain the basic concept of incremental refinement (Ch. 8.1.3)

    • Describe the aperture problem

    • Motivate and explain the basic concept of parametric motion models and spline-based motion

    • Define and explain what the “brightness constancy assumption” is

    • Describe at least one strategy for estimating optical flow

    • Describe the overall principle of RANSAC (also relevant for image matching)

  • Warping (Szeliski 3.6)

    • Explain the basic concept of forward and inverse warping (advantages/disadvantages)

    • Describe the main types of parametric transformation (e.g., translation and rigid/Euclidian)

    • Know the basic concept of mesh-based warping

  • Segmentation (Szeliski 5.1-5.4 + supplemental material)

    • Explain the basic concept of snakes (active contour models)

    • Explain the main differences between “level sets” and “snakes”

    • Explain the difference between divisive clustering and agglomerative clustering

    • Describe the K-means algorithm

    • Describe the “mean shift” algorithm

    • Explain the basic idea of segmentation using “normalized cuts”

    • Explain the basic idea of “active appearance models” (see supplemental material)

  • Camera calibration and projective geometry (Projective Geometry [Mundy & Zisserman])

    • Explain the geometry under perspective viewing (e.g., view point, rays, vanishing points, and horizon)

    • Mention examples of properties/structures that remain invariant – and do not remain invariant – under perspective viewing (i.e., line intersection, cross-ratio, and circles)

    • Explain the difference between perspective transformation and projective transformation

    • Explain the basic elements of projective geometry (i.e., the projective plane, projective lines, point and line duality, ideal points and lines)

    • Write down the projective transformation in both homogeneous and Cartesian coordinates

    • Explain how to estimate a projective transformation from four points

    • Define the cross-ratio and mention examples of what it can be used for

  • Structure from motion (Szeliski 7.1-7.2)

    • Explain what is meant by triangulation

    • Explain the epipolar geometry for two camera views (e.g., epipoles, epipolar lines, essential matrix, fundamental matrix)

    • Define the SFM problem and mention strategies for solving it (e.g., singular value decomposition)

  • Stereo (Szeliski 11.1-11.2)

    • Define the terms “disparity” and “depth”

    • Explain what is meant by image rectification and why this is useful

    • Describe the basic principle of the “plane sweep” method

    • Mention at least one strategy for matching two stereo images

  • Light and color (Szeliski 2.2)

    • Explain what is meant by an “environment map” (L)

    • Explain what is meant by foreshortening

    • Describe the Bidirectional Reflectance Distribution Function

    • Mention examples of how to calculate the light exiting a surface point p under a given lighting condition (diffuse reflection, specular reflection, and phong shading).

  • 3D reconstruction (Szeliski 12.1-12.2, 12.7)

    • Explain what is meant by “Shape from X”

    • Explain how photometric stereo works