CS448f: Assignment 3

UPDATE

By popular vote, the due date for this assignment has been pushed back two days, to Thursday Oct 22 at 11:59pm. This is your last chance to use late days, so if you have two late days left the due date is effectively Saturday night.

Also, remember that the speed comparison is against the original implementation, which used the slow GaussianBlur. So the FastBlur alone should buy you enough time to do something interesting.

Your task

Your task this week is to improve ImageStack's alignment algorithm. The align algorithm in the code you have is completely broken, and you should start by replacing your Alignment.cpp with this copy, which fixes the bugs.

ImageStack's current algorithm "align" warps the top image on the stack to match the second image on the stack. align takes one argument, which must be "translate", "similarity", "affine", "perspective", or "rigid" and constrains the warp to be of that type. For this assignment, we'll use perspective warps, because they are the most general - if you can get reliable perspective warps working everything else should also work.

ImageStack -load basic_a.jpg -load basic_b.jpg -align perspective -save basic_ab.jpg

Here are the input images basic_a.jpg and basic_b.jpb and the output basic_ab.jpg produced by the line above:


basic_a.jpg	basic_b.jpg

basic_ab.jpg - this is basic_b.jpg warped to match basic_a.jpg

We can tell they match up by either averaging them and observing that there is no ghosting. ImageStack -load basic_ab.jpg -load basic_a.jpg -add -scale 0.5 -downsample -display

We can also inspect the absolute difference between the two. The large black area indicates the images match in that region. ImageStack -load basic_ab.jpg -load basic_a.jpg -subtract -abs -downsample -display

This case was a success! ImageStack doesn't always succeed however. In fact, for the test set of images we have prepared for you, which includes a variety of tricky cases, ImageStack almost always fails. Your task is to improve the -align operation so that it works on as many of the test cases as possible.

Alignment can be broken into several phases. First feature points must be chosen. The current algorithm uses a difference of two Gaussians of fixed size. Then feature descriptors must be extracted for use in comparing two feature points. Currently these are 7x7 windows around each feature point, and features are compared using the sum of squared differences of these windows. Finally, once a list of good correspondences between features has been found, RANSAC is used to solve for a good warp. You should make -align better by improving the first two stages (feature detection and descriptor computation). You may alter the RANSAC stage if you wish, but we advise against it. Specifically, there are two sections of code you should modify. First there is the code that finds feature points:

// We first find some corner features by
// converting to grayscale ...
vector grayMatrix;
for (int i = 0; i < im.channels; i++) {
    grayMatrix.push_back(1.0f/im.channels);
}
Image gray = ColorMatrix::apply(im, grayMatrix);

// ... then taking the difference of two Gaussians ...
Image blurry = GaussianBlur::apply(gray, 0, 3, 3);
Image blurrier = GaussianBlur::apply(blurry, 0, 5, 5);
Subtract::apply(blurrier, blurry);

// ... then finding local maxima.
vector maxima = LocalMaxima::apply(blurry, false, true, true, 0.00001);
	
// We sort the maxima to put the strongest ones at the front
::std::sort(maxima.begin(), maxima.end());

// We reject maxima too close to the image edges - it makes it hard
// to extract descriptors
for (int i = (int)maxima.size()-1, j=0; i >= 0 && j < 256; i--, j++) {
    if (maxima[i].x < 3 || maxima[i].x > im.width-4 ||
	maxima[i].y < 3 || maxima[i].y > im.height-4) {
	j--;
	continue;
    }
    corners.push_back(Feature(maxima[i], im));
}

printf("%u corner features found\n", (unsigned int)maxima.size());

We advise replacing the portion which takes a difference of two Gaussians with something more robust. You can probably leave the stuff about finding and sorting the local maxima unchanged.

Next there's the code that creates a feature and compares two features for similarity. Note that it just uses image patches of a fixed size as descriptors. This is probably not a great descriptor. You should make an explicit descriptor vector for your features, with an appropriate distance function between descriptors.

// An image feature is a local maximum in the image (after
// applying some corner detector filter), with a descriptor
// vector. In this case we use a 7x7 patch.
struct Feature : public LocalMaxima::Maximum {
public:
    Feature(LocalMaxima::Maximum m, Window im) {
        x = m.x;
        y = m.y;
        t = floor(m.t + 0.5);
        int ix = (int)(x + 0.5);
        int iy = (int)(y + 0.5);
        value = m.value;
        assert( ( ix > 2 && ix < im.width-3 &&
                  iy > 2 && iy < im.height-3 &&
                  t >= 0 && t < im.frames ),
		  "Requested feature out of bounds\n");
	patch = Window(im, 
	               (int)t, ix-3, iy-3,
                       1, 7, 7);
        usage = 0;
    }

    // Distance between two features is just the sum of squared differences between the two patches.
    float distance(Feature *other) {
        float dist = 0;
        for (int t = 0; t < patch.frames; t++) {
            for (int y = 0; y < patch.height; y++) {
                float *thisPtr = patch(t, 0, y);
                float *otherPtr = other->patch(t, 0, y);
                for (int x = 0; x < patch.channels * patch.width; x++) {
                    float d = *thisPtr++ - *otherPtr++;
                    dist += d*d;
                }
            }
        }
        return dist;
    }

    // It's useful to keep track of how many times any one given
    // features is used, so we don't depend too heavily on a
    // single feature.
    int usage;	

    Window patch;
};

Although this assignment still has a bound on speed with respect to the current image alignment implementation, the emphasis of this assignment is an increase in correctness. Your task is to choose better image features to input to the optimization stage of the algorithm, so that your version of align works on as many of our test cases as possible.

Getting Started

You should all have ImageStack working from the previous assignments, but if for some reason you don't, refer to the instructions in Assignment 1. Download this version of Alignment.cpp and replace your existing version (the existing version is buggy). Don't bother to make a new version of -align for this assignment, just modify the one that's there. Remember to include in your submission any other files you happen to modify.

Grading Criteria

20% of your grade is for clean readable code - the point of making -align better is so we'll all have a better image alignment algorithm. If we can't read the code, we can't really benefit from it.

20% of your grade is for making an image aligment algorithm that is at least as fast as the original image alignment. If yours is the same speed or faster, you will recieve full credit for speed. If yours is slower, your score for this requirement, out of 20%, will be calculated as follows: 20% - ((new_time/old_time - 1) x 5%). So if you take twice the time, you still get 15% for this requirement. If you take five times as long, you get zero for this requirement. Having your algorithm run a bit slower might be a worthwhile tradeoff for more accuracy - it is up to you to decide how you'd like to maximize your score. To time your algorithm, we will run the following:

ImageStack -load basic_a.jpg -load basic_b.jpg -time --align perspective

60% of your grade is for making an image alignment algorithm that works on more of our test cases than the current algorithm.

Here's how we will test accuracy:

ImageStack -load basic_a.jpg -load basic_b.jpg -align perspective \
           -save basic_ab.jpg -subtract -abs -save diff_basic_ab.jpg

This generates two files: one is the ouput image and the other is the difference between the image we are warping to and the output image; the difference image will provide a visualization of alignment error. We will visually inspect the output and error for 8 test cases, which can be downloaded here. Each a & b image pair represents one test case, and we will assign your algorithm's performance a score out of 2 points for each test case. You will receive 0 points for a wacky/completely wrong output, 1 point if your alignment is fairly close but could definitely be better, and 2 points if the alignment is as good as possible given the input images, or within several pixels of as good as possible. Note that for the motion blur example, there's no single perfect alignment, and so for that case we'll be pretty lenient.

Due Date and Submission

Email your modified ImageStack files, in a zip archive, to cs448f-aut0910-staff@lists.stanford.edu by 11:59pm on Thursday Oct 22, with the subject "cs448f assignment 3 submission".

Competition

We'll take the most accurate entries, and have a run-off in class between them to see whose is the best. The entrants in the competition will be the top students with respect to all 8 test cases, but the winner of the competition will be decided by new, possibly more challenging image pairs. As always, there will be fabulous, edible prizes. If several entries solve the same number of test cases, ties will be broken using running time.