The final project is a chance to dig deeper into one of the topics we have already explored in the homework, or to make an independent exploration of some topic that we didn't have time to cover. The amount you choose to attempt is partly up to you; more ambitious work will be rewarded with the highest grades. In all cases you should keep a running journal of your efforts as you proceed, to document what you have attempted (even if it didn't work).
The sections below describe several options for the project. You can also devise your own, subject to my approval, as described in the final section.
Option 1: Better OCR
Our OCR engine in assignment #4 worked up to a point, but was not as reliable as one might like. For this option you will experiment with ways to make it work better. Feel free to try out different ideas, documenting them in your journal as you go.
The basic idea is to combine multiple modes in order to increase reliability.
- Begin with the template matching. Our initial attempt was brittle, and failed whenever there was a single pixel mismatch. To make things more forgiving, experiment with upsampling. If you resize everything by a factor of two or three, then areas that were previously one pixel thick will have some margin for error. Instead of going to a skeleton for the template, try an erosion (or thinning) of one or two pixels. What works the best? The goal for this step is to identify all possible candidates for a letter. We'll distinguish between them in the later steps.
- Use the morphological features to pick between several matching templates. Consider different combinations, including dimensionless ratios of two basic properties (like area/perimeter squared, for example).
- Also experiment with chamfer distance as a comparison tool. Although it would be slow, you could compare every component to every template based on chamfer distance. Alternately, use it to break ties identified by one of the methods above.
- You may have other ideas. Be creative. Can you come up with something that works really well?
If you wish, you can experiment with additional text images using different fonts, letter sizes, etc. How reliable is your method to these changes?
When you are done, write a short summary of what you attempted and the results you achieved. Speculate on what it would take to do even better.
Possible more ambitious project: I developed a flexible template model as part of my research. You could write a Python implementation and test it out for the OCR task.
Option 2: Image Panoramas
For this option you can weave together strands from several topics we studied to make a complete package. There are a couple of ways to go about it. You can pick one to follow, or try more than one thing to see what works best. Record everything that you try in your journal. You should work with pictures you have taken yourself -- decide what you want a panorama of, and go for it.
The first step is to figure out the registration transformations between photographs. You can use the cylindrical coordinates we studied in class (assuming that you know or can figure out the focal length of your camera -- there are resources that will help you do this from a picture of an object of known size at a known distance). Alternately, you can experiment with perspective transformations (homographies). We actually saw these already -- they are the billboard rectification transformations from homework 2. (More advanced options also exist, should you wish to explore them: read this paper for details. For example, it is possible to adjust all the transformation values between all images simultaneously through an approach called bundle adjustment.)
The next step is to perform the image compositing so as to generate a pleasing result. This should probably include exposure adjustment to match the average intensities of the overlapping images. You can also implement the seam-carving method that we alluded to in class. Finally, once the two images have been composited together, you may wish to trim them to produce a result with no black margins.
If you wish, you can try other types of compositing besides panoramas. For example, to take a closeup composite of a large building, you might take multiple sweeps across its facade before putting them all together in a single image. You will not be able to use the cylindrical projection in this case, since the camera does not stay level.
Possible more ambitious project: to prevent ghosting, apply a local image warp based on optical flow. We computed optical flow in assignment #5 for the simple case of stereo disparity. In the image matching context, you'll need to find the best match in two dimensions, up to some small maximum shift. Using the mean (or median) shift value over some local window, resample the original image so that it overlaps better with the other.
When you are done, write a short summary of the technique you implemented. Outline all the steps and why you included them. Show the results of your method on several examples, includeing at least some with pictures you have taken yourself.
Option #3: Something Else
I'm open to other projects you may wish to propose. These will require my approval, so that I can make sure that they seem feasible and sufficiently challenging. Any of these are possible:
- An extension of one of our other homework projects
- A project exploring some other image processing technique, either from the textbook or from a research paper.
- A live application that uses some image processing technique in service of a fun effect. Ideally, think of something thatcould go on display in the CS display case on the first floor of Ford.
The main rules here are that you receive my approval in advance (by April 24 at the latest) and that you carefully document everything you do in your journal. At the end, you will write a report summarizing the project and what you achieved, with examples.
Deliverables
When your project is complete, you should turn in the following:
- Your summary report. This should be a written document, including sample results and a list of any references/resources you consulted during the project.
- Your development journal (may be interspersed with the code in a jupyter notebook, if you wish)
- Your code (may be combined with the journal if you wish)
- Any additional files necessary to run your code