Learning for Pattern Alignment

From CSclasswiki
Jump to: navigation, search



Relevant Papers

Fully Convolutional Networks for Semantic Segmentation: https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf

SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation: https://arxiv.org/pdf/1511.00561.pdf


  • Source: George Washington dataset (GW20)
    • Binarized word images (GW20BinaryWordImagesV3.mat)

Fall 2018

Network architecture



All files in howelab\George Washington\PatternAlignment


  • gsmooth.m (From Nick): Performs 2D Gaussian smoothing on a matrix
    • Parameters: m (matrix), sigma, radius (default 2.5*sigma), method (default none)
    • Output: sm (smoothed matrix)
  • warpGT.m (warp code based on addManuscriptNoise.m): Warps original image using smoothed noise field, and calculates a ground truth "warp field" based on moving one pixel "back" from the warped image to the original image
    • Functions used: gsmooth
    • Parameters: img (matrix to be warped), optional: max_dims (maximum dimensions in image set, used to pad all images to same size)
    • Output: img (image plus optional padding), warpimg (warped image), gt (warp field)


  • generateData.m: Generates training and test data (will change upon addition of lineMask)
    • Functions used: warpGT.m

Other files

  • train\
    • Training data (8000 padded/warped/ground truth)
  • test\
    • Test data, (2000, padded/warped/ground truth)
  • 112718_cnn.mat
    • Trained network (fcn architecture) on 400 train, 100 test.
  • 120718_cnn.mat
    • Trained network (fcn architecture) on 8000 train, 2000 test.

Spring 2019

J-Term: Data diversification, prepare scripts/experiments to run while at Smith

    • think about network architecture, why did a shallower network outperform a deeper one?
    • create automatic evaluation script
    • clean up PatternAlignment folder

Work-in-progress: lineMask.m

  • creates a matrix mask along a line, with one "side" filled with 1s and the other 0s
    • challenges: choosing a line-drawing algorithm, extending it to the edges of the matrix (ensuring it goes through the word for balance?), determining correct class balance for use in generateData.m