Handwriting Recognition & Deep Learning; Minyue Dai
Contents
Week 01: Jun05 - Jun11
Jun05
- Install Tensorflow and System in GAN article
- On Windows, Tensorflow only works on Python3, but the system Digits only works on Python2.
- Successfully install Tensorflow on Python3.5
- BLSTM(Bidirectional Long Shoert Term Memory)
- This is a modified recurrent neural network, which can makes use of both far and close context bidirectionally.
- LSTM Blog
- Combine Syraic Image data from Spring2017 to a single file
- In SyriacGenesis/GenesisSync3820_4540.mat
- 721 new image data with matched tags
Jun06
- Rename the Syriac Image data file
- Find how to solve the Python Version Problem of Tensorflow&Digits
- Author's Github contains source code of the GAN he uses
- If wants to run the digits, the only way might be use Linux because digits the author uses is a modified version.
- and public version of digits does not support digits now
- 1. Use linux and run digits platform
- 2. Use windows and the source code of tensorflow
- 3. Both solutions need understanding of Tensorflow Library
- Implementation of BLSTM
Jun07
- Finish the BLSTM paper in handwriting recognition
- Finish the Tensorflow Tutorial
- H:\Summer2017\RNN\tensorflowpractice.py
- This is a copy of a BLSTM example from Tensorflow Tutorial, and I add some comments of
- used functions. Also I write a multi-layer BLSTM.
- Important Functions
- tensorflow.contrib.BasicLSTMCell: Build an instance of basic LSTM layer.
- tensorflow.contrib.static_bidirectional_rnn: Build a single layer BLSTN
- tensorflow.contrib.stack_bidirectional_rnn: Build a multi-layer BLSTM
Jun08
- Better Understanding of LSTM and Implementation
- Test Syriac Letter Data in BLSTM
Jun09
- Visualization of Tensorflow
- After running the program, type tensorboard --logdir=/Self-defined graph directory/ in command line to start TensorBoard
- Then type http://localhost:6006/ in browser
- SCALARS: The graph of variables, such as cost and accurary.
- GRAPHS: The visualization of the model's structure
- Read image data in Tensorflow
Week 02: Jun12 - Jun18
Jun12
- Work on Autoencoding
- Encode given data through hidden units, similar to PCA
- Impose sparsity on hidden units, which means each unit should be inactive for most input
- ρ is a ”‘sparsity parameter”’, typically a small value close to zero (say ρ=0.05ρ=0.05). In other words, we would like the average activation of each hidden neuron to be close to 0.05 (say).
- Blog Post on Autoencoding
- Tensorflow Example for Autoencoder
Jun13
- Test Syriac Letter on BLSTM Model
- Need to read same-size Syriac Letter images into a binarized numpy array and save them as array.
- Syriac Image file:
- Local: C:\Syriac\CharSet\CharSet
- Remote: H:\Syriac\CharSet.zip
- Images named with focus are in same sizes
- Python Code that convert images in P1 to a numpy array and save it: H:\Summer2017\SyriacLetter.py
- Numpy Array data file of images in P1: C:\Syriac\FocusArray\Image\P1_focus.npy
- save numpy array data in npy file
- Pretrained Model on Caffe2 and Tensorflow
Jun14
- Transform Syriac Letter Image to Numpy Array
- H:\Summer2017\AllSyriacLetter.py: Read image in P2-P5 into Numpy Array and save them
- C:\Syriac\CharSet\FocusArray\Image\SyriacFocus.npy: Data File for all Syriac Image array
- Local: C:\Syriac\CharSet\FocusArray\Image
- Remote: H:\Summer2017\FocusArray\Image
- Test Syriac Letter Data in AutoEncoder
- For efficiency, just test image in P1: P1_focus.npy
- H:\Summer2017\AutoEncoder\SyriacAutoEncoder.py: The definition of cost function is squared error, which does not perform well on Syriac Letter. For AutoEncoder, it is better to add KL divergence for the sparcity.
- Read the code file of GAN( Generative Adversarial Networks)
- H:\Summer2017\GAN\GAN.py: The Code file of GAN, which just contains some functions and class and only works with DIGITS platform
Jun15
- Read the Syriac Letter Image label and transform them as Numpy Array Data
- H:\Summer2017\SyriacLetterLabel.py: Use number-Letter file CharSetLabels.txt to generate binary code for each letter , then write the binary code as numpy array data and save them
- Local: C:\Syriac\
- Remote: H:\Summer2017\Syriac\
- CharSetLabels.txt: Image name number - Letter data file
- SyriacBinaryCode.txt: Letter Name - Binary Code data file
- FocusArray\Label\: All Image Label File(All in binary code)
- Pn_focus.npy: Image label data for Pnth folder data
- SyriacFocusLabel.npy: Image label data for SyriacFocus.npy(All Syriac Letters)
- Test the labeled Syriac Letter Data on BLSTM
- H:\Summer2017\RNN\SyriacBLSTM.py: Both single-layer and multi-layer BLSTM perform bad, it does not convergent.
- Find GAN Paper
Jun16
- Find GAN Paper
- GAN Fundamental Paper
- Yann LeCun Blog on GAN
- GAN Tutorial from its Author
- GAN Tutorial Blog Post
- Code for Blog Post
- Transform handreg Syriac Letter Data and Label into Numpy array
- This is the better sliced data in size of 59*58
- C:\Syriac\OldChar: This is the image data
- C:\Syriac\OldCharArray\Image: This is the Numpy data for Image
- C:\Syriac\OldCharArray\Label: This is the Numpy data for Label
- Test handreg Syriac Letter Data on AutoEncoder
- The model works on handreg data, but it performs much worse that it does on mnist data set:
- 1. mnist data is much smaller(28*28) than handreg Syriac(59*58) ---- Enlarge the NN but still performs bad
- 2. the sparcity issue, 80% of the output should be 0 ---- add NN's weights as cost or add -0.05 at the initialization of weights, both works well.
- Model and Code File:
- H:\Summer2017\AutoEncoder\oldSyriacAutoEncoder.py: The 2-layer NN AutoEncoder codefile for handreg data set
- Result(Image) (H:\Summer2017\AutoEncoderOldCharImage\):
- AutoEncoder_oldSyraic1024.png: Result for just squared error as cost and 1024-512 cells in NN
- AutoEncoder_oldSyraic-0.05.png: Result for just squared error as cost and -0.05 for all weights at initialization
- AutoEncoder_oldSyraic0.1Weightcost.png: Result for just squared error and 0.1*mean(weights) as cost
- AutoEncoder_oldSyraic0.2Weightcost.png: Result for just squared error and 0.2*mean(weights) as cost
Week 03: Jun19 - Jun25
Jun19
- Test handreg Syriac Letter Data on BLSTM classification model
- H:\Summer2017\RNN'\handregBLSTM.py: BLSTM Model for handreg Syriac Letter Data
- Train on 10000 training data in single-layer BLSTM has about 30% accuracy in testing data
- Multi-layer BLSTM performs even worse
- Then change the optimizer and the accuracy rises to over 95%, but it drops dramatically in training. This problem might be caused by the order of the data.
- Build CNN AutoEncoder
Jun20
- Shuffle the handreg Syriac data and test it on BLSTM again
- H:\Summer2017\DataPrepare\shuffleSyriac.py: Shuffle and save handreg Syriac Data
- shuffleSyriacReg.npy: shuffled Syriac data
- shuffleSyriacRegLabel.npy: shuffled Syriac data's label
- For the shuffled data, the plot of accuracy grows normally, but the testing accuracy is just 55% after 100,000 training.
- Also test on CNN, the accuracy is similar.
- When works with only P1 data, the accuracy is over 95%
- Find the problem of Syriac Data and relabel all of them
- The issue is the dictionary for letter-code is randomized for each labeling.
- Sort the letter list to make sure the order of the list will be the same for all files
- Relabel all data files
- Retest data on BLSTM and the model works well
- Now the accuracy on test data for handreg Syriac Letter classification is about 99%(Need about 10,000 steps)
Jun21
- Read and Test GAN
- Test the given GAN for mnist data set, it takes a long time to train
- Test the pretrained model and it performs pretty well on mnist data set
- This code example only uses CNN, but most complicated models from papers use deCNN.
Jun22
- Check GPU
- How to use GPU in tensorflow
- Tensorflow-GPU install document
- Current GPU:GeForce GTX 480 Compute Capability: 2.0
- Required Compute Capability for Tensorflow-GPU: >=3.0
- CUDA Path: C:\Users\howelab\AppData\Local\Temp\CUDA
- Test GAN on handreg Data
- GANhandreg.py: It works but really slow. (This model doesn't use deCNN)
- Save Binarized Contrext Syriac Data
- C:\Syriac\Bcontext: Image Size: 60*60
Jun23
- Check GAN on handreg Syriac Data
- The model runs slowly on CPU. It runs about 6000 times over 18 hours.
- Test GAN on smaller data
- shufflesmallSyriac10000.npy: The first 100,00 shuffled handreg data is resized into 29*29
- GANtestsmall.py: The model file
- It's trained much more quickly, but it still takes long time.
- It's hard to say if the model works, because it takes long time.
Week 04: Jun 26 - Jul 02
Jun26
- Create AutoEncoder model's generator with CNN and DeCNN
- tf.nn.conv2d
- output weight and height: out = ceil(in//stride)(with "same" padding)
- W(filter) = [height, weight, in_channels, out_channels]
- tf.nn.conv2d_transpose
- output weight and height: out = in*stride
- W(filter) = [height, weight, out_channels,in_channels]
- CNNAutoEncoder_2layers.py: 3-layers CNN and DeCNN Autoencoder (shared weights) on mnist
- Give up fully-connected layer and add a Convolutional layer
- Code Size [4 4 4] = 64; Original Size = 28*28 = 784
- Use Xavier initialization for weights rather than normal-distribution
- Xavier Initialization
- Save and restore model
Jun27
- Test CNN Autoencoder on handreg Syriac Data set
- CNNAutoEncoder_oldSyriac.py: 3-layer CNN AutoEncoder
- trainedCNNAutoEn_oldSyriac\CNNAutoEncoder_oldSyriac.ckpt: Trained model (5 epoch, 600,00 training data)
- AutoEncoder_oldSyriac_CNN.png: Example(Test Data) for the CNN AutoEncoder
- This model only uses CNN and DeCNN without pooling and performs really well, much better than fully connected networks. Also, CNN and DeCNN share parameters and they can be used in GAN.
Jun28
- Implement GAN with deCNN
- DeGAN_testsmall.py: DeGAN on small handreg Syriac data
- Discriminator: CNN - CNN - FullyConnected - FullyConnected
- Generator: FullyConnected - DeCNN - DeCNN - DeCNN
- Training is much faster, but the model does not converge
Jun29
- Try to make DeGAN on handreg Syriac works
- Add batch_norm to each layer, the model works much better, but the generated images have weird "grid" and it may caused by the batch_norm function.
Week 05: Jul 17 - Jul 23
Jul17
- Try to build an encoder in the discriminator of GAN
- GAN/DeAutoGAN_testsmall.py: Train on small(29*28) handreg Syriac data set. The discriminator will also return a z vector and D is trained on Diff(real_image,rebuilt_image)
- GANimage\DeAutoGAN_small_random.png: The first line is original image, the second line is the rebuilt image, and the third line is the fake image. It seems the autoencoder works well, but the GAN does not perform well.
- Try to build conditional GAN
- conditionalGAN\CDAutoGAN.py: This is the the conditional GAN with autoencoder trained on handreg Syriac data set.
- CDGANimage\CDGAN_handreg_30960period.png: The result shows that autoencoder perfroms well, but the generator does not converge.
- CDGAN for mnist
Jul18
- Build CDGAN without autoencoder
- Because in the article, the author first trains a good CDGAN, and then use transfer learning technique to build an autoencoder with the weights from the CDGAN. Actually the author builds two model: CDGAN to generate fake image and discriminate image, autoencoder to get the z vector of real image.
- Article for CDGAN
- conditionalGAN\CDGAN.py: The CDGAN without autoencoder, trained on handreg Syriac dataset.
- CDGANimage\CDGAN_handreg_noauto_3440period.png: The result shows the model does not converge.
- Other structures to get z vector of real image: Adversarially Learned Inference and learned similarity metric
- Adversarially Learned Inference
- The idea is generating z vector for real image and generate fake image for z vector input, so the discriminator input is both image and z vector rather than just image
- Paper
- Webpage and Github Code
- learned similarity metric
- The idea is building encoder and decoder rather than just generator, and the input of discriminator is rebuilt real image through autoencoder rather than real image itself.
- Paper
- Github Code
Jul19
- Build CDGAN on mnist data, exactly same as the author's model
- Script from Author to Rebuild his work
- conditionalGAN\CDGAN_mnist.py: The model which is exactly the same as author's model.
- CDGANimage\CDGAN_mnist_noauto_10000period.png: The result shows it works but really worse. The cause might be:
- 1. It needs more training period
- 2. The learning rate is not optimal
- 3. The initialization of parameters, especially weights for layers, is not optimal.
- Try Autoencoder for learned similarity metric on mnist dataset
- AutoLSM\AutoLSM_mnist.py: Incomplete
Jul20
- Improve the reproduced CDGAN on mnist
- It seems the author constrains the stddev of weights to be 0.02, which means all weight parameters should be close to 0, but the weights initialization I uses is Xavier, which will makes the parameters too small or large.
- conditionalGAN\CDGAN_mnist.py: Change the initialization of weights, and it performs really well.
- conditionalGAN\pretrainedCDGANmnist12400\model.ckpt: The model parameters
- CDGANimage\modifiedCDGAN_mnist_noauto_12400period.png: The fake image examples
- tensorboard --logdir=/tmp/tensorflow_logs/CDGANmnist2: Run in command line and open http://ford343-r08838:6006 to see the loss curve in learning
- conditionalGAN\trainedCDGAN_mnist.py: Get and run the trained models
- Build the Autoencoder based on the trained CDGAN on mnist
- Change the output of the discriminator from 1 to the length of z vector for the encoder, and then train the model based on the weights from trained CDGAN. Frozen the variables in generator
- conditionalGAN\CDGANAuto_transfer\model.ckpt: Parameters from trained CDGAN for training
- AutoCDGAN\AutoCDGAN_mnist.py: Train Autoencoder based on trained CDGAN and it performs really well.
- Find a problem of loading a subset of trained variables, find solution at RalphMao's answer
- AutoCDGANimage\AutoCDGAN_mnist_1000period.png: The test result. It works really well. The first row is original image, and the second row is the rebuilt image.
- AutoCDGAN\pretrainedAutoCDGAN_mnist\model.ckpt: The model is saved.
Jul20
- Try CDGAN on handreg Syriac data set
- conditionalGAN\CDGAN_handreg.py: Use exactly the same parameters, and the result shows it doesn't work.
- CDGANimage\modifiedCDGAN_handreg_noauto_8900period.png: The result of the network.
- Possible Reasons:
- 1. It needs more layer because the image size is four times of the mnist
- 2. The data is skewed because some syriac letters are rarely used.
- 3. Some Syriac letters are really similar.
- Find other examples of CDGAN
- The same article does another experiment of faces, and it has another more complicated networks.
- README of projects
- code for face project
- Improve the CDGAN for handreg
- 1. Add 2 more CNN/DeCNN for D/G
- 2. Set fakelabel to be the same as y_label to solve the skewed data issue
Week 06: Jul 24 - Jul 30
Jul 24
- Finish training 6-layer CDGAN on handreg data set and save the model
- conditionalGAN\CDGAN_handreg2.py: The model has 6-layer for both generator and discriminator, and the fake labels are from the distribution of real labels.
- CDGAN\modifiedCDGAN_handreg2_noauto_11680period.png: Examples from Model. The first line is real image and the second line is the fake image from same labels.
- conditionalGAN\pretrainedCDGANhandreg\model.ckpt: The trained model for 11680 periods.
- conditionalGAN\trainedCDGAN_handreg2.py: Restore and run the trained model.
- conditionalGAN\CDGANAuto_handreg2_transfer\model.ckpt: Resave model for transfer learning in Autoencoder of handreg Syraic Data.
- Train Autoencoder for getting z vector from input image based on trained CDGAN
- AutoCDGAN\AutoCDGAN_handreg.py: Train an Autoencoder based on the trained CDGAN.
- AutoCDGANimage\AutoCDGAN_handreg2_1000period.png: Results from the model
- AutoCDGAN\pretrainedAutoCDGAN_handreg2\model.ckpt: Trained model after 1000 periods
Jul 25
- Get Data for denoising networks
- C:\Syriac\ContextB: Binary Image
- C:\Syriac\ContextC: Color Image
- C:\Syriac\ContextG: Grey Scale Image
- C:\Syriac\StackB: Denoised Binary Image
- Prepare Data for Training
- DataPrepare\LabelImage_GandStack.py: Load and save Gray Scale and Stack image and labels(Just intersection).
- C:\Syriac\DenoiseArray\shuffledGrayimage.npy: shuffled Gray Scale Image data
- C:\Syriac\DenoiseArray\shuflledStackimage.npy: shuffled Binary Stack Image data
- C:\Syriac\DenoiseArray\shuffledLabel.npy: shuffled label data
Jul 26
- Finish training Denoise Model for Gray to Stack Image
- Denoiseimage\Denoise_GandS.png: The Result from trained Model. The first line is gray scale image input, the second is the target stack image, and the third is the result from model.
- Denoise\\pretrained_GandS\\model.ckpt: Model saved.
- Read More Papers
- PCA-Initialized Deep Neural Networks
- Professor Nich's paper
- Use AutoEncoder for CDGAN to generate all z vectors of handreg Syriac Data
- OldCharArray\Label\SyriacRegZvector.npy: z vectors data in order of SyriacRegLabel.npy (60880*100)
Jul 27
- Train Denoise Model with labels on Gray to Stack
- Denoise\CDDenoise_GandS.py: New Modeled with label and input and threshold is 0.3
- DenoiseIimage\DenoiseCD_GandS_0.3.png: Result from Model
- Denoise\pretrainedCD_GandS\model.ckpt: Model saved.
- Install Tensorflow-gpu on new computer
- All softwares installed, and it works.
- Main Instructions on Tensorflow
- Download Cuda
- Download cuDNN (5.1 rather than6 )
- testGPU.py: Test if the tensorflow works on GPU.
Jul 28
- Make the Syriac Number labels for z-vector data set.
- OldCharArray\Label\SyriacRegNum.txt: Number of Syriac Letter for handreg data
- Build CDGAN with pooling layers on handreg Syriac
Week 07: Jul 31 - Aug 06
Jul 30
- Prepare Data for handreg Syriac letters for dating model.
- DataPrepare\handleDate.py: Preprocess needed data for dating
- Syriac\DatingArray\: All train and test data for dating
Aug 01
- Build Dating Model for handreg Syriac Data set
- Dating\DatingHandreg.py: Model based on CNN for dating
- Run and save the model for CDGAN with pooling layers for handreg data set
- conditionalGAN\pretrainedCDGANhandregpooling\model.ckpt: Model saved.
Aug 02
- Build AutoEncoder based on CDGAN with pooling layers for handreg Syriac data
- AutoCDGAN\AutoCDGAN_handreg_pooling.py: Model for training CDGAN based on transfer learning.
- AutoCDGAN\pretrainedAutoCDGAN_handreg_pooling\model.ckpt: AutoEncoder Model saved.
- Generate Z-vector for handreg Syriac data in CDGAN with pooling layers
- AutoCDGAN\trainedAutoCDGAN_handreg_pooling.py: Generate z-vectors
Aug 03
- Run Dating Model
- Dating\DatingHandreg.py: Dating Model based on Conditional CNN. The result shows that it's overfitting.
- Write description for built CDGAN with pooling layers model.
Week 08: Aug 07 - Aug 13
- Test different dimensions for z-vectors and write up summaries.
Week 09: Aug 14 - Aug 20
Aug 14
- Prepare all needed scripts and data for presentation
- Create Diagram
Aug 18
- Use pretrained CDGAN model to train Combined CDGAN
- H:\Summer2017\CBCDGAN\: Scripts and model
Week 09: Aug 21 - Aug 27
Aug 21
- Use F-test to evaluate performance of Denoise and CDDenoise Model
- F test method
- DenoiseResult\DenoiseTestRaw.npy: 10000 test data's raw result from model
- DenoiseResult\CDDenoiseTestRaw.npy: 10000 test data's raw result from conditional model
- Compute F-table and F-score
- Denoise\evaluation.py: Generate F-table and F-score
Aug 23
- Build Classifier based on CNN and test it on reg3 and contextB data
- Average Accuracy for reg3 Test Images: 0.9917
- Average Accuracy for contextB Test Images: 0.9348