Difference between revisions of "Syriac Project"

From CSclasswiki
Jump to: navigation, search
(June 7)
(June 12)
(25 intermediate revisions by 2 users not shown)
Line 107: Line 107:
 
==June 6==
 
==June 6==
 
*trained model against Syriac data:
 
*trained model against Syriac data:
:Character error rate: 7.534247%. Word accuracy: 75.000000%.
+
:'''Character error rate: 7.534247%. Word accuracy: 75.000000%.'''
 
*prepared more data for training:
 
*prepared more data for training:
 
:learnt about batch renaming of files in a directory;
 
:learnt about batch renaming of files in a directory;
Line 116: Line 116:
 
:'''H:\SyriacGenesis\gwords'''
 
:'''H:\SyriacGenesis\gwords'''
 
*finished training with modified images at Epoch 75
 
*finished training with modified images at Epoch 75
:Character error rate: 4.618938%. Word accuracy: 84.333333%.
+
:'''Character error rate: 4.618938%. Word accuracy: 84.333333%.'''
  
 
==June 7==
 
==June 7==
Line 128: Line 128:
 
*path for folder 'a04-000u', 'a05-000u', and new 'word.txt':
 
*path for folder 'a04-000u', 'a05-000u', and new 'word.txt':
 
:'''H:\SyriacGenesis\gwords'''
 
:'''H:\SyriacGenesis\gwords'''
 +
 +
 +
=Week 03=
 +
 +
==June 10==
 +
*updated the model with:
 +
:'''Character error rate: 1.184433%. Word accuracy: 95.818182%.'''
 +
'''NRH>>> Awesome!'''  Quick question:  is this training error or test set error?
 +
:The dataset is split into 95% of the samples used for training and 5% for validation. This is the validation set error rate, which provides an estimate of the test error rate.
 +
:[https://machinelearningmastery.com/difference-test-validation-datasets/ (how experts in the field of machine learning define train, test, and validation datasets)]
 +
*read coda toolkit documentation at [https://docs.nvidia.com/cuda/archive/9.1/cuda-installation-guide-microsoft-windows/index.html URL]
 +
*successfully trained the model on '''GPU''' (finally!!!)
 +
:same dataset but different validation set error rate on a different PC:
 +
:'''Character error rate: 0.803723%. Word accuracy: 97.454545%.'''
 +
*working on shear transformation to enlarge the dataset
 +
:added shear -0.25 degree with size 0.8
 +
 +
==June 11==
 +
*set up the second tensorflow-gpu environment:
 +
:sadly it wasn't successful :(
 +
*Image Data Augmentation:
 +
:Added salt and pepper noise, with a noise density of 0.1, to the image. --> a7
 +
:Added Gaussian white noise with mean 0.1 and variance of 0.01. --> a6 (replaced shear transformation)
 +
:'''Character error rate: 0.759878%. Word accuracy: 97.066667%.'''
 +
:PS: At first, the word accuracy rate dropped by 10% after the these 2 sets of images were added.
 +
:    One possibility is that the data augmentation strategy adds some bias to the examples that doesn’t match the original examples.
 +
:    I also think maybe data augmentation will be more effective if I train from scratch than finetuning.
 +
:    So I replaced the shearing set with Gaussian white noise set, because I only sheared the photo in one direction, which may lead to bias.
 +
:    Then I deleted the files contained in the model/ directory, and trained the model from scratch.
 +
:two sources I find helpful:
 +
:[https://www.quora.com/What-are-some-possible-reasons-why-training-accuracy-would-decrease-over-time some possible reasons why training accuracy would decrease over time]
 +
:[https://discuss.pytorch.org/t/why-data-augmentation-leads-to-decreased-accuracy-when-finetuning/3600 Why data augmentation leads to decreased accuracy when finetuning]
 +
 +
==June 12==
 +
*After adding two sets of shear and two sets of noise:
 +
:'''Character error rate: 0.298851%. Word accuracy: 98.900000%.'''
 +
*ran Genesiswork.m:
 +
:found missing functions and used addpath();
 +
:read comments, ran chunks and tried to see how each one works;
 +
:had a general view of the process: 1) load page 2) binarize 3) blocks 4) lines
 +
 +
==June 13==
 +
*selected 15 pages from Binarized Manuscripts and found their copies in Raw Manuscripts;
 +
*working on finding corner coordinates of binarized manuscripts:
 +
:tried to apply functions in Genesiswork.m to binarized manuscripts;
 +
:but the borders wasn't cropped properly in this way;
 +
:changed to use ginput to cut manually;
 +
:decided to go over some basics of matlab.

Revision as of 11:32, 14 June 2019

Week 01

May 28

  • read some related papers; get a general sense of the field:
  • T. Bluche, H. Ney, J. Louradour and C. Kermorvant, "Framewise and CTC training of Neural Networks for handwriting recognition," 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Tunis, 2015, pp. 81-85.
URLShow CTC was similar to forward-backward training of hybrid NN/HMM systems and can be extended to more standard HMM topologies;
  • P. P. Sahu et al., "Personalized Hand Writing Recognition Using Continued LSTM Training," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 218-223.
URL Demonstrate simple but robust continued-training techniques for adapting a pre-trained model using Long Short Term Memory (LSTM) network and a Connectionist Temporal Classification (CTC) loss function- to a specific user's writing style;
  • W. Hu et al., "Sequence Discriminative Training for Offline Handwriting Recognition by an Interpolated CTC and Lattice-Free MMI Objective Function," 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 2017, pp. 61-66.
URL Find CTC+LFMMI approach very effective and helpful for some punctuation-sensitive scenarios such as handwritten receipt recognition.
  • V. Pham, T. Bluche, C. Kermorvant and J. Louradour, "Dropout Improves Recurrent Neural Networks for Handwriting Recognition," 2014 14th International Conference on Frontiers in Handwriting Recognition, Heraklion, 2014, pp. 285-290.
URL Applied dropout (a recently proposed regularization method for deep architectures) to RNNs; confirmed the effectiveness of dropout on deep architectures even when the network mainly consists of recurrent and shared connections;
  • Z. Xie, Z. Sun, L. Jin, H. Ni and T. Lyons, "Learning Spatial-Semantic Context with Fully Convolutional Recurrent Network for Online Handwritten Chinese Text Recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 8, pp. 1903-1917, 1 Aug. 2018.
URL Proposed a multi-spatial-context fully convolutional recurrent network (MC-FCRN) to exploit the multiple spatial contexts from the signature feature maps and generate a prediction sequence while completely avoiding the difficult segmentation problem; developed an implicit language model to make predictions based on semantic context within a predicting feature sequence, providing a new perspective for incorporating lexicon constraints and prior knowledge about a certain language in the recognition procedure.

May 29 - 30

  • Installed anaconda, openCV and Tensorflow;
built a python3.6 instance inside of python3.7 (Tensorflow does not support python3.7);
read anaconda documentation;
got familiar with Jupiter notebook.
  • Explore terminal;
learnt about path;
learnt how to switch version of python/tensorflow on Mac;
learnt to use command line arguments in a Python program;
Command Line Arguments in Python.
  • Tried out a demo NN implementation;
Build a Handwritten Text Recognition System using TensorFlow;
fixed 'placeholder' attribute error caused by TensorFlow2.0; solution: reinstall TensorFlow1.12;
fixed import error; solution: open the IDLE with the virtual environment.
  • helpful tutorials:
machine learning tutorial by Hung-yu Lee;
read the first part of lecture notes : introduction of deep learning;
Training and Testing on our Data for Deep Learning.

May 31

  • Set up system in Ford 342:
installed Python and TensorFlow1.12;
got the IAM dataset for training(opening large tar file in windows took a lot time).

June 1

  • Installed TensorFlow GPU:
ran into import error when testing installation;
issue: it seems to be a version compatibility issue;
started things over following How to Install TensorFlow GPU on Windows -FULL TUTORIAL;
looked through deep learning SDK documentation link.

June 2

  • went over the installation process once again but got the same error when importing tensorFlow :(

Week 02

June 3

  • Trained IAM dataset on CPU in Ford 342;
I'm curious about the training process so here's some notes:
11:30 | Epoch 01 | Character error rate: 17.056961%. Word accuracy: 61.686957%.
13:37 | Epoch 11 | Character error rate: 12.676747%. Word accuracy: 69.617391%.
15:40 | Epoch 21 | Character error rate: 11.097730%. Word accuracy: 73.060870%.
..........| Epoch 24 | Character error rate: 10.486641%. Word accuracy: 74.295652%.
.......... Character error rate improved, save model
16:29 | Epoch 25 | Character error rate: 10.531246%. Word accuracy: 74.434783%.
.......... Character error rate not improved
16:45 | Epoch 26 | Character error rate: 10.491101%. Word accuracy: 74.313043%.
.......... Character error rate not improved
  • Have a better understanding of Deep Learning with Neural Networks:
The artificial neural network is a biologically-inspired methodology to conduct machine learning, intended to mimic our brain (a biological neural network). The idea has been around since the 1940's, and has had a few ups and downs, most notably when compared against the Support Vector Machine (SVM). For example, the Neural Network was popularized up until the mid 90s when it was shown that the SVM, using a new-to-the-public (the technique itself was thought up long before it was actually put to use) technique, the "Kernel Trick," was capable of working with non-linearly separable datasets. With this, the Support Vector Machine catapulted to the front again, leaving neural nets behind and mostly nothing interesting until about 2011, where Deep Neural Networks began to take hold and outperform the Support Vector Machine, using new techniques, huge dataset availability, and much more powerful computers. reference
  • Covered some basics on what TensorFlow is and began using it
read article "what is TensorFlow;"
went through TensorFlow Basics;
wrote a mini TensorFlow program and ran it;

June 4

  • finished training IAM dataset
| Epoch 40 | Character error rate: 10.379589%. Word accuracy: 74.643478%.
Character error rate not improved
No more improvement since 5 epochs. Training stopped.
  • find out how to convert our own data so that the model can recognize the text; link
  • continue working on deep learning with NN and TF tutorial series
watched demos in:
Part 4: built the model for Neural Network and set up the computation graph with TensorFlow;
Part 5: set up the training process which is what will be run in the TensorFlow Session;
the basics of RNN: link
  • read about previous work on classwiki
  • understand HTR better reading:
U. -. Marti and H. Bunke, "Text line segmentation and word recognition in a system for general writer independent handwriting recognition," Proceedings of Sixth International Conference on Document Analysis and Recognition, Seattle, WA, USA, 2001, pp. 159-163. click here
  • installed Meltho Fonts

June 5

  • familiarized myself with Matlab
rotated 2223 grayscale images in H:\SyriacGenesis\gwords\a01\a01-000u by +10 and -10 degree
  • folder path for rotated copies:
H:\SyriacGenesis\gwords\a01\rot10_a01-000u
H:\SyriacGenesis\gwords\a01\rot(-10)_a01-000u

June 6

  • trained model against Syriac data:
Character error rate: 7.534247%. Word accuracy: 75.000000%.
  • prepared more data for training:
learnt about batch renaming of files in a directory;
made a python program that takes a folder name from the input argument and renames all its files;
edited word.txt using python and made it ready for training;
successfully added modified data (+10/-10) to the training examples and started training.
  • path for folder 'a02', 'a03', and new 'word.txt':
H:\SyriacGenesis\gwords
  • finished training with modified images at Epoch 75
Character error rate: 4.618938%. Word accuracy: 84.333333%.

June 7

  • made a list of ways to improve recognition accuracy;
  • tried to install TensorFlow GPU on the gpu server:
tried cuda version 9.0, 9.1, 10.0;
tried TensorFlow GPU 1.12, 1.13;
  • prepared more data and trained the model:
a04-000u: make the size 1.5 times bigger, tilt the image by +5 degree
a05-000u: make the size 1.5 times bigger, tilt the image by -5 degree
  • path for folder 'a04-000u', 'a05-000u', and new 'word.txt':
H:\SyriacGenesis\gwords


Week 03

June 10

  • updated the model with:
Character error rate: 1.184433%. Word accuracy: 95.818182%.

NRH>>> Awesome! Quick question: is this training error or test set error?

The dataset is split into 95% of the samples used for training and 5% for validation. This is the validation set error rate, which provides an estimate of the test error rate.
(how experts in the field of machine learning define train, test, and validation datasets)
  • read coda toolkit documentation at URL
  • successfully trained the model on GPU (finally!!!)
same dataset but different validation set error rate on a different PC:
Character error rate: 0.803723%. Word accuracy: 97.454545%.
  • working on shear transformation to enlarge the dataset
added shear -0.25 degree with size 0.8

June 11

  • set up the second tensorflow-gpu environment:
sadly it wasn't successful :(
  • Image Data Augmentation:
Added salt and pepper noise, with a noise density of 0.1, to the image. --> a7
Added Gaussian white noise with mean 0.1 and variance of 0.01. --> a6 (replaced shear transformation)
Character error rate: 0.759878%. Word accuracy: 97.066667%.
PS: At first, the word accuracy rate dropped by 10% after the these 2 sets of images were added.
One possibility is that the data augmentation strategy adds some bias to the examples that doesn’t match the original examples.
I also think maybe data augmentation will be more effective if I train from scratch than finetuning.
So I replaced the shearing set with Gaussian white noise set, because I only sheared the photo in one direction, which may lead to bias.
Then I deleted the files contained in the model/ directory, and trained the model from scratch.
two sources I find helpful:
some possible reasons why training accuracy would decrease over time
Why data augmentation leads to decreased accuracy when finetuning

June 12

  • After adding two sets of shear and two sets of noise:
Character error rate: 0.298851%. Word accuracy: 98.900000%.
  • ran Genesiswork.m:
found missing functions and used addpath();
read comments, ran chunks and tried to see how each one works;
had a general view of the process: 1) load page 2) binarize 3) blocks 4) lines

June 13

  • selected 15 pages from Binarized Manuscripts and found their copies in Raw Manuscripts;
  • working on finding corner coordinates of binarized manuscripts:
tried to apply functions in Genesiswork.m to binarized manuscripts;
but the borders wasn't cropped properly in this way;
changed to use ginput to cut manually;
decided to go over some basics of matlab.