Analysis of “Template-based Online Character Recognition”

Comments Made Elsewhere:

  1. Yuixang’s Blog

Summary:

Seek to allow for writer-independent character recognition by creating a “class” of templates for each writer and then training a classifier on these classes.  Do character recognition based on the templates after data reduction.  Discuss the two different classifers they used.

Other attempts: primitive decomposition (break into dots, arcs, loops, etc. and recognition characters using dictionary lookup or HMM), motor models (try to simulate movement of the hand?), elastic matching (”featureless” matching of points to points), stochastic models (extract features from points or sliding window of points and use HMM), time delay neural networks.

Preprocessing: Do resampling to make stroke equidistant and then apply Gaussian filter to each coordinates. End points and points of high curvature are preserved.  Scaled to be the same height but still have same aspect ratio.

Representation: Strokes listed as x,y, and theta of curvature.  Each stroke represented as sequence of events and then determine of sum of distances between two sequences of events.  When two strokes, or sequence of events, don’t have the same number, penalties are added to the distance calculated, ending up with four different equations for distance metric and going with the minimum of the four.

Data Reduction: (1) Cluster the “featureless” characters into some set K amount of clusters, each technically representing a writing style, and then take the medoid of the cluster. (2) Use a nearest neighbor calculation to select all examples on the edge of the training set.

Classification: (1) Used nearest neighbor again, or (2) constructed a decision tree based on a vector of the distances from each reference character.  Each of these “similarity features” (aka a comparison to a reference character) is then used a node on the decision tree?  They expand this to a “difference feature” by comparing to reference characters at each node to determine with one the input character is more like, and produce better classification.

Discussion:

I think their preprocessing steps would be quite useful in text-vs-shape distinction.

On page 13 second paragraph, when are they recognizing that multiple strokes belong to a character?  I assume this is for input strokes.

I’m not a fan of clustering algorithms that require a priori number of clusters to find.  I know this is a hard problem, but I know there are other algorithms out there that do a more bottom-up approach and segment the clusters themselves.

I don’t understand why they’d want to only select the templates on the edge of the training set?  Is it because there are the most extreme cases?

It’s interesting to see another paper using a decision tree for classification of text.  Maybe the feature space really is that small.

No Comment

No comments yet

Leave a reply

*
To prove you're a person (not a spam script), type the security word shown in the picture.
Anti-Spam Image