Analysis of “Constellation Models for Sketch Recognition”

Comments Made Elsewhere:

  1. Andrew’s Blog

Summary:

Introduction:

Are using a probabilistic model (constellation model) trained on example sketches to label an input stroke.  Supposed can recognize stokes that vary a lot, but they still must be drawn the same way (i.e. a rectangle draw with one stroke != another drawn with four).  Can have optional parts, but the mandatory parts must be present.

Other approaches used search trees, hierarchical graphs, and image-based techniques to find relationships between strokes.  In a constellation model, local features have defined shape and global position (like an eye, nose) and the pairwise features define relations, like distance, of the features.  They apply this for specific features to sketch recognition.  Features defined relative to the bounding box.  Only searches on mandatory features to prevent O(n^2) runtime.  Will search optional for still unlabeled strokes. From these feature vectors, compute mean and covariance matrices.  Use ML or maximum likelihood search.

For the branch and bounding part, run through each of the strokes with the mandatory labels, keeping track of the cost for each and using the best as the bound to limit searching other branches (once all mandatory labels are found). Use “multipass thresholding” to “cut off” branches before searching by using a generous threshold and then increasing using a pessimistic one on subsequent passes.  Also used hard constraints (i.e. a nose should always be above a mouth) to limit searches.  Both of these techiques greatly reduced the amount of time required for searching.

Discussion:

An interesting first read into doing multi-stroke recognition. Had to hit Wikipedia to learn about “branch and bounding” algorithms first thing.  They did not discuss their low-level stroke recognizer, or does this guy actually just work on global position and relation to other objects? (Sure there may be a stroke in the position where the eye is suppose to be, but did you first recognize that is looks like an eye?)  They might not care about this so as to all for cartoon-like drawings (ex: a child draws stars for eyes).  I assume the mandatory features are also hand-labeled?  (The author did discuss have the system define these, but it wasn’t good at it).

Why did they not post any accuracy rates for their algorithm?

I get the feature vector and covariance matrix calculations, though I’d have to do re-read if I were to present this.

No Comment

No comments yet

Leave a reply

*
To prove you're a person (not a spam script), type the security word shown in the picture.
Anti-Spam Image