Archive for the 'Paper Analyses' Category

Analysis of “Face Sketch Synthesis Algorithm Based on E-HMM and Selective Ensemble”

Web Link

Summary:

Has a large collection of photos and sketches of those photos.  To map the nonlinear relationship between photos and sketches, several models are generated by Embedded Hidden Markov Models (E-HMMs) which each produce a pseudo-sketch.  The author then uses a strategy he calls “selective ensemble” to produce a finer pseudo-sketch from the others.

The E-HMMs have two states: super-states that represent the vertical macro-features (forehead, eyes, nose, etc.) and then these have embedded states that describe the local features.

Discussed earlier work that mapped the nonlinear relationship between photo and sketch by using patches. This involves divvying up the photos and sketches into small overlapping patches and, for each patch, finding the neighbors that similar to it, calculating a “reconstruction weight” for each neighbor, and then using these weights to sketch that patch.  A pseudo-sketch is a combination of all the patches.  However, there was a fine dance between the sizes of the patches and how much they overlapped that one would have to deal with to get details while avoiding artifacts.

Also discussed an approach using E-HMMs and Viterbi decoding that did not work.

Discussion:

It seems like a very solid technique, but I still don’t have a corpus of images.  Not sure I could digest this paper anyway.  The patch approach made more sense to me.  He did mention a non-example-based face sketch synthesis approach that I’m looking into.

Analysis of Document on Clark’s Drawing Abilities Test from SCCGE

Web Link

This is a arbitrary Word document that I discovered on the website for the South Carolina Consortium for Gifted Education.  It is not a published work, but as useful information on Clark’s Drawing Abilities Test (CDAT).  It mainly covers the brief history of CDAT’s development and provides a little more insight into how it works. I’ve copied a few interesting segments out of it:

“Drawing with a pencil or crayon on paper is the most frequently exercised art activity of most children and, therefore, the least intimidating art exercise for a testing situation.”

“The Scoring Criteria Scale is based upon properties of art works that teachers use for instruction and are derived from the writings of Harry Broudy (1972). These are: (1) sensory properties, (2) formal properties, (3) expressive properties, and (4) technical properties.”  - I’m particularly interested in these properties used for scoring, but it appears I need to order the test to do so.

Had each instance of the test scored by three independent teachers and the results had “a high degree of agreement.”

UPDATE: Found a better outline of the scoring criteria here.

Analysis of “Drawing as Visual-Perceptual and Spatial Ability Training”

Web Link

Summary:

“The suggested implication is that all students have drawing and spatial potential that may be developed through art education in general, and through drawing experiences in particular.”

Some believe that the skill of drawing is “important to the total development of a child.” (Dr. Edwards has expressed the same sentiment).

Claims that art education, in the earlier 20th century, was part of the “practical needs of life (craft oriented)” and helped with development of hand-eye coordination and mechanical skills.  This was replaced by a belief that art was more for enrichment, thus art teachers became baby-sitters encouraging self-expression and not instructors, which led to the misnomer of artists being the select few given some innate ability.

Apparently there are many tests for scoring children from one-to-five on their ability to draw, particularly Clark’s Drawing Abilities Test.

Another test is the Test of Visual-Perception Skills (TVPS), which is used to “evalutated mental capabilities related to spatial ability.” It is discussed, but does not appear to be revisited elsewhere in the text. It is described as “an easy-to-use assessment to determine a child’s visual perceptual strengths and weaknesses. Visual perception is an important ability that enables one to make sense out of what is seen (in contrast to visual acuity tests that determine just that something was seen by the individual).”

States how spatial ability is used in lots of tasks, fields, and professions (mathematics, technical drawing, woodwork, engineering, interior design, and of course drawing).  Made claims to a tie between spatial ability and intelligence.  Could drawing be a way to increase intelligence? is their question.

“Since it has already been established that visual-perceptual skills can be trained and that they are synonymous with spatial skills by definition, it follows that each would benefit from the same training. Learning to draw may be one of these common areas.”

“Visual perception is learned.” (pg 6)

“Drawing, as an output of visual perception, enables the conversion of abstract visualization to concrete product.” (pg 6)

“Perceptual skill development is also recognized as a necessary skill in mental development.” (pg 6)

“Training in visual literacy/communication through visual-perception and drawing training may enhance the effectiveness of the use of [computers] whether it be developer or user.” (pg 6)

“Drawing training, because of its relationship with visual literacy and other cognitive areas, is essential to the total educational development of a child and has particular significance with respect to the ever-advancing communication technologies.” (pg 7)

Discussion:

These resources say it possible to find out how a person is perceiving something.  They also say drawing helps improve spatial abilities which ties heavily to many of the areas of work, disciple, and intelligence.

I intend to research more into the CDAT for a way to computational calculate its metrics.

While my intent for reading this paper was to find work that was counter to that of Dr. Edwards, it did lead to many great resources and provide validity to the purpose of this application.

On a reread, the summary and conclusion section is quite good.

How does this relate to my drawing application?  It affirms that the goal is helping the user “see” or visually perceive what is in front of him.

Analysis of “CRITIQUING FREEHAND SKETCHING”

Web Link (Related)

Summary:

Creates a manually-defined heuristics-based engine that checks the parameters of a user’s sketch for a floor plan.  The sketch recognition is not deep as it is mainly boxes or blobs (rooms) and lines (doors and windows). The application then runs through it’s heuristics to see that the certain labeled rooms meet the proper criteria based on where they are (e.g. “the Emergency Room should be closer to the Entrance Lobby”).

Heuristics are defined in a simple tuple form: (<requirement> <room> <room>) for example (SHOULD-BE-ADJACENT ER ICU).  These heuristics are manually input by an expert and not automatically gleamed.  This allows for text feedback when a heuristic is violated.  The system takes into account the placement of doors and arrangement of rooms.

The sketch is later converted in a simple walkthrough 3D model using VRML.

Discussion:

This idea is similar to the domain-independent approach of LADDER, except the defined constraints are not used to recognition domain symbols but to verify the placement of those symbols.  The user is responsible for labeling the rooms.  The system only recognizes that a box or blob is a room.

How is this applicable to drawing a face? We already know certain things about the face (i.e. placement of the eyes in relation to the nose, etc.) that won’t change from face to face, but a heuristic here can be supplanted by just having the user follow a step by step methdology (e.g. draw the nose; now draw the eyes).  However, it would be nice to have a more generic approach so that steps do not have to be handcrafted.

Analysis of “An improved interface for tutorial dialogues: browsing a visual dialogue history”

ACM Portal Link

Summary:

“Thus, we observed that both the tutor and the student refer to
prior utterances: the tutor refers to past explanations in order
to point out similarities or differences to a prior problemsolving
situation. The student refers to prior explanations
in order to ask questions about how prior problem-solving
steps relate to the current step. These observations led us to
provide facilities that allow both the system and the user to
make use of the dialogue history.”

Has a set of features or “facets” that is uses to determine if problems are the same or similar.  The application can then automatically refer back to them whenever the user steps through a problem to review it with the application (done after each problem is completed).

“Reminding students of all the instances of a
principle and stating this principle should promote better
understanding of the principle and when it can be applied
than repeatedly stating the principle as each instance of it
arises.”

Discussion:

It is an interesting idea to reinforce and allow review/query from the dialogue history that has been recorded between a computer and a human.  It helps to “drive the point home” and allows the interaction to feel like that of a normal tutor/student session.

How can this be applied to sketch recognition?  We are able to pull features out of strokes quite easily, but none that really deal with the user’s intention.  We can give them a task to create some primitive or collection of shapes, when it goes into the realm of what I’m trying to do (an application that can assist you in drawing), it would appear to be tougher?  The system might have some knowledge (e.g. “You need to draw the corners on the right eye like you drew them on the left eye.”), but I’m failing to see an example where the steps would be redundant (except when drawing reference lines).

Good idea, better for systems that deal with natural language processing.

Analysis of “Eye Movements in Portrait Drawing”

Review of This Article

Summary:

Used eye tracker and brain scanning to externally and internally analyze how a single artist looked back and forth between reference and canvas when doing portrait drawing.

Defined the simple cycle that an artist repeats to create a drawing: 1. Looking at a specific detail of the model; 2. Turning towards the picture; 3. Drawing or painting the detail; 4. Looking at the picture.

“Our principal finding from the eyetracker study of Humphrey Ocean at work was that his eye movements while drawing a portrait were different from his normal eye movements.  While drawing, he made a sequence of regular single fixations on selected details of the model’s face.”

Other observations made that are interesting:

1. The capture of visual information detail by detail, rather than in a more holistic manner, is reflected in the way the drawing or painting is built up.  This proceeds systematically, by small geometric areas, gradually building up to the picture’s main elements: right eye, left eye, nose, lips, etc. Each detail and each element is of intrinsic importance.

4. Nevertheless, the eye and eye-hand skills alone cannot define the picture production process.  Other artists working from life in Humphrey’s style have similar skills and goals, yet, if asked to draw the same model, would produce entirely different portraits.  The reason for this is not how they draw, but what they draw.

5. The last observation centres on this choice: “At any given moment I will start from what I can see from where I am.  I try to achieve a likeness.  But what I want is a likeness to the reaction I have to something I can see”.

Discussion:

This is a good read to skim about how an artist works when creating a picture.  Doesn’t go much into generalizations as this is only one artist, but it’s just a good reference with a defined process of what an artist does.

Analysis of “Lines and How to Draw Them”

Summary:

This article discusses different presentations for line styles drawn on the screen.  The first style presented covers the analogy of a brush and paper, seeking to mimic the environment of a paper.  The stroke varies based on pressure and the amount of “ink” left in the brush.  A user must actually “dip” the brush to get more ink.

Another style called “skeletal strokes” that use predefined imagery and then create strokes based on the deformation between the predefined strokes and those of the user.  I assume the user traces the example image.  Allows for creative expressive as their different techniques for what to do with the “flesh” made by the two strokes (prefined and user)

The last style maps many parameters to the stroke information (pressure, light/dark, etc.) to also vary the thickness of strokes and shade the shape made, based on the thickness of the stroke at a region.  Interesting idea.

Discussion:

Was looking for different information, but still an intersting read…

Analysis of “HOW CHILDREN LEARN TO DRAW REALISTIC PICTURES”

Summary:

Restricted their subjects (children) to line drawings. 20 children total, ages 5-17.

Judged drawings based on two things: the type of drawing system (perspective, orthographic, oblique) used and the number of overlaps (objects covers over another).

Setup a table with a few objects on it.  Created a “canonical drawing” of the scene using a “drawing machine” (clear plastic with line drawings on it).  Calculated angles and used this as reference drawing for comparisons and setting up scene for new children.

Drawing systems: (1) no projection systems (object drawn but with no relationships), (2) orthographic projection (no depth to objects), (3a) angle of convergence and (3b) angle of obliquity (calculated from the angles made by drawing three straight lines on the table top)

Further divided these into: (1a) vertical oblique projection (straight back; no perspective), (1b) oblique projection (angled; no projection), (2a) naive perspective (attempts to have objects converge), (2b) perspective (actually “converges” based on an angle)

Determined number of overlaps based on the number of overlaps possible for the drawing system used.

Only 9/20 used perspective and only 3 had correct depiction of overlap.

Claim that there is no smooth transition between any of the classes, and that they’re all pretty distinct in nature.  Therefore, development takes place in “discrete stages”.

Suggests that younger children draw stereotypes, or what they think, while older children draw more what they see.

“How children learn to draw in true perspective…seems to be an open question.” (No mention of left/right brain in whole paper).

Discussion:

Uses analogy of learning to draw realistically (shapes, relationships) to learning vocabulary (words, sentences).

Statement: “Drawing can be learned just like writing, it’s just not as needed of a skill.  Just as computer aided tools exist for writing, so should other exist for drawing.”

Has an interesting way to evaluate a drawing, according to some basic unit.

Analysis of “Decoupling Strokes and High-Level Attributes for Interactive Traditional Drawing”

Summary/Discussion:

Allows a user to draw on a picture using different stroke tools and then combines that input with image analysis of the picture to produce a tonal drawing in real-time.  Most work presented is on rendering techniques and mimicing pencil lead and smuging.  More of a light and shadows thing than perspective, edges, and spaces since the use is just tracing.

Analysis of “Effect of Fidelity in Diagram Presentation”

Comment Made Elsewhere:

  1. Nabeel’s Blog

Summary:

This works seeks to contrast when in the design process and presentation of results it is best to “beautify” results, what medium to use, and what level of fidelity (exactness to the ideal in my definition) is appropriate.  Hand drawing is quicker and not restrictive, yet looks imcomplete when people might want a cleaned up version for presentation.  Different levels of fidelity can also lead an audience to provide different types of feedback (font, color versus content).  Essentially hand-drawn is easy for the design stage, but people love to beautify.

Authors created four different levels of fidelity, each time refining strokes to their ideal, including group alignment and handwriting-to-font substitution. For evaluation, then created 25 different conditions based on five different web form setups for the same purpose.  Each evaluation participant was then give a low-fidelity (paper) and four higher fidelity (tablet pc) designs to make changes (marked “quality” if inline with design principles, and “expected” if looked for or “other” if not).

Found that more changes were made to the low-fidelity and it linearly decreased as it went on. Same with “quality” changes.  However, people still preferred the high-fidelity Tablet PC to the low-fidelity paper.

Discussion:

I like how for their evaluation the first made the designs compliant and then explicited created a certain number of flaws that they could look for as a metric.

Looking at their sample conditions in Figure 1, it just appears first off that there are actually more changes to make on the low-fidelity designs, despite the medium.  They comment that alignment and spacing of elements is part of visual fidelity in the opening paragraph of section 7, and it looks that these would at least half of the changes needed (as would be expected by prior works mentioned).  These seems to obvious for the researchers to have overlooked, though and I must be missing something.  Where alignment and spacing part of their “quality changes”?  I would like a table listening some of these.

I also am not of notion that it is a bad thing that one would spend 50% of their time on alignment and spacing since he knows the final version is to be a web form.  It’s obivously an important factor to the majority, so a messy design could easliy lead to over-adjusting (i.e. “other changes”).  People like the high-fidelity version because most of their work was already done for them.

I do agree, though, that the computer interface is a hinderance no matter what.  The user is probably predisposed to feeling that the computer is not capable of doing anything he wanted with the design, so he’s willing to adjust to what it can do.  A good example of this escapes me though, so I guess I can’t really back this up. :)

Next Page »