Pattern Recognition

Handwriting Recognition

.

Handwriting recognition (or HWR) is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper-documents, photographs, touch-screens and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning (optical character recognition) or intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface, a generally easier task as there are more clues available.

Off-line recognition -

Off-line handwriting recognition involves the automatic conversion of text in an image into letter codes which are usable within computer and text-processing applications. The data obtained by this form is regarded as a static representation of handwriting. Off-line handwriting recognition is comparatively difficult, as different people have different handwriting styles. And, as of today, OCR engines are primarily focused on machine printed text and ICR for hand "printed" (written in capital letters) text.

Off-line character recognition often involves scanning a form or document written sometime in the past. This means the individual characters contained in the scanned image will need to be extracted. Tools exist that are capable of performing this step. However, there are several common imperfections in this step. The most common is when characters that are connected are returned as a single sub-image containing both characters. This causes a major problem in the recognition stage. Yet many algorithms are available that reduce the risk of connected characters.

After the extraction of individual characters occurs, a recognition engine is used to identify the corresponding computer character. Several different recognition techniques are currently available.

Neural network recognizers learn from an initial image training set. The trained network then makes the character identifications. Each neural network uniquely learns the properties that differentiate training images. It then looks for similar properties in the target image to be identified. Neural networks are quick to set up; however, they can be inaccurate if they learn properties that are not important in the target data.

Feature extraction works in a similar fashion to neural network recognizers. However, programmers must manually determine the properties they feel are important.

On-line recognition -

On-line handwriting recognition involves the automatic conversion of text as it is written on a special digitizer or PDA, where a sensor picks up the pen-tip movements as well as pen-up/pen-down switching. This kind of data is known as digital ink and can be regarded as a digital representation of handwriting. The obtained signal is converted into letter codes which are usable within computer and text-processing applications.

The elements of an on-line handwriting recognition interface typically include:
  • a pen or stylus for the user to write with.
  • a touch sensitive surface, which may be integrated with, or adjacent to, an output display.
  • a software application which interprets the movements of the stylus across the writing surface, translating the resulting strokes into digital text.
The process of on-line handwriting recognition can be broken down into a few general steps:
  • Preprocessing
  • Feature extraction
  • Classification
The purpose of preprocessing is to discard irrelevant information in the input data, that can negatively affect the recognition. This concerns speed and accuracy. Preprocessing usually consists of binarization, normalization, sampling, smoothing and denoising. The second step is feature extraction. Out of the two- or more-dimensional vector field received from the preprocessing algorithms, higher-dimensional data is extracted. The purpose of this step is to highlight important information for the recognition model. This data may include information like pen pressure, velocity or the changes of writing direction. The last big step is classification. In this step various models are used to map the extracted features to different classes and thus identifying the characters or words the features represent.
.

Connect With Us

description