dc.description.abstract |
In optical character recognition (OCR), visible characters appearing as images (i.e. on paper) are recognized as symbolic characters (in ASCII) and stored in a computer’s memory or similar device like mobile phone. Modern high-end mobile camera phones are capable of capturing color images whose sizes are at least 1280 × 960 pixels, which is usually quite adequate for performing Optical Character Recognition. Having OCR software available for a camera phone has obvious benefits like it can be used to make quick translations of foreign words, to store notes space-efficiently from lecture slides or notice boards. As technology improves, it can become an invaluable tool for visually challenged people. As demand grows for mobile phone applications, research in optical character recognition, a technology well developed for scanned documents, is shifting focus to the recognition of text embedded in digital photographs. Through this project titled as “Text recognition on Android based mobile devices” we find the solution to printed text recognition problem accompanied by some feature rich applications which purposefully use the recognized text. Text recognition phase of the project involves a combination of image processing, feature extraction and machine learning (in our case it is MLP (Multi Layer Perceptron) based approaches.
Initially some preprocessing operations are performed on the input image, followed by text segmentation into entities like textual lines, words and characters. Broken characters after this phase are reconstructed using static and dynamic classifiers. Well defined individual characters are then sent for classification to our classification module, which classify and output the input character in ASCII based on its matching to training set data. All characters converted to ASCII are then grouped together to form complete words and textual lines in post processing phase.
The system is trained with 20 training samples for each character at minimum. Multiple training sessions after iterative testing and evaluation process generated the final training data file. This training file was then used to classify the input character in our project. Our project purposefully uses the recognized text into six applications named as Text to speech, Translation of text, Transliteration of text, SMS, Email and automatically adding new contact from business cards. Various Testing and evaluation results conducted on the product are extremely promising. |
en_US |