The world is changing every day, sodoes technology. Librarians and archivists find themselves facing the prospectsof digitization. Digitization is the process of converting analogue signals orinformation of any form into a digital format that can be understood bycomputer systems or electronic devices. Digitization can be important elementsto protecting originals from excessive handling and repeating copying.Digitized information is easier to store, access and transmit and digitizationis used by a number of consumer electronic devices. Digitization process canused any method such as data capture to collecting information and thenchanging it into a form that can read and used by a computer. Data capturing is the method toputting a document into an electronic format. Many organizations implement toautomatically identify and classify information and make the informationavailable within particular systems.
It takes documents content, in any format,and converts it into something that a computer can contrive. Systems forautomated data capture are for example OCR, OMR, and ICR. One of the functionof the data capture is to make the user easy to find the information, so, itdoes not matter if how fast the document been capture when they can’t extractdata, append metadata or even integrate with other content.
Capturing datawithout search ability will completely limits user’s ability to know what dataorganization have and where to find it. Usually, paper based is the forms whenit comes to capture the data and there have two ways to capture the data eitherin manual or automated. Manual data capture or rekeying is way when you can’tcapture the materials because it is not be in good repair, contain handwritingnotes or additions that impossible to read. If handwriting does not OCR well,and need to be rekeyed by hand. Rekeying is the process of taking a documentand physically typing the information contained in the document directly intoyour word processor. Rekeying is extremely involved and time consuming. If adocument or a document project needs to be rekeyed, then it needs to allotextra time for the rekeying of the text and proofreading.
When rekeying text,you should open word processing program and begin typing the text exactly thesame way it appears on the document. However, it should be sure to preserve thestructure and content closely to the original. The process should do it rightand not in hurry or feel rushed when rekeying texts. Rekeying is a long andslow process and should only be performed when necessary. For automated, is called automaticidentification and data capture (AIDC) is to identify, verify, record,communicate and store information on discrete, packaged or containerized items.
Technologies that are considered as part of AIDC are bar codes, magneticstripes, Optical Character Recognition (OCR), Intelligent Character Recognition(ICR), Optical Mark Recognition (OMR) and others. These technologies are capableof performing automatic data capture. Modern technology allows data capture tobe quick, accurate and reliable. Automatic data capturing is a technologydriven solution to document processing. Prior existing technology was unable toaccurately process forms such as invoices because of the various fields thatsuch documents contain.
The data then saved electronically for access at laterpoint. This will make document work efficient and convenient. For instance, to convert a document image toelectronic text, OCR is suitable software to bring up the TIFF image of thescanned document, select the necessary text portion and put it into a formatwhere we can edit the text for accuracy and usability.
For the record, OCRrecognizes text and character from PDF scanned documents (include multipagefiles), photographs and digital camera captured images. OCR will takingarchival TIFF images and converted them into readable and editable text. OCRwill not changing or alter images files, instead you are using a program toread the text in an image and create text files that are used for long termstorage and to mark up document for online viewing.