Holley, Rose How good can it get? Analysing and improving OCR accuracy in large scale historic newspaper digitisation programs. D-Lib Magazine, 2009, vol. 15, n. 3/4. [Journal article (Unpaginated)]
Preview |
PDF
ANDP__How_Good_Can_it_Get.pdf Download (265kB) | Preview |
English abstract
This article details the work undertaken by the National Library of Australia Newspaper Digitisation Program on identifying and testing solutions to improve OCR accuracy in large scale newspaper digitisation programs. In 2007 and 2008 several different solutions were identified, applied and tested on digitised material now available in the Australian Newspapers Digitisation Program beta service http://ndpbeta.nla.gov.au/ndp/del/home. This article gives a state of the art overview of how OCR software works on newspapers, factors that effect OCR accuracy, methods of measuring accuracy, methods of improving accuracy, and testing methods and results for specific solutions that were considered viable for large scale text digitisation projects.
Item type: | Journal article (Unpaginated) |
---|---|
Keywords: | OCR accuracy, Optical Character Recognition, Historic Newspapers, OCR text correction |
Subjects: | L. Information technology and library technology > LZ. None of these, but in this section. J. Technical services in libraries, archives, museum. > JG. Digitization. |
Depositing user: | Rose Holley |
Date deposited: | 30 Mar 2009 |
Last modified: | 02 Oct 2014 12:13 |
URI: | http://hdl.handle.net/10760/12908 |
References
Downloads
Downloads per month over past year
Actions (login required)
View Item |