Recognition Server Review: Unlocking a National Treasure with OCR Conversion

If there’s any downside to becoming nationally renowned library, it’s the practical challenge of making such a volume of information quickly and easily accessible. That’s exactly what the National Assembly Library (NAL) of Korea discovered this during its five-decade expansion from a legislative repository into a landmark institution.

With a rather staggering 160 million pages of digital content in image-only formats, a tremendous national resource remained almost inaccessible (particularly to blind patrons) until NAL leadership undertook an initiative to make all information searchable.

Under the guidance of a domestic consulting firm, ABBYY Recognition Server was selected for its outstanding OCR quality across various languages and formats, and compelling value proposition.
Continue reading

Recognition Server Review: Engineering a Better Document Archive

In 2008, the Chinese government mandated that the Nanjing Hydraulic Research Institute (NHRI) digitize six decades of hydraulic testing documents. Having been allowed only two years to complete what it considered a five-year project, the NHRI embarked on an eager search that ultimately led it to Recognition Server.

Although there were only around 6,000 documents in total, they presented some unusual challenges. First and foremost, they were not mere artifacts of past work, but contained critical data that were regularly accessed to support flood prevention projects. Furthermore, each document contained several hundreds of pages, with a non-computer-friendly mix of digits and lines. And needless to say, Chinese text precluded several OCR applications from consideration.

As is clear from the sample document reproduced below, it would take a unique application to meet NHRI’s needs. Continue reading

Recognition Server Review: An Enterprise OCR Solution Takes Root

In its history of more than three centuries, the Royal Botanic Garden Edinburgh (RBGE) has established itself as a paragon of botanical research and education. The data and specimens in its herbarium cover between half and two-thirds of the world’s plant life: around three million species in all!

With such a volume of disparate records, manual data entry was failing to provide the accuracy and completeness that scientists and hobbyists alike demanded of this premier institution. To complicate things further, records were multilingual, often several centuries old, and composed of both typed and handwritten text. RBGE’s priorities thus became clear:

  • The greatest possible precision across languages, to preserve the scientific value of its records
  • Sufficient processing speed to handle millions of complex documents in a reasonable, useful timeframe
  • The ability to handle various barcode formats (and their absence)
  • Compatibility with an existing image repository

Looking to the experiences of other large libraries and scientific organizations, the RBGE team reviewed and eventually selected ABBYY Recognition Server as the most promising solution to its unusual OCR conversion needs. Continue reading

Recognition Server Review: Great Price-to-Performance Makes Short Work of Library Digitization

The library system of the University of Southampton, an elite English institution, holds over 1.5 million books plus a far larger number of other documents. For its Library Digitisation Unit (LDU), the project of converting all this information to electronically accessible formats was daunting. Staff quickly recognized the necessity of an enterprise-scale OCR server to meet the goal of converting half a million pages per year from an extraordinarily wide variety of documents.

The LDU employed Recognition Server on everything from doctoral theses to 19th-century parliamentary proceedings to a particularly unique collection of antique books on knitting. Using just six book scanners and one line scanner to import text, the solution produced over two million digital documents, rendering rare and valuable works available to researchers worldwide. Continue reading

FlexiCapture Review: The “Nuclear Option” for Automated Data Capture

The China General Nuclear Power Group consists of dozens of entities whose specialties include everything from roads to power plants. It is no surprise, then, than CGNPG suffered from a large and still rapidly increasing number of paper documents. They were costly in terms of physical storage, slow and difficult to access (let alone to search), and frighteningly vulnerable to loss. Management knew something had to change, an insight which eventually led it to adopt ABBYY FlexiCapture.

Manual entry never found its way onto the solutions short list, since the expensive and error-prone nature of human input was too burdensome and risky. Basic OCR applications were also non-starters, as they fell short on handwriting recognition and business logic capabilities.

CGNPG then turned its attention to more sophisticated digital solutions, and spent four months evaluating every available tool against its difficult criteria:

  • Could it capture handwritten text?
  • Could it not only recognize but also extract information?
  • Could it automatically classify documents?
  • Could it serve as a single point of entry for all documents?
  • And, obviously, could it accurately do all the above for Chinese-language text?

Continue reading