Coding Scanned Images - Questions about using EPPI-Reviewer

Forum (Archive)

This forum is kept largely for historic reasons and for our latest changes announcements. (It was focused around the older EPPI Reviewer version 4.)

There are many informative posts and answers to common questions, but you may find our videos and other resources more informative if you are an EPPI Reviewer WEB user.

Click here to search the forum. If you do have questions or require support, please email eppisupport@ucl.ac.uk.

<< Back to main Help page

Home

Using EPPI-Revi...

Questions about...

Coding Scanned Images

21/11/2017 22:12

Madeleine Greig

Joined: 12/09/2017

Posts: 3

Coding Scanned Images

Hi there!

Our team is now at a point where we would like to begin coding, however, we are running in to some probelms with some of our scanned PDFs. Because our review is not restricted by date, we have had to secure some of our older material via Interlibrary Loans (ILLs). ILLs are then sent to us in the format of a scanned image. We realized that we could not code the text using a scanned image, so using Adobe Acrobat DR we used OCR softeware to convert the image into a editable PDF. We then uploaded this new file into Eppi. However, when a number of these news files were uploaded into Eppi, the file was contorted and the text became illegible. So while we could (in theory) code the document, we could no longer read it.

If you're able to get into our review (MAiD), under the category 'Screen In on Full Text' -> Attitudes & Experiences -> Babcock "The Nurse and Euthanasia", you will see what I'm getting at. The first uploaded PDF is illegible, but we're able to code it (this is the one that underwent the OCR). While the second uploaded PDF is legible, but we're not able to code it.

Any assistance is greatly appreciated.

Warmly,

Madeleine

22/11/2017 12:01

Zak Ghouze

Joined: 20/03/2017

Posts: 165

Re: Coding Scanned Images

Dear Madeleine,

Would it be possible to get the scans resent to you? Unless they are scanned well (i.e. not at an angle, flat, clear text, etc.), then the character recognition is unlikely to work correctly. Looking at the PDFs you mention in your review, I can see they appear skewed and distorted. Did you receive the scanned files like that? It is possible they were scanned with the original pages presented to the scanner at an angle, or partially folded open from a bound copy?

(What kind of output did you get from the OCR program? A text file or Word document or somesuch? It seems the PDFs I saw in your review were uploaded with the content as images contained within a PDF, rather than converted text in a PDF.)

There are other OCR programs available, so it may be worth trying one of those. Many will allow you to upload the PDFs you receive from the Inter-Library Loan service, and then convert them online for you e.g.

http://www.free-ocr.com/
http://www.freeocr.net/
https://onlineocr.net/
https://ocr.space/
https://www.iskysoft.us/lp/pdf-editing/
http://www.paperfile.net/
https://www.wondershare.net/ad/pdf-editor/

Unfortunately, there’s not a lot else I can suggest. As mentioned, the reliability of the OCR conversion will depend mostly on the quality of your initial PDF files. (Conversion programs can vary, but generally will be of a certain standard – given decent input. One of the suggestions below should be sufficient in that case, and definitely worth trying before you look at paid solutions.)

Hope you have more success with an alternative program, but let us know if the problems persist. Feel free to send us a sample of the files you receive via the ILL service if you are still having issues with conversion, and we can check it and see how it converts for us. (Out support email address is EPPISupport@ucl.ac.uk.) It would also be useful if you could send us a sample of the OCR output.

Kind regards,

Zak Ghouze

Page 1 of 1

Home

Using EPPI-Revi...

Questions about...

Coding Scanned Images