Version 188.8.131.52 adds support for PDF coding: it is now possible to apply codes to PDF text directly. This required to update also the reporting features, so to be able to include PDF coding in reports when needed.
Previous versions of EPPI-Reviewer allowed you to apply codes to the text extracted from a large variety of different documents (inductive coding). As only the simple text was extracted, all formatting and layout information was lost in the process, making inductive coding relatively difficult. We are pleased to announce that codes can now be applied directly to PDF documents. It is still possible to assign codes to the extracted text for all document types.
Adding Codes to PDF text
From the "Document Details" window, "Citation Details tab", click on the "View" button for the required PDF (if needed, please refer to the user manual for instructions on how to Upload PDFs). This opens the "PDF" tab and loads the document in the PDF viewer.
The "Current Code:" box on the top left corner of the viewer shows the name of the code currently selected. To change code, click on the desired code on the codes tree (left column). The name of the selected code will appear in the box, and the currently coded text will be loaded automatically. Text assigned to the current code will appear highlighted in yellow.
To apply the current code to some text, select the text by dragging the mouse over the PDF document and click on the "Add code to selected text" icon (next to the "Current Code:" box). Alternatively, you can also press the "A" key on the keyboard. The changes are saved automatically, and the selected text will be highlighted in yellow. If no text or no code is selected, the system will ignore the command.
To remove a code from already coded text, select the relevant text and click the "Remove Code from Selected Text" button or press the "D" key. The system will ignore selected text that is not currently assigned to the Current Code and remove the code only from the appropriate text.
As PDF documents may have a very complex internal representation of text and layout, there is a possibility to confuse the system and select accidentally parts of unwanted and/or hidden text. This may cause a problem, as the user would be unable to select the hidden text to remove unnecesary coding. To overcome this problem, you can use the "Reset..." button:
- Make sure that the required code is selected and click "Reset...", the "Reset Pdf Coding" window will open.
- The window shows the current code and the current page number; you can choose to remove the code from current page or the entire document.
- After clicking the "Remove Code" button to remove the code from the PDF document (it does not delete the code from the codes tree), the window will close and the PDF viewer will refresh the coding information.
PDF Document Annotations
PDF annotations have been available in EPPI-Reviewer since version 184.108.40.206. These are short comments that reviewers may add to a PDF document. They can't/shouldn't be used to hold coding information as they can't be searched, exported or included in reports. In the version published today, we have changed how annotations are shown: instead of being "sticky notes" that can appear anywhere on the page (and usually hide significant amounts of text), they are now confined to a right-hand semi-transparent Annotations Column. This is similar to how comments and tracked changes are shown in Microsoft Word.
- Users can Show or Hide the Annotations column (it is normally hidden) via the "Show/Hide Annotations" button.
- To add an annotation, double click anywhere on the Annotations Column, type your text in the "New annotation" window and click save.
- To edit or delete an existing annotation, double click the annotation and apply your changes through the "Edit Annotation" window.
- Annotations can be dragged up and down within a given page, they can't be moved across pages.
Annotations already present in EPPI-Reviewer have been converted to the new format and will appear in the correct position (i.e. height).
PDF tab: other changes and known limitations
The system that we use to display PDF has changed radically from the previous version. As a result, it is now possible to print the PDF directly from EPPI-Reviewer, it is also possible to search text within the PDF viewer. Note that the Print function will not print the PDF-coding highlighting nor the Annotations Column.
In order to display PDF documents within EPPI-Reviewer, the PDF is converted in real-time to a format that can be used within Silverlight applications. This conversion is not 100% reliable, and errors may occur:
- If a PDF document does not follow the official PDF specifications, the conversion may fail (producing a blank document) or show some parts incorrectly.
- Certain font types may be unrecognisable, and will be shown as meaningless symbols or not shown at all.
- some encrypted documents may not be converted. All standard encryption methods are supported. However, PDF documents can be encrypted using non-standard, custom systems. These are, by definition, unpredictable and therefore will never be supported.
- PDF documents that are made up by scanned images will display correctly, but since they contain no real text (only a visual representation of the scanned pages) both the PDF-coding and the extracted text functions will be unavailable. This was true also for all previous versions.
According to our tests, almost all documents will display correctly in the new system and there is a small improvement compared to the previous versions.
To include PDF coding in EPPI-Reviewer reports, a number of changes have been applied to the components used to generate reports. Coded text can appear in two types of reports: the main ones, configured in the "Reports" tab (main screen), and via the "Report: all text coded with this code" option that appears when right clicking a code in the "Document details" window.
To preserve consistency and to allow discrimination, text coded from the extracted text (the "Text document" tab) and text coded from the PDF tab are shown in different formats on both types of reports:
- Text coded from extracted text will be preceded by the "Characters X to Y:" text, and will use the Courier New font (frequently used to represent simple text).
- Text coded within a PDF document will be preceded by "Page: N" indication, and will appear in Italics.
A number of other cosmetic changes have been introduced to standard reports. In general, each different type of information is now presented with its own, unique and distinctive format. This should facilitate reading and interpreting reports, making it clearer if a given piece of text represents a code name, coded text or other information.
To achieve the above, some major "behind the scenes" changes were necessary. As a result, it will be possible to have more control over report styles. In particular, it will be possible to change the style of reports in real time (within some relatively tight constraints), so to represent information in different ways and immediately evaluate the results. Should you have suggestions on what formatting options you wish us to include in the next versions, please post your ideas in the "New Features Request" section of this Forum: we are always happy to hear your opinions.
The User Manual will be updated later this week to reflect the changes outlined above.