Version 22.214.171.124 is numbered as a major release because it includes significant upgrades to the technologies that drive EPPI-Reviewer Web. This means that not much has visibly changed, although this release includes a significant improvement to the deduplication algorithm and a number of other new features and bugfixes.
EPPI-Reviewer Web technology upgrade
The EPPI-Reviewer Web client is now written for Angular version 13 and has also been adjusted to fully follow all Angular best practices, making it more maintainable and quicker to develop. Concurrently, the client is now published as a separate/standalone App, separating it from the EPPI-Reviewer API. On the server side, the API now runs as a DotNet 6 app, which again, makes it more maintainable and aligned with the current mainstream (within the Microsoft ecosystem).
For users, the expected consequences are:
- The client is now much smaller in size and faster to load/run. Thus, opening up EPPI-Reviewer Web should be much faster, especially when doing so over a slow(ish) Internet connection and/or on slower hardware.
- The "look" of the client is also somewhat different, and in general more "compact", allowing to fit more data in less space, without, we believe, making the app harder to navigate.
- Hopefully there will also be a visible benefit in terms of the number and quality of new features that we'll be able to write for each forthcoming release, as a consequence of having made the development work a little faster and easier.
The downside of the above is that it's possible that this release does contain new bugs. The technology upgrade took several months of work, and has been subjected to additional testing routines. However, pretty much every "component" of the client was modified, which does mean that the risk of having introduced new (and hopefully small) bugs here and there is higher than usual.
Deduplication: better similarity scores
The algorithm calculating similarity scores between pair of (possible) duplicates has been improved in two ways. These changes apply to both versions (ER Web and ER 4), but do not have an impact on existing duplicate groups. First of all, there was a bug in the subroutine calculating similarity scores between author fields: if two references had a different number of authors, comparing Ref A to Ref B could produce a different similarity score from comparing Ref B to Ref A. This bug has been corrected.
However, this change made the "distribution" of scores change in a non-trivial way, requiring evaluation of and compensation for its effects. Thus, we also tweaked the relative weight that the algorithm assigns to similarity scores calculated for separate fields.
The combined effect of these changes is that, according to our tests, the algorithm now produces better results across all measures: fewer false positives, fewer false positives above 0.8 similarity, fewer false negatives, and more true positives above the 0.8 threshold. The improvement however is "marginal" (between 5 and 8 percent, depending on the specific measure), hopefully because the algorithm was already performing quite well.
In the real world, the effect of this upgrade is that the algorithm should stop "missing" the occasional real duplicate, which sometimes happened as a consequence of the "author similarity" bug and lead to false negatives that looked unexplainable. In order to profit from this improvement, running "get new duplicates" will suffice, although doing so will not recalculate the similarity scores for existing group members. In general, we do not recommend users to do a hard reset and "get new duplicates" from scratch for reviews where deduplication has been done already. Naturally, all deduplication rounds conducted from now on will benefit from the changes included in this release.
EPPI-Reviewer Web: search for items with duplicates. A new "search" is available, which will find all items that have some duplicates (master items of groups that have members marked as duplicates). This search can only be conducted against the whole review and does not discriminate between item states (Included, excluded, etc).
EPPI-Reviewer Web: duplicates report 1. This report, available against "selected items" in the References tab, produces a table that includes the main fields of the selected references (ID, type, Short title, etc) and ends with a column listing Source and Item ID for duplicates of the selected references. Used in conjunction with the new search, it can be used to summarise the origin of all items that are marked as duplicates.
EPPI-Reviewer Web: login screen. This page has been improved in 2 ways: (1) a "busy" animation now shows while username and password are being verified. (2) When the latest version published is less than 10 days old, the "latest changes" message glows, which hopefully will help users notice it and thus learn "what's new" (by linking to these “Latest Changes” posts).
In "Item Details", "OpenAlex" tab (in both versions, for reviews where these features are enabled), the list of "matched" OpenAlex papers could show the papers in the wrong order, which was not respecting their respective "matching score". This could lead to errors in manually checking the matches and is now resolved.
In the "Update review" pages of EPPI-Reviewer Web the "browse history" pages/functions would sometimes fail to work, failing to bring users back to previously visited pages/lists; this problem is now solved.