HelpForum

Forum (Archive)

This forum is kept largely for historic reasons and for our latest changes announcements. (It was focused around the older EPPI Reviewer version 4.)

There are many informative posts and answers to common questions, but you may find our videos and other resources more informative if you are an EPPI Reviewer WEB user.

Click here to search the forum. If you do have questions or require support, please email eppisupport@ucl.ac.uk.

<< Back to main Help page

HomeHomeUsing EPPI-Revi...Using EPPI-Revi...Questions about...Questions about...'Greyed out' references in duplicates list
Previous
 
Next
New Post
28/01/2011 08:58
 

Another de-duplicating question,I'm afraid!

There are several references in my duplicates list that are 'greyed out' when I click on them and I'm not able to mark them as a duplicate. Any ideas as to why and how I can change this?

Thanks,

Beki

 
New Post
28/01/2011 10:46
 

Hi Beki,
Short answer is:
1) If all items of a group are greyed out, then the master item of the group is marked as a duplicate in another group.
2) If single items of a group are greyed out it is because they are already marked as duplicates somewhere else.

Case 2) is straight forward, just ignore those items. Case 1) may be tricky, as it is possible that the grey group (GG) contains items that are not present in any other group and that should be marked as duplicates. There are two ways to deal with this type of situation, and both start with finding the group where GG's master is marked as a duplicate. This could be tricky, and an appropriate "search" is one of the mildly urgent additions I want to write for the de-dup functions.
Once the other relevant group (OG) has been found, you'll have two options:
Option one: take note of the Group ID of GG, select OG and go to the "Manual/Advanced" tab, click "Add group", type the group ID of GG and "OK". This adds all GG members to OG, including GG master and ignoring all items that are already into OG. This operation is 100% reversible and you can manually remove each single "manually added" item from OG whenever you may see fit.
Option two: conceptually 'cleaner', but more time consuming. Start by verifying that all OG members are present in GG, in such a case, mark all OG members as "Not a duplicate". Go to GG and mark items there as appropriate (it will be possible, at this point). When you'll mark as duplicate the items that are also present in OG, their line in OG will become grayed out.


I still have to explain why this happens: it is complex! I will try with another post as soon as possible.

I hope this helps, please don't hesitate to ask for clarifications.
Best wishes,
Sergio

 
New Post
31/01/2011 11:24
 

Hi again,

As promised, a little more information about the inner working of duplicate checking, hoping this will make it clear why sometimes groups or references are greyed out unexpectedly.

When “get new duplicates” is clicked, the grouping algorithm evaluates (and quantifies) the differences between all evaluated references. It then groups them together, choosing the groups in the way that minimises the differences within each group. In this phase, the “Master Item” is chosen, selecting the group member that has the minimum overall difference with all the other group members. In this way the automatically selected master is the one that has  the highest chance of being a legitimate representative of the group.

This approach works very well initially, but it starts to show its limits when “get new duplicates” is clicked more than once, normally after importing new items into a review. At this stage, it is necessary to match the fresh “get duplicates” results with what is already present in the database (so not to lose what marking work has already been done). The only legitimate way to do this is to compare master items; whenever they match, the system will assume that the new and old groups are the same and add any additional new group member to the existing one. If a group (or better, its master item) in the new results does not match any existing group, a new group is created, and all of its members are inserted in the database. This works perfectly for groups that are new, but doesn’t in the case the group is not new but has a new master.

Since the master is chosen automatically as the best representative of a group, and since similarity scores are provided only between the master and the other group members, when new references have been imported, and “get new duplicates” is triggered, the resulting groups might have a new master, making the corresponding old group outdated (in conceptual terms). The problem is that the old groups might have been already evaluated (automatically or manually), so disabling (and discharging) the old group doesn’t look like a wise choice. My solution was to provide two mechanisms to manually recover from these awkward situations: (a) disabling items already “marked as duplicates” and (b) providing a way to merge two groups.

(a)  has two separate parts: first, group members that are already marked as duplicates are greyed out individually. Second, if a group has a master item that is already marked as duplicate somewhere else, then the whole group is disabled (greyed out). This second safeguard is there to prevent the creation of meaningless duplicate chains, where item A is marked as duplicate of B and B is marked as duplicate of C.
(b) is quite simple, as explained in the previous post, it is just a shortcut to manually add items from group X into group Y, it is entirely reversible and it will ignore items that are present in both groups.

Finally, I have realised that a third component is missing! What I’m describing here usually happens for large reviews, which have more than 10000 references, and could easily have thousands of duplicate groups. In such situations, dealing with multiple versions of the same groups might be difficult, as manually finding the related groups can be difficult and time consuming. For this reason, the additional function that I’m planning to write is something like (details might change in due course):

(c) Find related groups. This would be a mechanism to list the groups that share some items (including manually added items), so that applying (b), and/or untangling overlapping groups in other ways (see previous post) will be a simple matter of point and click.

Unfortunately, I’m still buried under another big project: I’m currently testing our online shop. I believe this is the most waited-for new feature and hence has the highest priority (apart from user support, of course!).

Thanks for reading,
Sergio

 
Previous
 
Next
HomeHomeUsing EPPI-Revi...Using EPPI-Revi...Questions about...Questions about...'Greyed out' references in duplicates list


Copyright 2021 by EPPI-Centre :: Privacy Statement :: Terms Of Use :: Site Map :: Login
Home::Help::EPPI-Mapper::RIS Export::About::Account Manager