HomeHomeUsing EPPI-Revi...Using EPPI-Revi...Questions about...Questions about...Managing DuplicatesManaging Duplicates
New Post
08/11/2013 07:44

There is a slight problem in managing the duplicates in my studies. After getting the duplicates and marking them automatically, I could see some unchecked groups in the left hand side bar that have to be manually listed as  "A Duplicate". However, the check box seems to be locked as I cannot mark the studies as Duplicates. This problem was not encountered before and the issue is not with a particular group but is recurrent in different groups.

Secondly, I have to go through around 1800 groups manually to mark them as Duplicates or not Duplicates. Is there any way to do it through any command instead of doing it manually.


Please reply ASAP.

New Post
08/11/2013 09:59

Your first problem is due to overlapping groups of duplicates. This is explained in detail on the "Overlapping duplicate groups" chapter in the Manual (from page 29), it happens when you import some references, run duplicate checking, import some more, run duplicate checking again and so on.

You can handle the situation manually, or use the  "reset" functions in the "Manual/advanced" tab, in case you have already marked many items as duplicates, you will want to to use the first "reset" option, the one that retains the work already done (it's explained on-screen). You will then re-run duplicate checking and get a clean list of groups, with no overlaps.

Re going through groups manually: you can use the Advanced Mark Automatically feature in "Manual/Advanced" to lower the similarity thresholds, and/or to allow to mark as duplicates items that have already been coded. Normally, this second limitation is enforced to prevent loosing work done elsewhere.

I hope this helps, please let us know if you need further guidance.


New Post
08/11/2013 10:52

Thanks for ypur reply. I reduced around 1000 groups that were to be manually identified by lowering the threshold value, however it would be a bit more easier if you can explain the reset function as I am a bit concerned that what would happen if I use it and get a "fresh start", which the option flashes.


Another thing, I havent imported any more references, still I am not able to click on the option "A Duplicate". The option appears faded.


New Post
08/11/2013 11:18

Both reset options will delete all existing Duplicates Groups. The first option however preserves the "marked as duplicate" flag, as well as the master information, meaning that after the reset, the items that are already marked as duplicates will remain in their current state, and the system will know of what item they are a duplicate (the master). This information remains visible through the "item details" window: if an item is the master of some duplicates, it will show there, the same applies if an item is marked as the duplicate of something else.

After clicking "Get new duplicates" once more, new groups will be created, but items that are already marked as duplicates will not be included in any of the new groups. This will allow you to deal with the items that still need to be looked at, without having to re-do what has been done already.

In short, the first option allows you to avoid having to manually untangle overlapping groups while ensuring you won't need to re-evaluate items that you've already marked as duplicates. The drawback of this system is that in case you made a mistake, and marked as duplicate some items that are in fact unique, it will be difficult to spot the mistake (because the old groups have been deleted). Since your review is big I'm guessing that you will never want to manually look at all groups, so in your case this isn't a problem at all.

The second "reset" route is simpler: it deletes all groups, and re-instates all duplicates to their original "not a duplicate" state. This option is recommended if and only if you think you've made some big mistake and wish to start from square one.

I hope this helps,


New Post
08/11/2013 11:40

You are talking about two reset options. By the first one, do u mean the bin option?


New Post
08/11/2013 14:49

No, if you click the "reset" button, a window will open. With a long description of the two options that are available there.

Don't worry, the system is designed in such a way that it is virtually impossible to delete your work without noticing.

New Post
11/11/2013 09:25

Thank you very much for your replies. I had been able to remove duplicates, however once I was done, to make things sure i re-run the "get more duplicates" command which got me some more duplicates. This happened even when I hd not added any of the new studies in my work. What might be the reason for this?

Additionally, if a study was marked manually as " Not a Duplicate",  will it appear when the "Get more Duplicate" command is run?



New Post
11/11/2013 12:04

I'm not sure about what you did exactly, can you state the following?
1) I assume you: used the first "reset" option, clicked "get new duplicates", went through the whole lot, then clicked "get new duplicates" again. This time new groups appeared. Is that the case?
2) while going through the steps above, have you by any chance undeleted a source? This would explain the new groups.

If a study is marked "not a duplicate", you reset the groups (both options), and "get new", the study will most likely re-appear and should be marked as "not a duplicate" again. Sorry for not mentioning this before.


New Post
11/11/2013 12:15

Ok now i get it.

Another thing, I have around 23000 studies on which I have to apply exclusion criteria. The exclusion is based on the year, the region, and the type. I have the codes in the right hand side panel of EPPI. How do it do it.? Do I have to go through each study one by one and click on exclude or include or is their any other easier and quick way.


New Post
11/11/2013 14:14

Hello Deny,

The screening process in a systematic review is a method of determining which of your studies are relevant to your review question. If you have a complex screening tool then looking at each study would probably be necessary. If you are just trying to eliminate items based on basic criteria such as date you could try running a search. You can search on the date field in the search tab. By identifying items to exclude through searching you can then bulk exclude a number of items. If you are trying to exclude or include items on study type or the region you might also be able to eliminate a number of items in one go as well through searches. Searching on particular terms might be a bit risky as those terms could be referring to different concepts than what you might expect. Using this method could reduce the number of items that you need to examine individually.

A second method you might wish to consider is using the text mining functions in EPPI-Reviewer to help identify the items that are most likely to be includes. This would allow you to put your resources where they would be most effective. You will find details on the text mining functions in the user manual under 'Text mining' (page 77). This might not eliminate the need to look at each item but it will allow you to look at the most likely candidates first.

Best regards,



