HelpForum

Forum (Archive)

This forum is kept largely for historic reasons and for our latest changes announcements. (It was focused around the older EPPI Reviewer version 4.)

There are many informative posts and answers to common questions, but you may find our videos and other resources more informative if you are an EPPI Reviewer WEB user.

Click here to search the forum. If you do have questions or require support, please email eppisupport@ucl.ac.uk.

<< Back to main Help page

HomeHomeUsing EPPI-Revi...Using EPPI-Revi...Questions about...Questions about...Inter-rater reliability Inter-rater reliability
Previous
 
Next
New Post
20/09/2017 19:51
 

Hello!

I am wondering if there is a 'gold standard' inter-rater reliability percentage that reveiwers should be aiming for? I have heard 95% being tossed around in methods sections, but I can't seem to find this statistic referenced anywhere.

Furthermore, is inter-rater reliability calculated based upon the discrepancies between what is simply 'included' or 'excluded', or does it take into account the discrepancies between "reasons for exclusion". I hope that makes sense!

Your support in this is greatly appreciated.

Warmly,

Madeleine

 
New Post
25/09/2017 15:49
 

Hello Madeleine,

There really isn't a universal standard that must be reached as there are many possible reasons comparison coding might be carried out and those reasons might have different requirements. As well, there are many different ways comparison coding could be carried out. Some review organisation might have their own standard on the level of agreement required.

The answer to your question would be what level of agreement (or disagreement) are you comfortable with based on the parameters of the inclusion/exclusion criteria. If you are double screening a random sample to check that all coders are interpreting the screening tool the same way then you might require 100% agreement before moving forward. If the screening criteria required quite a bit of interpretation you might be happy with less agreement.

As well, you might have lots of disagreement on the reason for exclusion but only concern yourself with disagreements on inclusion vs exclusion. In that case you might want 100% agreement but only on the inclusion vs exclusion comparison.

I think the important issue is that you understand the reason for any disagreements as they might indicate confusion in how the criteria works with the studies in your review.

The kappa statistic itself is often a source of misunderstanding. A good paper to read is 'Fleiss J, Cohen J, Everitt, B (1969) “Large Sample Standard Errors of Kappa and Weighted Kappa”, Psychological Bulletin, 72(5) pg. 323-327.' There is the statistics kappa and the weighted kappa and their use depends on the relative seriousness of the possible disagreements (ex. include vs exclude and exclude vs exclude)

If you are calculating a kappa statistic I have found this paper useful in interpreting the value. 'Viera A, Garret J (2005) “Understanding the Interobserver Agreement: The Kappa Statistic” 37(5) pg. 360-363'

Best regards,

Jeff

 
Previous
 
Next
HomeHomeUsing EPPI-Revi...Using EPPI-Revi...Questions about...Questions about...Inter-rater reliability Inter-rater reliability


Copyright 2021 by EPPI-Centre :: Privacy Statement :: Terms Of Use :: Site Map :: Login
Home::Help::EPPI-Mapper::RIS Export::About::Account Manager