HelpForum

Forum (Archive)

This forum is kept largely for historic reasons and for our latest changes announcements. (It was focused around the older EPPI Reviewer version 4.)

There are many informative posts and answers to common questions, but you may find our videos and other resources more informative if you are an EPPI Reviewer WEB user.

Click here to search the forum. If you do have questions or require support, please email eppisupport@ucl.ac.uk.

<< Back to main Help page

HomeHomeUsing EPPI-Revi...Using EPPI-Revi...Questions about...Questions about...Text MiningText Mining
Previous
 
Next
New Post
23/07/2012 08:57
 

I have started a trial account for Eppi reviewer 4.0. I’m especially interested in the text mining functions of this program. However I cannot make the term management system TerMine to work properly. I’m trying both with the example review that comes with the trial account and the review that I imported myself.  All other search engines (yahoo, zemanta etc) suggest a ranking list of terms except the TerMine. What am I doing wrong? I would appreciate any help in this matter.

 
New Post
23/07/2012 14:53
 

Hello Derya,

Termine is a text mining webservice from the National Centre for Text Mining (NaCTeM) at the Univeristy of Manchester. EPPI-Reviewer accesses this webservice as part of its text mining functionality.

There was a problem with the Termine web service a couple of week ago but it looked to be back running properly. I just tested myself and it didn't return any results for me either. We will look into this to find out what is happening.

Best regards,

Jeff

 
New Post
02/08/2012 15:44
 

Jeff,

Thanks for your latest reply. I'm still having problems with the the text mining function. It seems now as if the TerMine works and provides me with terms that make sense. However I cannot make the screen prioritization work based on these suggested terms. Do you have the same problem or am I doing something wrong?

Thanks again very much for your help.

Derya

 
New Post
02/08/2012 16:43
 

Dear Derya,

Yes, TerMine is now working properly. I have tested the screening prioritisation functions as described on page 71 of the user manual: it all works as expected on one of my test reviews, making it difficult for me to find out what doesn't work in your case. Text mining techniques can be tricky, for example, I would be surprised if they worked with the example review, as there are too few references to work with. Also: it's sometimes a good idea to "clean" the list of terms, so to make sure that all corresponding concepts are actually relevant.

In other words, in order to provide more specific help I will need more information:

1) how are you selecting the items that are used to create the list of terms? Have you tried with a shorter list?
2) is the corresponding list very long? Do all the terms extracted seem relevant?
3) assuming that you are not finding any similar references after clicking "Search on Terms", have you tried eliminating the terms with the lower score?

I guess the questions above already suggest what I think may be the problem: it's difficult to evaluate this technology without working on a fully featured "real world" example. Text mining can be very useful for big reviews (50,000 references or more) where the only possible way to handle the screening phase within the expected time requires some sort of prioritisation, but smaller scale tests may simply lack the numbers and data-diversity required.

Please do let us know if this helps and how your work is proceeding.

Sergio

 

 
New Post
07/08/2012 10:55
 

Dear Sergio,

When I read your post I get the feeling that text mining is most appropriate for very big reviews. This is rarely the case with our reviews...The number of references usually ranges between 3-6000. Nevertheless, I tried to follow your advices when trying to make the text mining work with my review. I'm actually using a real review that we completed during last year. What I did was importing the references from the initial Pubmed search (approximately 2300 references) and selected for the term extraction those studies that were finally included in our systematic review. We are talking about around 10-15 references. I cleaned up the terms that seemed to be irrelavant and clicked "Search on Terms". I just get an alert warning where it says data.portal update failed together with the TerMine terms with weighted c-value. The c-scores for the terms are low and I don't code them in anyway.

I hope that the information I provided above helps you in guiding me further.

Thanks very much again for your help.

Derya

 
New Post
07/08/2012 12:25
 

Dear Derya,

I confess I'm puzzled now. I tried to re-create the issue you report and I couldn't. First, I've tried on one of our test reviews, and it all worked as expected (even when working with 10k+ list searchable items), this made me suspect that something may be going wrong in your peculiar case, so I gave a look at your review (ID=1306), I hope this is all right.

Within the aforementioned review, I've tried to see if I could re-create the error in the following (arbitrary) way:

1) in the "Search" tab, I've searched for a specific expression, so to retrieve a sample of references (to use as my source of data). Got 24 documents and listed them.
2) in the main document tabs, I've used the "Find more items like these" dialog to get the TerMine terms.
3) I've cleaned the terms list, so to end up with about 12 terms that looked significant (disregarding the score), I was (arbitrarily) selecting terms that were likely to be present in other documents as well, but that seemed to be specific enough to provide some kind of meaningful result. I'm mentioning this to clarify that even if arbitrary, this test is not entirely meaningless.
4) I clicked "search on Terms" and after some 10 seconds I got a new list of 173 references.

Once again, it was all successful, so I still don't know what went wrong. If you'll look at the "search" tab, both of my searches (from point 1 and 4 above) are there for you to have a look.
Looking at your "search" record, I can only speculate about what the error might have been: it is entirely possible that you stumbled on a "timeout" error. This happens sometimes (especially with searches) and is triggered when creating a new search takes more than 30 seconds. It may be due to bad luck (the search happened in a very busy moment) or it may be structural: if it is repeatable, please let us know! The "term search" functionality is a pretty complex one from the computational point of view, this means two things: it is difficult to find a way that ensures that no searches will time out, as this will be influenced by the general traffic, but if you are able to produce a search that invariably fails, then I would really like to know. In this second case, I may be able to use the failing search analyse why it is taking too long and optimise the procedure so to speed every future term-search.

I'm sorry that I don't have a definite answer: I will appreciate your collaboration to pursue this issue. Please give a close look to the error message you may see and report back: I would like to know exactly how you triggered it (you may want to use the "export terms" button to save the current list of terms). Also, does the error message mention a "Timeout" somewhere?

Many thanks,
Sergio

 
New Post
07/08/2012 13:27
 

Sergio,

Again my sincere thanks for really trying to help me out. I'm trying to follow your instructions but repeatedly failing to get a prioritization list. I think the easiest way to follow my "text mining track" would be by sending you screen captures. Would that work for you?

Thanks!

Derya

 
New Post
07/08/2012 14:21
 

Hi Derya,

yes, feel free to send screenshots to eppisupport, we'll continue our conversation via email.

Sergio

 
New Post
07/08/2012 17:05
Accepted Answer 

Dear Derya,
Thanks again for your patience and collaboration.

To All: Derya stumbled on a very awkward bug. This happens because the regional setting of her computer use the comma as the "Decimal Symbol", so that numbers are written the "1,2" form instead on "1.2". Internally, this triggers a very peculiar cascade that generates a very odd error message.

This is already fixed in our internal builds, but it will need a full update in order to be published for everyone. Unfortunately, I can't promise when the next update will be released, but I can explain how to work-around this issue in the mean time. You may want to follow the procedure below if clicking on "(find more documents like this)\Search on Terms" displays a very long error message instead of sending you to a new list of results. Also: the term list should show scores that do not use the full stop (".") as the decimal symbol.

Temporary workaround (to be used if needed, and only for EPPI-Reviewer version 4.2.1.4. Higher versions will be immune from this bug):

  • In "control panel" find the "Regional Options", in Windows Vista and Windows Seven these are under "Control Panel", “Clock, Language, and Region”, "Regional and Language Options".
  • Under “Formats”, click “Customise this format”,
  • Change the "Decimal Symbol" to "." (a full-stop). Optionally, change also the “Digit grouping symbol” in order to make sure it doesn't use the full stop as well (this would be confusing).
  • Click "OK" on all dialogs.

NOTE: the instructions above should work for Windows Vista and Seven, please feel free to use this thread if you need instructions for a different operative system.

 
Previous
 
Next
HomeHomeUsing EPPI-Revi...Using EPPI-Revi...Questions about...Questions about...Text MiningText Mining


Copyright 2021 by EPPI-Centre :: Privacy Statement :: Terms Of Use :: Site Map :: Login
Home::Help::EPPI-Mapper::RIS Export::About::Account Manager