Service Disruption: Machine Learning components

Forum (Archive)

This forum is kept largely for historic reasons and for our latest changes announcements. (It was focused around the older EPPI Reviewer version 4.)

There are many informative posts and answers to common questions, but you may find our videos and other resources more informative if you are an EPPI Reviewer WEB user.

Click here to search the forum. If you do have questions or require support, please email eppisupport@ucl.ac.uk.

<< Back to main Help page

Home

EPPI-Reviewer 4...

Forum announcem...

Service Disruption: Machine Learning components

20/02/2020 11:04

Sergio Graziosi

Joined: 17/10/2011

Posts: 318

Service Disruption: Machine Learning components Modified By Sergio Graziosi on 21/02/2020 10:37:16

We are currently experiencing a service disruption on our Machine Learning components. Features affected are Priority Screening and Machine Learning classifiers.

These systems use API calls to run and build Machine Learning models - the API run within the Microsoft Azure cloud. Currently most (but not all) API calls are failing, apparently during the low-level stages required to establish an HTTPS connection. Very rarely, the calls succeed, indicating that the problem is likely to be caused by an unexpected fault somewhere - if a misconfiguration was the root cause, we would expect all calls to fail. We are investigating the fault along with Microsoft engineers.

Details on the current disruptions:

For Priority Screening: the vast majority of training events will fail, however, existing priority lists will keep functioning making it possible to "get next item to screen" while the list still contains items to screen. As a consequence, for any given review, priority screening will continue working (without re-training) until the current list has been completed. Each list contains 3000 items, which means that most people might be able to continue working without experiencing any inconvenience (if the fault will be resolved promptly).

For Machine Learning Classifiers: each call to build and apply models is affected, meaning that in most cases it will fail. This will make it impossible to build or apply machine learning models until the fault is resolved.

Although some (very rare) API calls do succeed, we do not recommend trying multiple times, please monitor this thread or our Twitter Feed to receive updates instead.

20/02/2020 12:14

Sergio Graziosi

Joined: 17/10/2011

Posts: 318

Re: Service Disruption: Machine Learning components

Update #1:

Despite what we initially thought, it seems that EPPI-Reviewer Web is unaffected, all Machine Learning features implemented in the Web version appear to work normally.

Thus, using the Web version is possible to:
- Use Priority Screening, provided that it is already configured.
- Apply Machine Learning models.
- Create/train new Machine Learning models.

Unfortunately, since configuring Priority Screening for the first time can only be done in version 4, it is currently not possible to start a new Priority Screening exercise.

Overall, this is good news because it also gives us new and important information to guide us in resolving the issue.

20/02/2020 14:05

Sergio Graziosi

Joined: 17/10/2011

Posts: 318

Re: Service Disruption: Machine Learning components

Update #2 and final.

Problem is now resolved. For some reason, the APIs we use started accepting only requests that used certain underlying TLS protocols (used to establish HTTPS connections). The default protocol used by EPPI-Reviewer 4 was (suddenly) deemed too old, making its calls to the Machine Learning APIs fail. As the code for EPPI-Reviewer Web is newer, the default protocol used by the Web version was more recent and thus requests produced from there kept working.

Resolution: we have re-configured EPPI-Reviewer 4 so that it now uses more recent protocols by default.

Page 1 of 1

Home

EPPI-Reviewer 4...

Forum announcem...

Service Disruption: Machine Learning components