ProjectsTechnology research & developmentAutomation for discovery of inequalities research
Automation for discovery of inequalities research

Finding Accessible Inequalities Research in Public Health (the FAIR Database)

This work has been developing methods to apply machine learning and Natural Language Processing approaches to support the review, assessment, evaluation and summarisation of large volumes of public health research to support decision making. We have developed and applied automatic methods for identifying information about inequalities, study types and common themes mentioned within large volumes of public health research. The output of these techniques are available through an online tool (below) containing a continuously updated repository of public health research. Users are also be able to upload their own data for processing and download results through EPPI-Reviewer.

The key research question we addressed is:

Can text mining be used to maintain a ‘living’ database of public health research, including information about topics, methods and inequalities?

We addressed this research question by developing a ‘living’ database of public health research in collaboration with public health decision-makers, researchers, and patients and the public. The database is populated by identifying public health records from the >210 million records in OpenAlex and will be a ‘living’ database, as it will be updated regularly with newly published research.

While the initial NIHR project is officially complete now, we are continuing to develop the technology, and will update the tool periodically.

Version 1 (beta) of the database is available here.

Links to source code and data:

  • The PROGRESS-Plus dataset, used to train the classifiers can be downloaded here. The NIHR has asked we release this dataset under the CC-BY-NC-SA license. Please get in touch if you would like to make commercial use of the dataset.
  • The source code for the PROGRESS-Plus classifier and other workflows is available here.

Project staff include:

EPPI Centre logo                       


This study/project is funded by the National Institute for Health Research (NIHR) Public Health Research Programme (NIHR133603). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

Copyright 2019 Social Science Research Unit, UCL Institute of Education :: Privacy Statement :: Terms Of Use :: Site Map :: Login
Home::About::Projects::Training::Research Use::Resources::Databases::Blog::Publications