Long COVID - Automated living map of research

Introduction
  View more 
  View less 

About the Map

Overview
This living map contains bibliographic records of articles about long COVID, partitioned by 'Topic'. See below for further details of topic codes.

The current version of this living map - Version 107 - 30th May 2024 - contains 6,408 bibliographic records of research articles likely to be about long COVID. 

This living map was originally maintained by people, assisted by automation tools. It was originally retricted to studies concerning people who have had SARS-CoV-2 infection and who have any symptoms at >4 weeks after diagnosis. It did not originally include studies that investigate antibody response >4 weeks if no information on symptoms is reported. Also, studies investigating the recurrence of symptoms were originally intended to be included, while those investigating reinfection or test positivity alone were not originally intended to be included. However, this living map is now primarily maintained using automation tools, administered by people. As such, it is now likely that the overall profile of included studies reflects a broader range of definitions of long COVID in current use.

We will publish further, updated versions of this living map when ready. i.e. each time a new set of records from each sequential update of the OpenAlex dataset has been auto-coded and added to the map - see Identifying articles on long COVID, below, for further details.

Suggested Citation
Shemilt I, Khouja C, Lorenc T, Raine G, Sowden A, Sutcliffe K, Thomas J (2024). Long COVID - Automated living map of research. London: EPPI Centre, Social Science Research Unit, UCL Institute of Education, University College London.

Using the long COVID 'segment' 
There are various ways to use this living map, which is organised by 'Topic' and 'Version'. 

Quick Start:
All the bibliographic records in the map have been automatically tagged (or 'keyworded') with one 'Topic' code, listed in the first expandable list on the left. Records in the long COVID 'segment' can therefore be filtered by topic. Topic codes are: 

- Treatment Evaluation
- Transmission / Risk / Prevalence
- Diagnosis
- Health Impacts
- Vaccine Development
- Treatment Development
- Genetics / Biology
- Case Reports (Patients)
- Case Study – Organisation
- Social / Economic / Indirect Impacts
- Mental health Impacts

Topic codes are automatically assigned using a machine learning (BERT) model. See Identifying articles on long COVID, below, for further details; as well as for details of the guidance notes previously used to manually assign these topic codes to each record selected for inclusion in a broader, parent map (Lorenc 2020).

To list all records of 'Treatment evaluation' studies, open the 'Topic' heading, click on 'Treatment evaluation' and then on the button above 'List records'. Select 'Home' to return to the previous screen. To see how many records are in each category under the 'Topic' code, select 'Topic', and click on the 'Frequencies' button above it. This will open a table in the middle of the screen that tells you how many records have been categorised with each heading. You can then list the records by clicking on the corresponding number. Similarly, all records have one version code, as well as being tagged with 'All versions'. To see how many records are in each version of the long COVID 'segment', select 'Version' or 'Previous versions' and click on the 'Frequencies' button. Clicking on each number will display that set of records.

Downloading data and exploring individual records:
The search features described above (and below) will result in a list of bibliographic records being displayed. This list can be downloaded as a plain text file, in Excel format, or as a RIS file for importing into reference manager software (such as Zotero or EndNote). Clicking an individual title in the list will result in the detailed information about that record being displayed. This defaults to standard bibliographic information plus the abstract but can be expanded to include all bibliographic fields in the database. This screen also contains a 'show coding' button, which opens up the display to show the 'Topic' and 'Version' codes assigned to that specific record.

Displaying a map or cross-tabulation:
Crosstabulation operates using the grouped headings of codes. To show a table of all records with ‘Topic’ across the top and ‘Version’ down the side, click on the heading 'Topic' and, at the bottom right of the screen, click on the 'Set X axis' button; then click on the 'Version' heading (on the left of the screen; or Previous versions... below Version) and click on the 'Set Y axis' button; finally, click on the 'Get Crosstab' button (bottom right of the screen). The resulting page will display a matrix showing the intersections of the categories under these two headings. By clicking on the 'Bubble map' button, this matrix can be changed from a table to a bubble map, with bubbles indicating the relative number of records in each cell. The numbers / bubbles in the cells are clickable, and clicking on them will display a list of the records in that cell (below the table or bubble map - this may take some time to appear).

Finding a specific record or set of records:
There is a free-text search at the top of the 'Home' screen. This defaults to searching the title and abstract fields, but specific fields can be selected using the drop-down menu next to it.

Identifying articles on long COVID

Current (Version 107-)
Since its transition to a stand-alone living map in 2024 (following the discontinuation of the parent living map - Lorenc 2020) we import the top-scoring 500 records from OpenAlex auto-update searches, which are 'seeded' by the growing corpus of records included in the living map (applying import filters designed to filter out pre-prints and other always-excluded articles). We also import records from 'Quarterly scopes of the Long COVID literature' (systematic reviews on Long COVID and RCTs on Long COVID treatments) undertaken for a parallel project. Following de-duplication, records are run through a bespoke binary machine learning classifier designed to identify records of COVID-19 research (and we discard records scoring below an established threshold score).

Next, we apply a bespoke binary BERT model designed to identify records of long COVID research to retained records from the previous stage (and we discard records classified as 'not about long COVID). Finally, we apply a second bespoke BERT model designed to apply one of 11 include topic codes, or one of 2 exclude codes, to retained records from the previous stage (and we discard records that are either not coded or do not score above specified threshold scores calibrated to achieve 90% accuracy).

- Treatment Evaluation
- Transmission / Risk / Prevalence
- Diagnosis
- Health Impacts
- Vaccine Development
- Treatment Development
- Genetics / Biology
- Case Reports (Patients)
- Case Study – Organisation
- Social / Economic / Indirect Impacts
- Mental health Impacts

All records retained and topic-coded are added to the map.

Previous

Retrospective
For the inaugural version of this long COVID 'segment' (up to Version 75), we also needed to identify articles about long COVID that had been published in previous versions of our living map of COVID-19 research from its inception up to Version 72. We therefore searched the map for these articles in EPPI Reviewer-Web using an earlier version of the keyword search strategy shown above (available from the authors on request). Retrieved records were then reviewed (including referral for 'second opinion' if needed) and those records about long COVID were added to the 'segment' by 'Topic' and 'Version'. In addition, we identified, coded and added (both to this 'segment' and to the broader living map) a set of study reports (articles) included in a recently published version (27th September 2021) of a living systematic review of research on long COVID [4] that were not already included in our living map of COVID-19 research.

Version 73 to Version 97

Prospective
From September 2021 (Version 73) until April 2022 (Version 97) we prospectively applied a supplementary code 'About long COVID' to records of studies manually selected for inclusion in a broader parent living map of COVID-19 research (i.e. in addition to assigning each eligible record to one of eleven (primary) ‘Topic’ codes – see below). Each record was coded by a single researcher. If a researcher was unsure about whether a study was about long COVID (or about which ‘Topic’ code to assign) they had the option of referring it for a ‘second opinion’, to be resolved by team discussion and consensus. 

The criteria (guidance notes) used to manually code records for inclusion in the parent, broader living map, under each category heading (topic code), were as follows: 

Primary empirical data, systematic review,* modelling,** full economic evaluation,*** or novel analysis on COVID-19
 

Treatment Evaluation

  • Any intervention aimed at treatment, prevention and/or rehabilitation of COVID-19 (i.e. with either a population of COVID-19 patients or COVID-19 incidence as an outcome), including vaccines
  • Prospective outcome studies with comparison between researcher-allocated groups (i.e. randomised trials, quasi-randomised trials, and non-randomised trials with researcher allocation)
  • Include systematic reviews* that aimed to include studies meeting these criteria, whether or not any were located
  • Exclude observational/retrospective studies including treatment as an exposure and uncontrolled studies (code as Treatment development)
  • Exclude case reports with some information about treatment (code as Case reports)
  • Exclude basic science with claimed relevance to interventions, but without evaluation of effectiveness in human patients (code as Treatment development or Vaccine development)

Transmission / Risk / Prevalence

  • Epidemiological modelling of spread (incl. studies which aim to model health outcomes or health system outcomes; include population mortality rates (i.e. deaths relative to total population); exclude case fatality rates (i.e. deaths relative to COVID-19 cases)); include genetic epidemiology if the main focus is on spread of disease (if the focus is on characterising strains, code as Genetics / biology)
  • Risk modelling
  • Studies of viral persistence in bodily secretions/tissues or in the environment, e.g. on surfaces (including methods to inactivate the virus in these contexts) 
  • Population prevalence studies (including seroprevalence)
  • Studies of risk factors for developing COVID-19 at individual level (not risks of developing more severe disease/complications among people infected with COVID-19; code as Health impacts) or at population level
  • Studies of the effectiveness of non-drug prevention strategies e.g. masks, contact tracing (only if data on prevalence reported). Include modelling studies of impacts of vaccines on prevalence at population level.
  • Exclude data on preventive behaviour outcomes only (code as Social / economic / indirect impacts). Exclude studies of disinfection, aerosolisation etc. if they do not include data on COVID-19 (code as Not on topic) 

Diagnosis

  • Sensitivity and specificity of tests for COVID-19 (including antibody tests)
  • Training clinicians in diagnosis
  • Include studies of clinical signs if the main focus is on their diagnostic value (if the focus is on health outcomes or prognosis, code under Health impacts)
  • Exclude diagnosis of other conditions/comorbidities in COVID-19 patients

Health Impacts

  • Any observational study with a population of COVID-19-infected patients measuring physical health outcomes (incl. case fatality rates, QALYs or DALYs) and/or somatic indicators (code studies reporting prevalence data here if they also present health impacts)
  • Include studies of prognostic factors, indicators of disease progress or severity
  • Studies of comorbidities (e.g. coinfections), if not explicitly analysed as risk factors for infection 

Vaccine Development

  • Basic science aimed at development of vaccines
  • Include animal studies testing human vaccines only
  • Studies looking at vaccines but not meeting methodological criteria for Treatment evaluation, e.g. observational/retrospective studies including vaccine receipt as an exposure (even if measuring prevalence as an outcome), studies without researcher-allocated control group, or pre-post studies of antibody response
  • Studies of vaccine safety/side-effects
  • (Studies of vaccine hesitancy/intentions to be vaccinated/attitudes towards vaccination - code under Social / economic / indirect impacts)

Treatment Development

  • Basic science aimed at development of treatment, e.g. drug discovery (including in silico molecular docking studies)
  • Include animal studies testing human treatments only
  • Studies looking at treatments but not meeting methodological criteria for Treatment evaluation, e.g. observational/retrospective studies including treatment as an exposure, studies without researcher-allocated control group, or modelling based on evaluation data (but exclude studies of outcomes which simply state that treatment was administered without relating outcomes to treatments - code under Health impacts)
  • Studies of treatment safety/side-effects
  • Training clinicians to deliver interventions
  • Include studies of drug treatments used to prevent infection
  • (Studies of relevance to both vaccine and treatment development - code under Treatment development)       

Genetics / Biology 

  • Any data on the genetic or biological characteristics of the virus, or of mechanisms or responses to infection (including antibody responses or humoral immunity, if not clearly aimed at diagnosis/vaccine development)
  • Include modelling on the basis of secondary data analysis
  • Exclude studies of biological mechanisms theoretically linked to COVID-19 infection, but without data which actually concern COVID-19
  • If explicitly aimed at treatment (resp. vaccine) development, code as Treatment (resp. Vaccine) development; if quantifying seroprevalence, code as transmission / risk / prevalence

Case Reports (Patients)

  • Medical case reports of small numbers of patients considered as individuals
  • Include any case with confirmed COVID-19 or symptoms or history suggestive of COVID-19 infection (otherwise code as Not on topic, case studies of health professionals and mental health consequences of lockdown are also not on topic)
  • Include mental health cases tested using the "Fear of COVID" scale or equivalent
  • Include case reports of adverse effects of vaccines

Case study - Organisation

  • Descriptive studies setting out organisational responses/strategies to COVID-19
  • Surveys of professionals/institutions on organisational responses (not broader knowledge or attitudes to COVID-19 - code these as Social / economic / indirect impacts); any studies focused on service delivery e.g. performance of clinical procedures pre- and post-COVID-19 (or delivery of treatments, if no outcome data)
  • Include any organisation (healthcare or other) and any form of response to COVID-19, whether directly concerning COVID-19 patients or not
  • Exclude guidance or recommendation papers which do not describe the recommended measures being implemented in a specific case

Social / Economic / Indirect Impacts

  • Include studies mainly focusing on behaviour, attitudes etc.
  • Studies of information (e.g. analysis of websites or social media)
  • Surveys of professionals if not mainly focused on organisational responses
  • Studies of behaviour or health outcomes of patients without diagnosed COVID-19 (including total excess mortality, unless separable data on COVID-19 mortality are available)
  • Studies of other impacts of COVID-19 or COVID-19 control measures (e.g. environmental impacts of lockdown)
  • Include indirect health impacts on healthcare workers (e.g. from PPE use)
  • Studies of vaccine hesitancy/intention to be vaccinated, vaccine uptake/coverage
  • Studies of access to services (vaccination, treatment, testing, etc.) for COVID-19
  • Indirect impacts of diagnostic procedures (e.g. radiation)

Mental Health Impacts 

  • Include both COVID-19 patients and/or indirect mental health impacts on the broader population (or healthcare workers, etc.)
  • Include mental health status (anxiety, depression, etc.) and sleep-related outcomes
  • Where studies have an equal focus on mental health impacts and health and/or indirect impacts, code as mental health impacts
  • Exclude if there are no COVID-related measures and only one time point (no before-and-during or -after comparison)

* Define systematic review as any paper reporting secondary data which reports: some search terms; clearly defined inclusion criteria; and some information on the selection process (at least the number of references located by searches and the number of studies included). Include any systematic review which aimed to include studies on COVID-19, whether or not any were located. Include updates to systematic reviews and living reviews if the report presents new data and the original review meets the criteria above.
** Include modelling studies which are at least partly based on empirical data related to COVID-19 (e.g. data used as inputs to the model, or data against which the model is being calibrated or tested); code purely theoretical modelling as not primary data.
*** Include full economic evaluations (i.e. cost-effectiveness analyses, cost-minimisation analyses, cost-utility analyses, or cost-benefit analyses - see https://yhec.co.uk/resources/glossary/). Include model-based and single-study based economic evaluations. Code topic based on the main focus/ aim of the study (e.g. code cost-effectiveness analyses of clinical treatments or management strategies for COVID-19 as Treatment evaluation).

For the inaugural version of this living map (formerly known as the long COVID 'segment') up to Version 75, records prospectively coded as 'About long COVID' were also reviewed by a second researcher before being added, by 'Topic' and 'Version'. From the first updated version (up to Version 79) until the last version that included manually coded records (Version 97), we automatically assigned records prospectively coded as 'About long COVID' to the 'segment'.

During the same period, to safeguard against missing 'new' articles about long COVID in our prospective study identification workflow, we also periodically searched for, and then reviewed, records of articles published in recent versions of our living map without the supplementary code in EPPI Reviewer-Web [2] using the following keyword search strategy:

TITLE OR ABSTRACT=("long covid" OR "long term covid" OR "post covid syndrome" OR "post covid 19 syndrome" OR "post acute” OR "PASC" OR "chronic covid" OR "ongoing covid" OR "long haul" OR "long hauler" OR "post discharge" OR "postdischarge" OR "long term symptom" OR "persisting symptom" OR "persistent symptom" OR "prolonged symptom" OR "long term sequelae") OR TITLE=( "long term effect" OR "long term impact" OR "long tail")

Once reviewed (including referral for 'second opinion' if needed) any further records about long COVID were added to the 'segment', by 'Topic' and 'Version'. 

From December 2021 (Version 83) up until October 2023 (Version 106), we have been using a Bidirectional Encoder Representations from Transformers (BERT) model [3] to automatically assign some records from each sequential update of the OpenAlex (formerly MAG – see below) dataset to either one of our eleven (primary) 'Topic' codes, or an 'exclude' code, in the broader, parent living map of COVID-19 research, with other records assigned for manual screening-coding by a researcher, as before (as described above in this section). From December 2021 (Version 83) until March 2022 (Version 97), those records automatically assigned to one of the eleven (primary) 'Topic' codes by the main BERT model – which were then published in the main, parent living map – were also sent to a supplementary, semi-automated workflow, comprising use of: (i) a bespoke BERT model (binary classifier) that automatically assigns the supplementary code 'About long COVID' to some records; (ii) a bespoke machine learning classifier designed to identify records of articles about long COVID (see our living map of COVID-19 research for further details), combined with the keyword search strategy shown above in this section; and (iii) manual screening of selected (prioritised by ii) records. Further records about long COVID that are identified using this supplementary, semi-automated workflow were also added to the 'segment', by 'Topic' and 'Version'. After Version 97 (March 2022) we discontinued manual coding. We continue to import records when the OpenAlex dataset is updated, and upload those records coded by the BERT models.

Following its final update (on 6th December 2021), the MAG dataset and the academic.microsoft.com website have been discontinued. We have therefore migrated our study identification workflows for the main living map of COVID-19 research from MAG, to the OpenAlex dataset (MAG’s drop-in replacement) from January 2022.

On 28th February 2022, we added 41 further study reports (articles) to the long COVID ‘segment’ that have been identified as eligible for consideration in an update of a living systematic review of research on long COVID [4], of which: 32 were already included in our broader C-19 living map but not yet in the long COVID ‘segment’; and 9 were not yet included in our broader C-19 living map (now added).

Prospective – Fully Automated
From May 2022 (Version 98) until the transition to a stand-alone map in May 2024 (Version 107), we discontinued manual coding for the main living map of COVID-19 research and suspended manual coding for this long COVID 'segment'. We continued to import records from automated searches of each sequential update of the OpenAlex dataset and we added those records automatically coded by our machine learning classification (BERT) models as ‘included’ (i.e. assigned to one of eleven 'Topic' codes and ‘On topic for long COVID’) to the main, broader living map of COVID-19 research and the predecessor of this living map (known as the long covid ‘segment’).

N.B. The study identification workflow for our main living map of COVID-19 research also incorporates bespoke, supplementary automated searches of sequential updates of the OpenAlex dataset, which are designed to increase coverage of relevant articles in this long COVID 'segment'.

Results

Current
The 16th update (current version) of the long Covid ‘segment’ (up to Version 107, published 5th June 2024) contains 6,408 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version' - 558 records are 'new' in this version.

Previous
The inaugural version of the long COVID 'segment' (up to Version 75, published 7th October 2021) contained 261 records of research articles about long COVID, partitioned by 'Topic' and 'Version'.

The 1st update of the long COVID 'segment' (up to Version 79, published 4th November 2021) contained 470 records of research articles about long COVID, partitioned by 'Topic' and 'Version'.

The 2nd update of the long COVID ‘segment’ (up to Version 83, published 9th December 2021) contained 816 records of research articles about long Covid, partitioned by 'Topic' and 'Version'.

The 3rd update of the long COVID ‘segment’ (up to Version 85, published 6th January 2021) contained 1,362 records of research articles about long Covid, partitioned by 'Topic' and 'Version'.

The 4th update of the long COVID ‘segment’ (up to Version 89, published 3rd February 2022) contained 1,646 records of research articles about long Covid, partitioned by 'Topic' and 'Version'.

The 5th update of the long COVID ‘segment’ (up to Version 93, published 4th March 2022) contained 2,292 records of research articles about long Covid, partitioned by 'Topic' and 'Version'.

The 6th update of the long COVID ‘segment’ (up to Version 97, published 31st March 2022) contained 2,454 records of research articles about long Covid, partitioned by 'Topic' and 'Version'.

The 7th update of the long COVID ‘segment’ (up to Version 98, published 5th May 2022) contained 2,488 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.

The 8th update of the long COVID ‘segment’ (up to Version 99, published 19th October 2022) contained 3,362 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.

The 9th update of the long COVID ‘segment’ (up to Version 100, published 20th January 2023) contained 4,095 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.

The 10th update of the long COVID ‘segment’ (up to Version 101, published 10th March 2023) contained 4,646 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.

The 11th update of the long COVID ‘segment’ (up to Version 102, published 3rd May 2023) contained 4,789 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.

The 12th update of the long Covid ‘segment’ (up to Version 103, published 18th May 2023) contained 5,022 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.

The 13th update of the long Covid ‘segment’ (up to Version 104, published 13th June 2023) contained 5,197 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.

The 14th update of the long Covid ‘segment’ (up to Version 105, published 20th July 2023) contained 5,477 research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.

The 15th update of the long Covid ‘segment’ (up to Version 106, published 24th October 2023) contained 5,850 records of research articles automatically classified as being likely to be about long Covid, partitioned by 'Topic' and 'Version'.
 

Funding
The living map of COVID-19 research, including this long COVID 'segment', was commissioned by the National Institute for Health Research (NIHR) Policy Research Programme (PRP) for the Department of Health and Social Care (DHSC) and Public Health England (PHE). It is funded through the NIHR PRP contract with the EPPI Centre at UCL (Reviews facility to support national policy development and implementation, PR-R6-0113-11003). Any views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the NIHR or the DHSC.

Conflicts of interest
None.

Contributions
Any opinions expressed in this publication are not necessarily those of the EPPI Centre or the funders. Responsibility for any views expressed remains solely with the authors.

References
[1] Lorenc T, Khouja C, Raine G, Shemilt I, Sutcliffe K, D'Souza P, Burchett H, Hinds K, Khatwa M, Macdowall W, Melton H, Richardson M, South E, Stansfield C, Thomas S, Kwan I, Wright K, Sowden A, Thomas J (2020). COVID-19: living map of the evidence. London: EPPI-Centre, Social Science Research Unit, UCL Social Research Institute, University College London.

[2] Thomas J, Graziosi S, Brunton J, Ghouze Z, O'Driscoll P, Bond M (2022). EPPI-Reviewer: advanced software for systematic reviews, maps and other evidence synthesis [Software]. https://eppi.ioe.ac.uk/CMS/Default.aspx?alias=eppi.ioe.ac.uk/cms/er4  

[3] Devlin J, Chang M-W, Lee K, Toutanova K (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805

[4] Michelen M, Manoharan L, Elkheir N, Cheng V, Dagens A, Hastie C, O'Hara M, Suett J, Dahmash D, Bugaeva P, Rigby I, Munblit D, Harriss E, Burls A, Foote C, Scott J, Carson G, Olliaro P, Sigfrid L, Stavropoulou C. (2021). Characterising long COVID: a living systematic review. BMJ Global Health 6: e005427.

Copyright
Authors of the systematic reviews on the EPPI-Centre website [http://eppi.ioe.ac.uk] hold the copyright for the text of their reviews. The EPPI-Centre owns the copyright for all material on the website it has developed, including the contents of the databases, manuals, and keywording and data-extraction systems. The centre and authors give permission for users of the site to display and print the contents of the site for their own non-commercial use, providing that the materials are not modified, copyright and other proprietary notices contained in the materials are retained, and the source of the material is cited clearly following the citation details provided. Otherwise, users are not permitted to duplicate, reproduce, re-publish, distribute, or store material from this website without express written permission.

The NIHR Policy Research Programme Reviews Facility is a collaboration between:

UCL Social Research Institute
EPPI Centre
London School of Hygiene & Tropical Medicine
University of York Centre for Reviews and Dissemination

 

No coded records
Publications by year
Maps(3D) & Crosstabs(2D)
Selected node: N\A
EPPI-Vis is developed and maintained by the EPPI-Centre. The data shown is retrieved in real time from the EPPI-Reviewer database.