This living map (the Map) consists of bibliographic records of research articles on COVID-19 published since 2019.a To check that you are viewing the latest version, please visit this page.
Using this map:
There are various ways to use the Map.
The size of the circles (or the number of tiles in Mosaic view) is proportional to the number of bibliographic records (research articles) in the cell. To view these records, click on the cell of interest. You can also click on a column or row header to view all the records relating to that row or column.
To view the full set of records, click on the Reader tab above the Map. Records can be filtered, and exported as a RIS file, from here. Further details about each study can be found by clicking on the URL(s) and exploring the corresponding full text, where available.
It may appear as though there are no studies in a cell; hover the cursor over the cell to see how many studies are actually in that cell. In the filters tab, change the view to Mosaic to more easily see the number of studies in each cell.
Identifying the evidence:
Prior to implementing the MAG-enabled workflow described below (see Version 35 to date), evidence in this Map was identified by conducting weekly searches of MEDLINE and Embase, beginning on Wednesday 4th March 2020, and updating the search strategy as necessary.
MEDLINE search strategy as at October 2020:
Database: Ovid MEDLINE(R) ALL <1946 to October 01, 2020>
1 ("20200925" or "20200926" or "20200927" or "20200928").dt. (14732)
2 preprint.pt. (1048)
3 1 not 2 (14732)
4 limit 3 to covid-19 (1119)
The Embase search strategy as at October 2020:
Database: Embase <2016 to 2020 Week 40>
1 "202040".em. (120134)
2 limit 1 to covid-19 (5705)
For further details of these MEDLINE/Embase search strategies, please see current OVID Covid-19 Expert Searches developed by Wolters Kluwer, available from: https://tools.ovid.com/ovidtools/expertsearches.html#corona - for example (October 2020):
Coronavirus (Covid-19) 2019-nCoV on MEDLINE
1. exp Coronavirus/
2. exp Coronavirus Infections/
3. (coronavirus* or corona virus* or OC43 or NL63 or 229E or HKU1 or HCoV* or ncov* or covid* or sars-cov* or sarscov* or Sars-coronavirus* or Severe Acute Respiratory Syndrome Coronavirus*).mp.
4. (or/1-3) and ((20191* or 202*).dp. or 20190101:20301231.(ep).) [this set is the sensitive/broad part of the search]
5. 4 not (SARS or SARS-CoV or MERS or MERS-CoV or Middle East respiratory syndrome or camel* or dromedar* or equine or coronary or coronal or covidence* or covidien or influenza virus or HIV or bovine or calves or TGEV or feline or porcine or BCoV or PED or PEDV or PDCoV or FIPV or FCoV or SADS-CoV or canine or CCov or zoonotic or avian influenza or H1N1 or H5N1 or H5N6 or IBV or murine corona*).mp. [line 5 removes noise in the search results]
6. ((pneumonia or covid* or coronavirus* or corona virus* or ncov* or 2019-ncov or sars*).mp. or exp pneumonia/) and Wuhan.mp.
7. (2019-ncov or ncov19 or ncov-19 or 2019-novel CoV or sars-cov2 or sars-cov-2 or sarscov2 or sarscov-2 or Sars-coronavirus2 or Sars-coronavirus-2 or SARS-like coronavirus* or coronavirus-19 or covid19 or covid-19 or covid 2019 or ((novel or new or nouveau) adj2 (CoV on nCoV or covid or coronavirus* or corona virus or Pandemi*2)) or ((covid or covid19 or covid-19) and pandemic*2) or (coronavirus* and pneumonia)).mp.
8. COVID-19.rx,px,ox. or severe acute respiratory syndrome coronavirus 2.os.
9. ("32240632" or "32236488" or "32268021" or "32267941" or "32169616" or "32267649" or "32267499" or "32267344" or "32248853" or "32246156" or "32243118" or "32240583" or "32237674" or "32234725" or "32173381" or "32227595" or "32185863" or "32221979" or "32213260" or "32205350" or "32202721" or "32197097" or "32196032" or "32188729" or "32176889" or "32088947" or "32277065" or "32273472" or "32273444" or "32145185" or "31917786" or "32267384" or "32265186" or "32253187" or "32265567" or "32231286" or "32105468" or "32179788" or "32152361" or "32152148" or "32140676" or "32053580" or "32029604" or "32127714" or "32047315" or "32020111" or "32267950" or "32249952" or "32172715").ui. [Articles not captured by this search when created in April 2020, pending further indexing by NLM]
10. or/6-9 [Lines 6 to 9 are specific to Covid-19]
11. 5 or 10
12. 11 and 20191201:20301231.(dt).
13. remove duplicates from 12
From 20th July 2020 (Search 20) until 26th October 2020 (Search 34), all unique (following de-duplication) 'new' MEDLINE/Embase records were also scored using a binary machine learning (ML) classifier described below (see Version 35 to date). MEDLINE/Embase records scoring above an identified threshold score were retained for screening; while those scoring below this threshold score were set aside.
Version 35 to 44
From 9th November 2020 (Version 35) onwards, we stopped searching MEDLINE and Embase (see Searches 1-34, above) and began to identify the evidence using automated continuous prospective surveillance of the Microsoft Academic Graph (MAG) dataset . A MAG-enabled workflow was operationalised using the new MAG Browser suite of tools in EPPI-Reviewer Web (ER-Web) . The full MAG dataset currently comprises >245 million bibliographic records of research articles on all topics across science, connected in a large network graph of conceptual (e.g. related publications or fields of study), citation and author relationships .
Each time an updated copy of the MAG dataset is released by MicrosoftTM (currently, every two weeks), all 'new' MAG records (i.e. records of articles not indexed in any preceding versions of the MAG dataset - up to 1 million new records per update) and their associated metadata are automatically imported into MAG Browser (ER-Web) systems. New MAG records are then automatically scored by our novel Continuous Review (ContReview) machine learning (ML) model.b The ContReview model exploits both network graph features  and text features of new MAG records (with reference to the same features of known include MAG records identified and coded in preceding versions of this Map) to score and prioritise (by ranking them from highest to lowest, by score) the new records for potential manual screening-coding (see Coding the evidence, below). Preprints, and articles from specific sourcesc that are invariably excluded from this Map, are automatically filtered out and discarded. We then retain the top scoring (on ContReview) new, filtered MAG records, between 3,000 and 10,000 per MAG update, contingent on the time elapsed since the preceding update and the total number of new records included in each update.
Next, we use ER-Web de-duplication tools to identify and remove duplicate new MAG records. Then we re-score the remaining top scoring new MAG records (i.e. highest ranked by our ContReview model) using a binary ML classifier that we designed to distinguish between title-abstract records included in (positive class), and those excluded from (negative class), this Map.d New MAG records scoring above an identified threshold score on the binary ML classifier are retained; while those scoring below this threshold score are set aside. Finally, we retain the remaining new MAG records for potential manual screening and coding (see Coding the evidence).
Version 45 to date
From Version 45 onwards, we supplemented the set of top-scoring new records (on the ContReview model) with a second set of MAG records, from each update of the MAG dataset, identified using a COVID-19 'custom search' that we developed and executed using MAG Browser tools in ER-Web.
We restricted this custom search to MAG records, in each update of the MAG dataset, with publication dates on or after 6th July 2020, and (automatically) only imported those records which had not previously been either (a) imported into ER-Web from MAG or (b) matched from its corresponding MEDLINE/Embase record. These records were then processed into our Priority Screening workflow in precisely the same way as the set of top-scoring new MAG records on our ContReview model (see 'Version 35 to date', above).
Coding the evidence:
Prior to using the binary ML classifier score to discard low scoring records (see Identifying the evidence, above) and conducting screening-coding using priority screening mode (see Search 30 to date, below) we screened-coded all retrieved MEDLINE/Embase records in quasi-random order (i.e. without prioritisation), with weekly screening-coding assignments of varying sizes allocated between coding team members. Each coding team member could also refer selected records for a second opinion, to be resolved by team discussion and consensus.
Search 30 to date
From 28th September 2020 (Search 30) to date, screening and coding of new retained MEDLINE/Embase records (up to Search 34) or MAG records (from Version 35 onwards) has been conducted using priority screening mode in ER-Web. In priority screening mode, retained records (i.e. top-scoring on the ContReview model and above the threshold score on the binary ML classifier - see Identifying the evidence) are screened in prioritised rank order (highest to lowest) based on scores assigned by the binary ML classifier; and the rank order of those records awaiting screening is periodically reprioritised based on all preceding coding decisions (i.e. active learning [4, 5]).
From 28th September 2020 (Search 30) to date, each team member has had a fixed target to screen and code to reach a total of 1,500 records, each week, using priority screening mode. All retained records that are not screened-coded by the team in a given week are carried forward, along with new records from the next updated version of the MAG dataset (or, for Searches 30 to 34, from the next MEDLINE-Embase searches), to the pool of records to be reprioritised for screening-coding during a subsequent week. The option of referring selected records for a second opinion remains in the MAG-enabled workflow, and second opinion records are still resolved by team discussion and consensus.
Current criteria for inclusion in the Map under each category heading (topic code) are as follows.
- Primary empirical data, systematic review,* modelling,** full economic evaluation,*** or novel analysis on COVID-19
- Treatment evaluation
- Any intervention aimed at treatment and/or prevention of COVID (i.e. with either a population of COVID patients or COVID incidence as an outcome), including vaccines
- Prospective outcome studies with comparison between researcher-allocated groups (i.e. randomised trials, quasi-randomised trials, and non-randomised trials with researcher allocation)
- Exclude observational / retrospective studies including treatment as an exposure and uncontrolled studies (code as treatment development)
- Exclude case reports with some information about treatment (code as case reports)
- Exclude basic science with claimed relevance to interventions, but without evaluation of effectiveness in human patients (code as treatment development or vaccine development)
- Transmission / risk / prevalence
- Epidemiological modelling of spread (incl. studies which aim to model health outcomes or health system outcomes; include population mortality rates (i.e. deaths relative to total population); exclude case fatality rates (i.e. deaths relative to COVID cases)); include genetic epidemiology if the main focus is on spread of disease (if the focus is on characterising strains, code as genetics / biology)
- Risk modelling
- Studies of viral persistence in bodily secretions / tissues or in the environment, e.g. on surfaces (including methods to inactivate the virus in these contexts)
- Population prevalence studies
- Studies of risk factors for developing COVID-19 at individual level (not risks of developing more severe disease/complications among people infected with COVID-19; code as health impacts) or at population level
- Studies of the effectiveness of prevention strategies e.g. masks, drugs, contact tracing (only if data on prevalence reported). Exclude data on preventive behaviour outcomes only (code as social / economic / indirect impacts). Exclude studies of disinfection, aerosolisation etc. if they do not include data on COVID-19 (code as not on topic)
- Sensitivity and specificity of tests for COVID
- Modelling/evaluation of screening or testing programmes, or training clinicians in diagnosis
- Include studies of clinical signs if the main focus is on their diagnostic value (if the focus is on health outcomes or prognosis, code under health impacts)
- Exclude diagnosis of other conditions / comorbidities in COVID patients
- Health impacts of COVID-19
- Any study with a population of COVID-19-infected patients measuring physical health outcomes (incl. case fatality rates) and/or somatic indicators (code studies reporting prevalence data here if they also present health impacts).
- Include studies of prognostic factors, indicators of disease progress or severity
- Include case series if they have more than a small number of patients and report descriptive statistics
- Include indirect health impacts on healthcare workers (e.g. from PPE use)
- Studies of comorbidities (e.g. coinfections), if not explicitly analysed as risk factors for infection
- Vaccine development
- Basic science aimed at development of vaccines
- Include animal studies testing human vaccines only
- Treatment development
- Basic science aimed at development of treatment, e.g. drug discovery
- Include animal studies testing human treatments only
- Studies looking at treatments but not meeting methodological criteria for treatment evaluation, e.g. observational / retrospective studies including treatment as an exposure, studies without researcher-allocated control group, or modelling based on evaluation data (but exclude studies of outcomes which simply state that treatment was administered without relating outcomes to treatments - code under health impacts)
- Studies of treatment safety / side effects
- Training clinicians to deliver interventions
- (Studies of relevance to both vaccine and treatment development - code under treatment development)
- Genetics / biology of virus
- Any data on the genetic or biological characteristics of the virus, or of mechanisms or responses to infection (including antibody responses, if not clearly aimed at diagnosis/vaccine development)
- Include modelling on the basis of secondary data analysis
- Exclude studies of biological mechanisms theoretically linked to COVID-19 infection, but without data which actually concern COVID-19
- If explicitly aimed at treatment (resp. vaccine) development, code as treatment (resp. vaccine) development
- Case reports - patients
- Medical case reports of small numbers of patients considered as individuals
- Include any case with confirmed COVID-19 or symptoms or history suggestive of COVID-19 infection (otherwise code as not on topic, case studies of health professionals and mental health consequences of lockdown are also not on topic)
- Include mental health cases tested using the "Fear of COVID" scale or equivalent
- Case study - organisation
- Descriptive studies setting out organisational responses / strategies to COVID-19
- Surveys of professionals/institutions on organisational responses (not broader knowledge or attitudes to COVID-19 - code these as social and economic impacts)
- Include any organisation (healthcare or other) and any form of response to COVID-19, whether directly concerning COVID-19 patients or not
- Exclude guidance or recommendation papers which do not describe the recommended measures being implemented in a specific case
- Social, economic and indirect impacts
- Include studies mainly focusing on behaviour, attitudes etc.
- Studies of information (e.g. analysis of websites or social media)
- Surveys of professionals if not mainly focused on organisational responses
- Studies of behaviour or health outcomes of patients without diagnosed COVID-19 (including total excess mortality, unless separable data on COVID-19 mortality are available)
- Studies of other impacts of COVID-19 or COVID-19 control measures (e.g. environmental impacts of lockdown)
- Mental health impacts
- Include both COVID-19 patients and/or indirect mental health impacts on the broader population (or healthcare workers, etc.)
- Include mental health status (anxiety, depression, etc.) and sleep-related outcomes
- Where studies have an equal focus on mental health impacts and health and/or indirect impacts, code as mental health impacts
* Define systematic review as any paper reporting secondary data which reports: some search terms; clearly defined inclusion criteria; and some information on the selection process (at least N of references located by searches and N of studies included). Include any systematic review which aimed to include studies on COVID, whether or not any were located. Include updates to systematic reviews and living reviews if the report presents new data and the original review meets the criteria above.
** Include modelling studies which are at least partly based on empirical data related to COVID (e.g. data used as inputs to the model, or data against which the model is being calibrated or tested); code purely theoretical modelling as not primary data.
*** Include full economic evaluations (i.e. cost-effectiveness analyses, cost-minimisation analyses, cost-utility analyses, or cost-benefit analyses - see https://yhec.co.uk/resources/glossary/). Include model-based and single-study based economic evaluations. Code topic based on the main focus/ aim of the study and the type(s) of outcome data utilised in the economic evaluation (e.g. code cost-effectiveness analyses of clinical treatments for covid-19 with outcome (effects) data sourced from one or more randomised controlled trials as treatment evaluation).
In general, code using the main aim (or the main focus) of the paper if it covers more than one topic. Code systematic reviews by the inclusion criteria or the focus of the included papers.
Two exclude codes were originally displayed in the Map, but since 21st April 2020 these are no longer shown. Since 12th February, these have been combined into one code.
- Other viruses (SARS, MERS, etc.)
- Anything on human coronaviruses other than COVID-19; include both primary data and non-data papers
- No primary empirical data, systematic review or modelling
- Thinkpieces, non-systematic reviews, guidance, consensus statements etc.
- Protocols for studies or reviews which do not report findings data
- Systematic reviews which do not report findings data (mapping reviews; reviews which only contain guideline documents / opinion pieces)
- Methods papers (including validation of data collection methods, if usable primary data not reported)
- Corrections, errata, retractions
- Responses or replies which do not report substantive new data or analysis
The remaining excludes are all not on topic, pre-prints (that will be published in a journal, if they pass peer review), or duplicates identified while screening and coding.
To access a RIS file for any of these codes, please email firstname.lastname@example.org
Results for searches 1 to 34 are in previous versions of the Map.
Version 35 to date:
Results for versions 35 and 36 are in previous versions of the Map.
For Version 37, published on 3rd December, 2020, we did not import any new records because the MAG dataset had not been updated. We coded 1,530 records: 476 were added to the Map, 818 were excluded, and 236 were not on topic. The remaining 1,001 unscreened records were carried forward.
For Version 38, published on 10th December, 2020, we imported 9,810 new MAG records from the 23rd November, 2020 update of the MAG dataset. Of these, 90 records were duplicates, leaving 9,720 new records. Using the ML classifier, we discarded 1,684 records, and the remaining 8,036 records were added to the pool of 1,001 records carried forward, making 9,037 records to be coded. We coded 1,588 records: 1,136 were added to the Map, 150 were excluded, and 302 were not on topic. The remaining 7,449 records were carried forward.
For Version 39, published on 17th December, 2020, we did not import any new records because the MAG dataset had not been updated. We coded 1,522 records: 836 were added to the Map, 215 were excluded, and 471 were not on topic. The remaining 5,927 unscreened records were carried forward.
For Version 40, published on 7th January, 2021, we imported 4,274 new records from the MAG update on 12th December and 4,288 new records from the MAG update on 21st December. Of these, 85 were duplicates, leaving 8,477 new records. Using the ML classifier, we discarded 1,322 records and the remaining 7,155 records were added to the pool of 5,927 records carried forward, making 13,082 records to be coded. We coded 1,575 records: 1,472 were added to the Map, 70 were excluded, and 33 were not on topic. The remaining 11,507 unscreened records were carried forward.
For Version 41, published on 14th January, 2021, we did not import any new records because the MAG dataset had not been updated. We coded 1,520 records: 803 were added to the Map, 365 were excluded, and 352 were not on topic. The remaining 9,987 unscreened records were carried forward.
For Version 42, published on 21st January, 2021, we imported 10,635 records from the MAG update on 5th January. Of these, 28 were duplicates, leaving 10,607 new records. Using the ML classifier, we discarded 1,680 records, and the remaining 8,927 records were added to the 9,987 records carried forward, making 18,914 records to be coded. We coded 1,523 records: 659 were added to the Map, 431 were excluded, and 433 were not on topic. The remaining 17,391 records were carried forward.
For Version 43, published on 29th January, 2021, we did not import any records because the MAG dataset had not been updated. We coded 1,607 records: 1,121 were added to the Map, 252 were excluded, and 234 were not on topic. The remaining 15,784 unscreened records were carried forward.
For Version 44, published on 4th February, 2021, we did not import any records because the MAG dataset had not been updated. We coded 1,514 records: 824 were added to the Map, 426 were excluded, and 264 were not on topic. The remaining 14,270 unscreened records were carried forward.
For Version 45, published on 12th February, 2021, we imported two sets of records from the MAG dataset updated on 18th January. The first set contained 9,883 records, of which 44 were duplicates and 1,500 were discarded using the ML classifier, leaving 8,339 records. The second set was from our inaugural custom search of the MAG dataset. This set contained 10,355 records, of which 87 were duplicates, and 2,811 were discarded using the ML classifier, leaving 7,457 records. In total, 15,796 records were added to the pool of 14,270 unscreened records; 1,331 records were identified as preprints and removed from the pool, leaving 28,735 records to be coded. We coded 1,548 records: 1,420 were added to the Map, 74 were excluded, and 54 were not on topic. The remaining 27,187 records were carried forward.
For Version 46, published on 18th February, 2021, we coded most records before the new import. On 18th February, we imported two sets of records from the MAG dataset updated on 1st February. The first set contained 9,898 records, of which 177 were duplicates and 1,562 were discarded using the ML classifier, leaving 8,159 records. The second set (from the custom search) contained 5,630 records, of which 155 were duplicates and 1,634 were discarded by the classifier, leaving 3,841 records. In total, 12,000 records were added to the pool of 27,187 unscreened records, making 39,187 records to be coded. We coded 1,594 records; 983 were added to the Map, 326 were excluded, and 285 were not on topic. The remaining 37,593 records were carried forward.
a Primary empirical data, systematic review, modelling, full economic evaluation, or novel analysis on COVID-19 - see Coding the evidence.
b Our ContReview model was built and tested (in collaboration with MicrosoftTM) using MAG records of COVID-19 research articles that we had matched to records included in the Map up to Search 19 (and which are therefore also indexed in MEDLINE and/or Embase).
c New Scientist, The Conversation, NEJM Journal Watch, Veterinary Record, Chemical & Engineering News and Physics Today.
d The binary ML classifier was built and tested in ER-Web using MAG records of COVID-19 research articles that we had matched to records included in, and excluded from, our Map up to Search 19 (and which are therefore also indexed in MEDLINE and/or Embase). It was calibrated to achieve at least 0.95 recall among MEDLINE-Embase records included in this Map, with a corollary workload reduction of ~30% (compared with screening all MEDLINE-Embase records).
Suggested citation for this Map:
Lorenc T, Khouja C, Raine G, Shemilt I, Sutcliffe K, D'Souza P, Burchett H, Hinds K, Macdowall W, Melton H, Richardson M, South E, Stansfield C, Thomas S, Kwan I, Wright K, Sowden A, Thomas J (2020) COVID-19: living map of the evidence. London: EPPI-Centre, Social Science Research Unit, UCL Social Research Institute, University College London.
This Map was commissioned by the National Institute for Health Research (NIHR) Policy Research Programme (PRP) for the Department of Health and Social Care (DHSC) and Public Health England (PHE). It was funded through the NIHR PRP contract with the EPPI Centre at UCL (Reviews facility to support national policy development and implementation, PR-R6-0113-11003). Any views expressed in this publication are those of the author(s) and not necessarily those of the NHS, the NIHR or the DHSC.
Conflicts of interest:
Any opinions expressed in this publication are not necessarily those of the EPPI-Centre or the funders. Responsibility for any views expressed remains solely with the authors.
- Sinha A, Shen Z, Song S, Ma H, Eide D, Hsu B-J, Wang K. An Overview of Microsoft Academic Service (MA) and Applications. In Proceedings of the 24th International Conference on World Wide Web (WWW '15 Companion): 243-246. ACM, New York, NY, USA. 2015. https://academic.microsoft.com/paper/1932742904
- Thomas J, Graziosi S, Brunton J, Ghouze Z, O'Driscoll P, Bond M (2020). EPPI-Reviewer: advanced software for systematic reviews, maps and other evidence synthesis [Software]. https://eppi.ioe.ac.uk/CMS/Default.aspx?alias=eppi.ioe.ac.uk/cms/er4
- Shemilt I, Thomas J. MAG-Net-ise it! How the use of Microsoft Academic Graph with machine learning classifiers can revolutionise study identification for systematic reviews. Oral paper accepted for presentation at the 26th Cochrane Colloquium, Santiago, Chile, 22-25 October 2019. https://colloquium2019.cochrane.org/abstracts/mag-net-ise-it-how-use-microsoft-academic-graph-machine-learning-classifiers-can
- Miwa M, Thomas J, O'Mara-Eves A, Ananiadou S. Reducing systematic review workload through certainty-based screening. Journal of Biomedical Informatics 2014; 51: 242-253. https://academic.microsoft.com/paper/2099883114
- O'Mara-Eves A, Thomas J, McNaught J, Miwa M, Ananiadou S. Using text mining for study identification in systematic reviews: a systematic review of current approaches. Systematic Reviews 2015; 4: 5. https://academic.microsoft.com/paper/2147469877
Authors of the systematic reviews on the EPPI-Centre website http://eppi.ioe.ac.uk hold the copyright for the text of their reviews. The EPPI-Centre owns the copyright for all material on the website it has developed, including the contents of the databases, manuals, and keywording and data-extraction systems. The centre and authors give permission for users of the site to display and print the contents of the site for their own non-commercial use, providing that the materials are not modified, copyright and other proprietary notices contained in the materials are retained, and the source of the material is cited clearly following the citation details provided. Otherwise users are not permitted to duplicate, reproduce, re-publish, distribute, or store material from this website without express written permission.
The NIHR Policy Research Programme Reviews Facility is a collaboration between the following:
UCL Social Research Institute
EPPI-Centre (Evidence for Policy and Practice Information and Co-ordinating Centre)
London School of Hygiene & Tropical Medicine
University of York Centre for Reviews and Dissemination