Version 6.16.2.0 introduces a number different LLM models, which can be used for robot/auto coding tasks. Before only one LLM/model was available. This release also contains a bug fix and a security fix.
Multiple LLM models
In early 2024 we introduced the robot-coding functionality, which used a GPT4 LLM model to automatically code individual items. Later that year, we introduced the possibility to run "batch robot-coding jobs", and eventually the ability for EPPI Reviewer to run multiple such jobs in parallel. Meanwhile, of course, the number of models available has increased, as existing providers expanded their offerings, and new providers entered the market.
Thus, the next logical "LLM feature" to introduce in EPPI Reviewer is to allow people to decide which model to use for robot-coding jobs.
To this aim, in this release we introduce support for a selected range of OpenAI models.
Upon clicking on the robot-coding buttons, EPPI Reviewer now shows a list of models and requires users to pick one in order to run/submit any robot-coding request. Models vary in their declared capabilities, cost, speed and usually have a "retirement date" too. Therefore, when users select a specific model, EPPI Reviewer will show the model details by default, along with a short description of its declared capabilities.
The list of models/descriptions is:
- OpenAI GPT4: This is the first LLM deployed in EPPI Reviewer, in 2024. The model used is OpenAI "GPT-4o" with "model version = 2024-08-06". GPT models are optimised for "Chat", making them quicker and more "creative" than reasoning models.
- Gpt-4.1-nano: The model used is OpenAI "GPT-4.1-nano" with "model version = 2025-04-14". This is a quick and cheap model.
- GPT-4.1: Latest model of the GPT series. The model used is OpenAI "GPT-4.1" with "model version = 2025-04-14".
- o4-mini: The model used is OpenAI "o4-mini" with "model version = 2025-04-16". This is a reasoning model, optimised for speed/efficiency. Reasoning models iterate a little with the intention of giving more accurate answers (but evaluation is needed!). This makes them slower than other models though.
- o3: The model used is OpenAI "o3" with "model version = 2025-04-16". This is the latest OpenAI reasoning model we can currently make available, so it's also slow and expensive.
Please note that we are making the additional models available for the explicit purpose of evaluating their performance. While we (collectively) have collected and analysed data for the "GPT-4o" model, as implemented in EPPI Reviewer, at the present time, we have very little data about how the additional models compare. Therefore, we cannot meaningfully say that one is better than another for a particular task at the moment.
All models vary in cost per input/output model, and vary also in their predisposition to output more or less tokens, therefore we expect that costs will vary significantly across the range of models implemented. For reference, we charge £1 per million input tokens and £1 per million output tokens for the cheapest model (gpt-4.1-nano), while for the most expensive model (o3), these figures rise to £8 and £30 respectively.
For batch jobs, max parallelism has been increased from 2 to 3. We hope to implement models provided by other companies (besides OpenAI) from the next release onwards.
Bug Fix: Activating new Cochrane Reviews
The system to automatically start working on a new Cochrane review broke at some point in the past few months. This prevented Cochrane users from "activating" Cochrane reviews on their own, and forced them to ask for our support instead (we could work around the problem on our end). This release fixes the problem.
Security fix: editing descriptions for Visualisations
When setting up visualisations, writing a good description is crucial. To this aim, EPPI Reviewer includes a "what you see is what you get" Html editor in the setup Visualisations page. The editor used until now was outdated and had known vulnerabilities. The present release replaces it with a different and safer HTML editor.