c. Modified Delphi survey
The statements provided by stakeholders were added to those generated
from the umbrella review without editing. Together they created the long
list for the modified Delphi consensus survey among 30 stakeholders with
voting rights deploying a web-based survey tool (www.surveymonkey.com).
A seven-point scale was provided to assess the level of agreement with
the content of each statement. The scale was anchored between “strongly
agree” and “strongly disagree”, with “agree”, “somewhat agree”,
“neither agree nor disagree”, “somewhat disagree”, and “disagree”
included as the scaled options for responses. The same scale was used in
both survey rounds administered on 30th January and
9th February 2022. The sum of the “strongly agree”
and “agree” responses were used to compute an agreement rate for the
approval of each individual statement. The responses of the individual
stakeholders were kept anonymous throughout the whole process.
We used an objective method to determine the threshold or cut-off for
approval of the statements, average percent of majority opinions
(APMO).25 For this computation, a statement was
considered as agreed if the majority (>50%) of
stakeholders responded “strongly agree” or “agree” on the
seven-point scale. A statement was considered as disagreed if the
majority (>50%) of stakeholders responded “disagree” or
“strongly disagree” on the seven-point scale. The AMPO consensus
threshold was calculated as: sum of majority agreed and majority
disagreed statements / total number of responses received x 100%.
Statements above the APMO threshold were considered as having reached
consensus. For individual statements that reached consensus in each
round we computed the strength of the agreement among stakeholders using
the interquartile range (IQR).24 IQR was the
difference between first and third quartiles of the stakeholders´
responses on the seven-point scale. It was interpreted as follows: IQR 0
(>50% stakeholders gave the same responses) indicated very
good strength of agreement; IQR 1 (>50% stakeholders´
range of responses was ≤2 points of the scale) indicated good strength
of agreement; IQR ≥2 (>50% stakeholders´ range of
responses was >2 points of the scale) indicated poor
strength of agreement. As a sensitivity analysis, we used an arbitrary
approval threshold of 70%. Results were analysed using Stata v16
software (StataCorp. 2019, College Station, TX: StataCorp LLC).
Statements not having reached consensus in the first round using the
APMO threshold were merged with new statements provided by stakeholders
and subjected to the second round of the modified Delphi survey. The
statements deemed to have failed to reach consensus because of lack of
clarity in language had their wording improved. The statements
containing similar information were merged to avoid duplication.
First-round agreement rate was provided in the second survey round along
with the references to the reviews supporting the statements generated
via evidence synthesis. The minor rewording, statement merger and
statistical approach in the second round was the same as that used in
the first round. The statements that failed to reach consensus were
taken for voting to the final consensus development meeting.
To consolidate the provisional statement set, a core group of
stakeholders (AB, KSK, MNN, PC, MF) evaluated the statements that had
reached consensus for exact or inexact duplications and clarity of
meaning. Where the duplication was virtually exact, a single statement
was created, making only minor wording changes to clarify or enhance the
intended meaning. No major wording changes were introduced to any of the
statements that had met the consensus threshold. The statements without
consensus were revised in the same manner with a view to improving the
clarity of their meaning and to assist in subsequent voting. Thus, an
original statement may have been subjected to minor rewording or merger
with other statements various times through the different consensus
rounds. The list of statements resulting from the above process, both
those having reached consensus and those not having done so, was
tabulated and circulated to all the participants with the agreement
ratings and the underpinning references to reviews for the consensus
development meeting.