c. Modified Delphi survey
The statements provided by stakeholders were added to those generated from the umbrella review without editing. Together they created the long list for the modified Delphi consensus survey among 30 stakeholders with voting rights deploying a web-based survey tool (www.surveymonkey.com). A seven-point scale was provided to assess the level of agreement with the content of each statement. The scale was anchored between “strongly agree” and “strongly disagree”, with “agree”, “somewhat agree”, “neither agree nor disagree”, “somewhat disagree”, and “disagree” included as the scaled options for responses. The same scale was used in both survey rounds administered on 30th January and 9th February 2022. The sum of the “strongly agree” and “agree” responses were used to compute an agreement rate for the approval of each individual statement. The responses of the individual stakeholders were kept anonymous throughout the whole process.
We used an objective method to determine the threshold or cut-off for approval of the statements, average percent of majority opinions (APMO).25 For this computation, a statement was considered as agreed if the majority (>50%) of stakeholders responded “strongly agree” or “agree” on the seven-point scale. A statement was considered as disagreed if the majority (>50%) of stakeholders responded “disagree” or “strongly disagree” on the seven-point scale. The AMPO consensus threshold was calculated as: sum of majority agreed and majority disagreed statements / total number of responses received x 100%. Statements above the APMO threshold were considered as having reached consensus. For individual statements that reached consensus in each round we computed the strength of the agreement among stakeholders using the interquartile range (IQR).24 IQR was the difference between first and third quartiles of the stakeholders´ responses on the seven-point scale. It was interpreted as follows: IQR 0 (>50% stakeholders gave the same responses) indicated very good strength of agreement; IQR 1 (>50% stakeholders´ range of responses was ≤2 points of the scale) indicated good strength of agreement; IQR ≥2 (>50% stakeholders´ range of responses was >2 points of the scale) indicated poor strength of agreement. As a sensitivity analysis, we used an arbitrary approval threshold of 70%. Results were analysed using Stata v16 software (StataCorp. 2019, College Station, TX: StataCorp LLC).
Statements not having reached consensus in the first round using the APMO threshold were merged with new statements provided by stakeholders and subjected to the second round of the modified Delphi survey. The statements deemed to have failed to reach consensus because of lack of clarity in language had their wording improved. The statements containing similar information were merged to avoid duplication. First-round agreement rate was provided in the second survey round along with the references to the reviews supporting the statements generated via evidence synthesis. The minor rewording, statement merger and statistical approach in the second round was the same as that used in the first round. The statements that failed to reach consensus were taken for voting to the final consensus development meeting.
To consolidate the provisional statement set, a core group of stakeholders (AB, KSK, MNN, PC, MF) evaluated the statements that had reached consensus for exact or inexact duplications and clarity of meaning. Where the duplication was virtually exact, a single statement was created, making only minor wording changes to clarify or enhance the intended meaning. No major wording changes were introduced to any of the statements that had met the consensus threshold. The statements without consensus were revised in the same manner with a view to improving the clarity of their meaning and to assist in subsequent voting. Thus, an original statement may have been subjected to minor rewording or merger with other statements various times through the different consensus rounds. The list of statements resulting from the above process, both those having reached consensus and those not having done so, was tabulated and circulated to all the participants with the agreement ratings and the underpinning references to reviews for the consensus development meeting.