DISCUSSION

Our systematic review, based on 126 observational studies using routinely collected data, provides a comprehensive summary of the use and applications of LTOT definitions in observational research. We identified 78 distinct definitions, most of them using a minimum of 90 days of opioid therapy as a threshold for LTOT within a range of follow-up periods, commonly one year. The rationale cited for the use of these definitions was based mostly on previous publications and clinical judgement; a minority of studies used empirical data to derive definitions or tested the impact of using multiple definitions. Moreover, we identified the need to improve reporting on methodological aspects impacting the LTOT definition, such as listing the medicines and formulations included in the analysis, depicting how overlapping prescriptions and missing data were addressed, referring to the follow-up period, and explicitly stating the denominator for LTOT rates.
Whilst our systematic review was not the first to identify variation in LTOT definitions, the focus on routinely collected data and non-surgical settings adds complementary insights to prior systematic reviews.[14, 15] A key advantage leveraged from routinely collected data is to facilitate the characterisation of multiple patterns of opioid use based on prescription/dispensing information including its frequency, dose, opioid type and strength.[14, 52] As expected, the duration of opioid use, commonly based on supply days, was the most common criteria used to define LTOT (67%) in our systematic review compared to 27%-38% in the previous ones. [14, 15] We also observed a higher number of studies using opioid dose (7 vs 0-1 study) to derive LTOT definitions [14, 15, 134] and reporting how they accounted for overlapping prescriptions (22 vs 3 studies).[14]
Encouragingly, the commonly used threshold of 90 days of opioid therapy observed in this and prior studies [14] aligns with guidelines recommendations for opioid trial duration and has been tested on empirical data based specifically on routinely collected data.[48, 49] The cumulative number of supply days (i.e., duration of prescriptions filled) has been identified as one of the strongest predictors of LTOT compared to other criteria, such as the number of refills, or OMEs dispensed. [15, 123]However, information on supply days is not typically included in administrative databases in many countries outside the United States, such as in Australia, [97] Italy,[35] and Denmark; [111]estimates of treatment duration based on pack size, strength and quantity dispensed are hindered by the range of possible instructions for use of prescribed opioid. As an alternative, researchers can use a threshold based on the length of episodes of opioid use or the frequency of fills within a time period with or without additional criteria.[35, 97, 108, 112]
Although duration measures suffice to determine LTOT, the nature of opioid therapy such as the use of long or short-acting opioids and opioid potency are commonly reported by studies but only included as part of the LTOT definition in seven studies. Similarly, estimating opioid dosage can be challenging, explaining in part the small number of studies using this criteria. This is despite evidence of a dose-dependent association between opioid use and harm with dosages greater than 40-50 mg OMEs, which escalates further with dosages over 90-120 mg OMEs. [6, 8-10]
The differences in LTOT definitions and study design resulted in an approximately 300-fold variation on LTOT rates across studies and 13-fold variation in estimates among studies assessing multiple definitions in the same study population. [24, 26, 52, 59, 79, 99, 123, 126, 128] This large variability can impair comparisons across jurisdictions and health conditions; and evidence resulting from these studies probably should not be summarised in traditional meta-analysis without careful consideration. Even studies using similar LTOT definitions may vary in terms of the study population and denominator used in the analysis, thus restricting comparisons between studies. We recommend that future studies estimating LTOT from routinely collected data report the information presented in Box 1 to increase the reliability and comparability of findings. Whenever possible, authors should consider conducting a sensitivity analysis to assess the impact of differing LTOT definitions on their estimates.
However, while proportions and the absolute number of people identified as using LTOT across different definitions varies widely, overall trends can be similar when testing definitions in the same study population.[52, 79, 99, 123] For instance, a study evaluating three measures of LTOT found all of them were useful in identifying long-term opioid use, with between 0.6%-1.1% of the study population experiencing LTOT use at a given point in time, of which between 68%-84% remained using opioids two years later.[52] Similarly, a study defining LTOT as an “Episode of > 90 days supply that began within the first 30 days following opioid index date”[99] compared their primary outcome with two other common definitions: 1) > 90 days per year [135] and 2) ≥ 90 days per year with > 120 days supply dispensed or more than 10 prescriptions filled. [49] The authors identified LTOT rates of 20.4% in 2004 and 18.3% in 2011 using the primary definition and results using the first alternative definition were substantially lower (9.4%-8.2%), while the second alternative definition yielded higher results (26.0%-24.5%). However, trends over time remained similar, with reduced LTOT rates in the year 2011 compared to 2004. Another study reported consistent predictors of LTOT despite a high variation on the percentage of patients identified[123] replicating definitions from Deyo et al.[122] and Shah et al. [106]. Yet, these results should be interpreted with caution since only a few studies reported data with sufficient detail to enable the comparison of LTOT rates based on different measures. In addition, evidence from a systematic review evaluating LTOT in the surgical setting implemented 25 definitions on empirical data and reported a 100-fold variation in results, with low levels of agreement between measures.[15]
Undoubtedly, operationalisation definitions should be fit-for-purpose to achieve study aims since LTOT measures have different interpretations and applicability for patients, clinicians, researchers, and payers. For example, rates of LTOT measured at the prescription level are useful to inform patterns of LTOT prescribing and use but give no information on the proportion of patients receiving LTOT. Alternatively, studies reporting LTOT as the proportion of individuals with a specific health condition (the most common in our systematic review) provide useful information for clinicians aiming to identify patients at higher risk of harms and to inform treatment pathways and guidelines. At the payer perspective, LTOT rates estimated as the proportion of patients prescribed opioids or health enrolees allow comparisons across providers. Estimating of rates among the whole population allows comparison of jurisdictions, the evaluation of trends over time and policy interventions impacts. Finally, the level of strictness can identify different groups of LTOT users, with more strict definitions able to identify those at higher risk of harm. [15, 52]