Item-response model and item importance
An item-response model, including 33 graded-response logit sub-models, one per item, was successfully developed. Figure 2 shows that the pattern of the model-estimated severity data (upper right), including the progression over time, the variability among patients, and the visit-to-visit fluctuation resemble those of the observed SoS (upper left).
The discrimination parameter and four difficulty parameters for all items are shown in Table 2. Score value 4 (severe) was missing from five items (1, 25, 26, 31 and 32). The probability for a patient to score this value, and consequently the corresponding difficulty parameter, could not be estimated for these items.
The information content varied greatly among the items (Figure 3 and Table 2): eight items each held > 5% of the total information, totaling 65% and with the lowest discrimination parameter being 1.29. All seven items for the left side of the body were among the eight top-informing items.
Conversely, 11 items each held < 1% information, with the highest discrimination parameter being 0.46. Nine of the ten tremor items were among the 11 least informative ones. Indeed, four of the five items where score 4 (severe) was missing were tremor tests (Table 2). Six items (18, 20, 21, 30, 31 and 32) had mostly score 0 (normal); three were tremor tests (Figure 2, lower right). Several tremor items were even estimated to have a near-zero negative discrimination parameter value, with very wide-ranging difficult parameters. These observations suggested that the parameters were badly estimated for these items and revealed these items’ inability to differentiate patients with different levels of symptom severity. Based on these findings, the longitudinal modelling and subsequent estimation of clinical trial PoS were conducted with or without the tremor items.