Table 3

Summary of Key Findings and GRADE Assessment

GRADE Assessment
OutcomesLarge Effectf/Dose ResponsegPooled Effect Size(95% CI)Certainty of Evidence GRADERisk of BiasaInconsistencybImprecisionc/Publication Biasd/IndirectnesseLarge Effectf/Dose Responseg
PIP based on the Beers Criteria
Functional decline4,165RR 1.38
(1.06-1.80)
●●○○○
Low
No downgrade
(NOS = 9)
No downgrade
(I2 = 29.5%, P= .234)
No downgradeNo upgrade
Hospitalizations5,069RR 1.14
(1.01-1.29)
●●○○○
Low
No downgrade
(NOS = 9)
No downgrade
(I2 = 37.0%,P = .204)
No downgradeNo upgrade
Mortality73,533RR 0.98
(0.93-1.05)
●●○○○
Low
No downgrade
(NOS = 9)
No downgrade
(I2 = 0.0%, P = .689)
No downgradeNo upgrade
PIP based on the STOPP criteria
A&E visits3,588RR 1.63
(1.32-2.00)
●●○○○
Low
No downgrade
(NOS = 9)
No downgrade
(I2 = 0.0%, P = .452)
No downgradeNo upgrade
ADEs1,835RR 1.34
(1.09-1.66)
●●○○○
Low
No downgrade
(NOS = 9)
No downgrade
(I2 = 41.3%, P = .192)
No downgradeNo upgrade
Functional decline2,684RR 1.53
(1.08-2.18)
●●○○○
Low
No downgrade
(NOS = 9)
No downgrade
(I2 = 17.6%, P= .271)
No downgradeNo upgrade
HRQoL3,588SMD -0.26
(−0.36 to −0.16)
●○○○○
Very low
No downgrade
(NOS = 9)
Downgrade
(I2 = 82.3%, P = .003)
No downgradeNo upgrade
Hospitalizations2,338RR 1.25
(1.09-1.44)
●●○○○
Low
No downgrade
(NOS = 8)
No downgrade
(I2 = 53.6%, P =.116)
No downgradeNo upgrade
  • A&E = accident and emergency department; ADE = adverse drug event; GRADE = Grading of Recommendations, Assessment, Development and Evaluations; HRQoL = health-related quality of life; NOS = Newcastle-Ottawa Scale; PIP = potentially inappropriate prescribing; RR = relative risk; SMD = standardized mean difference; STOPP = Screening Tool of Older Persons’ Potentially Inappropriate Prescriptions.

  • a We downgraded the GRADE assessment if the risk of bias assessment based on the NOS is <8 in at least one of the studies, suggesting the presence of risk of bias.

  • b We downgraded the GRADE assessment if the Q test P < 0.10 or the I2 > 75%, indicating significantly high levels of heterogeneity in the results.

  • c For RR, we considered a clinically meaningful threshold to be 0.90 or 1.10 and downgraded the GRADE assessment if the RR point estimate is ≥1 and the lower limit of its CI is <0.90, or if the RR point estimate is <1 and the upper limit of its CI is >1.10. For SMD, we considered a clinically meaningful threshold to be ±0.20 and downgraded the GRADE assessment if the point estimate is ≥0 and the lower limit of its CI is <–0.20, or if the point estimate is <0 and the upper limit of its CI is >0.20.

  • d We could not assess for publication bias because there were <10 studies for each of the outcomes. Therefore, we did not downgrade any of the GRADE assessments due to publication bias.

  • e We downgraded the GRADE assessment if the recruited participants were not representative of older persons in the primary care settings.

  • f We upgraded the GRADE assessment if the RR is >2 or <0.5.

  • g We upgraded the GRADE assessment in the presence of dose-response gradient, which provides stronger evidence of the cause-effect relationship.