U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Peterson BS, Trampush J, Maglione M, et al. ADHD Diagnosis and Treatment in Children and Adolescents [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2024 Mar. (Comparative Effectiveness Review, No. 267.)

Cover of ADHD Diagnosis and Treatment in Children and Adolescents

ADHD Diagnosis and Treatment in Children and Adolescents [Internet].

Show details

2Methods

2.1. Review Approach

The methods for this evidence review follow the Methods Guide for Evidence-based Practice Center (EPC) Program. Appendixes provide supplementary information. Appendix A contains the methods. Appendix B lists the excluded studies as well as the background studies. Appendix C contains the evidence tables for the included studies. Appendix D has the critical appraisal and applicability tables for each included study, and Appendix E lists the included studies.

The topic of this report was developed by the Patient-Centered Outcomes Research Institute (PCORI) in consultation with the Agency for Healthcare Research and Quality (AHRQ). Key Questions (KQs) were posted on AHRQ’s Effective Health Care (EHC) website for public comment in August 2021 for 3 weeks. PCORI conducted an online townhall meeting to discuss the comments in November 2021 (Appendix F). The protocol was refined following this input through public posting of the KQs, the townhall meeting, input from Key Informants, and a Technical Expert Panel. The final protocol is posted on the EHC website. A panel of technical experts provided high-level content and methodological expertise throughout development of the review protocol. The protocol for the review has been registered in PROSPERO (CRD42022312656). Appendix G includes the PCORI checklist.

2.1.1. Key Questions

The KQs proposed for the systematic review, addressing diagnosis (KQ1), treatment (KQ2), and monitoring (KQ3) of ADHD, were refined following input from Key Informants, input through public posting, and a townhall organized by PCORI.

We obtained input from eight Key Informants. Key Informants included a parent of an underserved, ethnic minority (Hispanic) youth with attention deficit hyperactivity disorder (ADHD), an advocate from the national advocacy group CHADD (Children and Adults with ADHD), an expert in medical safety, an expert in testing and assessment, a representative from the Association for Child and Adolescent Counseling (ACAC), a family medicine representative, and members of the guideline group who will use the review to update the guidelines. The Key Informants showed strong support for the importance and relevance of the KQs. They suggested relevant references and provided important input on terminology relevant to the literature searches. There were discussions about developments since the last report and about where the field is now from the perspective of each participant.

Additional input on the project was received through public posting of the review questions on the AHRQ website. The posting aimed to ensure that the review is addressing the right questions, and all aspects have been considered. A submission from the American Psychological Association (APA) and a submission from a researcher at Immaculata University addressed all review questions. For KQ1, input stressed the importance of minimizing false positive diagnoses from the presence of co-occurring conditions; costs and reliability of electroencephalogram (EEG) diagnostic information; that a developmental lens should be adopted (e.g., does a child’s relative age and developmental maturity in comparison to classmates influence the odds of receiving a diagnosis of ADHD?); that the role of sleep, trauma, and language development should be considered; and that annual reassessments of behaviors and impairment are important. For KQ2, input addressed the importance of reviewing the effects of medications and the risk of diversion of pharmacological treatment; of treatment fidelity; of adherence to and persistence of medication use; of behavioral treatment, including use of different modalities (in person, video, online); and of the Multimodal Treatment of ADHD study, specifically. For KQ3, the input targeted the conduct of routine assessments, including reports from parents, teachers, and the children/adolescents, that should be accessible to all parties; and that routine monitoring should be part of the child/adolescent’s record.68

Finally, at the online townhall meeting in November 2021 hosted by PCORI, there were passionate discussions and advocacy for changes in ADHD policy and research. Some participants felt strongly that both important policies and data were lacking across the board. Specific areas identified by this group included lumping ADHD-Inattentive with the Combined presentation, the lack of empirical data on executive function training and executive function coaches, the general lack of specific and feasible non-pharmacological interventions that parents can use easily and have access to, as well as the lack of availability of parent training programs being offered before initiating stimulant medication.

Following Key Informant and public input, the KQs are as follows:

KQ1.

For the diagnosis of ADHD:

  1. What is the comparative diagnostic accuracy of approaches that can be used in the primary care practice setting or by specialists to diagnose ADHD among individuals younger than 7 years of age?
  2. What is the comparative diagnostic accuracy of EEG, imaging, or approaches assessing executive function that can be used in the primary care practice setting or by specialists to diagnose ADHD among individuals aged 7 through 17?
  3. For both populations, how does the comparative diagnostic accuracy of these approaches vary by clinical setting, including primary care or specialty clinic, or patient subgroup, including, age, sex, or other risk factors associated with ADHD?
  4. What are the adverse effects associated with being labeled correctly or incorrectly as having ADHD?

KQ2.

What are the comparative safety and effectiveness of pharmacologic and/or nonpharmacologic treatments of ADHD in improving outcomes associated with ADHD?

  1. How do these outcomes vary by presentation (inattentive, hyperactive/impulsive, and combined) or other co-occurring conditions?
  2. What is the risk of diversion of pharmacologic treatment?

KQ3.

What are the comparative safety and effectiveness of different empirical monitoring strategies to evaluate the effectiveness of treatment in improving ADHD symptoms or other long-term outcomes?

While the diagnosis and treatment KQs are unchanged from the 2018 AHRQ EPC report on the topic, the KQ regarding monitoring ADHD over time was rephrased for clarity. The restricted age range for sub-question 1b is based on recognition that most of these specialized technologies require the child to remain very still, which is difficult for children younger than seven. Neuropsychological tests as well as genetic markers are included in 1a and 1b. In question 1d, we will assess whether the literature suggests whether these adverse effects differ for those youth who are on the threshold of clinical or subclinical diagnoses. Co-morbidities may include co-occurring conditions such as conduct disorder, mood disorders, autism spectrum disorders, Williams syndrome, Down syndrome, learning and language disabilities, and developmental coordination disorder. Questions 2 and 3 address effectiveness as well as adverse outcomes.

2.1.2. Analytic Framework

The analytic framework (Figure 1) depicts the KQs and outcomes to evaluate the diagnosis, treatment, and monitoring strategies for ADHD.

Figure 1. Analytic framework
This figure is the analytic framework and depicts the Key Questions and outcomes to evaluate the diagnosis, treatment, and monitoring strategies for ADHD.

Figure 1

Analytic framework. Notes: ADHD = Attention deficit hyperactivity disorder, KQ = Key Question

2.2. Study Selection

The eligibility criteria are organized in a PICOTSO (population, intervention, comparator, outcome, timing, setting, study design, and other limiters) framework. The report includes studies published from 1980 to June 2023.

2.2.1. Search Strategy

For primary research studies, we searched the database PubMed® (biomedical literature), Embase® (pharmacology emphasis), PsycINFO (psychological research), and ERIC (education research). We also searched the U.S. trial database – ClinicalTrials.gov – to capture all relevant data regardless of the publication status. Increasingly trial registries include data and a complete record of adverse events, making them an important evidence review tool to identify all relevant data and to reduce publication bias.

We used existing reviews for reference-mining; these were identified through the same databases used for primary research plus searching the Cochrane Database of Systematic Reviews, Campbell Collaboration, What Works in Education, and PROSPERO. Scoping searches identified several published reviews. These often address medication treatment with an increased focus on safety.8084 Given that many practice guidelines are now based on systematic reviews, we also searched the ECRI Guidelines Trust, G-I-N, and ClinicalKey. Using external systematic reviews in addition to building on prior AHRQ reports increases the certainty that all relevant studies have been captured.

The literature searches for this project were built on prior ADHD reports published by AHRQ. KQ1 searches covered 1980 to 2011, and 2016 to present. Since research published between 2011 and 2016 was thoroughly screened by the 2018 review, we used the identified studies listed in the 2018 AHRQ report to cover 2011 to 2016. KQ2 searches covered 1980 to 2011 and 2016 to date, omitting search terms covered in the 2011 AHRQ report, and adding the adolescent population, which was not previously fully covered. We used the identified studies in the AHRQ report and reference-mining of pertinent reviews to identify relevant studies. KQ3 searches were not limited by date. We simplified the search strategies and removed filters for specific interventions for key databases to ensure that no existing test or intervention evaluation would be missed. Searches were designed, executed, and documented by the evidence review center librarian. The search strategy underwent peer review to ensure high quality searches. The search strategies for the databases are shown in the methods appendix (Appendix A). Furthermore, we used information provided by content experts,85 and the Technical Expert Panel reviewed the list of included studies to ensure that all relevant literature has been captured.

We used detailed pre-established criteria to determine eligibility for inclusion and exclusion of publications in accordance with the AHRQ Methods Guide for Effectiveness and Comparative Effectiveness Reviews. To reduce reviewer errors and bias, all citations were reviewed by a human reviewer and screened by a machine learning algorithm. Citations deemed potentially relevant were obtained as full text. Each full-text article was reviewed for eligibility by two literature reviewers, including any articles suggested by peer reviewers or that arose from the public posting process, submission through the SEADS (Supplemental Evidence And Data for Systematic reviews) portal, or response to Federal Register notice. Any disagreements were resolved by consensus. We maintain a record of studies excluded at the full-text level with reasons for exclusion (see Appendix B).

The SEADS portal was open from July 1st through August 15th 2022. We received two submissions, including one from the American Academy of Child and Adolescent Psychiatry. Submissions include comments on the need for an evidence review of ADHD research, the usefulness of the review as outlined in the posted protocol, and in total four published studies were submitted to be considered for the systematic review.

While the draft report was under peer review and open for public comment, we updated the search and included any eligible studies identified either during that search or through peer or public review suggestions in the final report.

2.2.2. Eligibility Criteria

The detailed inclusion and exclusion criteria are listed in Table 1.

Table Icon

Table 1

Eligibility criteria.

Compared to the prior 2018 report on ADHD, the eligibility criteria were simplified and now includes all tests used to diagnose ADHD and all treatments for ADHD treatments. In addition, randomized controlled trials (RCTs) are no longer limited by sample size given that RCTs allow strong evidence statements; however, treatment studies with fewer than 100 participants had to report a power calculation indicating sufficient power for at least one patient outcome to ensure that the studies were designed to detect a difference between the intervention and comparison group. Not all studies can be combined in meta-analyses to aggregate data, because the intervention, comparator, and reported outcome combinations are often unique to the study; hence we required individual studies to show sufficient power to detect effects. We specified that intervention studies had to have a treatment duration of four weeks; we excluded experiments of shorter duration (e.g., proof of concept studies) and focused on treatment for ADHD. Finally, no comparator is needed anymore for monitoring studies, and these are not restricted by publication date, given the small evidence base (the 2018 report found no relevant study).

Relevant systematic reviews and meta-analyses were retained as background or for reference-mining but will not be included as evidence. Publications reporting on the same participants were consolidated into one study record. Studies exclusively published in non-English language publications remain excluded given the high volume of literature, the focus on the review on populations in the U.S., the scope of the KQs, and the aim to support a U.S. clinical practice guideline.

2.3. Data Extraction

We abstracted detailed information regarding study characteristics, participants, methods, and results. The review team created data abstraction forms for the KQs in DistillerSR, an online program for systematic reviews. Forms included extensive guidance to support reviewers, both to aid reproducibility and standardization of data collection. One literature reviewer abstracted the data, and a second reviewer checked for accuracy and completeness. Further data checks were conducted while synthesizing results across studies. Disagreements were resolved by consensus.

We designed the data abstraction forms to collect the data required to evaluate the study, as well as demographic and other data needed for determining outcomes, informed by existing research.8689 We paid particular attention to describing the details of the treatment (e.g., pharmacotherapy dosing, methods of behavioral interventions), patient characteristics (e.g., ADHD presentation, co-occurring disorders, age), and study design (e.g., RCT versus observational), which may influence the reported outcome results. In addition, we carefully described comparators, as treatment standards may have changed during the period covered by the review. In addition, data necessary for assessing quality and applicability as described in the EPC Methods Guide were abstracted. Forms were pilot-tested with a sample of included articles to ensure that all relevant data elements are captured and that ambiguity is avoided.

The abstracted information was used for analyses as well as to populate the evidence tables in Appendix C showing characteristics for each included study. Final abstracted data will be uploaded to SRDR per EPC requirements and will be publicly available.

2.4. Risk of Bias Assessment

The critical appraisal for individual studies applied criteria consistent with QUADAS-2 for diagnostic studies and the RoB 2 guidance for common sources of bias in intervention studies adapted for the eligible study designs.90, 91

QUADAS-2 evaluates four domains: patient selection, index test characteristics, reference standard quality, as well as flow and timing:91

  • Patient selection: The domain patient selection addresses whether the selection of patients could have introduced bias, taking into account whether the study enrolled a consecutive or random sample, whether the data are not based on a retrospective case-control design, and whether the study avoided inappropriate or problematic exclusions from the patient pool.
  • Index test: The index test domain evaluates whether the conduct or interpretation of the test could have introduced bias, taking into account whether the results of the test were interpreted without knowledge of the results of the reference standard and whether any thresholds or cut-offs were pre-specified (e.g., instead of determined during the study to maximize diagnostic performance).
  • Reference standard: The domain reference standard evaluates whether the reference standard, its conduct, or its interpretation may have introduced bias, taking into account the quality of the reference standard in correctly classifying the condition and whether the reference standard test results were interpreted without knowledge of the results of the index test.
  • Flow and timing: The last domain, flow and timing, evaluates whether the conduct of the study may have introduced bias. The assessment takes into account whether the interval between the test and the reference standard was appropriate, whether all patients received the reference standard and whether they received the same reference standard, and whether all patients were included in the analysis. For each domain, we assessed the potential risk of bias in the study in order to identify high risk of bias and low risk of bias studies. We evaluated for each study and appraisal domain whether there are concerns regarding the applicability of the study results to the review question (Appendix D). This encompassed whether the patients included in the studies match the review question; whether the test, its conduct, or interpretation differ from the review question; or whether the target condition as defined by the reference standard fully matches the review question.

For treatment and monitoring studies, we assessed the six domains selection, detection, performance, attrition, reporting, and study-specific sources of bias:

  • Selection bias: For selection bias, we assessed the randomization sequence and allocation concealment in RCTs as well as baseline differences and potential confounders in all studies.
  • Performance bias:Performance bias evaluated whether patient- or caregiver knowledge of the intervention allocation or circumstances such as the trial context may have affected the outcome, and whether any deviations from intended interventions were balanced between groups.
  • Attrition bias:Attrition bias considered the number of dropouts, any imbalances across study arms, and whether missing values may have affected the reported outcomes.
  • Detection bias:Detection bias assessed whether outcome assessors were aware of the intervention allocation, whether this knowledge could have influenced the outcome measurement, and whether the outcome ascertainment could differ between arms.
  • Reporting bias:Reporting bias assessment includes an evaluation of whether a pre-specified analysis plan exists (e.g., a published protocol), whether the numerical results likely have been selected on the basis of the results, and whether key outcomes were not reported (e.g., an obvious effectiveness indicator is missing) or inadequately reported (e.g., anecdotal adverse event reporting).
  • Study-specific sources of bias: In addition to the types of bias listed above, we assessed other potential sources of bias such as inadequate reporting of intervention details.

Each study was initially appraised by the data abstractor for the study. In a second step, we reviewed risk of bias results across studies to ensure consistency of ratings. Risk of bias results informed the study limitation assessment in the quality of evidence assessment across studies. Appendix D has the critical appraisal and applicability tables.

2.5. Data Synthesis and Analysis

We summarized key features of the included studies, including study design; participant characteristics; diagnostic, treatment, and monitoring strategies; and frequent outcomes in a narrative overview. We answered each KQ with the available evidence using quantitative syntheses across studies where possible to increase statistical power, to increase precision, and to objectively summarize results across all available evidence. We ordered our findings by diagnostic, treatment, and monitoring strategy, i.e., the KQs.

We broadly characterized tests (KQ1), interventions (KQ2), and monitoring strategies (KQ3). For diagnostic studies, we reported the range of reported diagnostic performance. For KQ2, we differentiated effectiveness and comparative effectiveness results (i.e., comparing to a passive comparison in the form of a control group, or an active comparator in the form of an alternative intervention). We documented results by the pre-specified key outcomes. We consistently abstracted the longest follow up for each study. We converted reported standard errors and confidence intervals to standard deviations to compute effect sizes. We reversed originally reported outcomes where necessary to facilitate comparisons across studies.

For statistical pooling, we used random-effects models corrected for small numbers of studies where necessary to synthesize the available evidence quantitatively.92 We computed standardized mean differences (SMD) for continuous outcomes and relative risks (RR) for categorical outcomes to document results across studies. We present summary estimates and 95 percent confidence intervals for all summary estimates. Where more than one study could be combined in an analysis, we showed the results in a forest plot. The forest plots document the results for each study reporting on the outcomes, including the size of the effect, the direction of effects, the confidence interval surrounding the point estimate, the proximity to the point of no effect (RR = 1, SMD = 0), and the results in relation to other studies. Forest plots visually document the consistency of effects across studies, and they can show outliers clearly.

We determined whether the pooled effect was statistically significantly different from the comparison group and documented the identified systematic effects. We also documented results that were not statistically significant; in these cases, we stated that we did not detect a systematic effect – while we cannot rule out that the intervention may work for some children, across participants and studies the effect was indistinguishable from chance. For all interventions and outcomes that reported a continuous and a categorical effect estimate, we reviewed both estimates for each key outcome.

We assessed heterogeneity using graphical displays and the I-squared statistics. The statistic ranges from zero to 100 percent and we noted in particular results where heterogeneity exceeded 70 percent. We anticipated that intervention effects may be heterogeneous across studies. We explored potential sources of heterogeneity, while recognizing that the ability of statistical methods to detect individual sources of heterogeneity may be limited in the presence of multiple sources of heterogeneity.93 We hypothesized that the methodological rigor of individual studies and patients’ underlying clinical presentations are potentially associated with the intervention effects. We performed meta-regression analyses to examine these hypotheses and reported sensitivity analyses where necessary. We assessed the potential for publication bias for all key outcomes using the Begg and the Egger test.94, 95 The trim and fill method provides alternative estimates where evidence of publication bias was detected.96

Pre-defined subgroups for KQ1 included children younger than seven years of age and children and adolescents, seven through 17. We assessed whether diagnostic performance is associated with the age of participants using reported sensitivity and specificity estimates in a regression analysis across studies. In addition, we assessed the effect of treatment and diagnosis in participants with concomitant morbidities; the racial and ethnic composition of study samples; and the potential effect of the diagnostic, treatment, and monitoring setting in meta-regressions across studies and KQs. We differentiated primary care, specialty care, school settings, and other settings (e.g., participants were part of a larger research study), mixed settings (e.g., participant recruiting through primary care and schools), and not reported.

For KQ3, we documented outcomes as reported by the original authors.

2.5.1. Applicability Assessment

Applicability was assessed in accordance with the AHRQ Methods Guide. Factors that may affect applicability, which we have identified a priori, include patient, intervention, comparisons, outcomes, and settings. For each study, we assessed the population included in the study to identify those with narrow eligibility criteria, that excluded participants with comorbidities, that had more complex participants than typically seen in the community, and those that had run-in periods where adherence was tested and participants were excluded for non-adherence. Regarding interventions, we assessed whether studies described tests or treatments not used as recommended or commonly used in practice, dosing of medications not reflective of current practice, the presence of co-interventions that were likely to modify the effectiveness, and the presence of highly trained tester or treatment team. Regarding the comparisons, we assessed whether diagnostic studies used tools differently than recommended, treatment studies that used inadequate intervention or substandard care as comparators, and those where the comparator was unclear. Regarding outcomes, we assessed whether studies used outcome assessors that were not qualified for the assessment, surrogate or composite outcomes with limited applicability, and follow-ups too short for effects to manifest. Regarding the setting, we assessed whether studies were conducted in a setting which has a level of care that is different from that in the community. Literature reviewers could also flag additional applicability concerns.

We used this information to assess the situations in which the evidence is most relevant and to evaluate applicability to real-world clinical practice in typical U.S. settings, summarizing applicability assessments qualitatively. The information is reflected in the discussion of the review findings.

2.6. Grading the Body of Evidence

The strength of evidence assessment documents uncertainty, outlines the reasons for insufficient evidence where appropriate, and communicates our confidence in the findings.

The strength of evidence for each body of evidence (based on the KQ, diagnostic and treatment approach, comparator, and outcome) was initially assessed by one researcher with experience in determining strength of evidence for each primary clinical outcome by following the principles for adapting GRADE (Grading of Recommendations Assessment, Development and Evaluation), outlined in the AHRQ Methods Guide.97 The initial assessment was then discussed in the team.

2.6.1. Key Outcomes

We prioritized outcomes with the help of the Technical Expert Panel in combination with team expertise. The panelists reviewed a large number of possible outcomes. We considered outcomes most clinically relevant and important to patients and clinicians to guide clinical practice. The following outcomes were selected for the strength of evidence assessment:

  • Key Question 1:
    • Sensitivity
    • Specificity
    • Costs
    • Rater agreement
    • Internal consistency
    • Test-retest reliability
    • Misdiagnosis impact
  • Key Question 2:
    • Behavior changes
    • Broadband scale scores
    • Standardized symptom scores
    • Functional impairment
    • Acceptability of treatment
    • Academic rating scale scores
    • Appetite changes and growth suppression
    • Number of participants with adverse events
  • Key Question 3:
    • Functional impairment
    • Broadband scale scores
    • Standardized symptom scores
    • Progress toward patient-identified goals
    • Acceptability of treatment
    • Academic rating scale scores
    • Any long-term effects
    • Growth suppression
    • Quality of peer relationships

For diagnostic studies in KQ1, we abstracted the number of true positive and true negatives in order to compute diagnostic performance measures, but we also abstracted all values as reported by the authors. We added information on the specific cut-off and model used to achieve the diagnostic performance where reported. The impact of misdiagnosis included the risk of missed conditions that can appear as ADHD as well as being incorrectly labeled as having or not having ADHD.

For treatment studies in KQ2, we abstracted numerical values for all key outcomes to facilitate meta-analysis. We also abstracted a brief narrative for the evidence table for each outcome focusing on the comparison to a control or a comparator group (rather than pre-post data). In addition, we summarized study-specific health outcomes and reported adverse events to complete the evidence table for all included studies. For the behavior change domain, we abstracted individual behaviors such as aggression or conduct problems, either from direct observations or behavior ratings, where studies reported these in addition to global impression or symptom scales. We used global psychological, mental health, and child development assessments, such as the CGI (Clinical Global Impression)98 and total scores of the Conners rating scales, that go beyond assessing individual ADHD symptoms as broadband scale scores. For standardized symptom scores, we included summary measures for ADHD symptoms, such as ADHD-RS-IV (ADHD Rating Scale Version IV),99, 100 or, when unavailable, subclasses of individual symptoms for ADHD, such as inattention. For functional impairment, we abstracted functional measures such as the Weiss Functional Impairment Rating Scale.101, 102 For acceptability of treatment we abstracted child, parent, or teacher satisfaction with intervention, depending on what was reported. We abstracted academic rating scale scores where reported, in the absence of these, we used broad academic performance measures such as GPA (grade point average). Other, narrower performance measures, such as specific cognitive skills, were summarized in the free text field in the evidence table. For appetite changes and growth suppression, we abstracted indicators such as decreased appetite or growth during the study period. The number of participants with adverse events was restricted to documenting the total number of patients reporting at least one adverse event in each study arm. Other adverse event measures (such as the total number of adverse events or the number of serious adverse events) were summarized in the free adverse event text field in the evidence table.

For monitoring studies eligible for KQ 3, we abstracted all information provided by the authors on the suitability of the applied monitoring strategy in addition to all pre-specified outcomes.

The synthesis documented the presence and the absence of evidence for the key outcomes for all included diagnostic tests, treatment interventions, and monitoring strategies in the respective sections.

2.6.2. Strength of Evidence Assessments

In determining the quality of the body of evidence, the following domains were evaluated:

  • Study limitations: The extent to which studies reporting on a particular outcome are likely to be protected from bias. The aggregate risk of bias across individual studies reporting an outcome is considered; graded as low, medium, or high level of study limitations.
  • Inconsistency: The extent to which studies report the same direction and/or magnitude of effect or show statistical heterogeneity for a particular outcome; graded as consistent, inconsistent, or unknown (in the case of a single study or the absence of studies).
  • Indirectness: Describes whether the intervention (test, treatment, or strategy) and the comparator were directly compared (i.e., in head-to-head trials) or indirectly (e.g., through meta-regressions across studies). In addition, indirectness can reflect whether the outcome is directly or indirectly related to health outcomes of interest. The domain is graded as direct or indirect.
  • Imprecision: Describes the level of certainty of the estimate of effect for a particular outcome, where a precise estimate is one that allows a clinically useful conclusion. When quantitative synthesis is not possible, sample size and assessment of variance within individual studies are considered. Graded as precise or imprecise.
  • Reporting bias: Occurs when publication or reporting of findings is based on their direction or magnitude of effect. Publication bias, selective outcome reporting, and selective analysis reporting are types of reporting bias. Reporting bias is difficult to assess as systematic identification of unpublished evidence is challenging. When possible, we reviewed Begg and Egger test results and used trim and fill methods to assess the robustness of effect estimates.

Bodies of evidence consisting of RCTs were initially considered as high strength, while bodies of comparative observational studies began as low-strength evidence. The strength of the evidence could be downgraded based on the domains described above. There are also situations where evidence may be upgraded (e.g., large magnitude of effect, presence of dose-response relationship, or plausible unmeasured confounders could potentially increase the magnitude of effect) as described in the AHRQ Methods guides.97 A final strength of evidence grade for each evidence statement was assigned by evaluating and weighing the combined results of the above domains. We differentiated an overall grade of high, moderate, low, or insufficient according to a four-level scale outlined in Table 2.

Table Icon

Table 2

Definitions of the grades of overall strength of evidence.

Summary tables include reasons for downgrading or upgrading the strength of evidence.

2.7. Peer Review and Public Commentary

The report was updated after having undergone peer review and was posted for public commentary. The report was posted for public comment for 45 days. The disposition of comments document will be posted about three months after the final report is posted.

Views

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...