Small sample sizes in rapid-cycle quality improvement projects.

Small sample sizes can be statistically valid.

The model for improvement and its Plan–Do–Study–Act (PDSA) cycles typically require frequent data collection to test ideas and refine the planned change strategy. The perception that data collection must involve many patients can lead to insufficiently frequent PDSA cycles and act as a barrier to initiating local improvement activities.

Small samples for demonstrating local gaps in care

How is it possible that such small samples permit rejecting the null hypothesis, while properly designed controlled clinical trials need to enrol hundreds or thousands of patients? Table 1 shows the sample size requirements for local quality audits. Table 1 can be used in two ways:

  • First, on completing an audit, the table can quickly indicate if your result is statistically significant. For example, if your audit showed an observed system performance of 50% when the desired system performance is 80%, then an audit with a sample size of 12 or more will be statistically significant.
  • Second, you can use this table to plan a sample size for an audit or PDSA cycle. For example, if your "hunch" is that the observed system performance will be 50%, and you have a desired system performance of 90%, then a sample size as low as 6 will likely suffice (though there is no harm in planning to include a few additional observations to ensure that you have a sample that represents your system's usual performance [External validity]).

How can small samples be statistically valid?

How is it possible that such small samples permit rejecting the null hypothesis, while properly designed controlled clinical trials need to enrol hundreds or thousands of patients?

  • One reason is that we are looking at very large differences (eg, 50% vs 80%), whereas clinical trials typically look for much smaller differences. As shown in table 1, as the observed performance comes closer to the desired target larger sample sizes are required to show significant differences. For example, you would need an audit sample size of 280 to show that 75% observed performance differed significantly from a desired performance of 80%.
  • A second reason for the surprisingly small sample sizes shown in table 1 is that clinical researchers want a precise estimate of treatment effect, whereas in local audits, the precision of the estimate of system performance is less important. In an example audit, 10/20 (50%) of charts had successful medication reconciliation. Statisticians use 95% CIs to describe the precision of study results; our audit has a 95% CI that extends from a low of 28% to a high of 72%. But, this result suffices to conclude that our local system performance falls short of 80%. We are less concerned about whether the actual performance is 28% or 72%, because both are unacceptable.

 

Table 1. Minimum sample sizes required for improvement projects based on observed and desired system performance.
Observed system performance (%) Desired system performance
80% 90%
95 26 140
90 70 n/a
85 260 180
80 n/a 50
75 280 28
70 80 20
66 45 15
60 25 10
50 12 6
40 10 5
20 5 5

Handle small samples with care

You must have an extremely high level of confidence in the data integrity of your small sample. For small sample sizes, a 'few specific patients' can amount to a large proportion of the sample. One patient represents a substantial contribution to a sample of eight patients. So, the 'catch' to using small samples is the need to follow very clear steps for collecting the data. Apply five steps:

1. Define the eligible sample
we identified consecutive patients admitted to our inpatient medical service at General Hospital.
2. Establish exclusion criteria
we excluded patients who were admitted for <12h.
3. State the study period
the audit occurred from Saturday 7 November 2015 at 08:00h to Sunday 8 November 2015 at 16:00h.
4. Keep a reject log
we identified 23 consecutive admitted patients during the audit period. We excluded two patients who were discharged within 12h, leaving 21 patients for the audit.
5. Make data collection complete
we completed data collection for all 20 patients. One chart could not be located.

For a small sample medication reconciliation audit:

Using small samples in PDSA cycle

Suppose that the medication reconciliation audit wants local improvement and the first change concept consists of a new medication reconciliation form that must be completed by the ordering provider. For your first PDSA cycle, you plan to obtain feedback from users about the form's usability. Your main study measure is whether the clinicians can complete the form without your help. How many clinicians should you study in this cycle?

You can use table 1 to plan your first PDSA. At this early stage you will likely be recruiting friendly highly motivated clinicians (a 'convenience sample') to try out your form. You should aim for at least a 90% success rate for completing the form without any difficulty. You do not want to implement a form that requires training and personalised support for highly motivated users. Therefore, you will use the third column from table 1 with desired system performance of 90%. Next, you need a hunch about how good you can really expect your form to be in this first go-around. You should be humble, because at early stages nothing works out as intended. Let's estimate that 60% of clinicians will be able to complete the form without personalised help or difficulty. Therefore, a sample size of 10 should be sufficient. In other words, if, as you suspect, only 60% of your convenience sample will complete the form without help, you will only need observations to show that you are not yet at your target of 90% success.

For this first (convenience) sample of 10 volunteer users, 5/10 (50%) completed the form without any input or instructions. The other five became frustrated and gave up. Table 1 tells you that, with an observed success rate of 50% and a desired target of 90%, any audit with a sample of eight or more allows you to confidently reject the null hypothesis that your form is working at a 90% success rate. In other words, your form needs work!

The quantitative element of the first PDSA cycle is already finished. You should obtain qualitative feedback from your 10 participants (especially the five motivated users who could not complete the form) and make the necessary changes. Then you can start a second PDSA cycle next week.

Example Practice

For the example above of designing a form for medication reconciliation, use the online calculator (reference #3) to calculate an exact P value for the probability that you would observe a performance of only 50% (5/10) if the required performance were 90%. Also calculate the 95% CI for your result.

  1. Choose "Probabilities > Binomial Probabilities"
    Enter n = 10, k = 5, p = 0.9
    Hit the Calculate button
    Answer: Method 1. exact binomial calculation →
    P = 0.0016349374 (0.002)
    Interpretation: P is much less than the usual P<0.05, so the difference is statistically significant, and rejects the null hypothesis that there is no difference between the observed and desired result. In other words, the project did not achieve the desired result, and the difference was statistically significant.
  2. Choose "Proportions > The Confidence Interval of a Proportion"
    Enter k = 5, n = 10
    Hit the Calculate button
    Answer: "95% confidence interval: including continuity correction" →
    Lower limit = 0.2014
    Upper limit = 0.7986
    Answer: CI: 20%–80%
    (0.2014 ~ 0.7986).
    Interpretation: Even though the range is very wide (CI: 20%~80%) it is not so important in a quality improvement project because both lower and upper limits are unacceptable. The CI does reach, or include, the required level of 90%.

References

  1. Etchells E, Ho M, Shojania KG. Value of small sample sizes in rapid-cycle quality improvement projects.
    [ BMJ Qual Saf ] 2016; 25(3): 202-206.
  2. Etchells E, Woodcock T. Value of small sample sizes in rapid-cycle quality improvement projects 2: assessing fidelity of implementation for improvement interventions.
    [ BMJ Qual Saf ] 2018; 27(1): 61-65.
  3. Lowry R. VassarStats: Website for Statistical Computation
    [ vassarstats.net ]
  4. Perla RJ, Provost LP, Muray SK. Sampling considerations for health care improvement.
    [ Qual Manag Health Care ] 2014; 23(4): 268-279.
  5. Perla RJ, Provost LP. Judgment sampling: a health care improvement perspective.
    [ Qual Manag Health Care ] 2012; 21(3): 170-176.





Accept Cookies?
Provided by Web design, Gloucester