Run charts Anhøj rules|Tao's Tips|Medical Quality Improvement

Definitions

What is a run chart?

A run chart is a line graph of each data point in the order that they were collected, and then draws a horizontal reference line at the median. Run chart are used in quality improvement interventions in health care to demonstrate how systematically delivered practices change when administrators and providers modify health care goals and practices. It helps discriminate between random and non-random variation in data over time. Important uses of the run chart include:

Displaying data to make process performance visible
Determining whether a change resulted in improvement
Determining whether the improvement is being sustained

What is random variation?

Random variation is present in all natural processes. In a random process we cannot know the exact value of the next outcome, but from studying previous data we may predict the probability of future outcomes. So, a random process is predictable.

Non-random variation appears when something new, sometimes unexpected, starts to influence the process. This may be the result of intended changes made to improve the process or unintended process deterioration. The ability to tell random from non-random is crucial in quality improvement. One way of achieving this is runs analysis.

What is a run?

A sequence of consecutive points which all lie on the same side of the median line. Runs analysis is based on knowledge of the natural distributions and limits of run lengths and the number of runs in random processes.

What is a crossing?

A crossing is when the line graph crosses the median line. If the process is random, the change of crossing or not crossing the median line between two adjacent data points is fifty-fifty.

What is oscillation?

Occurs when the data fluctuates up and down above and below the midline.

Unequal denominators?

When using rate or percent data on a run chart it is important that each data point has a roughly equal denominator (± 25% of the average denominator size). This may cause spurious non-random signals; use a Shewhart control chart instead. ^{[1] p93-94}

Run Chart Example

Figure 1. A run chart showing red midline (median = 10) and two runs, indicated by dotted red lines A and B. Big yellow dots indicate data points exactly on the midline. Run chart example with median midline and two runs A and B

Data = [12, 10, 16, 7, 20, 16, 17, 16.5, 17, 18, 10, 10.5, 11, 10, 12, 9, 6, 8, 8.5, 4, 7, 8, 4 , 5]
Number of data points = 24
Median(Data[]) = 10.

The basic line chart links data points consecutively from left (earliest) to right (most recent) by a dark blue line. Data points are shown as light blue (before the following observations are made).
The median is shown as a straight orange line at the level of y=10.
There are three data points exactly equal to the median (#2, #11, #14). They are indicated by big yellow dots in the diagram. Points on the midline do not affect counting when deciding if the line chart crosses the midline.
There are two runs, circled in red.
- Run A is above the midline, and contains nine useful (not on the midline) data points on the same side (above) of the midline;
- Run B is below the midline, and contains nine data points on the same side (below) the midline.
There are 3 crossings, where the line chart crosses the midline: {#3(16)→#4(7)} {#4(7)→#5(20)} {#15(12)→#16(9)}
Note that #2, #11 and #14 on the midline do not affect the judgment of whether the midline was crossed!

Run Chart Anhøj Rules

The prediction limits for longest run and number of crossings both depend on the number of useful observations (data points not on the median). These two rules are violations of random patterns and are based on a probability of less than 5% chance (or p = 0.05) of misdiagnosis. ^[6]

Shift rule: A shift is present if any run of consecutive data points on the same side of the median is longer than its upper 95% prediction limit, calculated as log2(n) + 3, where n is the number of data points. The result is rounded to the nearest integer.
In Figure 1, with 24 data points, using Excel to calculate log2(), and rounding to the nearest integer, the upper limit would be:
ROUND(LOG(24, 2) + 3, 0) = 8.
A run of more than 8 would indicate a shift. Both of the runs (A=9 useful data points; B=9 data points) in Figure 1 exceed the 95% upper prediction limit (8 points).
Crossings rule: If the process is random, the chance of crossing or not crossing the median line between two adjacent data points is fifty-fifty. Thus, the total number of crossings has a binomial distribution, b(n−1,0.5), where n is the number of data points and 0.5 is the success probability. ^[6] The total number of crossings is calculated as the number of times the line chart crosses the midline augmented by the addition of 1. For example, in Figure 1, number of crossings = 3 + 1 = 4.
A crossings signal is present if the number of times the graph crosses the median is smaller than its prediction limit.
For example, for a run chart with 24 data points, choose the lower fifth percentile of the cumulative binomial distribution of 23 trials with a success probability of 0.5 as the critical value for the lower limits of crossings. This is easily calculated using the Excel function,
BINOM.INV(n-1, 0.5, 0.05) = 8
i.e. fewer than 8 crossings would be unusual and suggest that the process is shifting.
In Figure 1, non-random variation in the form of a shift is identified by the fact that the chart has only 4 crossings when at least 8 would be expected from 24 random numbers.
Oscillations: unusually high number of crossings (oscillation) is also a sign of non-random variation, which will appear if data are negatively auto-correlated. However, oscillation is not an effect of the process shifting location, but most likely a result of a poorly designed measure or sampling issues.

The two rules are closely related — when runs get longer, the number of crossings get fewer and vice versa — and while they often signal together, either of them is diagnostic of non-random variation. The Anhøj rules adapt dynamically to the number of available data points and can be applied to charts with as few as 10 and up to indefinitely many data points without losing sensitivity and specificity. ^[6]

Run Charts or Control Charts?

Several sets of run chart rules with different diagnostic properties are currently available [2,4,5]. In general, the Anhøj rules are more conservative (less sensitive, more specific) than the other rules. The Anhøj rules have better diagnostic properties that reliably tell random from non-random variation and balance the risk of false positive and false negative signals.

It is a common misunderstanding that run charts are inferior to control charts. Runs analysis is more sensitive to minor (≅ 1SD) persistent shifts in data than are the control charts that only react to larger shifts (≥ 2SD) in data.

As the first step in monitoring a quality improvement project, use visual inspection of the run chart. Collect at least 12, preferably 20–30 data points. Test for non-random variation using the Anhoej rules with the median as reference. If the Anhøj rules find non-random variation, seek to identify its cause(s).
- If the process is moving in the undesired direction, eliminate the cause.
- Otherwise seek to stabilise the process at the desired level.
The cumulative sum (CUSUM) statistic used with the run chart can yield information about changes in the process. The analysis of a CUSUM run chart is purely visual. Neither the median or the probability-based rules are used. ^{[1] p101-105}
Monitoring a quality improvement project should consider using a family of measures. A set of run charts are all run on the same page, to form a monitoring dashboard for the project. The charts may be for different locations of the same indicator; or for different measures (process, outcome, balancing) of the same project … ^{[1] p73-75}
When the process has been stabilised at a satisfactory level, a control chart using the mean as centre line together with 3-sigma limits is useful to quickly identify sudden larger shifts in data. It also establishes the natural process limits to be expected in the future.

References

Provost LP, Murray SK. The health care data guide. Learning from data for improvement. www.amazon.com 2011. San Francisco: John Wiley & Sons.
Perla RJ, Provost LP, Murray SK. The run chart: a simple analytical tool for learning from variation in healthcare processes. wwwncbi.nlm.nih.gov BMJ Qual Saf 2011; 20: 46-51.
Hart MK, Hart RF. Statistical process control for health care. 2000 www.amazon.com
Translated and published in Taiwan as:
鐘國彪審閱、陳宗泰譯：「健康照護的統計流程管制」金名圖書有限公司 www.eslite.com
Carey RG. How do you know that your care is improving? Part 1: Basic concepts in statistical thinking. J Ambulatory Care Manage. 2002; 25(1):80–7.
Anhøj J, Olesen AV (2014) Run Charts Revisited: A Simulation Study of Run Chart Rules for Detection of Non-Random Variation in Health Care Processes. PLoS ONE 9(11): e113825. https://doi.org/10.1371/journal.pone.0113825
Anhøj J (2014) A run chart is not a run chart is not a run chart. Understanding variation using runs analysis https://nhsrcommunity.com/blog/
Anhøj J, Wentzel-Larsen T. Sense and sensibility: On the diagnostic value of control chart rules for detection of shifts in time series data. BMC Medical Research Methodology 2018-10-03.
Swed FS, Eisenhart C. Tables for testing randomness of grouping in a sequence of alternatives. The Annals of Mathematical Statistics 1943; 14:66–87.

Related Web Pages

Previous version of this page (rules: shift, trend, runs, astronomical points) Run chart (4 rules)
Common patterns seen in quality improvement-1 Simple Run Chart Patterns
Common patterns seen in quality improvement-2 Mixed Run Chart Patterns
Test your skill at interpreting run charts Try Run Charts

Run charts: Anhøj rules

Monitoring quality improvement processes