Understanding Test Accuracy Ratio and Test Uncertainty Ratio for Practical Application of Total Uncertainty

Female worker in a pharmaceuticals factory
Paul Daniel, Vaisala
Senior GxP Regulatory Compliance Expert
Industrial Manufacturing and Processes
Industrial Measurements
Life Science
In this week's blog we answer a question on uncertainty in probes used in stability chambers. I´ll offer two good references to help with calculating uncertainty and some practical advice on uncertainty in RH instruments.

Dear Paul,
I recently attended your seminar and I appreciated all the great information! I  was wondering whether you might be able to provide me with more specific advice regarding the concerns we have been having with humidity sensors monitoring our stability cabinets. The issue as it stands revolves around the uncertainty of measurement on the calibration of the probes we use to profile and monitor the cabinets. As you will no doubt be aware, the ICH guidelines state that the conditions must be maintained within ±5%RH from the set point.

If we take the example of a cabinet at 25°C/60%RH this provides us with an allowed range of 55-65%RH. Our procedures require us to map the unit, place the monitoring probe at the mid-range location and set its alarm limits to take into account the variance from the high and low extreme points. For example, if we assume a scenario where we map a cabinet and the probe reading the highest gives a mean value of 61.0%RH (after adjusting for its offset), while the probe reading the lowest gives a value of 59.0%RH (after adjusting for its offset) we would place the monitoring probe in a location that gave a mean value of 60%RH and initially determine the OOS alarm limits for the monitoring probe at 56.0 and 64.0%RH.
Our probes are calibrated on site with a calculated uncertainty of ±2.9 (±1LSD) %RH. Since this means that the probes used to perform the mapping could have been reading up to 2.9% higher or lower than the "true" values at their locations, we would narrow the alarm limits to account for this, leaving us with alarm limits of 58.9 and 61.1%RH.

This would then be adjusted for any offset/error on the monitoring probe based on its most recent calibration. For this exercise I am going to assume there is no offset. If we then take into account the uncertainty of the monitoring probe ( ±2.9%RH) we are left with alarm limits of:
58.9 + 2.9 = 61.8%RH (low limit)
61.1 - 2.9 = 58.2%RH (high limit)
Clearly these are not feasible limits to set and I can find no way to make this work without ignoring the uncertainties on the probe calibrations. Even if we could get the probes calibrated with uncertainties below ±2%RH we would still end up with very narrow alarm limits. Any help you could provide would be greatly appreciated!

Many thanks,


Dear I,

Thanks for the compliments on our seminar! We really appreciate you taking the time to attend and learn with us.  You've done a great job of describing the problem.  And it looks like there is no way out of it, doesn't it?  But, there are a few ways to handle this. 

The easy solution is to just ignore it. While I am not recommending the practice, I will admit (without naming names) that there are plenty of folks who do just ignore this.  Maybe a better way to say it is that they do nothing about it because they realize there is little that they can do.  They are already using a world-class instrument, right?  You can't really be expected to do better than that.  Sure you could get closer to ±2% RH with a well-executed calibration, but the same essential logic problem still remains.  What is the FDA/MHRA/EMA going to do when they audit you?  Will they tell you need a better RH sensor when you are already using the industry standard?  Again, this is not my recommendation, but it is a common solution.

I know I'm being a little cheeky, but to be realistic, it helps to consider what the ICH intended when they released those guidelines.  Let's look at the dependence of RH on temperature to explore this further.  If you have a full ±2°C variance in your chamber; you will AUTOMATICALLY have a RH variance of greater than ±5%RH if the absolute humidity is constant across the chamber for many common setpoints.  This certainly holds true for standard stability testing points defined by the ICH guidelines such as 75%RH at 40°C or 60%RH at 30°C.

What does this tell us?  One interpretation would be that ICH didn't understand the relationship between temperature and humidity when they wrote the guidelines.  However, I think that is unlikely.  A more reasonable interpretation would be that the ICH guidelines are exactly that – guidelines.  They are meant to provide us with a goal, so that we use the ICH set-points as a target, and use the best reasonable technology available to achieve that target.  I do not think the ICH intentionally provided us with standards that could not be met.

This brings us to the more difficult solution…  It is more difficult because it involves a LOT of complex math. So here, I'm calling on my admittedly limited knowledge of both the math and the history of Test Accuracy Ratio (TAR). I refer you to a white paper (below) that details the strategy summarized here in simple math.

For this mathematical approach, you are basically combining guard-banding and statistical analysis.  Here is the easy way to apply the formulas: we take your ICH alarm limits, square them, then subtract the square of your measurement uncertainty, then take the square root to get actual alarm limits.  .


Here goes:

  • ICH Alarm Limits Squared: 5 x 5 = 25
  • Measurement Uncertainty Squared: 2.9 x 2.9 = 8.41
  • 25 - 8.41 = 16.59
  • √ 16.59  = 4.07 (we will round this to 4 even)

This tells us that with the current ±2.9%RH uncertainty of your RH measurements, that you can achieve your ICH requirements of ±5%RH, if you set your alarm limits at ±4%RH.  (Note that if you had measurement uncertainty of ±2% RH, you would get alarm limits of about ±4.5%).  Basically, this will give you a 2% False-Accept Risk, so your alarms will keep you within your control limits 98% of the time. Not bad.

Seems like magic, eh?  Read this paper "A Guard-Band Strategy for Managing False-Accept Risk" from KeySight Technologies for a more complete understanding.  The math backing this up is a little challenging, and you should understand it before you put it before an auditor.

Uncertainty Formula

The basic idea here is that ADDING up our uncertainties is not the correct math, because it is unlikely that at any given time, both the instrument and the process will be operating at the extremes of their uncertainty.  This turns it into a probability equation, (which is also known as uncertainty), in which case it is the realm of statistics.

Here is the official guide for calculating uncertainty  Evaluation of Measurement Data - Guide to the Expression of Uncertainty in Measurement (GUM)

The original rule for calibrations was that we should have a 10:1 test accuracy ratio (TAR), where the reference used for a calibration comparison had to be 10 times more "accurate" than the item being calibrated.  Eventually, the TAR was reduced to 4:1, and to add a bit more confusion to the topic the concept of TUR or Test Uncertainty Ratio was introduced.  I believe that 10:1 TAR was originally from US Military Specification MIL-STD-45662, but I'm not 100% certain so don't quote me... ☺

In the case of your stability chamber, we aren't calibrating it, so this doesn't exactly apply.  And it would be VERY hard to apply it.  You would need a ±1.25%RH sensor for your alarm monitoring, and that is not remotely practical or even possible. The best laboratory references for RH are barely +/-0.5RH, which means the best we can do in a field instrument (with the 4:1 TUR) is likely +/2%RH.

I bring up the TUR here for two reasons:

  1. In this comparison, we have a TUR of 1.72.  This statistical analysis described above is only valid for TUR above 1.6, so we are okay.
  2. The mathematical approach I proposed provides almost the same False-Accept limit as a 4:1 TUR, but it is entirely dependent on a high probability (95%) that your RH monitor probe is within tolerance.  If you have evidence that your monitor probe is highly likely to be out of tolerance, your False-Accept limit goes up to about 5% (equivalent to a TUR of 1.7).  That is still not bad!

Other caveats:

  1. This statistical guard-band approach is pretty well accepted in North America.  I do not know how they feel about it in the UK, or Europe.
  2. Qualifier: I am no mathematical genius, so my summary here is based on the advice of our calibration experts who are WAY better at math.  So, there may be an error in there somewhere.  This just means that you should do the math yourself, and have a good understanding of the guard-band concept and process before you are in the position of needing to explain it to an auditor.

Thanks for the opportunity to answer your question.  And if I failed to do so, email us and we'll give it another go!

Best regards,
Paul Daniel


Add new comment