Thermal Validation Detective Work: The case of the mysterious hot spot

Warehouse for medicine
Paul Daniel, Vaisala
Senior GxP Regulatory Compliance Expert
Published:
Life Science

Thanks to our blog readers, we get a lot of great questions from professionals in life science environments all over the world! This week we share some back-and-forth email troubleshooting from Paul Daniel and a Validation colleague. 

 

Hi Paul,

We have an issue I’m hoping you can help with. We recently mapped a warehouse and our study indicated good control at all points except one – in the 
graph you can see it’s at position 4, level 3. The warehouse is a uniform rectangle, surrounded by temperature-controlled warehouse space to the east, south, and west, and outside to the north. Position 4 is in the SW corner, and level 3 is ~20 ft. from the ground. Clearly, it looks like there is an additional heat source affecting that location and then some event at around ~11am Monday that brings it into control, but the warehouse owners have been unable to suggest a cause. Could you suggest any other avenues to investigate? Your input is most welcome!
Best regards, S

Dear S,

Let’s start with the graph you sent.  Superficially, the graph looks nice and repeatable for the first 3.5 days.  Temperatures start to rise every day at 7 AM, and they peak at 3 PM, and cool down quickly, and then level out through the night.  All the points mapped are nicely stratified through the whole duration, and every point in the warehouse is changing by about the same amount as every other point.  That is, except for point position  4, level 3, which as you say, is all over the map (so to speak!).

Then something happens at 11:06 AM on 4/02/13.  Point 4-3 dives.  At 11:36 AM, every other point in the whole warehouse dives as well.  The slope of this decline is steep.  Clearly, some event occurs, and it affects every point in the warehouse.

You said that the warehouse owners have been unable to suggest a cause, but this was more than an event that brought your hot-spot into control.  This event was global and affected every point in the warehouse.  And rather than bringing a point into control, I think it exposed a larger issue with either the HVAC controls or the function of your mapping tools.

I have a couple questions about your graph…

1) Why does your initial hot spot (at position 4, level 3) show so much variability when almost every other point (according to the graph) seems to be fairly stable?  I would analyze this perhaps by determining the standard deviation by day.  That spot was so variable before the event that it requires an explanation.  If it was a cold spot with this much variability, I would expect that the sensor was placed in the path of an HVAC vent.  As a hot spot, it looks like it is near an open window that is letting in hot air, or something that could provide the variability and the daily downward trend.  It is hard to see from the graph what happens to this point after the event.

2)            Why does the “new” cold spot (at position 28, level 3) become so variable after the event?  Even without analyzing the raw data, you can see on the graph that 28-3 was stable before, and is unstable after.  And after the event, the instability is periodic, becoming stable again from 1 AM to 7 AM on the first two days following the event.

 

I can only think of a few things that might begin to explain a graph like this…

 
  1. A change in set-point, or something directly affecting the operation of the HVAC system.
  2. A change in doors or windows that changes how the HVAC is cooling the space.
  3. A change in the sensor locations.
  4. Faulty monitoring equipment.
 
Without a deeper analysis, my instinct says #4.  The steps you would take to check this will depend on what kind of monitoring equipment you use.  In fact, the type of system you use can be a diagnostic tool in itself.  If you were using individual loggers, I would suspect #’s 1, 2, or 3 from my list above.  But if you are using a single processor and data collection system, say with many thermocouples, then #4 is more likely. If this were my warehouse I’d be doing some deeper statistical analysis of the data.
Best regards,
Paul

S wrote back:

Hi Paul,
Thank you for your initial response! Now I can add to the discussion: the devices are all individual 
loggers, and we’re performing a diagnostic check on position 4, level 3 now. I expect it will be alright since it was recording in-control readings after the event. I went and looked around yesterday, and confirmed that 4-3 is in fact near an AHU vent! We also got anecdotal evidence that this particular AHU may have been malfunctioning in the past few weeks, and it may have even been blowing hot air. Still no input on what may have happened at 11 am on 04/01 however.
Sincerely,
S


Paul responded:
Dear S,

Ah ha! What a great case study!  Now,  at least for point 4-3, you know how you can use the mapping data forensically to better understand the area being studied. Okay, so now we’ve potentially explained why 4-3 might have been so variable and hot.  But we still don’t know two key factors, right?
 

1)What happened at 11 am on 4/01?
(Wait a minute…  This isn’t an April Fool’s Day joke, is it?  This would be right during the lunch hour for a 7AM-3PM shift…   Maybe it was a practical joke affecting sensor placement or HVAC function or doors open?)
2)Why did position 28, level 3 become variable after the event?

Exploring Question #1 -
Since you are using individual data loggers, we can probably remove the idea that the global event was internal to the monitoring equipment.  That means that the environment actually changed in temperature and became cooler.  The rate of change at 11 AM is faster cooling than seen anywhere else in the graph.  And, if I read the graph right, sensor 20-3 has the greatest change, dropping 7 degrees in 30 minutes.  Given the apparent localization of the cooling event, I would look near 20-3 to find the source of the cool air.  That might lead you to the “event”.

 

Exploring Question #2 -  
This one concerns me.  If you look at the data 28-3 does not become highly variable until 1:42 PM.  Which is approximately 2 hours after the “event” and the temperatures of every other sensor are increasing! This may be a different event, perhaps either a sensor failure or sensor that has been moved.  The pre-variability onset standard deviation of the data from 28-3 is about 0.96.  After variability onset, it increases to approx. 1.29.  Something definitely changed.  Furthermore, sensor 29-3, which is only 8 or 9 feet away (if I read the map right), is showing no variability at all, not even a slight cooling.  As long as you are checking Senor 4-3 for issues, you should check 28-3 as well.

Please let me know how this resolves!
Best regards,
Paul
 

S Responded:
Hi Paul,
We just received word that the position 4, level 3 sensor passed post-run diagnostics, so the data we have are accurate. Unfortunately, we don’t have much more information from the warehouse manager, although I did visually confirm 4-3 and 28-3 are indeed in proximity to an AHU vent.
I also heard anecdotally that AHU 22 near 4-3 may have been malfunctioning and maybe even blowing hot air in the last month or so. So, that’s likely been the offending variable in the study…Thanks so much for your input on this!
Regards,
S

Add new comment