Overview
IT infrastructure is often designed around nominal electrical capacity assumptions: breaker ratings, UPS VA limits, and manufacturer power figures. However, real-world system behaviour is not static.
Server rooms and comms environments frequently operate correctly for extended periods, only to develop intermittent or permanent overload symptoms under seasonal or environmental change.
This is not typically a single-point failure — it is the result of margin erosion combined with thermal drift across multiple systems.
The Core Issue: Design Margin vs Operating Reality
Electrical infrastructure is normally designed with an assumed diversity factor:
- Not all devices draw maximum load simultaneously
- PSU ratings exceed typical consumption
- Breakers operate below their instantaneous trip threshold
- UPS systems are sized for average rather than peak recharge conditions
In practice, these assumptions degrade over time due to:
- Equipment growth (additional servers, switches, PoE devices)
- Increased PoE demand (APs, CCTV, access control)
- UPS battery ageing and charging inefficiency
- Cooling degradation and airflow restriction
- Firmware and workload changes increasing baseline utilisation
As a result, systems gradually move from comfortable operating margin into marginal operation.
Seasonal Temperature Effects
A critical but often overlooked factor is ambient temperature variation.
In winter conditions:
- Lower ambient temperature reduces cooling demand
- Server and switch fans operate at reduced speed
- PSU efficiency improves under lower thermal stress
- UPS charging cycles are less thermally constrained
In summer conditions:
- Elevated ambient temperature increases system cooling demand
- Fan speeds increase, raising electrical consumption
- PSU efficiency decreases under thermal load
- UPS systems may increase charging activity or cooling overhead
This results in a system-wide increase in steady-state electrical load without any change in IT workload.
Thermal Load and Protective Device Behaviour
Circuit protection devices do not respond purely to instantaneous current.
Miniature circuit breakers (MCBs) operate using:
- A thermal element (long-duration overload response)
- A magnetic element (instantaneous high-current trip)
When systems operate near the upper thermal threshold:
- Small increases in sustained load can trigger delayed tripping
- Elevated ambient temperatures reduce thermal headroom
- Enclosure conditions (poor ventilation, rack density) exacerbate heating effects
This results in behaviour where systems appear stable for long periods, then begin tripping intermittently as environmental conditions change.
UPS Systems as a Hidden Load Multiplier
UPS systems introduce additional complexity:
- Battery recharge following discharge increases input current draw
- Charging current may coincide with peak IT load conditions
- Ageing batteries increase internal resistance and charging inefficiency
- High load + recharge conditions can exceed upstream circuit design assumptions
In marginal installations, UPS behaviour can be the triggering factor for breaker overload conditions.
The “100W Problem”
In electrically marginal systems, small increases in load can have disproportionate effects.
For example:
- 100W at 230V ≈ 0.43A additional load
- This may appear negligible in isolation
- However, if the system is already operating near the thermal curve of the breaker, this is sufficient to push it into sustained overload conditions
The issue is not magnitude — it is position on the trip curve over time.
Real-World Failure Mode
Typical sequence of failure:
- System installed with acceptable headroom under cool conditions
- Additional devices and services gradually increase load
- Cooling efficiency degrades over time
- Ambient temperature rises seasonally
- UPS charging or reboot events increase demand
- System crosses thermal operating threshold
- Breaker trips under sustained overload conditions
This often appears as a “sudden” failure, despite being a long-term margin erosion problem.
Key Engineering Principle
Infrastructure reliability is not determined by nameplate capacity.
It is determined by:
Sustained operation under worst-case combined conditions: temperature, load diversity, recharge behaviour, and system ageing.
Failure typically occurs when multiple near-limit conditions align rather than a single overload event.
Design Implications
Robust installations should account for:
- Seasonal ambient temperature variation
- UPS recharge current under full system load
- Long-term fan and PSU efficiency drift
- Minimum electrical headroom (not just theoretical capacity)
- Adequate ventilation and thermal design of rack environments
- Conservative circuit loading below breaker thermal knee thresholds
Conclusion
Many infrastructure failures are not caused by incorrect design ratings, but by systems operating at acceptable margins that degrade over time and environmental conditions shift those margins into failure territory.
Electrical resilience in IT environments is therefore not a static calculation — it is a dynamic operating condition.