The Sump Pump Incident: When Coke Machines Outrank Network Infrastructure

Some IT problems are complex distributed systems failures. Some are obscure race conditions. And some… some are just a guy who really didn’t want anyone to miss out on a cold Coke.

This is the story of the latter.

The Ghost in the Network

Every couple of months, we’d get the same problem: complete network outage. Total loss of connectivity across the entire site.

But here’s the thing — it was never consistent:

Different days of the week
Always around the same time of day (give or take)
Completely unpredictable schedule

I’d get the call, start the long walk from my office to the server room in another building… and by the time I got there?

Everything would be fine. Like nothing had happened.

It was gaslighting, but with Ethernet.

The Pattern That Wasn’t

After the third or fourth incident, I started trying to find patterns:

Weather-related? No correlation with temperature or storms
Time-based? Roughly the same time, but not exactly
Load-related? No unusual traffic or usage patterns
Hardware failure? Everything checked out fine every single time

The logs showed nothing. The hardware was solid. The cabling tested fine.

It was like the network just… decided to take a coffee break sometimes.

The Day I Caught It

I’d given up on finding the root cause through analysis. This was going to require catching it in the act.

So when the next outage hit, I was ready.

The call comes in. Network down. And this time, I’m not walking — I’m sprinting.

I jump in my car and absolutely yeet it across the site. Pretty sure I set a new land-speed record for “Sysadmin in Full Panic Mode.”

I burst into the server room like I’m expecting to find someone stealing copper wire.

Everything looks fine. All the lights are green. Nothing unusual.

But the network is definitely still down.

Around the Wall

The fibre link between buildings ran behind the server room wall. I go around to check the patch panel on the other side.

And there he is.

A maintenance guy. Just standing there. Completely calm. Not a care in the world.

He’s unplugged the core switch — you know, the one with the bright yellow label that says “DO NOT TURN OFF” — so he can plug in his sump pump.

Let me repeat that: He took down the entire site network to power his sump pump.

The Conversation

Me: “You’ve taken down the entire site network.”

Him: (shrugs) “Was I supposed to know that?”

Me: “The sign literally says ‘DO NOT TURN OFF.’”

Him: “Yeah but… I didn’t want to unplug the Coke machine. People might want a drink.”

Let That Sink In

This guy had a choice:

Unplug the Coke machine temporarily
Take down every computer system on site

And he chose option 2.

Because hydration > communication.

His logic was actually kind of sound from a certain perspective — the Coke machine was visible and people use it. The switch was hidden behind a wall with a label he could safely ignore.

Out of sight, out of mind. The network is invisible when it works. The Coke machine is a tangible benefit.

The Root Cause

Once I understood what was happening, the pattern made sense:

The maintenance guy had a sump pump that he needed to run periodically. Not on a fixed schedule — just whenever water accumulated. Which happened every few weeks, usually after cleaning or when weather patterns caused groundwater issues.

He’d done this several times before. And every time, the network was only down for 10-15 minutes (however long the pump took).

By the time I walked over from my office, he was done, the switch was plugged back in, and everything looked normal.

The intermittent timing? Just whenever the sump needed draining.

The Fix

The technical fix was simple:

Installed a dedicated outlet for maintenance equipment
Added even more signage (though let’s be honest, that was clearly not the issue)
Put a lock on the switch power supply
Documented the incident in excruciating detail

But the real fix was understanding human behavior:

People will solve their immediate problem with the most convenient solution available, regardless of downstream consequences they can’t see.

The maintenance guy wasn’t being malicious or stupid. He had a problem (water), he needed power, and there was a convenient outlet right there. The consequences were invisible to him.

Lessons from the Sump Pump

1. The Most Obvious Solution Isn’t Always Obvious

To me: “Don’t unplug the thing labeled DO NOT TURN OFF” is blindingly obvious.

To him: “Don’t interrupt the Coke machine that everyone uses” was more obvious.

We see the world through our own expertise. Labels and warnings that are clear to us might be meaningless noise to others.

2. Visibility Matters

The Coke machine had immediate, visible consequences. The network switch had abstract, invisible consequences.

This is why monitoring, alerts, and visible status indicators matter. If there had been a giant red light showing “NETWORK DOWN” visible from the hallway, maybe different choice would have been made.

3. Physical Security Matters

All the network security in the world doesn’t matter if someone can just… unplug the thing.

Physical access is root access. Always.

4. Documentation Tells Stories

Years later, this incident was still in the documentation as a reminder. Not because we thought it would happen again, but because it illustrated a principle:

Expect the unexpected. Especially the really, really unexpected.

5. Systems Need Defenses Against Users AND Non-Users

We spend a lot of time thinking about what users might do wrong. We often forget about what non-users (maintenance, cleaning crew, contractors, visitors) might do.

The most robust systems account for both.

The Broader Pattern

This wasn’t an isolated incident in my career. Over 20+ years in IT, I’ve seen:

Cleaning crew unplugging servers to plug in vacuums
Construction workers cutting through “unused” cables
Office managers rearranging server rooms for “better feng shui”
Security guards turning off “noisy equipment” during night shifts
Well-meaning employees “organizing” cable messes by unplugging everything

The common thread? People operating outside IT infrastructure who encounter IT equipment and make decisions based on their own context and priorities.

What Would I Do Differently Today?

Looking back, here’s what might have prevented this:

1. Better Physical Design

Critical infrastructure should be in locked rooms, not accessible areas
Power supplies should be isolated from general-use outlets
Redundancy so one unplugged switch doesn’t kill everything

2. Better Communication

Regular briefings to facilities/maintenance teams about critical infrastructure
Visual indicators that show when equipment is active/critical
Clear escalation paths (“If you need power, call X first”)

3. Better Monitoring

Real-time alerts that would have notified me instantly
Visible status boards showing network health
Cameras in areas with critical infrastructure (though that might be overkill)

4. Better Redundancy

Multiple fibre paths between buildings
Failover switches
Backup power that can’t be accidentally unplugged

But honestly? Sometimes you just need to experience the sump pump incident to learn these lessons.

The Human Element of IT

Early in my career, I thought IT problems were technical problems. Logic, troubleshooting, systematic analysis.

The sump pump incident taught me that IT problems are often human problems:

Communication failures
Different priorities
Invisible consequences
Convenient solutions

The best technical infrastructure in the world still has to interact with humans who have their own goals, priorities, and understanding of how things work.

And sometimes, those humans really care about people getting cold drinks.

The Recurring Theme

This story echoes a pattern I’ve seen throughout my career: the most critical infrastructure is often the most invisible.

Networks work perfectly until they don’t
Backups are useless until you need them
Security doesn’t matter until you’re breached
Monitoring is overhead until there’s an incident

The challenge in IT is making the invisible visible, the abstract concrete, and the distant immediate.

Because if you don’t, someone will unplug your core switch to run a sump pump.

In Defense of the Maintenance Guy

Here’s the thing: the maintenance guy wasn’t wrong to prioritize what he could see and understand.

From his perspective:

Sump pump = immediate visible problem (water damage)
Coke machine = immediate visible benefit (thirsty people)
Mystery switch with label = abstract concept

He made a rational decision based on the information and priorities available to him.

The failure was systemic:

We didn’t communicate the criticality of the equipment
We didn’t provide alternatives (dedicated maintenance power)
We didn’t make the consequences visible
We didn’t build in redundancy

Blaming individuals for systemic failures is easy. Fixing the system is harder but more effective.

The Unexpected Legacy

This incident became legendary in our IT department. For years, whenever someone would suggest a fix or change, someone else would inevitably ask:

“But what about the sump pump?”

It became shorthand for: “Have we considered how someone with completely different priorities and context might interact with this?”

That question has saved more headaches than any technical solution.

Conclusion

Twenty years later, I still think about the sump pump incident when designing systems:

How might someone misunderstand this?
What invisible consequences exist?
Who might need to interact with this who isn’t an expert?
What’s the most convenient but wrong solution someone might choose?

The most important lesson wasn’t about network redundancy or physical security (though those matter too).

It was about understanding that everyone operates with their own context, priorities, and understanding of the world.

And sometimes, someone genuinely believes that protecting cold beverage access is more important than your carefully designed network infrastructure.

And you know what? In that moment, from their perspective, they weren’t entirely wrong.

Have an IT war story to share? I’ve got 20+ years of them. Let’s swap stories.

This incident occurred in the mid-2000s. The maintenance guy was not reprimanded — we recognized it as a systems failure, not a personal failure. We bought him lunch and explained why the switch mattered. He never unplugged it again.