Question: “We need a bulletproof Parallels RAS for 1,200 users. It absolutely cannot go down. And if Toronto fails, Calgary must take over instantly.”
That was the CTO of one of Canada’s largest logistics companies laying out the challenge. In his world, even ten minutes of downtime can mean hundreds of delayed shipments, trucks sitting idle, and warehouses scrambling.
We had two sites to work with — Toronto as the primary, Calgary as the DR site. They already had F5 BIG-IP appliances at both locations and Parallels RAS with its built-in HALB feature. The early idea floated in the room was to stack another ADC layer, maybe NetScaler, on top of the F5 and HALB for “extra” resilience.
Here’s the thing about “extra”: it can cost you. In this case, it would’ve added close to CAD $180K a year in licensing alone — plus the operational overhead of managing a third tier of load balancers. For a lean IT team, that wasn’t going to fly.
So, I recommended a different plan: keep it simple, but tune it like a race car. F5 would handle the heavy lifting — global load balancing between sites, local distribution across gateways — and HALB would focus on broker awareness and session host health. The magic was in getting the F5 health monitors tuned so they wouldn’t hammer the brokers with constant probes. I’ve seen that mistake cause false failovers before, and we weren’t going to let that happen here.
In normal operation, Calgary only needed to support about 300 concurrent users. But in a failover, it had to instantly handle all 1,200. We wrote automation so that when F5 detected trouble in Toronto, it would flip Calgary into “all hands-on deck” mode — spinning up more brokers, lighting up additional gateways, and adding 24 extra session hosts in just minutes.
And then the storm came.
It was mid-February, a brutal Ontario ice storm. At 2:41 PM, Toronto’s primary fiber line went dark. F5 saw the gateway health drop immediately. By 2:43 PM, it had shifted new sessions to Calgary. At 2:44 PM, our automation kicked in, scaling the DR site to full capacity. By 2:48 PM, Calgary was serving everyone. Only 172 people even noticed — the rest kept working, blissfully unaware that their session had moved 3,000 kilometers west.
Over the next six months, the system proved itself again and again. We hit a 99.7% connection success rate, sessions started in just over 8 seconds on average, and downtime dropped well below the SLA target. The business avoided roughly CAD $420K in shipment delay costs, cut help desk calls nearly in half, and saved a small fortune by not buying another ADC stack.
The lesson? High availability isn’t about piling on more gear — it’s about knowing when you’ve already got the right tools and making them work smarter. For this logistics company, F5 plus HALB — tuned for their reality — kept freight moving even when Mother Nature threw her worst at us.