Previous incidents

April 2025
Apr 15, 2025
1 incident

triplepat.com is down

Downtime

Resolved Apr 15 at 06:37am HDT

triplepat.com recovered.

1 previous update

Apr 14, 2025
1 incident

triplepat.com is down

Downtime

Resolved Apr 14 at 05:10am HDT

The root cause is a failed nginx deployment due to what looks like a race condition and/or an overly-picky health check. We are auditing the health checks.

Redeploying the exact same config worked, so it's clear that this failure has something to do with either races or ephemeral machine state.

2 previous updates

March 2025
Mar 29, 2025
1 incident

b.triplepat.com is down

Downtime

Resolved Mar 31 at 01:12am HDT

The machine b.triplepat.com became unavailable and eventually rebooted due to a "power event" and corresponding Google Cloud outage in its datacenter. When it came back up, the containers did not all start successfully at boot-time.

It looks like the internal DNS for the docker-compose network didn't come-up successfully, which meant that the nginx config could not resolve the internal names for redirects, which meant that ng...

2 previous updates

Mar 19, 2025
1 incident

d.triplepat.com is down

Downtime

Resolved Mar 31 at 01:14am HDT

TILAA has network issues more frequently than we (or they!) would like. This was one of them. Our every-node-is-a-master-node redundancy system means that everything still managed to work fine, however.

2 previous updates

February 2025
Feb 27, 2025
1 incident

a.triplepat.com was down after a OS upgrade + reboot

Downtime

Resolved Feb 28 at 04:23am HST

We tried to upgrade the OS and reboot the a server yesterday to see if that was a safe and quick operation. It was not. We delayed in bringing it back online because we haven't launched and were in the middle of a major code change.

2 previous updates

Feb 17, 2025
1 incident

d server was unreachable repeatedly

Resolved Feb 17 at 02:03am HST

Tilaa cloud (where the d server is hosted) had some network issues that made d.triplepat.com inaccessible. It was scheduled maintenance that ended up taking more things down than they expected.

https://status.tilaa.com/incidents/6hfj589ys0w1 and https://status.tilaa.com/incidents/4b2nb4tdw7mc are the issues on Tilaa's incident report page.

Unclear what was the root cause here, as our incident started between the two. But I suspect it ha...

Feb 11, 2025
1 incident

a and b were briefly unavailable (multiple times!)

Resolved Feb 11 at 01:56am HST

a and b servers were briefly unavailable, twice! Once on 7 Feb 2025 and a second time on 11 Feb 2025. The a server even had a third outage on 10 Feb 2025! This did not affect any users for two reasons:

  1. every server needs to go down before the check-in service becomes unavailable, and
  2. we're not launched.

Throughout these incidents, the servers triplepat.com, c, and [d](https://d...