Eerdere incidenten

april 2025
15 apr, 2025
1 incident

triplepat.com is down

Uitvaltijd

Opgelost apr 15 om 06:37 HDT

triplepat.com recovered.

1 eerdere update

14 apr, 2025
1 incident

triplepat.com is down

Uitvaltijd

Opgelost apr 14 om 05:10 HDT

The root cause is a failed nginx deployment due to what looks like a race condition and/or an overly-picky health check. We are auditing the health checks.

Redeploying the exact same config worked, so it's clear that this failure has something to do with either races or ephemeral machine state.

2 eerdere updates

maart 2025
29 mrt, 2025
1 incident

b.triplepat.com is down

Uitvaltijd

Opgelost mrt 31 om 01:12 HDT

The machine b.triplepat.com became unavailable and eventually rebooted due to a "power event" and corresponding Google Cloud outage in its datacenter. When it came back up, the containers did not all start successfully at boot-time.

It looks like the internal DNS for the docker-compose network didn't come-up successfully, which meant that the nginx config could not resolve the internal names for redirects, which meant that ng...

2 eerdere updates

19 mrt, 2025
1 incident

d.triplepat.com is down

Uitvaltijd

Opgelost mrt 31 om 01:14 HDT

TILAA has network issues more frequently than we (or they!) would like. This was one of them. Our every-node-is-a-master-node redundancy system means that everything still managed to work fine, however.

2 eerdere updates

februari 2025
27 feb, 2025
1 incident

a.triplepat.com was down after a OS upgrade + reboot

Uitvaltijd

Opgelost feb 28 om 04:23 HST

We tried to upgrade the OS and reboot the a server yesterday to see if that was a safe and quick operation. It was not. We delayed in bringing it back online because we haven't launched and were in the middle of a major code change.

2 eerdere updates

17 feb, 2025
1 incident

d server was unreachable repeatedly

Opgelost feb 17 om 02:03 HST

Tilaa cloud (where the d server is hosted) had some network issues that made d.triplepat.com inaccessible. It was scheduled maintenance that ended up taking more things down than they expected.

https://status.tilaa.com/incidents/6hfj589ys0w1 and https://status.tilaa.com/incidents/4b2nb4tdw7mc are the issues on Tilaa's incident report page.

Unclear what was the root cause here, as our incident started between the two. But I suspect it ha...

11 feb, 2025
1 incident

a and b were briefly unavailable (multiple times!)

Opgelost feb 11 om 01:56 HST

a and b servers were briefly unavailable, twice! Once on 7 Feb 2025 and a second time on 11 Feb 2025. The a server even had a third outage on 10 Feb 2025! This did not affect any users for two reasons:

  1. every server needs to go down before the check-in service becomes unavailable, and
  2. we're not launched.

Throughout these incidents, the servers triplepat.com, c, and [d](https://d...