Hi,
I know this is quite impossible to diagnose from afar, but I came across the posting from lemmy.world admins talking about the attacks they are facing where the database will get overwhelmed and the server doesn’t respond anymore. And something similar seemed to have happened to my own servers.
Now, I’m running my own self-hosted Lemmy and Mastodon instances (on 2 seperate VPS) and had them become completely unresponsive yesterday. Mastodon and Lemmy both showed the “there is an internal/database error” message and my other services (Nextcloud and Synapse) didn’t load or respond.
Login into my VPS console showed me that both servers ran at 100% CPU load since a couple of hours. I can’t currently SSH into these servers, as I’m away for a couple of days and forgot to bring my private SSH key on my Laptop. So, for now I just switched the servers off.
Anyway, the main question is: what should I look at in troubleshooting when I’m back home? I’m a beginner in selfhosting and I run these instances just for myself and don’t mind if I’d have to roll them back a couple days (I have backups). But I would like to learn from this and get better at running my own services.
For reference: I run everything in docker containers behind Nginx Proxy Manager as my reverse proxy. I have only ports 80, 443 and 22 open to the outside. I have fail2ban set up. The Mastodon and Lemmy instances are not open for registration and just have 2 users each (admin + my account).
If you do not neglect updates, then by all mean, changing ports does not hurt. :) Sorry if I may have strong reaction on that, but I’ve seen way too many people in the past couple decades counting on such anecdotal measures and not doing the obvious. I’ve seen companies doing that. I’ve seen one changing ports, forcing us to use the company certificate to log in, and then not update their servers in 6 months. I’ve seen sysadmins who considered that rotating servers every year made it useless to update them, but employees should all use Jumpcloud “for security reasons”! Beware, though, mentioning port changing without saying it’s anecdotal and the most important thing is updates, because it will encourage such behaviors. I think the reason is because changing ports sounds cool and smart, while updates just sound boring.
That being said, port scanning is not just about targeted pentesting. You can’t just run nmap on a host anymore, because IDS (intrusion detection systems) will detect it, but nowadays automated pentesting tools do distributed port scanning to bypass them : instead of flooding a host to test all their ports, they test a range of hosts for the same port, then start over with a new port. It’s half-way classic port scanning and the “let’s just test the whole IP range for a single vulnerability” that we more commonly see nowadays. But they are way harder to detect, as they scan smaller sets of hosts, and there can be hours before the same host is tested twice.