-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mailcow runs into rate limit for every mail #5168
Comments
Ok. I've now manually commented everything in the rspamd rate limit configuration, and I'm getting mails again at least. Good enough for some "emergency recovery". Trying to find out more. |
Same problem here but only for specific addresses/domains. |
We have found an issue related to this as well. |
@cajus what have you commented out exactly, can you give us a diff? There are 176 occurrences to ~/mailcow# grep -r -I ratelimit . | wc -l
176 |
@immanuelfodor I entered the rspamd container from my mailcow directory using
and commented everything in
using nano. After a restart it worked again (without ratelimits of course). But as there's other work to do, I didn't search for reasons yet. Edit: which is the wrong way to do it and based on the "gnaaaaaargh how to fix this quickly" effect while search for the reason inside the container. #5168 (comment) is the real way to do it - as long as there's no upstream fix for it. |
Thanks! I comented everything out in just the following file, and after restarting rspamd, SOGo can now send emails again. Here is the diff ~/mailcow# git diff diff --git a/data/conf/rspamd/override.d/ratelimit.conf b/data/conf/rspamd/override.d/ratelimit.conf
index aec1c788..2dd733ef 100644
--- a/data/conf/rspamd/override.d/ratelimit.conf
+++ b/data/conf/rspamd/override.d/ratelimit.conf
@@ -1,12 +1,12 @@
-rates {
- # Format: "1 / 1h" or "20 / 1m" etc. - global ratelimits are disabled by default
- to = "100 / 1s";
- to_ip = "100 / 1s";
- to_ip_from = "100 / 1s";
- bounce_to = "100 / 1h";
- bounce_to_ip = "7 / 1m";
-}
-whitelisted_rcpts = "postmaster,mailer-daemon";
-max_rcpt = 25;
-custom_keywords = "/etc/rspamd/lua/ratelimit.lua";
-info_symbol = "RATELIMITED";
+#rates {
+# # Format: "1 / 1h" or "20 / 1m" etc. - global ratelimits are disabled by default
+# to = "100 / 1s";
+# to_ip = "100 / 1s";
+# to_ip_from = "100 / 1s";
+# bounce_to = "100 / 1h";
+# bounce_to_ip = "7 / 1m";
+#}
+#whitelisted_rcpts = "postmaster,mailer-daemon";
+#max_rcpt = 25;
+#custom_keywords = "/etc/rspamd/lua/ratelimit.lua";
+#info_symbol = "RATELIMITED"; ~/mailcow# docker-compose restart rspamd-mailcow |
I have reviewed our commits but could not identify any changes that could have caused this issue. However, it seems that the problem may be related to the recent upgrade of the rspamd container to version 3.5, as they have made some modifications to the ratelimit feature. As a possible solution, we could try to perform a hotfix by rolling back the rspamd image to a previous version. can someone with this problem test this fix?
to
|
@FreddleSpl0it is rspamd "connected" to fail2ban? In this case the PR #5127 could be related to this. Of course the rspamd ratelimit feature modifications are a viable explanation. |
mhm, cannot really see how this could be connected to the issue. Maybe I'm missing something.
|
Reverted the change in the ratelimit file, restarted rspamd as before, and it still lets SOGo to send emails to both internal and external addresses, receiving from both internal and external also works 🤷♂️ What's happening here. |
Did a |
Maybe you've to run into a ratelimit once, and it doesn't recover? |
It's still weird:
If it's just a coincidence, it's definitely a rare one 😀 |
I suspect that this might be the case as well. |
Same here, we have used our server without ever realizing that such a limit existed, on monday we have upgraded our instance and now this. And it started working again by commenting the ratelimit part in the conf. |
Same problem here. No more incoming emails. |
I've repushed the image. Can someone try it again by using |
Just did it, and it goes instantly into the "ratelimit" state. No Mail sending possible. Commenting everything in the ratelimit configuration makes it work again. So - whatever it is - it's not yet resolved. @FreddleSpl0it reverting rspamd to 1.92 seems to work at the first glance. |
Thank you for the feedback, @cajus. Currently, @DerLinkman has republished the old image, so no one should experience any issues when updating. We'll investigate further. |
Hi, |
@erichk4 did you updated today? In my test environment, it only seems to affect mailboxes with a ratelimit set. Once the ratelimit was triggered, it doesn't reset. In Redis, the expiration of the hash you see in 'System -> Information -> Logs -> Ratelimits' was set to 1 day and 22 hours for a ratelimit of '1/1m'. |
Could someone tell me how the ratelimit was set for the problematic mailboxes and how many recipients were attempted to be sent to? |
The server we have that was affected by this had a domain rate limited to 1 message per second. After it was triggered no email could be sent from that domain. |
Thanks, now I know what the issue is. It was introduced with the latest Rspamd update and has already been fixed in the master branch. We will wait for the release. In the meantime, we have republished the old image. A possible workaround would be to avoid using '1' as the rate limit value, such as '1/1m' or '1/10d'. |
Out of interest could you link to the upstream issue that caused this? |
If i get everything right, than this should be the fix |
Hmm. I've still hundreds of rate-limit admin mails, and some mails are not delivered. Even with the updated rspamd image. Deactivating it again. |
This hit me pretty hard today. I had no rate limits set on any domains or mailboxes anywhere, but it suddenly enabled send and receive limits on 24 of my accounts for no reason, giving bogus messages about how the rate limit was set to "to" on the boxes. Other accounts were unaffected. Manually setting the system back to rspamd 1.92, doing a down, pull, and up seems to have fixed it (hopefully). It even recognized its own bogus limit hashes and removed them. A little terrifying because I didn't install the April update until Monday evening, so it's still pushing the bad version out as of this week, it seems I don't think the double counting bug explains this, since I didn't have rate limits turned on. |
@evultrole it seems that the affected version of Rspamd, version 3.5, is still being shipped. I just cloned a fresh mailcow and logged into Rspamd, and it showed version 3.5. Did you happen to look at the symbols added to the rate-limited emails in the Rspamd history? |
I'm not certain I'm looking at the right thing, but the log for the events is still there so I can check whatever you want if it will be helpful. Is this what you're looking for? Symbols for a rate limit on send RATELIMITED (0) [to(RLtqzparnjyoujkrdy1ggen5re)] Symbols for a rate limit on recieve ASN (0) [asn:22606, ipnet:13.111.0.0/16, country:US] That's all there is on the listings, which is quite short compared to stuff that goes through. |
No DYN_RL symbol was added, which indicates that you have run into a global ratelimit. |
Same Situation here. Going back to: solved it for me. Maybe until the next ratelimit will ocure. |
got the same problem with ratelimits in the last days running mailcow 2023-04a. tried the provided solutions without success:
only the switch back to 1.92 in docker-compose.yml solved the issue for me. |
@FreddleSpl0it My ratelimit.conf is 100% stock and has not been touched. For more information: This is a very light use server, it sends less than 100 messages a day, mostly from copy machine scan-to-email functions, with no automated mailers interacting with it. It also only receives about 600 messages a day, including those rejected by rspamd. I can't imagine how any of these global limits could have possibly been triggered, even with each message being double counted.
|
I just noticed that |
@evultrole it's not that the messages get double counted. The recipients get double counted. If you send to 50 recipients, the ratelimit I'm not entirely sure about this comment: |
This E-Mail triggered the rate limit for me this morning. I've updated to the latest version of mailcow yesterday (the last update before that was 14 days earlier), I've never had issues with rate limiting before. I notice the "MISSING_TO", unfortunately I can't see how many recipients the mail had. Since it's some kind of spam I guess its possible that there's a long list in CC. This is rspamd info about the email the time it was greylisted, on the next try it was rate limited and after that the customer no longer received mail, it was all rate limited:
|
@Lennix Thanks for the info. Then maybe we should lower the burst. At the moment, there is no burst limit specified. For example, the rate limit to = "100 / 1s"; has a leak rate of 100 emails per second, but there is no explicit burst limit set. Without a burst limit, a single sender could potentially send up to 100 emails at once, filling the rate limit. If we set the burst limit to 50, then a single sender can only send up to 50 emails at once |
I also just encountered this (or something similiar?). |
Where do i execute git diff? In the container or? |
Git diff is just showing the changes. Comment out everything in your |
Get into similar problem today. Come from nowhere, we getting some amount of spam on our alias
Mailcow version |
Can you try to update to 2023-04b? The rspamd tag got reassigned to 1.92 |
Is it safe to rollback to 3.4? Some checks were not successful |
I have updated to 2023-04b. |
will do at midnight, but that problem appear after about 9 days after the installation and switch to mailcow, will take a look after update and revert to default ratelimits and see if the problem occurs in two weeks |
Thanks! Seems that fixed it.
It was downgraded from 1.93: 5c025bf |
Hi, I am remote, so not able to update mailcow, I am on 2023-04a. I will try 2023-04b when I will return. |
Is it somehow possible to flush the rate limit for a certain email / domain? Or to disable getting a mail every other minute – to say every hour? |
Contribution guidelines
I've found a bug and checked that ...
Description
Logs:
Steps to reproduce:
Which branch are you using?
master
Operating System:
Fedora 37
Server/VM specifications:
8G, 4 CPUs
Is Apparmor, SELinux or similar active?
yes
Virtualization technology:
none
Docker version:
23.0.2
docker-compose version or docker compose version:
v2.9
mailcow version:
2023-04a
Reverse proxy:
nginx
The text was updated successfully, but these errors were encountered: