Skip to content
This repository has been archived by the owner on Mar 9, 2018. It is now read-only.

Rapns Worker stalls and not setting notifications as delivered #179

Open
navied opened this issue Nov 19, 2013 · 8 comments
Open

Rapns Worker stalls and not setting notifications as delivered #179

navied opened this issue Nov 19, 2013 · 8 comments

Comments

@navied
Copy link

navied commented Nov 19, 2013

I am having an issue with the rapns worker lately, It seems to suddenly stop working randomly and needs to be restarted, but what is worse is that the notifications that are sent after restarting the rapns worker is not being set as delivered and need to manually be set or the next time the worker is restarted it will resend them all again.

@ileitch
Copy link
Owner

ileitch commented Nov 20, 2013

Not much detail for me to work with here. Do you see an SQL activity in the log? Does sending USR2 signal print any debug info to the log?

@navied
Copy link
Author

navied commented Nov 20, 2013

Yeah sorry about, I actually looked at SQL logs around the same time the
last notification was sent and it looks like nothing weird has happened.

The problems started when I did a global notification to all our users
basically creating thousands of notifications at one time. It stalled
sending the notifications and I noticed that the ones that were already
marked as send were not marked as sent in the database. I had to manually
make sure all the notifications were marked as delivered because the next
time rapns restarted it would go through all those notifications again.

I will perform a USR2 signal when the problem pops up again to see if
anything useful shows up.

On Wednesday, November 20, 2013, Ian Leitch wrote:

Not much detail for me to work with here. Do you see an SQL activity in
the log? Does sending USR2 signal print any debug info to the log?


Reply to this email directly or view it on GitHubhttps://github.com//issues/179#issuecomment-28877814
.

Navied Shoushtarian
Founder & CTO, LISTN http://getlistn.com/
twitter.com/nshoush http://twitter.com/nshoush | linkd.in/URo4iO
http://linkd.in/URo4iO

e: [email protected]

@ileitch
Copy link
Owner

ileitch commented Nov 21, 2013

How long after creating the notifications did you then check the log?

I wonder if the batch size is too large. The default is to load 5000 notifications at a time, combined with the error detection option check_for_errors with implies a sleep of 0.2s for every notification. With a single connection to the APNs, the best possible processing time for the batch is 16.6 minutes. In reality it could be at least 50% higher.

Try setting config.batch_size to a lower value, say 100. Also, I'd recommend turning off config.check_for_errors, it's an imperfect check anyway (I'm planning an improvement).

@navied
Copy link
Author

navied commented Nov 25, 2013

It has been behaving properly so far, I am not too sure why, could be less notifications being delivered or possibly because I enabled config.batch_storage_updates = true , but that could of always been true as a default.

What are the downsides to turning off check_for_errors would that effect feedback and disabling devices that do not have the app installed anymore?

@j-mcnally
Copy link

Is it possible if you are using MyISAM instead of INNODB tables that you are running into locks?

@navied
Copy link
Author

navied commented Nov 28, 2013

Using PostgreSQL, which correct me if I am wrong uses neither but it's own storage engine.

@j-mcnally
Copy link

yeah, durr just throwing that out there.

@navied
Copy link
Author

navied commented Dec 23, 2013

So the rapns process did this again after not doing it for a while, I tried sending a USR2 signal doing "kill -USR2 pid" but nothing has been returned in the logs. I might be doing the USR2 signal wrong. I just ended up lowering the rapns batch size and turned off error checking to see if how that affects it now.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants