-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guidance on implementing async timeout #64
Comments
I am also experiencing I was able to get stacktrace for current threads via
src: https://github.com/mperham/sidekiq/wiki/Signals Stacktrace:
It points to: |
Can you please check if this clarifies it for you? |
Hey @ostinelli TL;DR; LONG STORY: We have 1 worker for 1 "batch notification". Code looks like: # connection pool - declared as CONSTANT inside class (not method) in AbstractWorker
APNOTIC_POOL = ConnectionPool.new(size: 5) do
connection = Apnotic::Connection.new(
cert_path: Rails.root.join(ENV.fetch('IOS_PUSH_APN_PEM_CERT'))
)
connection.on(:error) do |exception|
Sidekiq.logger.error "Exception has been raised on APNS socket: #{exception.inspect}"
end
connection
end
# taken from SendMarketRateNotificationWorker#perform
ActiveRecord::Base.uncached do
user_scope(type, rate.currency_name).find_in_batches.with_index do |group, index|
device_tokens = group.pluck(:device_token)
device_tokens.each do |device_token|
send_push_iphone(device_token, notification_message, type)
end
end
end
# [...]
def send_push_iphone(device_token, notification_message, type)
APNOTIC_POOL.with do |connection|
notification = Apnotic::Notification.new(device_token)
notification.alert = notification_message
notification.sound = 'default'
notification.topic = # ...
notification.custom_payload = {
others: {
device_token: device_token,
push_type: type
}
}
push = connection.prepare_push(notification)
connection.push_async(push)
end { We don't aim at 100% accuracy, I am happy if 99.9% notifications are sent, so no retry for rare failed notifications } In my experience, if connection is
This connection is still inside { 6 godzin temu means == 6 hours ago -> this job is stuck, iphone worker takes 2-5mins } This result in worker waiting forever (
|
I don't understand how you can get this error being raised if you define a connection callback:
This should take care of not raising an error. Any ideas why you are seeing this? On a side note, instead of your current custom way of declaring a connection you can now use: APNOTIC_POOL = Apnotic::ConnectionPool.new({
cert_path: Rails.root.join("config", "certs", "apns_certificate.pem"),
cert_pass: "mypass"
}, size: 5) do |connection|
connection.on(:error) { |exception| puts "Exception has been raised: #{exception}" }
end |
Hey @ostinelli
Right now I switched from
It seems to me like after connection is broken |
#73 has nothing to do with this. What do you mean "I pasted old stacktrace"? |
That means this stacktrace:
was when, I had class AbstractSendMobileNotificationWorker
include Sidekiq::Worker
sidekiq_options retry: 0
APNOTIC_POOL = ConnectionPool.new(size: 5) do
connection = Apnotic::Connection.new(
cert_path: Rails.root.join(ENV.fetch('IOS_PUSH_APN_PEM_CERT'))
)
connection
end
# [...] After that (found info here) I added callback: class AbstractSendMobileNotificationWorker
include Sidekiq::Worker
sidekiq_options retry: 0
# https://github.com/ostinelli/apnotic/issues/48#issuecomment-399776823
APNOTIC_POOL = ConnectionPool.new(size: 5) do
connection = Apnotic::Connection.new(
cert_path: Rails.root.join(ENV.fetch('IOS_PUSH_APN_PEM_CERT'))
)
connection.on(:error) do |exception|
Sidekiq.logger.error "Exception has been raised on APNS socket: #{exception.inspect}"
end
connection
end
# [...] Sidekiq stopped crashing when connection was closed, but I experienced issues with hanging sidekiq jobs.
In my experience (YMMV) if you got 2 |
Please see if latest |
I'm posting this here, because it seems the same as the original poster's issue, however I am in running using a single connection in a rake task. It happens inconsistently. This is the stack that is dumped, when I interrupt a stuck process using sig TERM: versions: (ruby 2.3.8)
trace reported when interrupted:
it's a basic async_push:
In the attempt to troubleshoot, I have reduced a batch of around 30k pushes on a connection, to doing batches of 200, and re-building the connection for each batch. However, I still run into this "sleep forever" behavior on .join in about 1/4 of the runs of the job. |
I'm having a similar hang on sends (not async, and using individual connections, not connection pool, although the problem happens with the pool also). It just seems to get stuck on the send forever. So far, adding an explicit timeout to the send call seems to be working, but it remains to be seen if this is a long term fix.
Does the internal HTTP2 call without a specified timeout just wait indefinitely and that's what's causing these hangs? apnotic/lib/apnotic/connection.rb Lines 45 to 49 in b71af58
|
I would like to include handling for timeout situations when using
.push_async
and would appreciate some guidance on how best to accomplish this.To provide some context, we first discovered timeout-related issues in our own project upon using Apnotic's
connection.join
. The join was never returning and there were also notifications that had been queued via.push_async
that had not run theiron(:response)
yet. Our knee jerk solution is to wrap.join
in a timeout block and run.close
shortly after, whether a timeout occurs or not.This got me to thinking... if two notifications are sent with the async method, are they always delivered in sequence? If there is some sort of delay with sending the first one, does the second one stay
stuck
? My guess is that it depends upon how the http2 streams multiplex the notifications. Is this accurate?Ultimately, I'm hoping to find a robust way to time out these problems and retry but would greatly benefit from some insights on the best way to approach it with this library.
The text was updated successfully, but these errors were encountered: