-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Requests getting lost when handle_request receives non-selectively #42
Comments
One question I meant to ask is that if we go with my solution for the whole thing, do we accept to duplicate the 'Req' value with all of its contents? I know misultin already does sandboxing that way -- we'd end up with 3 copies of the socket by acting that way, but I see no other good fix that doesn't involve doing things that way. Is it worth it for the corner cases where people do it my way and punish everyone, or punish the few people who have that issue by needing them to look at documentation (which we will need to write at some point). |
@ferd, could you copy paste solution? |
Here you go, @sinnus. The code is a bit customized for some closed-source app I had, and you can disregard some options such as
You can see the trick is starting at |
@ferd, I can't understand how messages can get lost? |
New socket calls are received as messages by the TCP socket controlling process. Misultin dispatches a process to do that, which socket.io's code sits in to redirect the messages to the right spot. So if the These messages were likely meant for the socketio process in charge of handling queries, but might have been consumed or discarded by your handler. That means that on the other end of the request (the browser), the query will never be replied to and it will time out. This can also be true of messages sent to the socketio process in charge of handling queries, unrelated to sockets (internal communications). Because of this, I had to split my queries into two processes. One one hand, the handle_request function will be guaranteed not to consume messages it wasn't meant to, and by spawning a new process to handle the actual request, I allow myself to do and receive whatever I want in |
@ferd, is it Misultin issue to bypass messages if a function takes a lot of time? |
I don't think we could pinpoint the problem to a single place. The problem comes from the interaction of many logical actors (Misultin's side, socket.io-erlang's side, the callback's side) running with the same mailbox. You can fix the issue on any side. I don't know of a way to do it that will not require adding processes, though. |
@ferd, thank you for your answers! As I can understand the problem is that mailbox is "shared" between other sides, |
Exactly. Then ownership of messages for each query or worker becomes extremely vague. Misultin doesn't read messages, but you can have conflicts between your callbacks and socket.io-erlang. Using processes the way I did it solves the problem. |
Hello, @ferd! I tested "long running" callback for longpolling and websocket using timer:sleep: -module(socketio_transport_websocket). ...
%% Websockets
handle_call({websocket, Data, _Ws}, _From, #state{ heartbeat_interval = Interval, event_manager = EventManager } = State) ->
error_logger:info_msg("Before timer~n", []),
timer:sleep(1000),
error_logger:info_msg("After timer~n", []),
... -module(socketio_transport_polling). %% Incoming data
handle_call({_TransportType, data, Req}, From, #state{ server_module = ServerModule,
event_manager = EventManager,
polling_duration = Interval,
sup = Sup } = State) ->
error_logger:info_msg("Before timer~n", []),
timer:sleep(1000),
error_logger:info_msg("After timer~n", []),
... I used flash socket.io client to send data to server and all messages was processed by server side. I only faced with longpolling timeout when sleep:timer > longpolling timeout. |
I don't sure I understand? You're trying to reproduce the issue I had? |
Yes, I'm trying to reproduce the issue without "reloading many tabs of the same browser really fast". |
@ferd, I have reproduced your issue. My test code above was in wrong place because of misunderstanding, sorry. My steps to reproduce:
handle_request('GET', [], Req) ->
error_logger:info_msg("Before timer", []),
timer:sleep(10000),
error_logger:info_msg("after timer", []),
Req:file(filename:join([filename:dirname(code:which(?MODULE)), "index.html"])); |
As we can see handle_request('GET', [], Req) blocks socket.io server. All clients will be wait handle_request function. |
TL;DR: When we're using 'handle_request/5' to deal with a request, the control of the socketio_http process is given up to the user and queries can get lost when doing a non-selective receive during quick successions of client requests
I used the handle_request/5 call and had it communicating with non-socket.io processes. As part of its task, the callback had to do a non-selective receive and digest everything it received in a while to handle them later (much like a gen_server would do it).
However, socketio_http runs the 'handle_request' function calls within its own process, so whenever socketio_http_misultin or some other instance calls the socketio_http server for the current connection more than once (can be simulated by reloading many tabs of the same browser really fast), the 'handle_request' currently running is the one getting the {request, 'GET', ...} message bypassing the socketio_http process altogether. The process model used in this case should be modified to avoid such issues.
In the meantime (and for external readers), I solved the issue by basically spawning a new process in handle_request, then monitoring it and waiting only for the monitor of that process to be done. During that time the other process can send the data to the socket as much as it wants, but we're forcing keep-alive/pipelined calls to be serialized when they're in the same section. This exact model should probably be moved from my own application to socket.io-erlang itself (to be discussed).
The text was updated successfully, but these errors were encountered: