[QFJ-804] Race condition between sessionClosed and processing logout message received Created: 31/Jul/14  Updated: 27/Oct/14  Resolved: 27/Oct/14

Status: Closed
Project: QuickFIX/J
Component/s: Build
Affects Version/s: 1.5.3
Fix Version/s: 1.6.0

Type: Bug Priority: Default
Reporter: Glyn Walters Assignee: Christoph John
Resolution: Duplicate Votes: 0
Labels: QuickfixJ

Issue Links:
Duplicate
duplicates QFJ-790 Logout message not propagated to the ... Closed
Relates
relates to QFJ-810 Messages get lost during logon proces... Open

 Description   

sessionClosed() called by AbstractIoHandler can call disconnect() on Session before a logout message has been processed by the EventHandlingSession. (This was witnessed with a SingleThreadedEventHandlingStrategy).

The disconnect() call on Session results in logonSent being set to false in SessionState. If this happens before SingleThreadedEventHandlingStrategy calls next(Message) on the Session object, the logout message will subsequently throw a SessionException in validLogonState() (as logonSent is now false). And the message will not reach the application.

This is problematic in the scenario where an ECN is informing us in a logout message of an issue with sequence numbers. The message is not received and the appropriate action can't take place,

This is an example of the occurrence from our logs with Reuters:

20140731-10:44:42.662 INFO [SocketConnectorIoProcessor-1.0] quickfix.Session - [XXX/YYY->TR MATCHING/FXM] Disconnecting: IO Session closed
20140731-10:44:42.663 ERROR [QFJ Message Processor] quickfixj.errorEvent - XXX/YYY->TR MATCHING/FXM: quickfix.SessionException Logon state is not valid for message (MsgType=5)

One possible suggestion would be to check the EventHandlingStrategy queue size and delay the sessionClosed call if there are outstanding events.

E.g. in InitiatorIoHandler

@Override
public void sessionClosed(IoSession ioSession) throws Exception {
while (eventHandlingStrategy.getQueueSize() > 0) {
log.info("Delaying sessionClosed while " + eventHandlingStrategy.getQueueSize() + " event(s) "
+ "are processed");
try

{ Thread.sleep(500); }

catch (InterruptedException e)

{ log.warn("sessionClosed dealy was interrupted"); }

}
super.sessionClosed(ioSession);
}

Perhaps there are better approaches. Is this a known race condition?



 Comments   
Comment by Christoph John [ 31/Jul/14 ]

Hi, thanks for the report. This sounds similar to QFJ-790.

Comment by Glyn Walters [ 31/Jul/14 ]

Hi, yes indeed, we're describing the same issue I believe. His proposal of having the event handler informed of the end of stream, or the approach of having the mina thread wait, to give the event handling queue to empty both would seem to be possible approaches.

Comment by Glyn Walters [ 31/Jul/14 ]

Try to phrase that a bit better: -edit "to give the event handling queue a chance to empty, would both seem possible approaches"

Comment by Christoph John [ 27/Oct/14 ]

I closed this as duplicate. Moreover, the approach from QFJ-790 will be followed. During tests it showed that waiting in the sessionClosed callback made the engine lock in some cases which got resolved after the Logout timeout. This is due to the fact that two threads want to access the ResponderSync object.

Generated at Sat Nov 23 01:22:15 UTC 2024 using JIRA 7.5.2#75007-sha1:9f5725bb824792b3230a5d8716f0c13e296a3cae.