[QFJ-215] QFJ deadlocks in Session.disconnect() code when a Windows client disconnects from Linux or Mac server Created: 26/Jul/07 Updated: 11/Feb/09 Resolved: 27/Jul/07 |
|
Status: | Closed |
Project: | QuickFIX/J |
Component/s: | Engine |
Affects Version/s: | None |
Fix Version/s: | 1.3.0 |
Type: | Bug | Priority: | Default |
Reporter: | Toli Kuznets | Assignee: | Steve Bate |
Resolution: | Fixed | Votes: | 0 |
Labels: | None | ||
Environment: |
rev 709 |
Issue Links: |
|
Description |
I'm encountering a situation (in the HEAD code) where I get a deadlock when i disconnect from a QFJ acceptor running on Linux or Mac OS X from a Windows machine. Repro: The first time the connection will go through successfully. Now, press Ctrl-C to disconnect the Banzai process - the "acceptor" side registers the disconnect messages: Now, try connecting Banzai again. on the Banzai side, you see the logon initiation messages: On the 2nd try, when you see Banzai trying to log on, there are no messages showing up on the Executor side. Doing a Ctrl-\ to get a stack trace yields the following: "SocketAcceptorIoProcessor-0.0" prio=10 tid=0x905e1800 nid=0x2e7f in Object.wait() [0x8ffad000..0x8ffadf30] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xb1de2338> (a org.apache.mina.common.support.DefaultCloseFuture) at java.lang.Object.wait(Object.java:485) at org.apache.mina.common.support.DefaultIoFuture.join(DefaultIoFuture.java:86) - locked <0xb1de2338> (a org.apache.mina.common.support.DefaultCloseFuture) at quickfix.mina.IoSessionResponder.disconnect(IoSessionResponder.java:44) at quickfix.Session.disconnect(Session.java:1369) at quickfix.mina.AbstractIoHandler.exceptionCaught(AbstractIoHandler.java:82) at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.exceptionCaught(AbstractIoFilterChain.java:695) at org.apache.mina.common.support.AbstractIoFilterChain.callNextExceptionCaught(AbstractIoFilterChain.java:423) at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:54) at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.exceptionCaught(AbstractIoFilterChain.java:794) at org.apache.mina.common.IoFilterAdapter.exceptionCaught(IoFilterAdapter.java:78) at org.apache.mina.common.support.AbstractIoFilterChain.callNextExceptionCaught(AbstractIoFilterChain.java:423) at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:54) at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.exceptionCaught(AbstractIoFilterChain.java:794) at org.apache.mina.common.support.AbstractIoFilterChain$HeadFilter.exceptionCaught(AbstractIoFilterChain.java:611) at org.apache.mina.common.support.AbstractIoFilterChain.callNextExceptionCaught(AbstractIoFilterChain.java:423) at org.apache.mina.common.support.AbstractIoFilterChain.fireExceptionCaught(AbstractIoFilterChain.java:407) at org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:293) at org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:241) at org.apache.mina.transport.socket.nio.SocketIoProcessor.access$500(SocketIoProcessor.java:44) at org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:563) at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:43) at java.lang.Thread.run(Thread.java:619) When you run this on a vanilla Quickfix/J-1.2.1 distribution everything works. However, it starts breaking using the head (as of 709) revision. I'm inclined to think this is a QFJ and not a MINA issue, since it works fine with 1.2.1 but not with SVN revision. I've looked at the code, and it seems that there are 2 different code paths that happen. I saw something related in https://issues.apache.org/jira/browse/DIRMINA-261 but since we are not specifying SO_LINGER (or SocketLinger) i don't think it applies. i tried setting SocketLinger to be 0, but that didn't change the behaviour. I can reproduce this consistently, so would like any advice on how to go about further debugging or fixing this. |
Comments |
Comment by Steve Bate [ 27/Jul/07 ] |
I've removed the CloseFuture.close() call. That shouldn't have been there because the threading model we are using will cause MINA to not be able to process the close completion event (and complete the join()). |