[QFJ-433] Session can deadlock on responderSync and reduced concurrency Created: 16/Apr/09  Updated: 15/Nov/12  Resolved: 16/Apr/09

Status: Closed
Project: QuickFIX/J
Component/s: Engine
Affects Version/s: 1.4.0
Fix Version/s: None

Type: Bug Priority: Default
Reporter: Parwinder Sekhon Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Environment:

Windows Java 1.6


Issue Links:
Duplicate
duplicates QFJ-421 Session responder lock is too restric... Closed

 Description   

Because Session is using a String literal as a lock, the scope of this lock is not per Session instance

private String responderSync = "SessionResponderSync";
// @GuardedBy(responderSync)
private Responder responder;

As well as reducing concurrency it caused my app to deadlock, because the vm will intern "SessionResponderSync". I suggest changing this to

private String responderSync = new String("SessionResponderSync");

My deadlock was probably caused because I was trying a combination of

1) synchronous writes on both the acceptor and initiator
2) both the acceptor and initiator in the same vm process
3) switching out the LinkedBlockingQueue in the SingleThreadedEventHandlingStrategy with a bounded ArrayBlockingQueue

Here is my stack trace if it helps:

SocketConnectorIoProcessor-0.0@1168 daemon, priority=5, in group 'main', status: 'WAIT'
at sun.misc.Unsafe.park(Unsafe.java:-1)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1,925)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:252)
at quickfix.mina.SingleThreadedEventHandlingStrategy.onMessage(SingleThreadedEventHandlingStrategy.java:47)
at quickfix.mina.initiator.InitiatorIoHandler.processMessage(InitiatorIoHandler.java:54)
at quickfix.mina.AbstractIoHandler.messageReceived(AbstractIoHandler.java:113)
at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.messageReceived(AbstractIoFilterChain.java:570)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceived(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.codec.support.SimpleProtocolDecoderOutput.flush(SimpleProtocolDecoderOutput.java:58)
at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:180)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceived(AbstractIoFilterChain.java:648)
at org.apache.mina.common.support.AbstractIoFilterChain$HeadFilter.messageReceived(AbstractIoFilterChain.java:499)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.fireMessageReceived(AbstractIoFilterChain.java:293)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:228)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:198)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.access$400(SocketIoProcessor.java:45)
at org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:485)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:619)

SocketAcceptorIoProcessor-0.0@1170, priority=5, in group 'main', status: 'WAIT'
at sun.misc.Unsafe.park(Unsafe.java:-1)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1,925)
at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:252)
at quickfix.mina.SingleThreadedEventHandlingStrategy.onMessage(SingleThreadedEventHandlingStrategy.java:47)
at quickfix.mina.acceptor.AcceptorIoHandler.processMessage(AcceptorIoHandler.java:97)
at quickfix.mina.AbstractIoHandler.messageReceived(AbstractIoHandler.java:113)
at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.messageReceived(AbstractIoFilterChain.java:570)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceived(AbstractIoFilterChain.java:648)
at org.apache.mina.filter.codec.support.SimpleProtocolDecoderOutput.flush(SimpleProtocolDecoderOutput.java:58)
at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:180)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilterChain.java:53)
at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceived(AbstractIoFilterChain.java:648)
at org.apache.mina.common.support.AbstractIoFilterChain$HeadFilter.messageReceived(AbstractIoFilterChain.java:499)
at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(AbstractIoFilterChain.java:299)
at org.apache.mina.common.support.AbstractIoFilterChain.fireMessageReceived(AbstractIoFilterChain.java:293)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.read(SocketIoProcessor.java:228)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.process(SocketIoProcessor.java:198)
at org.apache.mina.transport.socket.nio.SocketIoProcessor.access$400(SocketIoProcessor.java:45)
at org.apache.mina.transport.socket.nio.SocketIoProcessor$Worker.run(SocketIoProcessor.java:485)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:619)

SocketConnector-0@1154 daemon, priority=5, in group 'main', status: 'RUNNING'
at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(WindowsSelectorImpl.java:-1)
at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:274)
at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:256)
at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:137)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at org.apache.mina.transport.socket.nio.SocketConnector$Worker.run(SocketConnector.java:378)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:619)

QFJ Message Processor@1284 daemon, priority=5, in group 'main', status: 'MONITOR'
waiting for main@1
at quickfix.Session.getResponder(Session.java:382)
at quickfix.Session.hasResponder(Session.java:393)
at quickfix.Session.next(Session.java:1,417)
at quickfix.Session.next(Session.java:853)
at quickfix.mina.SingleThreadedEventHandlingStrategy$SessionMessageEvent.processMessage(SingleThreadedEventHandlingStrategy.java:107)
at quickfix.mina.SingleThreadedEventHandlingStrategy.block(SingleThreadedEventHandlingStrategy.java:70)
at quickfix.mina.SingleThreadedEventHandlingStrategy$1.run(SingleThreadedEventHandlingStrategy.java:87)
at java.lang.Thread.run(Thread.java:619)

QFJ Message Processor@1236 daemon, priority=5, in group 'main', status: 'MONITOR'
waiting for main@1
at quickfix.Session.getResponder(Session.java:382)
at quickfix.Session.hasResponder(Session.java:393)
at quickfix.Session.next(Session.java:1,417)
at quickfix.Session.next(Session.java:853)
at quickfix.mina.SingleThreadedEventHandlingStrategy$SessionMessageEvent.processMessage(SingleThreadedEventHandlingStrategy.java:107)
at quickfix.mina.SingleThreadedEventHandlingStrategy.block(SingleThreadedEventHandlingStrategy.java:70)
at quickfix.mina.SingleThreadedEventHandlingStrategy$1.run(SingleThreadedEventHandlingStrategy.java:87)
at java.lang.Thread.run(Thread.java:619)

SocketAcceptor-0@1105, priority=5, in group 'main', status: 'RUNNING'
at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll0(WindowsSelectorImpl.java:-1)
at sun.nio.ch.WindowsSelectorImpl$SubSelector.poll(WindowsSelectorImpl.java:274)
at sun.nio.ch.WindowsSelectorImpl$SubSelector.access$400(WindowsSelectorImpl.java:256)
at sun.nio.ch.WindowsSelectorImpl.doSelect(WindowsSelectorImpl.java:137)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
at org.apache.mina.transport.socket.nio.SocketAcceptor$Worker.run(SocketAcceptor.java:220)
at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)
at java.lang.Thread.run(Thread.java:619)

QFJ Timer@1141 daemon, priority=5, in group 'main', status: 'MONITOR'
waiting for main@1
at quickfix.Session.getResponder(Session.java:382)
at quickfix.Session.hasResponder(Session.java:393)
at quickfix.Session.next(Session.java:1,417)
at quickfix.mina.SessionConnector$SessionTimerTask.run(SessionConnector.java:248)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:181)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:205)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

OrderBook-testExchange-EURUSD-work-1@733, priority=5, in group 'main', status: 'MONITOR'
waiting for main@1
at quickfix.Session.send(Session.java:1,856)
at quickfix.Session.sendRaw(Session.java:1,803)
at quickfix.Session.send(Session.java:1,850)
at quickfix.Session.sendToTarget(Session.java:504)
... my stack

main@1, priority=5, in group 'main', status: 'WAIT'
blocks QFJ Message Processor@1284
blocks QFJ Message Processor@1236
blocks QFJ Timer@1141
blocks OrderBook-testExchange-EURUSD-work-1@733
at java.lang.Object.wait(Object.java:-1)
at org.apache.mina.common.support.DefaultIoFuture.await0(DefaultIoFuture.java:121)
at org.apache.mina.common.support.DefaultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:100)
at org.apache.mina.common.support.DefaultIoFuture.join(DefaultIoFuture.java:79)
at quickfix.mina.IoSessionResponder.send(IoSessionResponder.java:50)
at quickfix.Session.send(Session.java:1,861)
at quickfix.Session.sendRaw(Session.java:1,803)
at quickfix.Session.send(Session.java:1,850)
at quickfix.Session.sendToTarget(Session.java:504)
... my stack



 Comments   
Comment by Parwinder Sekhon [ 16/Apr/09 ]

Well that didnt quite fix it for me. fyi, I am testing a client and server under extreme load using synchronous writes. And the system seems to get very slow, or queues build up inside mina hogging memory.

I have managed to get my stress test working well by making the following modification to Session.send

private boolean send(String messageString) {
getLog().onOutgoing(messageString);
Responder responder;
synchronized (responderSync)

{ responder = this.responder; }

if (responder == null)

{ getLog().onEvent("Attempt to send while not connected (message stored until connected)."); return false; }

return responder.send(messageString);
}

As you can see, I have reduced the scope of locking, so that other bits of code can call Session.getResponder without blocking for ages (with synchronous writes, responder.send may take a while).

This with a combination of switching out the LinkedBlockingQueue in the SingleThreadedEventHandlingStrategy with a bounded ArrayBlockingQueue (sized to 64 elements) is working well now.

Comment by Steve Bate [ 16/Apr/09 ]

Already resolved. See linked issue.

Comment by Parwinder Sekhon [ 16/Apr/09 ]

Any comments/thoughts on changing Session.send to

private boolean send(String messageString) {
getLog().onOutgoing(messageString);
Responder responder;
synchronized (responderSync)

{ responder = this.responder; }

if (responder == null)

{ getLog().onEvent("Attempt to send while not connected (message stored until connected)."); return false; }

return responder.send(messageString);
}

Given responder.send can block for longer than expected if synchronous writes are enabled. Without this my stress tests (firing thousands of orders at a fix engine as fast as possible) seem to lock up and concurrency is drastically reduced as responderSync is also used to get hold of the responder from other threads.

Generated at Sat Nov 23 14:11:23 UTC 2024 using JIRA 7.5.2#75007-sha1:9f5725bb824792b3230a5d8716f0c13e296a3cae.