[QFJ-959] Fix session could stuck forever in acceptor mode. Created: 07/Nov/18 Updated: 16/Nov/18 |
|
Status: | Open |
Project: | QuickFIX/J |
Component/s: | Engine |
Affects Version/s: | 2.1.0 |
Fix Version/s: | None |
Type: | Bug | Priority: | Major |
Reporter: | Dmitry | Assignee: | Unassigned |
Resolution: | Unresolved | Votes: | 0 |
Labels: | None |
Description |
Repetitive error: Multiple logons/connections for this session are not allowed Looks like session was not cleared correctly after exception. Nov 6 06:47:32 aemulator: java.io.IOException: Broken pipe Nov 6 06:47:32 aemulator: at sun.nio.ch.FileDispatcherImpl.write0(Native Method) Nov 6 06:47:32 aemulator: at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) Nov 6 06:47:32 aemulator: at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) Nov 6 06:47:32 aemulator: at sun.nio.ch.IOUtil.write(IOUtil.java:65) Nov 6 06:47:32 aemulator: at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) Nov 6 06:47:32 aemulator: at org.apache.mina.transport.socket.nio.NioProcessor.write(NioProcessor.java:384) Nov 6 06:47:32 aemulator: at org.apache.mina.transport.socket.nio.NioProcessor.write(NioProcessor.java:47) Nov 6 06:47:32 aemulator: at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.writeBuffer(AbstractPollingIoProcessor.java:1107) Nov 6 06:47:32 aemulator: at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.flushNow(AbstractPollingIoProcessor.java:994) Nov 6 06:47:32 aemulator: at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.flush(AbstractPollingIoProcessor.java:921) Nov 6 06:47:32 aemulator: at org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:688) Nov 6 06:47:32 aemulator: at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64) Nov 6 06:47:32 aemulator: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) Nov 6 06:47:32 aemulator: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) Nov 6 06:47:32 aemulator: at java.lang.Thread.run(Thread.java:748) Server logs. 2018-11-07 13:06:56.731 +0000 ERROR [NioProcessor-471] quickfixj.errorEventFIX.4.4:emulator->TRADE: - Multiple logons/connections for this session are not allowed 2018-11-07 13:06:57.736 +0000 ERROR [NioProcessor-472] quickfixj.errorEventFIX.4.4:emulator->TRADE: - Multiple logons/connections for this session are not allowed 2018-11-07 13:06:58.740 +0000 ERROR [NioProcessor-473] quickfixj.errorEventFIX.4.4:emulator->TRADE: - Multiple logons/connections for this session are not allowed 2018-11-07 13:06:59.780 +0000 ERROR [NioProcessor-474] quickfixj.errorEventFIX.4.4:emulator->TRADE: - Multiple logons/connections for this session are not allowed 2018-11-07 13:07:01.446 +0000 ERROR [NioProcessor-475] quickfixj.errorEventFIX.4.4:emulator->TRADE: - Multiple logons/connections for this session are not allowed 2018-11-07 13:07:03.446 +0000 ERROR [NioProcessor-476] quickfixj.errorEventFIX.4.4:emulator->TRADE: - Multiple logons/connections for this session are not allowed 2018-11-07 13:07:05.446 +0000 ERROR [NioProcessor-477] quickfixj.errorEventFIX.4.4:emulator->TRADE: - Multiple logons/connections for this session are not allowed 2018-11-07 13:07:07.446 +0000 ERROR [NioProcessor-478] quickfixj.errorEventFIX.4.4:emulator->TRADE: - Multiple logons/connections for this session are not allowed |
Comments |
Comment by Christoph John [ 07/Nov/18 ] |
Hi, is the pasted stack trace complete? I can not gather much from it. There should also be log entries from QFJ along with the IOException. Can you post them? I have never seen this behaviour before. Is it reproducible? Did it stop at some time or did it keep running like that? Thanks, |
Comment by Christoph John [ 07/Nov/18 ] |
And of course as always: would be better to first ask on the mailing list. We can clarify there if it really is a bug. |
Comment by Dmitry [ 08/Nov/18 ] |
Hi,
We do not see another exceptions in log.
I believe we found the way how to achieve this behavior. It is definitely not an expected working behavior, but as a result, quickfix could be stuck in some wrong state forever. From time to time we have an issue with our HyperV cloud. Virtual machine is frozen for ~30-40 seconds. (it is testing environment) Probably qfSession.hasResponder() keeps wrong state forever due to exception in Session.disconnect public void disconnect(String reason, boolean logError) throws IOException { try { final boolean logonReceived = state.isLogonReceived(); final boolean logonSent = state.isLogonSent(); synchronized (responderLock) { if (!hasResponder()) { if (!ENCOUNTERED_END_OF_STREAM.equals(reason)) { getLog().onEvent("Already disconnected: " + reason); } return; } final String msg = "Disconnecting: " + reason; if (logError) { getLog().onErrorEvent(msg); } else { getLog().onEvent(msg); } responder.disconnect(); // Maybe exception is here setResponder(null); } |
Comment by Christoph John [ 09/Nov/18 ] |
Hi, OK, if the whole VM froze for 30 seconds all sort of things can happen. You are right: given the log message "Multiple logons/connections for this session are not allowed" it looks like the responder was not set to null. When you say that the Exception might occur in or after responder.disconnect() is called then you should have something in your log file like "Disconnecting: " because that happens some lines above. Do you see something like that in your log file? Thanks, |
Comment by Dmitry [ 16/Nov/18 ] |
Hi, I am sorry for late response. Dmitry. |
Comment by Christoph John [ 16/Nov/18 ] |
OK, feel free to update the issue when you encounter problems again. |