[QFJ-357] Logout message is sent before Logon on each Session start Created: 23/Oct/08 Updated: 15/Nov/12 Resolved: 09/Dec/11 |
|
Status: | Closed |
Project: | QuickFIX/J |
Component/s: | Networking |
Affects Version/s: | 1.3.3 |
Fix Version/s: | 1.5.2 |
Type: | Bug | Priority: | Major |
Reporter: | Phil Hong | Assignee: | Christoph John |
Resolution: | Fixed | Votes: | 10 |
Labels: | None | ||
Environment: |
Windows |
Attachments: | excess_logout.patch | ||||||||||||
Issue Links: |
|
Description |
On the StartTime of session new Session is created and suppried to Session as Responder. This is caused by the management of session state in Session class. It should update session state (session creation time) when new Session is suppried. /**
We could solve this problem by this but, I'm not sure what' there side effect... |
Comments |
Comment by Phil Hong [ 07/Nov/08 ] |
We've found there are a serious problem with above code. |
Comment by Phil Hong [ 10/Nov/08 ] |
There are two session time check function in Session class. |
Comment by zuolin [ 22/Jan/10 ] |
public void reset() throws IOException { if (hasResponder()) { } Phil Hong ,Whether this can be solved |
Comment by zuolin [ 22/Jan/10 ] |
if (hasResponder()) { } |
Comment by qiuyanming [ 22/Jan/10 ] |
public void reset() throws IOException { if (hasResponder()) { } |
Comment by Beat Glattfelder [ 17/Nov/10 ] |
As requested by Grant Birchmeier |
Comment by Grant Birchmeier [ 17/Nov/10 ] |
Not that I'm any authority here. He posted the patch to the mailing list, and I suggested that here is a better place. Here's Beat's email list message that describes his fix: To further complicate the issue, such as the check whether the session schedule has cycled (isSameSession) that is checking the session start time. Since SessionState.reset() does not update the starttime if the sequence numbers are both set to 1, the session starttime would never be updated. The way the logon eventually succeeded before, is that disconnecting following the first logout created a new session object, which then was within the schedule. I have removed that check as well, and checked the test suite is still running successfully and have attached the patch. Session.reset(): |
Comment by Grant Birchmeier [ 17/Nov/10 ] |
And here's a direct link to Beat's mail list message (click on the HTML attachment): |
Comment by Christoph John [ 10/May/11 ] |
What's really troubling me is that the Logout message is sent but the sequence number is not incremented. This causes a problem with some of our counterparties since they expect message sequence number n+1 but we still send n, although the initial Logout already had seqnum n. A client of us has a rather old FIX-Engine (non-QuickFIX) and due to this bug the sequence numbers do not synchronize correctly until we manually restart the session. I hope that this fix will be incorporated into QuickFIX/J soon. |
Comment by Jörg Thönnes [ 20/May/11 ] |
Are the changes done for bug
related to this one? |
Comment by Eric Deshayes [ 23/May/11 ] |
Joerg, |
Comment by Jörg Thönnes [ 31/May/11 ] |
We need a solution for QF/J 1.5.1. |
Comment by Jörg Thönnes [ 07/Aug/11 ] |
Moved to a later version since we ran out of time. |
Comment by Christian Plätzinger [ 16/Aug/11 ] |
Jörg is currently on holiday until September and I am afraid he will have to time to solve it once he is back. Can you please assign the issue to someone else? |
Comment by Paul McCabe [ 10/Oct/11 ] |
Just an interesting note, Beat's Previously, if a logon was received outside session times, then a logout response would be generated. With the patch, the logon outside of session times is ignored. This can cause issues when connecting to another FIX engine that has no timeout and never sends another logon request (I know, it sounds ridiculous but they are out there). I think this should be taken into account when this issue is picked up again, as I'm sure I'm not the only person who relies on this behaviour being maintained. |
Comment by RK [ 17/Oct/11 ] |
Has someone found a workaround for this issue? (the patch above doesn't seem to work) |
Comment by Christoph John [ 17/Oct/11 ] |
Hi, I'm not sure (and currently don't have time to test ), but IIRC the bug is also triggered by the session time. Could you test to supply "00:00:00" as both StartTime and EndTime. Does this help? Cheers, |
Comment by RK [ 17/Oct/11 ] |
Exactly, we experience the bug only on session start time when the session is created. ("quickfix.mina.initiator.InitiatorIoHandler sessionCreated") If I set both the StartTime and EndTime to "00:00:00" does the session reset (sequence numbers to 1:1) happen at all? We do need a reset once in 24hrs. |
Comment by Christoph John [ 18/Oct/11 ] |
IMHO the reset will be done (as long as there is no connection up at that time). Otherwise just stop/start the process at 00:00. But of course you should first test if setting "00:00:00" as both StartTime and EndTime helps. |
Comment by Christoph John [ 27/Oct/11 ] |
Guys, I have found an ugly workaround. When I was at a customer site, I wondered why they don't get hit by this bug. It seemed it was because they already load messages into the store right after gateway start (current order status information for the connecting client to be published at Logon). After some tests I could verify that it works in most cases. But unfortunately it does not work if the FIX process is started outside the StartTime/EndTime range. QuickFIX will then reset the state again on first logon (leading to Logout to be sent as first message). I have not tested the case where the process is never stopped since our processes are restarted each night for EOD jobs. So, all I did was to put some code into the onCreate() callback to increment the seqnum by 1 if it is still 1. Surprisingly, QuickFIX/J still sends out the first Logon attempt with seqNum = 1, although I incremented the seqNum when the quickfix.Session gets created. However, I have not the time to follow this up right now. Good luck, |
Comment by Mate Varga [ 01/Nov/11 ] |
This issue is of paramount importance I think – I recommend raising it's priority to major. |
Comment by AE [ 14/Nov/11 ] |
This bug is listed as since 1.3.3. Can anyone verify this is the case? We upgraded from 1.3.3 to 1.5 and started seeing this bug, so went back to 1.3.3 and stopped seeing it. Is it possible this bug is more prevalent in 1.5? Thanks |
Comment by Christoph John [ 25/Nov/11 ] |
IIRC it showed in 1.3.3 only if you used a custom StartTime and initiated the connection. |
Comment by Christoph John [ 08/Dec/11 ] |
Hi, I picked up Beat's changes on the SessionState.reset() method. But this did not help in the case where a Session is created and immediately (inside one second) initiates or accepts a connection (this is happening when the Session is started within the session time). So I made sure that the first check in checkSessionTime() (which is called from the Session constructor) does not update the lastSessionTimeCheck. Hence on the next call (from Session.next()) the lastSessionTimeResult is evaluated again, returning true and in turn leading to the acceptance of the FIX connection. I also made sure that outside session time a Logout is sent (see Paul's comment). The behaviour seemed to have changed in the meantime. When I tried to connect outside session time, I received a Logon immediately followed by Logout. I have changed Session.nextLogon() to reject the Logon outside of the session time. Will also provide UnitTests. UnitTests and AcceptanceTests were successful. Cheers, |