Details
-
Type: Bug
-
Status: Closed
-
Priority: Major
-
Resolution: Won't Fix
-
Affects Version/s: 1.3.1
-
Fix Version/s: None
-
Component/s: Engine
-
Labels:None
-
Environment:QuickFIX 1.3.1, Java 1.5.0_12, Linux RHEL 4.4.2
Description
We are testing a case where a counterparty disconnects from our acceptor when we are filling their execution requests. In this case, when they reconnect, we have a large number of messages for them (ie: our sender sequence number is larger than what they expected), and they initiate a resend request, as expected.
What surprised us was that if we are still sending new messages, those messages are interleaved with the resends. This causes our FIX client to drop the connection because it receives an out of sequence message during the replay.
From our log files (we are using a custom MessageStore implementation)
11:58:41,363 INFO Store - Get request from 20315 to 26359
11:58:41,501 INFO Store - set: sequence 26383 :: 8=FIX.4.1 [rest removed]
From the event log:
Thu Aug 14 11:58:11 JST 2008 Disconnecting
Thu Aug 14 11:58:40 JST 2008 Accepting session FIX.4.1:JPN10000XXXX->JPN123456789 from /192.168.1.2:33299
Thu Aug 14 11:58:40 JST 2008 Acceptor heartbeat set to 30 seconds
Thu Aug 14 11:58:40 JST 2008 Refreshing message/state store at logon
Thu Aug 14 11:58:40 JST 2008 Received logon request
Thu Aug 14 11:58:40 JST 2008 Responding to logon request
Thu Aug 14 11:58:41 JST 2008 Received ResendRequest FROM: 20315 TO: 999999
Thu Aug 14 11:58:41 JST 2008 Resending Message: 20315
Thu Aug 14 11:58:41 JST 2008 Resending Message: 20316
Thu Aug 14 11:58:41 JST 2008 Resending Message: 20317
Thu Aug 14 11:58:41 JST 2008 Resending Message: 20318
Thu Aug 14 11:58:41 JST 2008 Resending Message: 20319
However, the sequence of messages the counterparty received was: 21035, 26383, 21036. The counterparty immediately disconnected upon receiving 26383, because it was out of sequence; they were expecting 20316.
Doesn't this seem strange? I know that in the inbound direction [ie: if we, the acceptor, were the one making the resend request] we would queue 26383 in QuickFIX and get it after the resend is complete. Or is it just something that all FIX clients have to be able to cope with, getting new messages interleaved in a resend request, and then queuing them until they are ready to deal with it?