[QFJ-334] New messages are interleaved with old ones during a counterparty replay request Created: 14/Aug/08 Updated: 15/Nov/12 Resolved: 09/Sep/08 |
|
Status: | Closed |
Project: | QuickFIX/J |
Component/s: | Engine |
Affects Version/s: | 1.3.1 |
Fix Version/s: | None |
Type: | Bug | Priority: | Major |
Reporter: | JamesM | Assignee: | Unassigned |
Resolution: | Won't Fix | Votes: | 0 |
Labels: | None | ||
Environment: |
QuickFIX 1.3.1, Java 1.5.0_12, Linux RHEL 4.4.2 |
Description |
We are testing a case where a counterparty disconnects from our acceptor when we are filling their execution requests. In this case, when they reconnect, we have a large number of messages for them (ie: our sender sequence number is larger than what they expected), and they initiate a resend request, as expected. What surprised us was that if we are still sending new messages, those messages are interleaved with the resends. This causes our FIX client to drop the connection because it receives an out of sequence message during the replay. From our log files (we are using a custom MessageStore implementation) 11:58:41,363 INFO Store - Get request from 20315 to 26359 From the event log: Thu Aug 14 11:58:11 JST 2008 Disconnecting However, the sequence of messages the counterparty received was: 21035, 26383, 21036. The counterparty immediately disconnected upon receiving 26383, because it was out of sequence; they were expecting 20316. Doesn't this seem strange? I know that in the inbound direction [ie: if we, the acceptor, were the one making the resend request] we would queue 26383 in QuickFIX and get it after the resend is complete. Or is it just something that all FIX clients have to be able to cope with, getting new messages interleaved in a resend request, and then queuing them until they are ready to deal with it? |
Comments |
Comment by JamesM [ 15/Aug/08 ] |
I was able to prevent this problem by adding an additional event to SessionStateListener so that my application is notified when a resend starts, and when it finishes. Using this, I was able to lock out the 'send' method of our application so that it doesn't attempt to send messages during this time. It seems like there may be a race condition for the socket; new messages seem to be interleaved between socket write() calls with those being sent by the resend. While the window for this race condition to occur is very small, it definitely seems to be a possibility if the message rate is high enough during the processing of the counterparty's resend request. |
Comment by Steve Bate [ 15/Aug/08 ] |
That's an interesting solution. Like you said, this is something the receiving FIX engine would usually handle. A sequence number that's too high should not cause a disconnect. If that high sequence number arrives during resend processing, then it would usually be queued until the gap is filled. A FIX engine doesn't necessarily require that the messages are received in order, but it will guarantee that it won't deliver the messages to the application out of order. |