Details
Description
While performing a large session-layer recovery operation (around 1 million of messages) in chunks of 100 the QuickFIX/J engine got stuck, as it ignored a valid GapFill message:
18-07-13 08:56:50.673|QFJ Message Processor|INFO|quickfix: Received SequenceReset FROM: 1000102 TO: 1000103
18-07-13 08:56:50.673|QFJ Message Processor|ERROR|quickfix: MsgSeqNum too high, expecting 1000102 but received 1000103: 8=FIX.4.29=45135=834=100010343=Y49=ABFX52=20130718-08:56:50.66356=TEST_CLIENT122=20130718-00:09:001=B2BUSER60.STP6=0.816711=B2BCLIENT60-1374106069-1000102:014=4000015=EUR17=1374106069-100010220=021=131=0.816732=4000037=B2BCLIENT60-1374106069-1000102:038=4000039=240=D54=155=EUR/GBP60=20130718-00:08:54.57564=20140221109=B2BCLIENT60150=2151=0167=FOR194=0.82195=-0.003285544=GBP5549=AcmecoFXTest6054=32668.00999999=137413781066310=200
After analysis of the Session.nextSequenceReset() method it's clear that the incoming SequenceReset-GapFill message is handled differently when using non-zero ResendRequestChunkSize. Also, it looks like in this particular scenario the target sequence number has not been set (from 1000102) to newSequence (1000103) in the Session.nextSequenceReset() method, because:
1) range[2] was > 0 (which means using non-zero ResendRequestChunkSize)
AND
2) newSequence (1000103) was NOT >= range[1] (1084290)
AND
3) newSequence (1000103) was NOT >= range[2] (1000200)
[All values in the log below.]
It looks like the following logic needs to be reviewed because in the discussed scenario the newSequence value was completely ignored:
if (range[2] > 0) {
if (newSequence >= range[1])
else if (newSequence >= range[2])
{ state.setNextTargetMsgSeqNum(newSequence + 1); final String beginString = sequenceReset.getHeader().getString( BeginString.FIELD); sendResendRequest(beginString, range[1] + 1, newSequence + 1, range[1]); }} else
{ state.setNextTargetMsgSeqNum(newSequence); }The log is attached.