[QFJ-271] StackOverflowError trying to process queued SequenceReset Created: 11/Dec/07  Updated: 23/Dec/13  Resolved: 23/Dec/13

Status: Closed
Project: QuickFIX/J
Component/s: Engine
Affects Version/s: 1.2.1
Fix Version/s: 1.3.1

Type: Bug Priority: Default
Reporter: Rhys Yarranton Assignee: Steve Bate
Resolution: Fixed Votes: 2
Labels: None

Attachments: Text File Session.java    
Issue Links:
Duplicate
is duplicated by QFJ-409 Stack overflow when processing a larg... Closed
Relates
is related to QFJ-309 There is no need to store the message... Open

 Description   

While testing out of order messages, we ran into the stack overflow at bottom. We are using 1.2.1. I believe this also affects 1.3.0.

Scenario: Counterparty sends messages through 41 OK, then sends 50 and 51 out of sequence. QF queues these and issues a resend request. Messages 42 through 49 arrive, QF processes 50 from the queue OK then cacks on 51. Message 51 is:
16:13:27,627 INFO [GATEWAYS.FWST.Incoming] FIX.4.2:BIDS->FWST: 8=FIX.4.2^A9=60^A35=4^A49=FWST^A56=BIDS^A34=51^A52=20071210-21:13:26^A123=Y^A36=51^A10=215^A

The problem appears to be that message 51 is asking QF to set the next sequence number to 51. QF obediently leaves the sequence number at 51, resulting in an infinite loop, and due to the use of recursion, a stack overflow.

The FIX 4.2 spec didn't seem too clear to me whether one should issue a reject or ignore the message. Whatever the case, I guess we have to make sure the nasty message gets removed from the buffer.

Don't know if there are other types of pathological messages. This buffer stuff is kind of tricky, and coupled with the recursion makes me nervous. e.g., you can get an overflow simply because someone sends you a massive plug of out-of-order data.

java.lang.StackOverflowError
at java.lang.Exception.<init>(Exception.java:77)
at java.lang.reflect.InvocationTargetException.<init>(InvocationTargetException.java:54)
at sun.reflect.GeneratedMethodAccessor63.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:917)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1339)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1290)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1079)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1375)
at java.io.ObjectOutputStream.defaultWriteObject(ObjectOutputStream.java:391)
at org.apache.log4j.spi.LoggingEvent.writeObject(LoggingEvent.java:407)
at sun.reflect.GeneratedMethodAccessor100.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:917)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1339)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1290)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1079)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:302)
at org.apache.log4j.net.SocketAppender.append(SocketAppender.java:232)
at com.amsl.common.util.QuietSocketAppender.append(QuietSocketAppender.java:79)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:230)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65)
at org.apache.log4j.Category.callAppenders(Category.java:203)
at org.apache.log4j.Category.forcedLog(Category.java:388)
at org.apache.log4j.Category.warn(Category.java:1008)
at com.amsl.bids3.fixgw.session.FIXGateway.validateMessageForLatency(FIXGateway.java:777)
at com.amsl.bids3.fixgw.session.FIXGateway.fromAdmin(FIXGateway.java:671)
at quickfix.Session.fromCallback(Session.java:1227)
at quickfix.Session.verify(Session.java:1190)
at quickfix.Session.nextSequenceReset(Session.java:939)
at quickfix.Session.next(Session.java:688)
at quickfix.Session.next(Session.java:1487)
at quickfix.Session.nextQueued(Session.java:1477)
at quickfix.Session.nextQueued(Session.java:1461)
at quickfix.Session.next(Session.java:750)
at quickfix.Session.next(Session.java:1487)



 Comments   
Comment by Rhys Yarranton [ 12/Dec/07 ]

On further digging, it looks like there could also be a problem when NewSeqNo < MsgSeqNum on a queued message. QF issues a reject, but leaves the message in the queue and does not increment the sequence number.

So, perhaps an underlying problem is that SessionState.dequeue() is using Map.get() instead of Map.remove(). The queue will never have anything removed until disconnect when SessionState.clearQueue() is called, which is itself a memory leak. If remove() were used, the case NewSeqNo == MsgSeqNum would result in essentially a no-op.

Comment by Biswaranjan Das [ 18/Jan/13 ]

Looks this requires some refactoring. This stack overflow is a genuine problem. Following refactoring should fix it without any business logic impact. I am referring it based on 1.4 code base.


    /**
     * (Internal use only)
     */
    public void next(Message message) throws FieldNotFound, RejectLogon, IncorrectDataFormat,
            IncorrectTagValue, UnsupportedMessageType, IOException, InvalidMessage {
    	
    	if(!processMessage(message)){
    		return;
    	}

        nextQueued();
        if (isLoggedOn()) {
            next();
        }
    }
    
    private boolean processMessage(Message message) throws FieldNotFound, RejectLogon, IncorrectDataFormat,
    IncorrectTagValue, UnsupportedMessageType, IOException, InvalidMessage {

        if (!checkSessionTime()) {
            reset();
            return false;
        }

        Header header = message.getHeader();
        String msgType = header.getString(MsgType.FIELD);

        try {

            String beginString = header.getString(BeginString.FIELD);

            if (!beginString.equals(sessionID.getBeginString())) {
                throw new UnsupportedVersion();
            }

            if (msgType.equals(MsgType.LOGON)) {
                if (sessionID.isFIXT()) {
                    targetDefaultApplVerID.set(new ApplVerID(message.getString(DefaultApplVerID.FIELD)));

                } else {
                    targetDefaultApplVerID.set(MessageUtils.toApplVerID(beginString));
                }
            }

            if (dataDictionaryProvider != null) {
                DataDictionary sessionDataDictionary = dataDictionaryProvider
                        .getSessionDataDictionary(beginString);
    
                String customApplVerID = header.isSetField(CstmApplVerID.FIELD) ? header
                        .getString(CstmApplVerID.FIELD) : null;
    
                ApplVerID applVerID = header.isSetField(ApplVerID.FIELD) ? new ApplVerID(header
                        .getString(ApplVerID.FIELD)) : targetDefaultApplVerID.get();
    
                DataDictionary applicationDataDictionary = isAdminMessage(msgType)
                        ? dataDictionaryProvider.getSessionDataDictionary(beginString)
                        : dataDictionaryProvider.getApplicationDataDictionary(applVerID,
                                customApplVerID);
                
                DataDictionary.validate(message, sessionDataDictionary, applicationDataDictionary);
            }
            
            if (msgType.equals(MsgType.LOGON)) {
                nextLogon(message);
            } else if (msgType.equals(MsgType.HEARTBEAT)) {
                nextHeartBeat(message);
            } else if (msgType.equals(MsgType.TEST_REQUEST)) {
                nextTestRequest(message);
            } else if (msgType.equals(MsgType.SEQUENCE_RESET)) {
                nextSequenceReset(message);
            } else if (msgType.equals(MsgType.LOGOUT)) {
                nextLogout(message);
            } else if (msgType.equals(MsgType.RESEND_REQUEST)) {
                nextResendRequest(message);
            } else if (msgType.equals(MsgType.REJECT)) {
                nextReject(message);
            } else {
                if (!verify(message)) {
                    return false;
                }
                state.incrNextTargetMsgSeqNum();
            }
        } catch (FieldException e) {
            generateReject(message, e.getSessionRejectReason(), e.getField());
        } catch (FieldNotFound e) {
            if (sessionID.getBeginString().compareTo(FixVersions.BEGINSTRING_FIX42) >= 0
                    && message.isApp()) {
                generateBusinessReject(message,
                        BusinessRejectReason.CONDITIONALLY_REQUIRED_FIELD_MISSING, e.field);
            } else {
                if (msgType.equals(MsgType.LOGON)) {
                    getLog().onEvent("Required field missing from logon");
                    disconnect();
                } else {
                    generateReject(message, SessionRejectReason.REQUIRED_TAG_MISSING, e.field);
                }
            }
        } catch (IncorrectDataFormat e) {
            generateReject(message, SessionRejectReason.INCORRECT_DATA_FORMAT_FOR_VALUE, e.field);
        } catch (IncorrectTagValue e) {
            generateReject(message, SessionRejectReason.VALUE_IS_INCORRECT, e.field);
        } catch (InvalidMessage e) {
            getLog().onEvent("Skipping invalid message: " + e.getMessage());
        } catch (RejectLogon e) {
            String rejectMessage = e.getMessage() != null ? (": " + e.getMessage()) : "";
            getLog().onEvent("Logon rejected" + rejectMessage);
            if (e.isLogoutBeforeDisconnect()) {
                generateLogout(e.getMessage());
            }
            state.incrNextTargetMsgSeqNum();
            disconnect();
        } catch (UnsupportedMessageType e) {
            if (sessionID.getBeginString().compareTo(FixVersions.BEGINSTRING_FIX42) >= 0) {
                generateBusinessReject(message, BusinessRejectReason.UNSUPPORTED_MESSAGE_TYPE, 0);
            } else {
                generateReject(message, "Unsupported message type");
            }
        } catch (UnsupportedVersion e) {
            if (msgType.equals(MsgType.LOGOUT)) {
                nextLogout(message);
            } else {
                generateLogout("Incorrect BeginString");
                state.incrNextTargetMsgSeqNum();
                // 1d_InvalidLogonWrongBeginString.def appears to require
                // a disconnect although the C++ didn't appear to be doing it.
                // ???
                disconnect();
            }
        } catch (IOException e) {
            LogUtil.logThrowable(sessionID, "error processing message", e);
        }
        return true;
    }

    private void next(String msg) throws InvalidMessage, FieldNotFound, RejectLogon,
            IncorrectDataFormat, IncorrectTagValue, UnsupportedMessageType, IOException {
        try {
            processMessage(parseMessage(msg));
        } catch (InvalidMessage e) {
            String message = e.getMessage();
            getLog().onEvent(message);
            if (MsgType.LOGON.equals(MessageUtils.getMessageType(msg))) {
                getLog().onEvent("Logon message is not valid");
                disconnect();
            }
            throw e;
        }
    }
Comment by Biswaranjan Das [ 18/Jan/13 ]

Patch on 1.4 Code Base.

Comment by Jörg Thönnes [ 15/Mar/13 ]

The current version is QF/J 1.5.3 so why providing patches based on 1.4?

If the issue still applies to the current version, please comment accordingly.

Otherwise, I would like to close this as "Cannot Reproduce".

Comment by Jörg Thönnes [ 15/Mar/13 ]

Sorry, closed this by accident. But actually I want to take your findings honestly and wait for a test with the current version.

Comment by Christoph John [ 19/Dec/13 ]

Will investigate if this still is a problem. I am pretty sure that map.remove is used in current versions of QFJ.

Comment by Christoph John [ 23/Dec/13 ]

Turned out that the map.remove is present in the code since QFJ 1.3.1 and QFJ-271 should hence be no problem for current versions. Added unit test now to ensure expected behaviour.

http://sourceforge.net/p/quickfixj/code/1130/

Generated at Sat Nov 23 10:45:47 UTC 2024 using JIRA 7.5.2#75007-sha1:9f5725bb824792b3230a5d8716f0c13e296a3cae.