Details
Description
This has been mentioned on the mailing list with no feedback so ...
Ref: FIX Session protocol Version (FIXT) 1.1. Errata
In a FIX logon 1137,1147,1408 tags can set defaults for use in future FIX messages e.g. DefaultApplVerID, the behaviour I was seeing was that
it was possible for preprocessing on future messages to occur before the logon was fully processed (race condition) and occasionally I would
get this error message even though I was replaying identical messages (timing\thread delay is the variant) ...
"12-Apr-2011 15:06:51 quickfix.mina.AbstractIoHandler messageReceived
SEVERE: Invalid message: Can't determine ApplVerID for message"
In the debugger it was obvious the thread setting ApplVerID was free to "race" the thread doing preprocessing on the subsequent message i.e
doing a get on ApplVerID, I have only reproduced this in test scenarios (though I can do it every time) as usually delay in comms etc I think
would be enough the setter always wins the race. Even if this is not real world its still a problem in debugging and unit testing, I was
actually trying to track another of my bugs but this one made that very difficult. I believe its only safe to test application version when any previous logon has completed.
The ApplVerID is set in next(Message message) in Session,java
if (msgType.equals(MsgType.LOGON)) {
if (sessionID.isFIXT())
else
{ targetDefaultApplVerID.set(MessageUtils.toApplVerID(beginString)); }}
if you put a break point on the above you'll see the next message in (different thread) can still call
ApplVerID getApplVerID(Session session, String messageString) in MessageUtils which logs that error (the set should have occurred before the get) and in my case the missing (failed validation) message killed the session.
Just using the client in debug on Eclipse with a free running Server was enough to cause a 50% failure rate with the break point it was 100% obviously.
This method in my case fails as in my workflow the default appl ver id was required to be known (which would be set by the logon).
The two fixes messages are not being executed in parallel obviously its just the up front initial verification of the second message that is that
causes the issue i.e. we are validating too early.
Attachments
Issue Links
- relates to
-
QFJ-721 non-FIXT sessions: NPE accessing ApplVerID if previous Logon was not completely processed
- Resolved