QuickFIX/J User Manual

Simple Failover for Socket-based Acceptors.

When using a MessageStore that supports shared data (FileStore, JdbcStore and SleepycatStore), the Session can be configured to refresh the store information upon logon with configuration RefreshOnLogon=Y. You would typically run two acceptor processes using a shared message store. One process would be the active acceptor and the other would be the standby for any specific session. If one acceptor process dies, the client (assuming they have been configured with failover addresses) will attempt to logon to the other acceptor. When they do, the message store for that session will be refreshed and the session should continue normally. I've tried this with Banzai and two Order Executors and it appears to work well.

Example acceptor configuration with failover support:

[default]
FileStorePath=target/data/server
DataDictionary=etc/FIX42.xml
BeginString=FIX.4.2
ConnectionType=acceptor
StartTime=00:00:00
EndTime=00:00:00
HeartBtInt=30
SocketAcceptPort=9877
RefreshOnLogon=Y

[session]
SenderCompID=EXEC
TargetCompID=BANZAI

Note that this approach doesn't require that each acceptor is globally in an active or standby role. Since that role is session-specific, both acceptors could be be actively supporting sessions and when one node dies all the sessions on that node will be automatically switched to the other node. This is implemented simply by specifying the order of the failover addresses in the QF settings file so that some clients initially connect to one node and others connect to the other node. One weakness with this scenario is that there's no way to redistribute the sessions to the two nodes once the dead node is restarted. That can be added as future capability.