MsgSeqNum too high problem

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

MsgSeqNum too high problem

Alexey Zubko
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html
QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html

Hi,

I'd like to ask if QF initiator acts correctly in the following scenario.
The initiator detects that seqnum of a server is too high and sends a
resend message.
The server doesn't resend requested messages, but there are heartbeat
messages.
The initiator disconnects because of timeout on heartbeat. :-)
Below is the initiator's log.

20050627-17:52:50 : Created session
20050627-17:52:51 : Connecting to eng-server on port 5725
20050627-17:52:51 : Connection succeeded
20050627-17:52:52 : Initiated logon request
20050627-17:53:15 : Received logon response
20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received 15
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14
20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received 16
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15
20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received 17
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16
20050627-17:53:22 : Sent test request TEST
20050627-17:53:22 : MsgSeqNum too high, expecting  13 but received 18
20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17
20050627-17:53:37 : MsgSeqNum too high, expecting  13 but received 19
20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18
20050627-17:53:40 : Timed out waiting for heartbeat
20050627-17:53:40 : Disconnecting



--

Regards,
                Alexey Zubko

Infinium Capital Corporation
(416) 360-7000 ext. 305



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Quickfix-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfix-developers
Reply | Threaded
Open this post in threaded view
|

Re: MsgSeqNum too high problem

Oren Miller
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html
QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html

Seems to me this shouldn't be happening.  I'm guessing that the engine isn't
processing the message because the sequence number is too high, hence no
heartbeat is processed.  I believe that out of sequence messages should
probably count as a keep-alive even if their contents arn't processed.

--oren

----- Original Message -----
From: "Alexey Zubko" <[hidden email]>
To: <[hidden email]>
Sent: Monday, June 27, 2005 1:44 PM
Subject: [Quickfix-developers] MsgSeqNum too high problem


> QuickFIX Documentation:
> http://www.quickfixengine.org/quickfix/doc/html/index.html
> QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
> QuickFIX Support: http://www.quickfixengine.org/services.html
>
> Hi,
>
> I'd like to ask if QF initiator acts correctly in the following scenario.
> The initiator detects that seqnum of a server is too high and sends a
> resend message.
> The server doesn't resend requested messages, but there are heartbeat
> messages.
> The initiator disconnects because of timeout on heartbeat. :-)
> Below is the initiator's log.
>
> 20050627-17:52:50 : Created session
> 20050627-17:52:51 : Connecting to eng-server on port 5725
> 20050627-17:52:51 : Connection succeeded
> 20050627-17:52:52 : Initiated logon request
> 20050627-17:53:15 : Received logon response
> 20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received 15
> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14
> 20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received 16
> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15
> 20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received 17
> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16
> 20050627-17:53:22 : Sent test request TEST
> 20050627-17:53:22 : MsgSeqNum too high, expecting  13 but received 18
> 20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17
> 20050627-17:53:37 : MsgSeqNum too high, expecting  13 but received 19
> 20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18
> 20050627-17:53:40 : Timed out waiting for heartbeat
> 20050627-17:53:40 : Disconnecting
>
>
>
> --
>
> Regards,
> Alexey Zubko
>
> Infinium Capital Corporation
> (416) 360-7000 ext. 305
>
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
> _______________________________________________
> Quickfix-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/quickfix-developers
>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Quickfix-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfix-developers
Reply | Threaded
Open this post in threaded view
|

Re: MsgSeqNum too high problem

Brian Erst
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html
QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html

Oren -

I believe the following change to Session.cpp will fix the timeout
problem when receiving out-of-sequence messages while awaiting a
sequence reset/gap-fill:

Move the following lines in Session::verify(...)

    UtcTimeStamp now;
    m_state.lastReceivedTime( now );

from their current position up to a spot just before the preceding
if/else clause:

-->>Place the code here<<--
if ( checkTooHigh && isTargetTooHigh( msgSeqNum ) )

This will increment the "lastReceivedTime" in the SessionState object
even when the sequence number is wrong. This appears to solve a whole
bunch of interrelated timeouts that could occur in this scenario (test
request, heartbeat, logon response, etc.) in one quick hit. Sequence
too low is largely unaffected by this change, as it will cause a
disconnect when hit, so that part of the code shouldn't need to be
rearranged.

It was somewhat unclear to me whether the following line:

    m_state.testRequest( 0 );

Also needed to be moved, but it seemed like it did not.

- Brian Erst
Thynk Software, Inc.
--- Oren Miller <[hidden email]> wrote:

> QuickFIX Documentation:
> http://www.quickfixengine.org/quickfix/doc/html/index.html
> QuickFIX FAQ:
> http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
> QuickFIX Support: http://www.quickfixengine.org/services.html
>
> Seems to me this shouldn't be happening.  I'm guessing that the
> engine isn't
> processing the message because the sequence number is too high, hence
> no
> heartbeat is processed.  I believe that out of sequence messages
> should
> probably count as a keep-alive even if their contents arn't
> processed.
>
> --oren
>
> ----- Original Message -----
> From: "Alexey Zubko" <[hidden email]>
> To: <[hidden email]>
> Sent: Monday, June 27, 2005 1:44 PM
> Subject: [Quickfix-developers] MsgSeqNum too high problem
>
>
> > QuickFIX Documentation:
> > http://www.quickfixengine.org/quickfix/doc/html/index.html
> > QuickFIX FAQ:
> http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
> > QuickFIX Support: http://www.quickfixengine.org/services.html
> >
> > Hi,
> >
> > I'd like to ask if QF initiator acts correctly in the following
> scenario.
> > The initiator detects that seqnum of a server is too high and sends
> a
> > resend message.
> > The server doesn't resend requested messages, but there are
> heartbeat
> > messages.
> > The initiator disconnects because of timeout on heartbeat. :-)
> > Below is the initiator's log.
> >
> > 20050627-17:52:50 : Created session
> > 20050627-17:52:51 : Connecting to eng-server on port 5725
> > 20050627-17:52:51 : Connection succeeded
> > 20050627-17:52:52 : Initiated logon request
> > 20050627-17:53:15 : Received logon response
> > 20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
> 15
> > 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14
> > 20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
> 16
> > 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15
> > 20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
> 17
> > 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16
> > 20050627-17:53:22 : Sent test request TEST
> > 20050627-17:53:22 : MsgSeqNum too high, expecting  13 but received
> 18
> > 20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17
> > 20050627-17:53:37 : MsgSeqNum too high, expecting  13 but received
> 19
> > 20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18
> > 20050627-17:53:40 : Timed out waiting for heartbeat
> > 20050627-17:53:40 : Disconnecting
> >
> >
> >
> > --
> >
> > Regards,
> > Alexey Zubko
> >
> > Infinium Capital Corporation
> > (416) 360-7000 ext. 305
> >
> >
> >
> > -------------------------------------------------------
> > SF.Net email is sponsored by: Discover Easy Linux Migration
> Strategies
> > from IBM. Find simple to follow Roadmaps, straightforward articles,
> > informative Webcasts and more! Get everything you need to get up to
> > speed, fast.
> http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
> > _______________________________________________
> > Quickfix-developers mailing list
> > [hidden email]
> > https://lists.sourceforge.net/lists/listinfo/quickfix-developers
> >
>
>
>
> -------------------------------------------------------
> SF.Net email is sponsored by: Discover Easy Linux Migration
> Strategies
> from IBM. Find simple to follow Roadmaps, straightforward articles,
> informative Webcasts and more! Get everything you need to get up to
> speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
> _______________________________________________
> Quickfix-developers mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/quickfix-developers
>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Quickfix-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfix-developers
Reply | Threaded
Open this post in threaded view
|

Re: MsgSeqNum too high problem

Alexey Zubko
Hello Brian,

Your fix prevents the disconnecting, but I still have a sequencing problem.

Actually I thing there are two problems -
1. Acceptor sends nothing back.
2. Initiator ignores incoming application messages.

I've got the problem during debugging - terminating initiator's process, but it's easy to reproduce:
If to increase NextSenderMsgSeqNum number in acceptor's file, after restart the initiator sends resend requests, the acceptor sends nothing back and ignores all incoming application messages (there is no fromApp() call).
Both acceptor and initiator are QF (CVS version + your patch).


Initiator:

20050628-19:38:55 : Created session
20050628-19:38:55 : Connecting to eng-server on port 57258
20050628-19:38:55 : Connection succeeded
20050628-19:38:56 : Initiated logon request
20050628-19:38:59 : Received logon response
20050628-19:40:50 : Created session
20050628-19:40:50 : Connecting to eng-server on port 57258
20050628-19:40:50 : Connection succeeded
20050628-19:40:51 : Initiated logon request
20050628-19:40:51 : Received logon response
20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 7
20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 6
20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 8
20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 7
20050628-19:41:06 : MsgSeqNum too high, expecting  5 but received 9
20050628-19:41:06 : Sent ResendRequest FROM: 5 TO: 8
.........

Acceptor:

20050628-19:38:48 : Created session
20050628-19:38:56 : Received logon request
20050628-19:38:56 : Responding to logon request
20050628-19:39:41 : Socket Error
20050628-19:39:41 : Disconnecting
20050628-19:40:42 : Created session
20050628-19:40:51 : Received logon request
20050628-19:40:51 : Responding to logon request
20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:06 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:21 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:36 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:51 : Socket Error
20050628-19:41:51 : Disconnecting


Regards,
		Alexey Zubko

Infinium Capital Corporation
(416) 360-7000 ext. 305


Brian Erst wrote:
Oren -

I believe the following change to Session.cpp will fix the timeout
problem when receiving out-of-sequence messages while awaiting a
sequence reset/gap-fill:

Move the following lines in Session::verify(...)

    UtcTimeStamp now;
    m_state.lastReceivedTime( now );

from their current position up to a spot just before the preceding
if/else clause:

-->>Place the code here<<--
if ( checkTooHigh && isTargetTooHigh( msgSeqNum ) )

This will increment the "lastReceivedTime" in the SessionState object
even when the sequence number is wrong. This appears to solve a whole
bunch of interrelated timeouts that could occur in this scenario (test
request, heartbeat, logon response, etc.) in one quick hit. Sequence
too low is largely unaffected by this change, as it will cause a
disconnect when hit, so that part of the code shouldn't need to be
rearranged.

It was somewhat unclear to me whether the following line:

    m_state.testRequest( 0 );

Also needed to be moved, but it seemed like it did not.

- Brian Erst
Thynk Software, Inc.
--- Oren Miller [hidden email] wrote:

  
QuickFIX Documentation:
http://www.quickfixengine.org/quickfix/doc/html/index.html
QuickFIX FAQ:
http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html

Seems to me this shouldn't be happening.  I'm guessing that the
engine isn't 
processing the message because the sequence number is too high, hence
no 
heartbeat is processed.  I believe that out of sequence messages
should 
probably count as a keep-alive even if their contents arn't
processed.

--oren

----- Original Message ----- 
From: "Alexey Zubko" [hidden email]
To: [hidden email]
Sent: Monday, June 27, 2005 1:44 PM
Subject: [Quickfix-developers] MsgSeqNum too high problem


    
QuickFIX Documentation: 
http://www.quickfixengine.org/quickfix/doc/html/index.html
QuickFIX FAQ:
      
http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
    
QuickFIX Support: http://www.quickfixengine.org/services.html

Hi,

I'd like to ask if QF initiator acts correctly in the following
      
scenario.
    
The initiator detects that seqnum of a server is too high and sends
      
a 
    
resend message.
The server doesn't resend requested messages, but there are
      
heartbeat 
    
messages.
The initiator disconnects because of timeout on heartbeat. :-)
Below is the initiator's log.

20050627-17:52:50 : Created session
20050627-17:52:51 : Connecting to eng-server on port 5725
20050627-17:52:51 : Connection succeeded
20050627-17:52:52 : Initiated logon request
20050627-17:53:15 : Received logon response
20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
      
15
    
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14
20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
      
16
    
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15
20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
      
17
    
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16
20050627-17:53:22 : Sent test request TEST
20050627-17:53:22 : MsgSeqNum too high, expecting  13 but received
      
18
    
20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17
20050627-17:53:37 : MsgSeqNum too high, expecting  13 but received
      
19
    
20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18
20050627-17:53:40 : Timed out waiting for heartbeat
20050627-17:53:40 : Disconnecting



-- 

Regards,
Alexey Zubko

Infinium Capital Corporation
(416) 360-7000 ext. 305



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration
      
Strategies
    
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast.
      
http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
    
_______________________________________________
Quickfix-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfix-developers

      

-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration
Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Quickfix-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfix-developers

    


  


	
	
	
	
Reply | Threaded
Open this post in threaded view
|

Re: MsgSeqNum too high problem

Oren Miller
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html
QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html

Any idea on why the acceptor is not responding to the ResendRequest?  
Those acceptance tests are passing.  Do you have the incoming/
outgoing log files for both processes?

--oren

On Jun 28, 2005, at 2:49 PM, Alexey Zubko wrote:

> Hello Brian,
>
> Your fix prevents the disconnecting, but I still have a sequencing  
> problem.
>
> Actually I thing there are two problems -
> 1. Acceptor sends nothing back.
> 2. Initiator ignores incoming application messages.
>
> I've got the problem during debugging - terminating initiator's  
> process, but it's easy to reproduce:
> If to increase NextSenderMsgSeqNum number in acceptor's file, after  
> restart the initiator sends resend requests, the acceptor sends  
> nothing back and ignores all incoming application messages (there  
> is no fromApp() call).
> Both acceptor and initiator are QF (CVS version + your patch).
>
>
> Initiator:
>
> 20050628-19:38:55 : Created session
> 20050628-19:38:55 : Connecting to eng-server on port 57258
> 20050628-19:38:55 : Connection succeeded
> 20050628-19:38:56 : Initiated logon request
> 20050628-19:38:59 : Received logon response
> 20050628-19:40:50 : Created session
> 20050628-19:40:50 : Connecting to eng-server on port 57258
> 20050628-19:40:50 : Connection succeeded
> 20050628-19:40:51 : Initiated logon request
> 20050628-19:40:51 : Received logon response
> 20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 7
> 20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 6
> 20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 8
> 20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 7
> 20050628-19:41:06 : MsgSeqNum too high, expecting  5 but received 9
> 20050628-19:41:06 : Sent ResendRequest FROM: 5 TO: 8
> ........
> .
> Acceptor:
>
> 20050628-19:38:48 : Created session
> 20050628-19:38:56 : Received logon request
> 20050628-19:38:56 : Responding to logon request
> 20050628-19:39:41 : Socket Error
> 20050628-19:39:41 : Disconnecting
> 20050628-19:40:42 : Created session
> 20050628-19:40:51 : Received logon request
> 20050628-19:40:51 : Responding to logon request
> 20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:41:06 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:41:21 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:41:36 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:41:51 : Socket Error
> 20050628-19:41:51 : Disconnecting
>
>
> Regards, Alexey Zubko Infinium Capital Corporation (416) 360-7000  
> ext. 305
>
> Brian Erst wrote:
>> Oren - I believe the following change to Session.cpp will fix the  
>> timeout problem when receiving out-of-sequence messages while  
>> awaiting a sequence reset/gap-fill: Move the following lines in  
>> Session::verify(...) UtcTimeStamp now; m_state.lastReceivedTime
>> ( now ); from their current position up to a spot just before the  
>> preceding if/else clause: -->>Place the code here<<-- if  
>> ( checkTooHigh && isTargetTooHigh( msgSeqNum ) )  This will  
>> increment the "lastReceivedTime" in the SessionState object even  
>> when the sequence number is wrong. This appears to solve a whole  
>> bunch of interrelated timeouts that could occur in this scenario  
>> (test request, heartbeat, logon response, etc.) in one quick hit.  
>> Sequence too low is largely unaffected by this change, as it will  
>> cause a disconnect when hit, so that part of the code shouldn't  
>> need to be rearranged. It was somewhat unclear to me whether the  
>> following line: m_state.testRequest( 0 ); Also needed to be moved,  
>> but it seemed like it did not. - Brian Erst Thynk Software, Inc.  
>> --- Oren Miller <[hidden email]> wrote:
>>> QuickFIX Documentation: http://www.quickfixengine.org/quickfix/ 
>>> doc/html/index.html QuickFIX FAQ: http://www.quickfixengine.org/ 
>>> wikifix/index.php?QuickFixFAQ QuickFIX Support: http://
>>> www.quickfixengine.org/services.html Seems to me this shouldn't  
>>> be happening. I'm guessing that the engine isn't processing the  
>>> message because the sequence number is too high, hence no  
>>> heartbeat is processed. I believe that out of sequence messages  
>>> should probably count as a keep-alive even if their contents  
>>> arn't processed. --oren ----- Original Message ----- From:  
>>> "Alexey Zubko" <[hidden email]> To: <quickfix-
>>> [hidden email]> Sent: Monday, June 27, 2005  
>>> 1:44 PM Subject: [Quickfix-developers] MsgSeqNum too high problem
>>>> QuickFIX Documentation: http://www.quickfixengine.org/quickfix/ 
>>>> doc/html/index.html QuickFIX FAQ:
>>> http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
>>>> QuickFIX Support: http://www.quickfixengine.org/services.html 
>>>> Hi, I'd like to ask if QF initiator acts correctly in the following
>>> scenario.
>>>> The initiator detects that seqnum of a server is too high and sends
>>> a
>>>> resend message. The server doesn't resend requested messages,  
>>>> but there are
>>> heartbeat
>>>> messages. The initiator disconnects because of timeout on  
>>>> heartbeat. :-) Below is the initiator's log. 20050627-17:52:50 :  
>>>> Created session 20050627-17:52:51 : Connecting to eng-server on  
>>>> port 5725 20050627-17:52:51 : Connection succeeded  
>>>> 20050627-17:52:52 : Initiated logon request 20050627-17:53:15 :  
>>>> Received logon response 20050627-17:53:15 : MsgSeqNum too high,  
>>>> expecting 13 but received
>>> 15
>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14  
>>>> 20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
>>> 16
>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15  
>>>> 20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
>>> 17
>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16  
>>>> 20050627-17:53:22 : Sent test request TEST 20050627-17:53:22 :  
>>>> MsgSeqNum too high, expecting 13 but received
>>> 18
>>>> 20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17  
>>>> 20050627-17:53:37 : MsgSeqNum too high, expecting 13 but received
>>> 19
>>>> 20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18  
>>>> 20050627-17:53:40 : Timed out waiting for heartbeat  
>>>> 20050627-17:53:40 : Disconnecting -- Regards, Alexey Zubko  
>>>> Infinium Capital Corporation (416) 360-7000 ext. 305  
>>>> ------------------------------------------------------- SF.Net  
>>>> email is sponsored by: Discover Easy Linux Migration
>>> Strategies
>>>> from IBM. Find simple to follow Roadmaps, straightforward  
>>>> articles, informative Webcasts and more! Get everything you need  
>>>> to get up to speed, fast.
>>> http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
>>>> _______________________________________________ Quickfix-
>>>> developers mailing list Quickfix-
>>>> [hidden email] https://lists.sourceforge.net/ 
>>>> lists/listinfo/quickfix-developers
>>> ------------------------------------------------------- SF.Net  
>>> email is sponsored by: Discover Easy Linux Migration Strategies  
>>> from IBM. Find simple to follow Roadmaps, straightforward  
>>> articles, informative Webcasts and more! Get everything you need  
>>> to get up to speed, fast. http://ads.osdn.com/?
>>> ad_id=7477&alloc_id=16492&op=click  
>>> _______________________________________________ Quickfix-
>>> developers mailing list [hidden email]  
>>> https://lists.sourceforge.net/lists/listinfo/quickfix-developers



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Quickfix-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfix-developers
Reply | Threaded
Open this post in threaded view
|

Re: MsgSeqNum too high problem

Brian Erst
In reply to this post by Alexey Zubko
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html
QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html

Alexey -

I'm not a QuickFIX developer, but I've rooted around the code a little
bit and about the only explanation I can think of that might explain
your problem is if the sequence number being used by the acceptor is
corrupt or the message offset table is broken.

Looking at nextResendRequest(...) in Session.cpp, it looks like it may
be possible to fall into a trap when an attempt to retrieve the
"missing" messages fails. This may be because the message store is bad
in some non-fatal way (read fails but doesn't throw an exception -
looks like FileStore may do this if message number 'n' simply doesn't
exist in the offsets table) or because the "beginning" sequence number
is greater than the "end" sequence number that actually got stored
(corrupt sequence number). This could happen if, in your scenario,
message number "5" for some reason did not exist in the message store -
the FileStore get() will stop on the first non-existent message number.
Even if 6-8 exist (in the form of heartbeats), the get will return an
empty set.

In this case, because the main for loop is skipped (no messages to
iterate thru), no messages are sent and, as a side-effect of this, no
sequence number reset gets sent either (the code that calls a
generateSequenceReset() call is within an if clause that won't be true
- the "begin" variable never gets set [fails check #1] and endSeqNo and
msgSeqNum are both 0 [fails check #2]).

The acceptor is happy because it increments the initiator's "expected"
sequence number regardless of whether any messages are sent in response
to the resend request, so it will continue to send heartbeats and will
not error out on additional sequence resets.

This is all EXTREMELY seat-of-the-pants analysis - there may well be
checks gonig on under the covers that I'm not aware of. It's an odd
sort of scenario.

- Brian Erst
Thynk Software, Inc.

--- Alexey Zubko <[hidden email]> wrote:

> Hello Brian,
>
> Your fix prevents the disconnecting, but I still have a sequencing
> problem.
>
> Actually I thing there are two problems -
> 1. Acceptor sends nothing back.
> 2. Initiator ignores incoming application messages.
>
> I've got the problem during debugging - terminating initiator's
> process,
> but it's easy to reproduce:
> If to increase NextSenderMsgSeqNum number in acceptor's file, after
> restart the initiator sends resend requests, the acceptor sends
> nothing
> back and ignores all incoming application messages (there is no
> fromApp() call).
> Both acceptor and initiator are QF (CVS version + your patch).
>
>
> Initiator:
>
> 20050628-19:38:55 : Created session
> 20050628-19:38:55 : Connecting to eng-server on port 57258
> 20050628-19:38:55 : Connection succeeded
> 20050628-19:38:56 : Initiated logon request
> 20050628-19:38:59 : Received logon response
> 20050628-19:40:50 : Created session
> 20050628-19:40:50 : Connecting to eng-server on port 57258
> 20050628-19:40:50 : Connection succeeded
> 20050628-19:40:51 : Initiated logon request
> 20050628-19:40:51 : Received logon response
> 20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 7
> 20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 6
> 20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 8
> 20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 7
> 20050628-19:41:06 : MsgSeqNum too high, expecting  5 but received 9
> 20050628-19:41:06 : Sent ResendRequest FROM: 5 TO: 8
> .........
>
> Acceptor:
>
> 20050628-19:38:48 : Created session
> 20050628-19:38:56 : Received logon request
> 20050628-19:38:56 : Responding to logon request
> 20050628-19:39:41 : Socket Error
> 20050628-19:39:41 : Disconnecting
> 20050628-19:40:42 : Created session
> 20050628-19:40:51 : Received logon request
> 20050628-19:40:51 : Responding to logon request
> 20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:41:06 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:41:21 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:41:36 : Received ResendRequest FROM: 5 TO: 0
> 20050628-19:41:51 : Socket Error
> 20050628-19:41:51 : Disconnecting
>
>
> Regards,
> Alexey Zubko
>
> Infinium Capital Corporation
> (416) 360-7000 ext. 305
>
>
>
> Brian Erst wrote:
>
> >Oren -
> >
> >I believe the following change to Session.cpp will fix the timeout
> >problem when receiving out-of-sequence messages while awaiting a
> >sequence reset/gap-fill:
> >
> >Move the following lines in Session::verify(...)
> >
> >    UtcTimeStamp now;
> >    m_state.lastReceivedTime( now );
> >
> >from their current position up to a spot just before the preceding
> >if/else clause:
> >
> >-->>Place the code here<<--
> >if ( checkTooHigh && isTargetTooHigh( msgSeqNum ) )
> >
> >This will increment the "lastReceivedTime" in the SessionState
> object
> >even when the sequence number is wrong. This appears to solve a
> whole
> >bunch of interrelated timeouts that could occur in this scenario
> (test
> >request, heartbeat, logon response, etc.) in one quick hit. Sequence
> >too low is largely unaffected by this change, as it will cause a
> >disconnect when hit, so that part of the code shouldn't need to be
> >rearranged.
> >
> >It was somewhat unclear to me whether the following line:
> >
> >    m_state.testRequest( 0 );
> >
> >Also needed to be moved, but it seemed like it did not.
> >
> >- Brian Erst
> >Thynk Software, Inc.
> >--- Oren Miller <[hidden email]> wrote:
> >
> >  
> >
> >>QuickFIX Documentation:
> >>http://www.quickfixengine.org/quickfix/doc/html/index.html
> >>QuickFIX FAQ:
> >>http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
> >>QuickFIX Support: http://www.quickfixengine.org/services.html
> >>
> >>Seems to me this shouldn't be happening.  I'm guessing that the
> >>engine isn't
> >>processing the message because the sequence number is too high,
> hence
> >>no
> >>heartbeat is processed.  I believe that out of sequence messages
> >>should
> >>probably count as a keep-alive even if their contents arn't
> >>processed.
> >>
> >>--oren
> >>
> >>----- Original Message -----
> >>From: "Alexey Zubko" <[hidden email]>
> >>To: <[hidden email]>
> >>Sent: Monday, June 27, 2005 1:44 PM
> >>Subject: [Quickfix-developers] MsgSeqNum too high problem
> >>
> >>
> >>    
> >>
> >>>QuickFIX Documentation:
> >>>http://www.quickfixengine.org/quickfix/doc/html/index.html
> >>>QuickFIX FAQ:
> >>>      
> >>>
> >>http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
> >>    
> >>
> >>>QuickFIX Support: http://www.quickfixengine.org/services.html
> >>>
> >>>Hi,
> >>>
> >>>I'd like to ask if QF initiator acts correctly in the following
> >>>      
> >>>
> >>scenario.
> >>    
> >>
> >>>The initiator detects that seqnum of a server is too high and
> sends
> >>>      
> >>>
> >>a
> >>    
> >>
> >>>resend message.
> >>>The server doesn't resend requested messages, but there are
> >>>      
> >>>
> >>heartbeat
> >>    
> >>
> >>>messages.
> >>>The initiator disconnects because of timeout on heartbeat. :-)
> >>>Below is the initiator's log.
> >>>
> >>>20050627-17:52:50 : Created session
> >>>20050627-17:52:51 : Connecting to eng-server on port 5725
> >>>20050627-17:52:51 : Connection succeeded
> >>>20050627-17:52:52 : Initiated logon request
> >>>20050627-17:53:15 : Received logon response
> >>>20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
> >>>      
> >>>
> >>15
> >>    
> >>
> >>>20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14
> >>>20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
> >>>      
> >>>
> >>16
> >>    
> >>
> >>>20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15
> >>>20050627-17:53:15 : MsgSeqNum too high, expecting  13 but received
> >>>      
> >>>
> >>17
> >>    
> >>
> >>>20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16
> >>>20050627-17:53:22 : Sent test request TEST
> >>>20050627-17:53:22 : MsgSeqNum too high, expecting  13 but received
> >>>      
>
=== message truncated ===



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Quickfix-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfix-developers
Reply | Threaded
Open this post in threaded view
|

Re: MsgSeqNum too high problem

Alexey Zubko
In reply to this post by Oren Miller
HI,

I found the problem.
Acceptor didn't resend messages because my endSeqNo field is 0 but my beginString is FIX4.0.
So, in the
nextResendRequest function QF tried to get messages from say 15 to 0.
It's my mistake that I put 0 in my initiator instead of 999999, of course,
but may be we need to change the code anyway?


void Session::nextResendRequest( const Message& resendRequest )
........
  std::string beginString = m_sessionID.getBeginString();
  if ( beginString >= FIX::BeginString_FIX42 && endSeqNo == 0 ||
       beginString <= FIX::BeginString_FIX42 && endSeqNo == 999999 ||
       endSeqNo >= getExpectedSenderNum() )
  { endSeqNo = getExpectedSenderNum() - 1; }

  std::vector < std::string > messages;
  m_state.get( beginSeqNo, endSeqNo, messages );
..........



Regards,
		Alexey Zubko

Infinium Capital Corporation
(416) 360-7000 ext. 305


Oren Miller wrote:
Any idea on why the acceptor is not responding to the ResendRequest?   Those acceptance tests are passing.  Do you have the incoming/ outgoing log files for both processes?

--oren

On Jun 28, 2005, at 2:49 PM, Alexey Zubko wrote:

Hello Brian,

Your fix prevents the disconnecting, but I still have a sequencing  problem.

Actually I thing there are two problems -
1. Acceptor sends nothing back.
2. Initiator ignores incoming application messages.

I've got the problem during debugging - terminating initiator's  process, but it's easy to reproduce:
If to increase NextSenderMsgSeqNum number in acceptor's file, after  restart the initiator sends resend requests, the acceptor sends  nothing back and ignores all incoming application messages (there  is no fromApp() call).
Both acceptor and initiator are QF (CVS version + your patch).


Initiator:

20050628-19:38:55 : Created session
20050628-19:38:55 : Connecting to eng-server on port 57258
20050628-19:38:55 : Connection succeeded
20050628-19:38:56 : Initiated logon request
20050628-19:38:59 : Received logon response
20050628-19:40:50 : Created session
20050628-19:40:50 : Connecting to eng-server on port 57258
20050628-19:40:50 : Connection succeeded
20050628-19:40:51 : Initiated logon request
20050628-19:40:51 : Received logon response
20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 7
20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 6
20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 8
20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 7
20050628-19:41:06 : MsgSeqNum too high, expecting  5 but received 9
20050628-19:41:06 : Sent ResendRequest FROM: 5 TO: 8
........
.
Acceptor:

20050628-19:38:48 : Created session
20050628-19:38:56 : Received logon request
20050628-19:38:56 : Responding to logon request
20050628-19:39:41 : Socket Error
20050628-19:39:41 : Disconnecting
20050628-19:40:42 : Created session
20050628-19:40:51 : Received logon request
20050628-19:40:51 : Responding to logon request
20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:06 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:21 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:36 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:51 : Socket Error
20050628-19:41:51 : Disconnecting


Regards, Alexey Zubko Infinium Capital Corporation (416) 360-7000  ext. 305

Brian Erst wrote:
Oren - I believe the following change to Session.cpp will fix the  timeout problem when receiving out-of-sequence messages while  awaiting a sequence reset/gap-fill: Move the following lines in  Session::verify(...) UtcTimeStamp now; m_state.lastReceivedTime ( now ); from their current position up to a spot just before the  preceding if/else clause: -->>Place the code here<<-- if  ( checkTooHigh && isTargetTooHigh( msgSeqNum ) )  This will  increment the "lastReceivedTime" in the SessionState object even  when the sequence number is wrong. This appears to solve a whole  bunch of interrelated timeouts that could occur in this scenario  (test request, heartbeat, logon response, etc.) in one quick hit.  Sequence too low is largely unaffected by this change, as it will  cause a disconnect when hit, so that part of the code shouldn't  need to be rearranged. It was somewhat unclear to me whether the  following line: m_state.testRequest( 0 ); Also needed to be moved,  but it seemed like it did not. - Brian Erst Thynk Software, Inc.  --- Oren Miller [hidden email] wrote:
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/ doc/html/index.html QuickFIX FAQ: http://www.quickfixengine.org/ wikifix/index.php?QuickFixFAQ QuickFIX Support: <a class="moz-txt-link-freetext" href="http://">http:// www.quickfixengine.org/services.html Seems to me this shouldn't  be happening. I'm guessing that the engine isn't processing the  message because the sequence number is too high, hence no  heartbeat is processed. I believe that out of sequence messages  should probably count as a keep-alive even if their contents  arn't processed. --oren ----- Original Message ----- From:  "Alexey Zubko" [hidden email] To: <quickfix- [hidden email]> Sent: Monday, June 27, 2005  1:44 PM Subject: [Quickfix-developers] MsgSeqNum too high problem
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/ doc/html/index.html QuickFIX FAQ:
http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html  Hi, I'd like to ask if QF initiator acts correctly in the following
scenario.
The initiator detects that seqnum of a server is too high and sends
a
resend message. The server doesn't resend requested messages,  but there are
heartbeat
messages. The initiator disconnects because of timeout on  heartbeat. :-) Below is the initiator's log. 20050627-17:52:50 :  Created session 20050627-17:52:51 : Connecting to eng-server on  port 5725 20050627-17:52:51 : Connection succeeded  20050627-17:52:52 : Initiated logon request 20050627-17:53:15 :  Received logon response 20050627-17:53:15 : MsgSeqNum too high,  expecting 13 but received
15
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14  20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
16
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15  20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
17
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16  20050627-17:53:22 : Sent test request TEST 20050627-17:53:22 :  MsgSeqNum too high, expecting 13 but received
18
20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17  20050627-17:53:37 : MsgSeqNum too high, expecting 13 but received
19
20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18  20050627-17:53:40 : Timed out waiting for heartbeat  20050627-17:53:40 : Disconnecting -- Regards, Alexey Zubko  Infinium Capital Corporation (416) 360-7000 ext. 305  ------------------------------------------------------- SF.Net  email is sponsored by: Discover Easy Linux Migration
Strategies
from IBM. Find simple to follow Roadmaps, straightforward  articles, informative Webcasts and more! Get everything you need  to get up to speed, fast.
http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________ Quickfix- developers mailing list Quickfix- [hidden email] https://lists.sourceforge.net/ lists/listinfo/quickfix-developers
------------------------------------------------------- SF.Net  email is sponsored by: Discover Easy Linux Migration Strategies  from IBM. Find simple to follow Roadmaps, straightforward  articles, informative Webcasts and more! Get everything you need  to get up to speed, fast. http://ads.osdn.com/? ad_id=7477&alloc_id=16492&op=click  _______________________________________________ Quickfix- developers mailing list [hidden email]  https://lists.sourceforge.net/lists/listinfo/quickfix-developers




	
	
	
	
Reply | Threaded
Open this post in threaded view
|

Re: MsgSeqNum too high problem

Oren Miller
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/doc/html/index.html
QuickFIX FAQ: http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html

Were you manually sending ResendRequest messages?

--oren

On Jun 28, 2005, at 4:15 PM, Alexey Zubko wrote:

> HI,
>
> I found the problem.
> Acceptor didn't resend messages because my endSeqNo field is 0 but  
> my beginString is FIX4.0.
> So, in the nextResendRequest function QF tried to get messages from  
> say 15 to 0.
> It's my mistake that I put 0 in my initiator instead of 999999, of  
> course,
> but may be we need to change the code anyway?
>
>
> void Session::nextResendRequest( const Message& resendRequest )
> .......
> .   std::string beginString = m_sessionID.getBeginString();
>   if ( beginString >= FIX::BeginString_FIX42 && endSeqNo == 0 ||
>        beginString <= FIX::BeginString_FIX42 && endSeqNo == 999999 ||
>        endSeqNo >= getExpectedSenderNum() )
>   { endSeqNo = getExpectedSenderNum() - 1; }
>
>   std::vector < std::string > messages;
>   m_state.get( beginSeqNo, endSeqNo, messages );
> .........
> .
>
>
> Regards, Alexey Zubko Infinium Capital Corporation (416) 360-7000  
> ext. 305
>
> Oren Miller wrote:
>> Any idea on why the acceptor is not responding to the  
>> ResendRequest?   Those acceptance tests are passing.  Do you have  
>> the incoming/ outgoing log files for both processes?
>>
>> --oren
>>
>> On Jun 28, 2005, at 2:49 PM, Alexey Zubko wrote:
>>
>>> Hello Brian,
>>>
>>> Your fix prevents the disconnecting, but I still have a  
>>> sequencing  problem.
>>>
>>> Actually I thing there are two problems -
>>> 1. Acceptor sends nothing back.
>>> 2. Initiator ignores incoming application messages.
>>>
>>> I've got the problem during debugging - terminating initiator's  
>>> process, but it's easy to reproduce:
>>> If to increase NextSenderMsgSeqNum number in acceptor's file,  
>>> after  restart the initiator sends resend requests, the acceptor  
>>> sends  nothing back and ignores all incoming application messages  
>>> (there  is no fromApp() call).
>>> Both acceptor and initiator are QF (CVS version + your patch).
>>>
>>>
>>> Initiator:
>>>
>>> 20050628-19:38:55 : Created session
>>> 20050628-19:38:55 : Connecting to eng-server on port 57258
>>> 20050628-19:38:55 : Connection succeeded
>>> 20050628-19:38:56 : Initiated logon request
>>> 20050628-19:38:59 : Received logon response
>>> 20050628-19:40:50 : Created session
>>> 20050628-19:40:50 : Connecting to eng-server on port 57258
>>> 20050628-19:40:50 : Connection succeeded
>>> 20050628-19:40:51 : Initiated logon request
>>> 20050628-19:40:51 : Received logon response
>>> 20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 7
>>> 20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 6
>>> 20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 8
>>> 20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 7
>>> 20050628-19:41:06 : MsgSeqNum too high, expecting  5 but received 9
>>> 20050628-19:41:06 : Sent ResendRequest FROM: 5 TO: 8
>>> ........
>>> .
>>> Acceptor:
>>>
>>> 20050628-19:38:48 : Created session
>>> 20050628-19:38:56 : Received logon request
>>> 20050628-19:38:56 : Responding to logon request
>>> 20050628-19:39:41 : Socket Error
>>> 20050628-19:39:41 : Disconnecting
>>> 20050628-19:40:42 : Created session
>>> 20050628-19:40:51 : Received logon request
>>> 20050628-19:40:51 : Responding to logon request
>>> 20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
>>> 20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
>>> 20050628-19:41:06 : Received ResendRequest FROM: 5 TO: 0
>>> 20050628-19:41:21 : Received ResendRequest FROM: 5 TO: 0
>>> 20050628-19:41:36 : Received ResendRequest FROM: 5 TO: 0
>>> 20050628-19:41:51 : Socket Error
>>> 20050628-19:41:51 : Disconnecting
>>>
>>>
>>> Regards, Alexey Zubko Infinium Capital Corporation (416)  
>>> 360-7000  ext. 305
>>>
>>> Brian Erst wrote:
>>>> Oren - I believe the following change to Session.cpp will fix  
>>>> the  timeout problem when receiving out-of-sequence messages  
>>>> while  awaiting a sequence reset/gap-fill: Move the following  
>>>> lines in  Session::verify(...) UtcTimeStamp now;  
>>>> m_state.lastReceivedTime ( now ); from their current position up  
>>>> to a spot just before the  preceding if/else clause: -->>Place  
>>>> the code here<<-- if  ( checkTooHigh && isTargetTooHigh
>>>> ( msgSeqNum ) )  This will  increment the "lastReceivedTime" in  
>>>> the SessionState object even  when the sequence number is wrong.  
>>>> This appears to solve a whole  bunch of interrelated timeouts  
>>>> that could occur in this scenario  (test request, heartbeat,  
>>>> logon response, etc.) in one quick hit.  Sequence too low is  
>>>> largely unaffected by this change, as it will  cause a  
>>>> disconnect when hit, so that part of the code shouldn't  need to  
>>>> be rearranged. It was somewhat unclear to me whether the  
>>>> following line: m_state.testRequest( 0 ); Also needed to be  
>>>> moved,  but it seemed like it did not. - Brian Erst Thynk  
>>>> Software, Inc.  --- Oren Miller <[hidden email]> wrote:
>>>>> QuickFIX Documentation: http://www.quickfixengine.org/quickfix/ 
>>>>> doc/html/index.html QuickFIX FAQ: http://
>>>>> www.quickfixengine.org/ wikifix/index.php?QuickFixFAQ QuickFIX  
>>>>> Support: http:// www.quickfixengine.org/services.html Seems to  
>>>>> me this shouldn't  be happening. I'm guessing that the engine  
>>>>> isn't processing the  message because the sequence number is  
>>>>> too high, hence no  heartbeat is processed. I believe that out  
>>>>> of sequence messages  should probably count as a keep-alive  
>>>>> even if their contents  arn't processed. --oren ----- Original  
>>>>> Message ----- From:  "Alexey Zubko"  
>>>>> <[hidden email]> To: <quickfix-  
>>>>> [hidden email]> Sent: Monday, June 27, 2005  
>>>>> 1:44 PM Subject: [Quickfix-developers] MsgSeqNum too high problem
>>>>>> QuickFIX Documentation: http://www.quickfixengine.org/ 
>>>>>> quickfix/ doc/html/index.html QuickFIX FAQ:
>>>>> http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
>>>>>> QuickFIX Support: http://www.quickfixengine.org/services.html   
>>>>>> Hi, I'd like to ask if QF initiator acts correctly in the  
>>>>>> following
>>>>> scenario.
>>>>>> The initiator detects that seqnum of a server is too high and  
>>>>>> sends
>>>>> a
>>>>>> resend message. The server doesn't resend requested messages,  
>>>>>> but there are
>>>>> heartbeat
>>>>>> messages. The initiator disconnects because of timeout on  
>>>>>> heartbeat. :-) Below is the initiator's log.  
>>>>>> 20050627-17:52:50 :  Created session 20050627-17:52:51 :  
>>>>>> Connecting to eng-server on  port 5725 20050627-17:52:51 :  
>>>>>> Connection succeeded  20050627-17:52:52 : Initiated logon  
>>>>>> request 20050627-17:53:15 :  Received logon response  
>>>>>> 20050627-17:53:15 : MsgSeqNum too high,  expecting 13 but  
>>>>>> received
>>>>> 15
>>>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14  
>>>>>> 20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
>>>>> 16
>>>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15  
>>>>>> 20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
>>>>> 17
>>>>>> 20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16  
>>>>>> 20050627-17:53:22 : Sent test request TEST  
>>>>>> 20050627-17:53:22 :  MsgSeqNum too high, expecting 13 but  
>>>>>> received
>>>>> 18
>>>>>> 20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17  
>>>>>> 20050627-17:53:37 : MsgSeqNum too high, expecting 13 but received
>>>>> 19
>>>>>> 20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18  
>>>>>> 20050627-17:53:40 : Timed out waiting for heartbeat  
>>>>>> 20050627-17:53:40 : Disconnecting -- Regards, Alexey Zubko  
>>>>>> Infinium Capital Corporation (416) 360-7000 ext. 305  
>>>>>> -------------------------------------------------------  
>>>>>> SF.Net  email is sponsored by: Discover Easy Linux Migration
>>>>> Strategies
>>>>>> from IBM. Find simple to follow Roadmaps, straightforward  
>>>>>> articles, informative Webcasts and more! Get everything you  
>>>>>> need  to get up to speed, fast.
>>>>> http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
>>>>>> _______________________________________________ Quickfix-  
>>>>>> developers mailing list Quickfix-  
>>>>>> [hidden email] https://
>>>>>> lists.sourceforge.net/ lists/listinfo/quickfix-developers
>>>>> ------------------------------------------------------- SF.Net  
>>>>> email is sponsored by: Discover Easy Linux Migration  
>>>>> Strategies  from IBM. Find simple to follow Roadmaps,  
>>>>> straightforward  articles, informative Webcasts and more! Get  
>>>>> everything you need  to get up to speed, fast. http://
>>>>> ads.osdn.com/? ad_id=7477&alloc_id=16492&op=click  
>>>>> _______________________________________________ Quickfix-  
>>>>> developers mailing list Quickfix-
>>>>> [hidden email]  https://lists.sourceforge.net/ 
>>>>> lists/listinfo/quickfix-developers
>>
>>



-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________
Quickfix-developers mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/quickfix-developers
Reply | Threaded
Open this post in threaded view
|

Re: MsgSeqNum too high problem

Alexey Zubko
Oren,

I didn't send resend messages manually.

One of a Globex (CME) certification step is not to send resend request for more than 2500 messages.
QF sets endSeqNo field to 0 (or 99999) in the generateResendRequest and the only way I found is
to comment out those code and set the field to 0 in my toAdmin callback.
I set it to 0 because the initiator was written for FIX4.2 and I didn't expected to use it for the other versions.

The problem occurred because I used the initiator for version FIX4.0 and absolutely forgot about this patch.
Now I set the field to 99999 for version <FIX4.2 and it solved the problem.

It may say that the resend messages were corrupted, but QF's reaction wasn't correct anyway.
I think versions checks need to be eliminated int the
nextResendRequest function:

  if (endSeqNo == 0 ||
       endSeqNo == 999999 ||
       endSeqNo >= getExpectedSenderNum() )
  { endSeqNo = getExpectedSenderNum() - 1; }


Regards,
		Alexey Zubko

Infinium Capital Corporation
(416) 360-7000 ext. 305


Oren Miller wrote:
Were you manually sending ResendRequest messages?

--oren

On Jun 28, 2005, at 4:15 PM, Alexey Zubko wrote:

HI,

I found the problem.
Acceptor didn't resend messages because my endSeqNo field is 0 but  my beginString is FIX4.0.
So, in the nextResendRequest function QF tried to get messages from  say 15 to 0.
It's my mistake that I put 0 in my initiator instead of 999999, of  course,
but may be we need to change the code anyway?


void Session::nextResendRequest( const Message& resendRequest )
.......
.   std::string beginString = m_sessionID.getBeginString();
  if ( beginString >= FIX::BeginString_FIX42 && endSeqNo == 0 ||
       beginString <= FIX::BeginString_FIX42 && endSeqNo == 999999 ||
       endSeqNo >= getExpectedSenderNum() )
  { endSeqNo = getExpectedSenderNum() - 1; }

  std::vector < std::string > messages;
  m_state.get( beginSeqNo, endSeqNo, messages );
.........
.


Regards, Alexey Zubko Infinium Capital Corporation (416) 360-7000  ext. 305

Oren Miller wrote:
Any idea on why the acceptor is not responding to the  ResendRequest?   Those acceptance tests are passing.  Do you have  the incoming/ outgoing log files for both processes?

--oren

On Jun 28, 2005, at 2:49 PM, Alexey Zubko wrote:

Hello Brian,

Your fix prevents the disconnecting, but I still have a  sequencing  problem.

Actually I thing there are two problems -
1. Acceptor sends nothing back.
2. Initiator ignores incoming application messages.

I've got the problem during debugging - terminating initiator's   process, but it's easy to reproduce:
If to increase NextSenderMsgSeqNum number in acceptor's file,  after  restart the initiator sends resend requests, the acceptor  sends  nothing back and ignores all incoming application messages  (there  is no fromApp() call).
Both acceptor and initiator are QF (CVS version + your patch).


Initiator:

20050628-19:38:55 : Created session
20050628-19:38:55 : Connecting to eng-server on port 57258
20050628-19:38:55 : Connection succeeded
20050628-19:38:56 : Initiated logon request
20050628-19:38:59 : Received logon response
20050628-19:40:50 : Created session
20050628-19:40:50 : Connecting to eng-server on port 57258
20050628-19:40:50 : Connection succeeded
20050628-19:40:51 : Initiated logon request
20050628-19:40:51 : Received logon response
20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 7
20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 6
20050628-19:40:51 : MsgSeqNum too high, expecting  5 but received 8
20050628-19:40:51 : Sent ResendRequest FROM: 5 TO: 7
20050628-19:41:06 : MsgSeqNum too high, expecting  5 but received 9
20050628-19:41:06 : Sent ResendRequest FROM: 5 TO: 8
........
.
Acceptor:

20050628-19:38:48 : Created session
20050628-19:38:56 : Received logon request
20050628-19:38:56 : Responding to logon request
20050628-19:39:41 : Socket Error
20050628-19:39:41 : Disconnecting
20050628-19:40:42 : Created session
20050628-19:40:51 : Received logon request
20050628-19:40:51 : Responding to logon request
20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
20050628-19:40:51 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:06 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:21 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:36 : Received ResendRequest FROM: 5 TO: 0
20050628-19:41:51 : Socket Error
20050628-19:41:51 : Disconnecting


Regards, Alexey Zubko Infinium Capital Corporation (416)  360-7000  ext. 305

Brian Erst wrote:
Oren - I believe the following change to Session.cpp will fix  the  timeout problem when receiving out-of-sequence messages  while  awaiting a sequence reset/gap-fill: Move the following  lines in  Session::verify(...) UtcTimeStamp now;  m_state.lastReceivedTime ( now ); from their current position up  to a spot just before the  preceding if/else clause: -->>Place  the code here<<-- if  ( checkTooHigh && isTargetTooHigh ( msgSeqNum ) )  This will  increment the "lastReceivedTime" in  the SessionState object even  when the sequence number is wrong.  This appears to solve a whole  bunch of interrelated timeouts  that could occur in this scenario  (test request, heartbeat,  logon response, etc.) in one quick hit.  Sequence too low is  largely unaffected by this change, as it will  cause a  disconnect when hit, so that part of the code shouldn't  need to  be rearranged. It was somewhat unclear to me whether the   following line: m_state.testRequest( 0 ); Also needed to be  moved,  but it seemed like it did not. - Brian Erst Thynk  Software, Inc.  --- Oren Miller [hidden email] wrote:
QuickFIX Documentation: http://www.quickfixengine.org/quickfix/  doc/html/index.html QuickFIX FAQ: <a class="moz-txt-link-freetext" href="http://">http:// www.quickfixengine.org/ wikifix/index.php?QuickFixFAQ QuickFIX  Support: <a class="moz-txt-link-freetext" href="http://">http:// www.quickfixengine.org/services.html Seems to  me this shouldn't  be happening. I'm guessing that the engine  isn't processing the  message because the sequence number is  too high, hence no  heartbeat is processed. I believe that out  of sequence messages  should probably count as a keep-alive  even if their contents  arn't processed. --oren ----- Original  Message ----- From:  "Alexey Zubko"  [hidden email] To: <quickfix-  [hidden email]> Sent: Monday, June 27, 2005   1:44 PM Subject: [Quickfix-developers] MsgSeqNum too high problem
QuickFIX Documentation: http://www.quickfixengine.org/ quickfix/ doc/html/index.html QuickFIX FAQ:
http://www.quickfixengine.org/wikifix/index.php?QuickFixFAQ
QuickFIX Support: http://www.quickfixengine.org/services.html   Hi, I'd like to ask if QF initiator acts correctly in the  following
scenario.
The initiator detects that seqnum of a server is too high and  sends
a
resend message. The server doesn't resend requested messages,   but there are
heartbeat
messages. The initiator disconnects because of timeout on   heartbeat. :-) Below is the initiator's log.  20050627-17:52:50 :  Created session 20050627-17:52:51 :  Connecting to eng-server on  port 5725 20050627-17:52:51 :  Connection succeeded  20050627-17:52:52 : Initiated logon  request 20050627-17:53:15 :  Received logon response  20050627-17:53:15 : MsgSeqNum too high,  expecting 13 but  received
15
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 14   20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
16
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 15   20050627-17:53:15 : MsgSeqNum too high, expecting 13 but received
17
20050627-17:53:15 : Sent ResendRequest FROM: 13 TO: 16   20050627-17:53:22 : Sent test request TEST  20050627-17:53:22 :  MsgSeqNum too high, expecting 13 but  received
18
20050627-17:53:22 : Sent ResendRequest FROM: 13 TO: 17   20050627-17:53:37 : MsgSeqNum too high, expecting 13 but received
19
20050627-17:53:37 : Sent ResendRequest FROM: 13 TO: 18   20050627-17:53:40 : Timed out waiting for heartbeat   20050627-17:53:40 : Disconnecting -- Regards, Alexey Zubko   Infinium Capital Corporation (416) 360-7000 ext. 305   -------------------------------------------------------  SF.Net  email is sponsored by: Discover Easy Linux Migration
Strategies
from IBM. Find simple to follow Roadmaps, straightforward   articles, informative Webcasts and more! Get everything you  need  to get up to speed, fast.
http://ads.osdn.com/?ad_id=7477&alloc_id=16492&op=click
_______________________________________________ Quickfix-  developers mailing list Quickfix-  [hidden email] https:// lists.sourceforge.net/ lists/listinfo/quickfix-developers
------------------------------------------------------- SF.Net   email is sponsored by: Discover Easy Linux Migration  Strategies  from IBM. Find simple to follow Roadmaps,  straightforward  articles, informative Webcasts and more! Get  everything you need  to get up to speed, fast. <a class="moz-txt-link-freetext" href="http://">http:// ads.osdn.com/? ad_id=7477&alloc_id=16492&op=click   _______________________________________________ Quickfix-  developers mailing list Quickfix- [hidden email]  https://lists.sourceforge.net/ lists/listinfo/quickfix-developers