But, do not record first packet time on an out-of-order packet.
It so happens that packets get out-of-order a lot more across relay.
And it turns out with some H.264 stream, the first few packets of a key
frame are very small (may be SPS/PPS, haven't checked), they get
out-of-oder quite a lot, so much so a down track never starts even it
has 20 - 25 key frames have passed through.
* Negotiate downttrack for subscriber before receiver is ready
This change will save 1 round sdp negotiation time for
subscribing to simulcast-codec or remote node track
* solve comment
* Fix simulcast-codec case
There are cases where the very first packet on resume is an out-of-order
packet. In that case, the gap in both sequence number and time stamp is
a small(ish) negative number. With a high threshold to declare very old
packet, the condition does not trip and the packet gets through and
treated as a packet that has rolled over.
It should be fine to have smaller threshold (in fact, it is probably
okay to have something a little over 1.0 too) as the expected jump is
calculated based on elapsed time since last packet receive and new
packets should be coming in with a diff close to that. So, a factor of
just over 1.0 to prevent false triggers should be fine. Using 1.5 for
now.
* Handle another old packet condition.
With this detection, the sequence number can be rolled over even when TS
rollover is not possible. For example, a track at 300 pps can rollver
the sequence number space in minutes compared to 13h+ for video time
stamp to roll over.
* fix typo
* Refactor propagation delay estimator.
NOTE: It is not possible to calculate OWD (one-way-delay) in a passive
fashion. So, this should not be used for anything requiring high
precision.
But, mainly factoring it out as a separate object just in case it can be
re-used.
TODO:
- probably has some edge case that is not handled well
- maybe path change detection can be improved
- will write UT later. This is just purely splitting it out from what
was embedded in RTPStatsReceiver.
* fix labels
* precision -> accuracy
* Reset DD tracker layers when muted.
@cnderrauber, I think this is okay to do, but please let me know if
there are gotchas in there.
* copy
* more compact form
It is possible that old packets arrive on receiver. If subscriber starts
on that, the first packet time would be incorrect. Do not start
forwarding on out-of-order packets.
There are cases of small negative sequence number jump and small
positive time stamp jump. Those should not force rollover. Maybe, they
should be dropped, but just logging for now till we learn more.
* Fix forced rollover of RTP time stamp.
Was erroneously forcing a rollover when the timestamp jump actually has
room to accommodate large jumps. For example, before pause ts = 10, then
eight hour pause, restart ts = 10 + (8 * 00 * 60 * 90000) = 2592000010
(at 90000 clock rate for video). In normal processing, it will look like
out-of-order as the difference 2592000000 is more than half the 32-bit
range. But, forcing a roll over is incorrect.
Fix by calculating excess over the full range and then account for wrap
around.
* log potential ts rollover
* clamp at min 0
* Ignore really old packets.
There are cases where really old packets (time stamp is way back, but
sequence number looks like it is moving forward) which cause the
sequence number to update incorrectly. Drop those packets are they are
very old.
* test
* Rollover sequence number when time stamp is moving forward.
Seeing large gaps in sequence number due to potential network issues.
In that gap, the sequence number could roll over.
Using packet time jumps to figure out if a roll over could have happened
and force roll over the sequence number to ensure that it does not flow
backwards.
* fix test
Seeing cases of huge jumps in sender erport rtp time stamp
(of the order of minutes) a few hundred ms after start of track.
Only less than 20 packets have been published at that time as seen by
server. Adding these to sender report to check if client thinks it has
sent much more.
There are cases of subscriber of a FF publisher not able to lock onto
layer 0 and kept rolling back till it switched to a higher layer and
then it could switch to layer 0. Must have been due to not having a
sender report. Can't think of a reason why that would be missing.
Logging more to debug this further.
Also, using a wrapping logger because of this bug: https://github.com/uber-go/zap/issues/836
Tried using `json:"" yaml:""` tag for `refInfos` field and all fields
inside `refInfo` struct. But, they were still logging nils. So, using
the wrapper logger.
Enabled by default.
Also, tweak the long term propagation delay a bit. The first propagation
delay itself was too high and the long term initialized with a high
value. Prevent that and also ensure large negtaives do not have an
effect by using a lower bound of 0. Lower bound of 0 is okay as the main
purpose is to track sustained high positive values.
Seeing cases (mostly across relay) of large first packet time adjustment
getting ignored. From data, it looks like the first packet is extremely
delayed (some times of the order of minutes) which does not make sense.
Adding some checks against media path, i. e. compare RTP timestamp from
sender report against expected RTP timestamp based on media path
arrivals and log deviations more than 5 seconds.
Another puzzling case. Trying to understand more.
Also, refactoring SetRtcpSenderReportData() function as it was getting
unwieldy.
* Handle cases of long mute/rollover of time stamp.
There are cases where the track is muted for long enough for timestamp
roll over to happen. There are no packets in that window (typically
there should be black frames (for video) or silence (for audio)). But,
maybe the pause based implementation of mute is causing this.
Anyhow, use time since last packet to gauge how much roll over should
have happened and use that to update time stamp. There will be really
edge cases where this could also fail (for e. g. packet time is affected
by propagation delay, so it could theoretically happen that mute/unmute
+ packet reception could happen exactly around that rollover point and
miscalculate, but should be rare).
As this happen per packet on receive side, changing time to `UnixNano()`
to make it more efficient to check this.
* spelling
* tests
* test util
* tests
With RTX, some clients use very old packets for probing. Check for
duplicate before logging warning about old packet/negative sequence
number jump.
Also, double the history so that duplicate tracking is better. Adds
about 1/2 KB per RTP stream.