With Read and ReadExtended waiting (they are two different goroutines),
use Broadcast always. In theory, they both should not be waiting at the
same time, but just being safe.
* Do not use LastTS for dummy offset.
LastTS could be random when using dummy start. That should not be used
in calculating offsets.
Also, do not push padding into sequence before init. Could have heppened
with dummy start.
* apply dummy offset before comparing to last
* refresh ref TS
* initialize codec munger on catch up forwarding
* Simplify time stamp calculation on switches.
Trying to simplify time stamp calculation on restarts.
The additional checks take effect rarely and it not worth the extra
complication.
Also, doing the reference time stamp in extended range.
The challenge with that is when publisher migrates the extended
timestamp could change post migration (i. e. post migration would not
know about rollovers). To address that, maintain an offset that is
updated on resync.
* WIP
* Revert to resume threshold
* typo
* clean up
* Handle large jumps in RTCP sender report timestamp.
Seeing cases of RTCP Sender Report spaced apart by more than half the
RTP Timestamp range. Maybe a case of laptop going to sleep and waking
up. Handle it using time diff from last report and calculating expected
timestamp.
* try go 1.22
* Connection quality LOST only if RTCP is also not available.
It is possible that sender stops all layers of video due to some
constraint (CPU or bandwidth). Packet reception going dry due to
that should not trigger `LOST` quality.
Add last received RTCP time also to distinguish the case
of real `LOST` and sender stopping traffic.
Some bits to watch for
- With audio, RTCP reports could be more than 5 seconds apart (5 seconds
is the default interval for connection quality scorer), but audio
senders usually send silence packets even when there is no input.
So audio completely stopping can be considered `LOST`.
- With video, have to observe if all clients continue to send RTCP even
if all layers are stopped.
- RTCP bandwidth is not supposed to exceed the primary stream bandwidth.
libwebrtc calculates that and spaces out RTCP reports accordingly.
That is the reason why audio reports are that far apart. If a video
stream is encoded at a very low bit rate, it could also be sending
RTCP rarely. So, there is the case of LOST being indistinguishable
from sender stopping all layers. But, this should be a rare case.
* typo
* Do codec munging when munging RTP header.
It was possible for probe packets to get in between RTP munging and
codec munging and throw off sequence number while dropping packets.
Affected only VP8 as it does codec munging.
* do not pass in buffer as it is created anyway
* flip fields
* flip order
* fix test
* call translate for all tracks
* simplify
If the first packet of keyframe has template structure is lost then
subsequent packets rely on it will report invalid tempalte error which
is expected.
* Add support for "abs-capture-time" extension.
Currently, it is just passed through from publisher -> subscriber side.
TODO: Need to store in sequencer and restore for retransmission.
* abs-capture-time in retransmissions
* clean up
* fix test
* more test fixes
* more test fixes
* more test fixes
* log only when size is non-zero
* log on both sides for debugging
* add marshal/unmarshal
* normalize abs capture time to SFU clock
* comment out adding abs-capture-time from registered extensions
* Disable audio loss proxying.
Added a config which is off by default.
With audio NACKs, that is the preferred repair mechanism.
With RED, repair is built in via packet redundancy to recover from
isolated losses.
So, proxying is not required. But, leaving it in there with a config
that is disabled by default.
* fix test
- Had arguments reversed.
- Also, cannot take away reference layer from state as a new layer
as reference could have a time stamp that is widely different from
expected. So, put that back.
* Move caching of publisher sender report to subscriber side.
Please see inline for descriptive comments on why. Basically,
pause/unpause using replaceTrack(null)/replaceTrack(actualTrack) can
cause time stamp in sender report sent to subscribers jump ahead.
This prevents that.
With the caching on subscriber side, cleaning up the caching on
publisher side.
* fix compile, test still failing, need to debug
* skip reference TS for testing
* Prevent large spikes in propagation delay
A few tweaks
- Large spike in propagation delay due to congested channel results in
long term estimate getting high value. Ignore outliers in long term
estimate.
- Introduce a new field for adjusted arrival time as adjusting the
arrival time in place meant it got applied again across the relay and
that caused different propagation delay on remote nodes.
- Reset path change counters as long as there is any sample that is not
higher than the multiple of long term. There was a case of
o Sample with high value that triggered path change start.
o Then some samples with high enough delta, but did not meet the
criteria for increasing counter further.
o Some time later, another sample met the threshold and that triggered
a path change re-init.
* do not adapt to large delta
* Tweak adaptation to increase in propagation delay.
A couple of issues
- RTCP Sender Reports rate will vary based on underying track bitrate.
(at least in theory, not all entities will do it though, for example
SFU does standard rate of one per three seconds irrespective of track
bit rate). So, adapt the long term estimate of propagation delay delta
based on spacing of reports.
- Re-init of propagation delay to adapt to path change was taking the
last value before the switch. But, that one value could have been an
outlier and accepting it is not great. So, adapt spike time
propagation delay in a smoother fashion to ensure that all values
during spike contribute to the final value.
* clean up
On migration, when subscription moved from remote -> local,
transceiver caching was racing. Although a very small possibility,
it could happen like so
1. down track close
2. down track close callback fires go routine to close subscribed track
3. subscribed track close handler in subscription manager tries to
reconcile
4. reconcile adds subscribed track again
5. cannot find cached transceiver as caching happens after down track
close finishes in stap 1 above. Although there are a couple of
gortouine jumps (step 2 fires a goroutine to close subscribed track
and step 4 will reconcile in a goroutine too), it is theoretically
possible that the step 1 has not finished and hence transceiver is
not cached.
Fix is to move caching to before closing subscribed track.
- When audio is muted, server injects silence frames which moves the
time stamp forward and adjusts offset. That cannot be used against
publisher side sender report. Use a pinned version.
- Ignore small changes to propagation delay even while checking for
sharp increase. That is spamming a lot for small changes, i.e.
existing delta is 100 micro seconds or so and the new one is 300 micro
seconds. Also rename to `longTerm` from `smoothed` as it is a slow
varying long term estimate of propagation delay delta. And slow down
that adaptation more.
* Forward publisher sender report.
Publisher side RTCP sernfer report is rebased to SFU time base
and used to send sender rerport to subscriber.
Will wait to merge till previous versions are out as this will require a
bunch of testing.
* - Add rebased report drift
- update protocol dep
- fix path change check, it has to check against delta of propagation
delay and not propagation delay as the two side clocks could be way
off.