With Read and ReadExtended waiting (they are two different goroutines),
use Broadcast always. In theory, they both should not be waiting at the
same time, but just being safe.
* Simplify time stamp calculation on switches.
Trying to simplify time stamp calculation on restarts.
The additional checks take effect rarely and it not worth the extra
complication.
Also, doing the reference time stamp in extended range.
The challenge with that is when publisher migrates the extended
timestamp could change post migration (i. e. post migration would not
know about rollovers). To address that, maintain an offset that is
updated on resync.
* WIP
* Revert to resume threshold
* typo
* clean up
* Handle large jumps in RTCP sender report timestamp.
Seeing cases of RTCP Sender Report spaced apart by more than half the
RTP Timestamp range. Maybe a case of laptop going to sleep and waking
up. Handle it using time diff from last report and calculating expected
timestamp.
* try go 1.22
* Connection quality LOST only if RTCP is also not available.
It is possible that sender stops all layers of video due to some
constraint (CPU or bandwidth). Packet reception going dry due to
that should not trigger `LOST` quality.
Add last received RTCP time also to distinguish the case
of real `LOST` and sender stopping traffic.
Some bits to watch for
- With audio, RTCP reports could be more than 5 seconds apart (5 seconds
is the default interval for connection quality scorer), but audio
senders usually send silence packets even when there is no input.
So audio completely stopping can be considered `LOST`.
- With video, have to observe if all clients continue to send RTCP even
if all layers are stopped.
- RTCP bandwidth is not supposed to exceed the primary stream bandwidth.
libwebrtc calculates that and spaces out RTCP reports accordingly.
That is the reason why audio reports are that far apart. If a video
stream is encoded at a very low bit rate, it could also be sending
RTCP rarely. So, there is the case of LOST being indistinguishable
from sender stopping all layers. But, this should be a rare case.
* typo
If the first packet of keyframe has template structure is lost then
subsequent packets rely on it will report invalid tempalte error which
is expected.
* Add support for "abs-capture-time" extension.
Currently, it is just passed through from publisher -> subscriber side.
TODO: Need to store in sequencer and restore for retransmission.
* abs-capture-time in retransmissions
* clean up
* fix test
* more test fixes
* more test fixes
* more test fixes
* log only when size is non-zero
* log on both sides for debugging
* add marshal/unmarshal
* normalize abs capture time to SFU clock
* comment out adding abs-capture-time from registered extensions
* Disable audio loss proxying.
Added a config which is off by default.
With audio NACKs, that is the preferred repair mechanism.
With RED, repair is built in via packet redundancy to recover from
isolated losses.
So, proxying is not required. But, leaving it in there with a config
that is disabled by default.
* fix test
* Prevent large spikes in propagation delay
A few tweaks
- Large spike in propagation delay due to congested channel results in
long term estimate getting high value. Ignore outliers in long term
estimate.
- Introduce a new field for adjusted arrival time as adjusting the
arrival time in place meant it got applied again across the relay and
that caused different propagation delay on remote nodes.
- Reset path change counters as long as there is any sample that is not
higher than the multiple of long term. There was a case of
o Sample with high value that triggered path change start.
o Then some samples with high enough delta, but did not meet the
criteria for increasing counter further.
o Some time later, another sample met the threshold and that triggered
a path change re-init.
* do not adapt to large delta
* Tweak adaptation to increase in propagation delay.
A couple of issues
- RTCP Sender Reports rate will vary based on underying track bitrate.
(at least in theory, not all entities will do it though, for example
SFU does standard rate of one per three seconds irrespective of track
bit rate). So, adapt the long term estimate of propagation delay delta
based on spacing of reports.
- Re-init of propagation delay to adapt to path change was taking the
last value before the switch. But, that one value could have been an
outlier and accepting it is not great. So, adapt spike time
propagation delay in a smoother fashion to ensure that all values
during spike contribute to the final value.
* clean up
- When audio is muted, server injects silence frames which moves the
time stamp forward and adjusts offset. That cannot be used against
publisher side sender report. Use a pinned version.
- Ignore small changes to propagation delay even while checking for
sharp increase. That is spamming a lot for small changes, i.e.
existing delta is 100 micro seconds or so and the new one is 300 micro
seconds. Also rename to `longTerm` from `smoothed` as it is a slow
varying long term estimate of propagation delay delta. And slow down
that adaptation more.
* Forward publisher sender report.
Publisher side RTCP sernfer report is rebased to SFU time base
and used to send sender rerport to subscriber.
Will wait to merge till previous versions are out as this will require a
bunch of testing.
* - Add rebased report drift
- update protocol dep
- fix path change check, it has to check against delta of propagation
delay and not propagation delay as the two side clocks could be way
off.
* Use start time stamp to calculate down stream sender report.
With first packet time adjustment, using the first time stamp is more
accurate.
This still suffers if the up stream clock rate changes (happens in cases
like noise suppression which is not well understood). Will be looking at
pass through of sender report from publisher to subscriber.
* similar log strings
* avoid early sender reports
* log messages
* Reduce first packet adjustment threshold to 15 seconds
* Buffer size config for video and audio.
There was only one buffer size in config.
In upstream, config value was used for video.
Audio used a hard coded value of 200 packets.
But, in the down stream sequencer, the config value was used for both
video and audio. So, if video was set up for high bit rate (deep
buffers), audio sequencer ended up using a lot of memory too in
sequencer.
Split config to be able to control that and also not hard code audio.
Another optimisation here would be to not instantiate sequencer unkess
NACK is negotiated.
* deprecate packet_buffer_size
Firefox on Windows 10 seems to be producing simulcast tracks with
duplicate RID. That causes a leak as only one buffer is processed.
Ignore duplicate rid.
NOTE: This is not perfect as the actual layer -> rid is indeterminable
at addition time. It would require looking at packets to determine the
video dimensions and match to rid/layer to figure out which one is
correct and which one is duplicate.
To simplify though, taking the first one and dropping later ones.
This could mean the correct resolution is not streamed, but that should
be okay. The leak is far more destructive.
Was at 20 when LOST was introduced, but was going to 20 even when under
not LOST conditions. When there are packets, want the min to be at 30.
Going down to 20 resulted in reporting LOST quality even when packets
were flowing (although they were experiencing heavy loss and quality
would have been very bad, yet they are not lost).
Also, sample warning about adding packet to bucket even more.
* Add debug to understand VP9 freezes.
Have reports of VP9 freezing in some rooms.
Some data indicates that NACKs are received by SFU, but cannot get RTP
packet when that happens. It is possible that the NACKs are all from
dropped packets. Adding some debug to understand drops/NACKs better.
* enable DD debug
* comment out DD debug
* markers
* add back log about diff length mismatch
* add back key frame mismatch logging
* log skipped drops also