data.
Without the check, it was getting tripped by publisher not publishing
any data. Both conditions returned nil, but in one case, the receiver
report should have been received, but no movement in number of packets.
* Run quality scorer when there are no streams.
In the down stream direction, receiver report is used for scoring.
If there are no receiver reports, it should go to `dry` state and report
poor quality.
Update scorer on dry condition only when update score has not happened
for longer than some multiple of update interval. Cannot update on every
interval when there are no streams as receiver report might be just
missed. Waiting for longer to ensure that report is definitely not
received.
* update last stats time
When current became unavailable, it was possible for
target to be set to opportunistic. Because of that,
the downgrade did not happen and PLI layer lock was
requested continuously.
* A coupke of stream allocator tweaks
- Do not overshoot on catch up. It so happens that during probe
the next higher layer is at some bit rate which is much lower
than normal bit rate for that layer. But, by the time the probe
ends, publisher has climbed up to normal bit rate.
So, the probe goal although achieved is not enough.
Allowing overshoot latches on the next layer which might be more
than the channel capacity.
- Use a collapse window to record values in case of a only one
or two changes in an evaluation window. Some times it happens
that the estimate falls once or twice and stays there. By collapsing
repeated values, it could be a long time before that fall in estimate
is processed. Introduce a collapse window and record duplicate value
if a value was not recorded for collapse window duration. This allows
delayed processing of those isolated falls in estimate.
* minor clean up
* add a probe max rate
* fix max
* use max of committed, expected for max limiting
* have to probe at goal
When detecting congestion based on loss, it is possible that
the loss based signal triggers earlier and the estimate based
signal is lagging. In those cases, check against last received
estimate and if that is lower than loss based throttling, use that.
Without this, it was possible that the current usage high.
Loss based throttling may not dial things back far enough to pause
the stream. Ideally, congestion should hit again and it should be dialled
down further and eventually pause, but there are situations it never
dials back far enough to pause.
* Do not let request layer overshoot available.
After a layer stopped on publisher side, an optimal allocation side
while initially adjusted to not request the stopped layer, a subsequent
allocation went back to the higher layer although it was stopped.
Prevent that.
* simplify
* Support simualting subscriber bandwidth.
When non-zero, a full allocation is triggered.
Also, probes are stopped.
When set to zero, normal probing mechanism should catch up.
Adding `allowPause` override which can be a connection option.
* fix log
* allowPause in participant params
* Decode chains
* clean up
* clean up
* decode targets only on publisher side
* comment out supported codecs
* fix test compile
* fix another test compile
* Adding TODO notes
* chainID -> chainIdx
* do not need to check for switch up point when using chains, as long as chain integrity is good, can switch
* more comments
* address comments
* Use bandwidth requested from last allocation.
With overshoot/opportunistic forwarding, It is possible that
bitrate at target layers is 0. So, use bandwidth requested
from last allocation which shouold have a correct value.
Still need to think about using the latest bit rates to get
the requested bandwidth. It is possible that bitrates have
changed since last allocation. That was the idea behind using
the latest bitrates, but it could return 0. Accounting for it
runs into a few scenarios. Last allocation has number from
last allocation and is a good indicator of the need.
* race
Hopefully temporary while we can find a better solution.
Adds 36 KB per SSRC. So, if a node can handle 10K SSRCs (roughly 10K
tracks), that will be 360 MB of extra memory.
* Update go deps
Generated by renovateBot
* use generics with Deque
---------
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: David Zhao <dz@livekit.io>
This fixes the case of screen share forwarding. We should probably also
look at proper AddTrack. The problem was that
- AddTrack used two layers for screen share from JS sample app
- Track was published with rid = f. Given that and the track info,
consistent layer mapping set the layer as 1.
- `getBufferLocked` always uses the highest layer for SVC
- Between the two, when down track was requesting PLI, there was
no buffer at the requested layer and hence no PLI went out.
A few other notes
- Tried locking SVC to layer 0 (instead of layer 2), but that resulted
in PLI layer lock spamming. It did not happen in v1.3.0 of the server
though. Not sure what causes that. Need to investigate later.
But, that does not happen when using layer 2 buffer as SVC buffer.
- When using layer 2 for SVC, the PLI throttle config will be using that
of layer 2. Is that okay?
- `buffer` structure should maintain more stats about spatial layers for
SVC case so that layer stats can be reported to analytics/scoring etc.
- In general, `buffer` may need some more hooks to make it SVC aware so
that it can handle various spetial layer aware/specific bits.
* Ensure sequence number continuity
When using Go SDK (livekit-cli or egress) as a client,
SFU sends blank frames when audio track is muted to ensure that
Pion OnTrack fires on GoSDK side. That resulted in a huge sequence
number/time stamp jump when the real stream started.
Ensure continuity by creating random sequence number/time stamp when
starting with a blank frames. And when sequence number/time stamp is
initialized using SetLastSnTs, continue sequence if it was already
initialized.
* remove debug
A few more candidates to think about demoting
- Publisher mute changes
- Forwarder -> layer lock/upgrade/downgrade/overshoot adjusting
- StreamAllocator
* Discount upstream + processing jitter from down stream jitter.
Jitter in RTCP Receiver Report from down stream tracks includes
jitter from up stream tracks and any processing in forwarding path.
As packets are forwarded without any buffering (i. e. no de-jittering)
in the SFU, any up stream jitter will carry forward.
While taking delta stats (which is used for connection quality and
reporting to analytics), discount the up stream + processing jitter so
that connection quality score of down stream track is not penalized
due to up stream + processing jitter.
NOTE: Not discounting it in RTP stats ToString/ToProto methods as
that information is useful to have for analysis/debugging.
* fix typo
A few things
1. Have to use expected layer in upstream distance to desired. Using
min(published, expected) means if expected is higher than published, it was not caught as a missed layer.
2. Forgot to remove layer transition update in one place. It was still constrained to screen share.
This caused quality to not pick up after constraint is released.
3. Switching to max layer cannot be marked on max published. Same as point #1 above. Otherwise,
dynacast would kick in and turn off highest layer.
There are cases where the layer bit rate configuration is such that
the expected bitrate difference is very high. For example,
setting up layer 2 (f) layer for 1.7 Mbps and layer 1 (h) for 180 kbps.
With bitrate based quality, a layer drop results in going to `POOR`
quality rating. With layer based, it will drop one level only.
Also, cleaning up the distance to desired calculation a bit.
With push model (i. e. connection quality evaluation triggered
by reception of RTCP receiver report), it is possible that a report
is received quickly after a track is started (especially with video).
Those should not trigger a quality evaluation.
Set `lastStatsAt` in `Start` routine and ensure that start has been
called and enough time has passed since last stats time to avoid
small windows.