* Participant traffic load.
Capturing information about participant traffic
- Upstream/Downstream
- Audio/Video/Data
- Packets/Bytes
This captures a notion of how much traffic load a participant is
generating.
Can be used to make allocation decisions.
* Clean up
* SIP patches
* reporter goroutine
* unlock
* move traffic stats from protocol
* check type
* Use 32-bit time stamp to get reference time stamp on a switch.
With relay and dyncast and migration, it is possible that different
layers of a simulcast get out of sync in terms of extended type,
i. e. layer 0 could keep running and its timestamp could have
wrapped around and bumped the extended timestamp. But, another layer
could start and stop.
One possible solution is sending the extended timestamp across relay.
But, that breaks down during migration if publisher has started afresh.
Subscriber could still be using extended range.
So, use 32-bit timestamp to infer reference timestamp and patch it with
expected extended time stamp to derive the extended reference.
* use calculated value
* make it test friendly
* WIP commit
* WIP commit
* WIP commit
* Some clean up
- Removed a chatty debug log
- some spelling, punctuation correction in comments
- missed an `Abs` in check, add it.
* Delete down track from receiver in close always.
I think with the parallel close in goroutines, it so happens that
peer connection can get closed first and unbind the track.
The delete down track and RTCP reader close was inside if `bound` block.
So, they were not running leaving a dangling down track in the receiver.
* fix tests
* fix test
It is possible that publisher paces the media.
So, RTCP sender report from publisher could be ahead of
what is being fowarded by a good amount (have seen up to 2 seconds
ahead). Using the forwarded time stamp for RTCP sender report
in the down stream leads to jumps back and forth in the down track
RTCP sender report.
So, look at the publisher's RTCP sender report to check for it being
ahead and use the publisher rate as a guide.
* Keep track of expected RTP time stamp and control drift.
- Use monotonic clock in RTCP Sender Report and packet times
- Keep the time stamp close to expected time stamp on layer/SSRC
switches
* clean up
* fix test compile
* more test compile failures
* anticipatory clean up
* further clean up
* add received sender report logging
* Make connection quality not too optimistic.
With score normalization, the quality indicator showed good
under conditions which should have normally showed some badness.
So, a few things in this PR
- Do not normalize scores
- Pick the weakest link as the representative score (moving away from
averaging)
- For down track direction, when reporting delta stats, take the number
of packets sent actually. If there are holes in the feed (upstream
packet loss), down tracks should not be penalised for that loss.
State of things in connection quality feature
- Audio uses rtcscore-go (with a change to accommodate RED codec). This
follows the E-model.
- Camera uses rtcscore-go. No change here. NOTE: THe rtscore here is
purely based on bits per pixel per frame (bpf). This has the following
existing issues (no change, these were already there)
o Does not take packet loss, jitter, rtt into account
o Expected frame rate is not available. So, measured frame rate is
used as expected frame rate also. If expected frame rate were available,
the score could be reduced for lower frame rates.
- Screen share tracks: No change. This uses the very old simple loss
based thresholding for scoring. As the bit rate varies a lot based on
content and rtcscore video algorithm used for camera relies on
bits per pixel per frame, this could produce a very low value
(large width/height encoded in a small number of bits because of static content)
and hence a low score. So, the old loss based thresholding is used.
* clean up
* update rtcscore pointer
* fix tests
* log lines reformat
* WIP commit
* WIP commit
* update mute of receiver
* WIP commit
* WIP commit
* start adding tests
* take min score if quality matches
* start adding bytes based scoring
* clean up
* more clean up
* Use Fuse
* log quality drop
* clean up debug log
* - Use number of windows for wait to make things simpler
- track no layer expected case
- always update transition
- always call updateScore
When the publisher stops publishing, the individual receivers would close
attached DownTracks first before notifying MediaTrackReceiver callbacks.
This means #1454 does not fix the issue entirely since there is still a window
when we can subscribe to a closing track.
Addressing edge case where a layer stopped before bitrate could be
measured. Purely bit rate based change deduction missed this as
the before and after did not have bit rates.
Use available layers to look for changes, especially currently
forwarding layer going away.
Also, simplifying bits. Only in the optimal allocation path,
these things are required. When congested, bitrate is always needed.
So, for optimal path, just look at available layer changes and adjust.
Don't need to look for bitrate based layer changes. Clean up that code.
Added a new manager to handle all subscription needs. Implemented using reconciler pattern. The goals are:
improve subscription resilience by separating desired state and current state
reduce complexity of synchronous processing
better detect failures with the ability to trigger full reconnect
* Use local time base for NTP in RTCP Sender Report for downtracks.
More details in comments in code.
* Remove debug
* RTCPSenderReportInfo -> RTCPSenderReportDataExt
* Get rid of sender report data pointer checks
* WIP commit
* comment
* clean up
* remove unused stuff
* cleaner comment
* remove unused stuff
* remove unused stuff
* more comments
* TrackSender method to handle RTCP sender report data
* fix test
* push rtcp sender report data to down tracks
* Need payload type for codec id mapping in relay protocol
* rename variable a bit
* Promoting a few logs to Info
Also, adding a couple of more info logs which I will remove later
after some debugging.
* mime type
* Protect pause/max layer
* notify even if not bound
* WIP commit
* Connection quality changes
- Fix Firefox showing poor quality
o The issue was that we were using max available layer and
calculating quality. The rationale being that even if
server sends dynacast messages, client may not implement
dynacast and still stream all layers. But, with Firefox
(maybe a Firefox bug), it sends some small amount of
data on layer 2 even when that layer is disabled.
Guessing it is probing (or actually we might be using
some small value for high layers as Firefox cannot turn off
layers). That higher layer gets used in quality calculation.
As the bit rate on that layer is extremely low, it yields low
score.
Fixed by considering the max expected layer. That is of most
interest. Yes, clients may ignore dynacast and stream all layers,
but, max expected is the one of interest. So, look for
quality in the max expected layer and not max available layer.
- Lots of clean up around connection quality stuff
o Use a dynamic scaling thing to ensure that we do not get bitten
by absolute values. Calculate best possible scenario score and
map that to maximum MOS score. This will ensure that different
codecs, different settings do not mess up the scoring. For example,
a client might use 1 Mbps for 720p, but a different client could
use 2 Mbps for 720p. As an SFU/infrastructure middlebox, we do
not have control over quality at those rates. We can only ensure
that streaming happens smoothly at those rates. So, in that
example, for client 1, 1 Mbps will map to MOS 5.0 and for client 2,
2 Mbps will map to MOS 5.0. Any impairments after that will
reflect in the score.
o Penalise for missing target layer by one level for one layer missed.
o Move tests to connection quality directory. The participant test
was not super useful.
* Add missed file
* Remove debug code
* use more constants and initialise normalisation factor
* rtcscore pointer
* Set DtxDisabled from TrackInfo in score calculation.
Also, fix sending connection quality upate on a new subscription.
* comments tweaks
* Move TrackInfo into StreamTrackerManager as this is used by cloud as well
* Separate from ion-sfu
changes:
1. extract pkg/buffer, twcc, sfu, relay, stats, logger
2. to solve cycle import, move ion-sfu/pkg/logger to pkg/sfu/logger
3. replace pion/ion-sfu => ./
reason: will change import pion/ion-sfu/pkg/* to livekit-server/pkg/*
after this pr merged. Just not change any code in this pr, because it
will confused with the separate code from ion-sfu in review.
* Move code from ion-sfu to pkg/sfu
* fix build error for resovle conflict
Co-authored-by: cnderrauber <zengjie9004@gmail.com>