data.
Without the check, it was getting tripped by publisher not publishing
any data. Both conditions returned nil, but in one case, the receiver
report should have been received, but no movement in number of packets.
* Decode chains
* clean up
* clean up
* decode targets only on publisher side
* comment out supported codecs
* fix test compile
* fix another test compile
* Adding TODO notes
* chainID -> chainIdx
* do not need to check for switch up point when using chains, as long as chain integrity is good, can switch
* more comments
* address comments
Hopefully temporary while we can find a better solution.
Adds 36 KB per SSRC. So, if a node can handle 10K SSRCs (roughly 10K
tracks), that will be 360 MB of extra memory.
* Update go deps
Generated by renovateBot
* use generics with Deque
---------
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: David Zhao <dz@livekit.io>
A few more candidates to think about demoting
- Publisher mute changes
- Forwarder -> layer lock/upgrade/downgrade/overshoot adjusting
- StreamAllocator
* Discount upstream + processing jitter from down stream jitter.
Jitter in RTCP Receiver Report from down stream tracks includes
jitter from up stream tracks and any processing in forwarding path.
As packets are forwarded without any buffering (i. e. no de-jittering)
in the SFU, any up stream jitter will carry forward.
While taking delta stats (which is used for connection quality and
reporting to analytics), discount the up stream + processing jitter so
that connection quality score of down stream track is not penalized
due to up stream + processing jitter.
NOTE: Not discounting it in RTP stats ToString/ToProto methods as
that information is useful to have for analysis/debugging.
* fix typo
There are cases where the layer bit rate configuration is such that
the expected bitrate difference is very high. For example,
setting up layer 2 (f) layer for 1.7 Mbps and layer 1 (h) for 180 kbps.
With bitrate based quality, a layer drop results in going to `POOR`
quality rating. With layer based, it will drop one level only.
Also, cleaning up the distance to desired calculation a bit.
* Push/pull for connection stats/quality scoring.
Was not happy with pure pull method missing a window because
of RTCP RR timing is slightly off for audio and using a much
larger window of data in the next update.
That also resulted in RTP stats getting some bits of code.
As that is per-packet processing, was not a good idea.
Switching to push-pull method.
For up track, it is pull, i. e. connection stats worker will pull stats.
For down track, there is a new notification about receiver report
reception. Using this to check for time to run stats. And adding a bit
of tolerance for processing window (currently set so that as long as it
is > 95% of usual processing interval). This allows two things
- for video, RTCP RR are more frequent, but we will still not process
till enough time has passed
- for audio, RTCP RR could be once in 5 seconds or so. Can process when
it is available rather than miss a window and use a much larger window
later.
* uber atomic
* Connectino quality misc changes
1. Call scorer.Update() with nil stat when no data available so that
scorer can synthesise window with proper window time.
2. Substract out loss in interval to account for packets not sent at
all.
3. Fix `packetsNotFound` variable in `getIntervalStats`. I remember this
working at some point. Not sure if I fat fingered in another PR and
deleted the increment line.
4. Logging a bit more when no packets expected. Those can get noisy
especially when track is muted. But, seeing some unexplained
instances of no packets leading to quality drop. So, temporary logging
to get a bit more information.
* correct spelling
* Limit packet score minimum to 0.0
* Make connection quality not too optimistic.
With score normalization, the quality indicator showed good
under conditions which should have normally showed some badness.
So, a few things in this PR
- Do not normalize scores
- Pick the weakest link as the representative score (moving away from
averaging)
- For down track direction, when reporting delta stats, take the number
of packets sent actually. If there are holes in the feed (upstream
packet loss), down tracks should not be penalised for that loss.
State of things in connection quality feature
- Audio uses rtcscore-go (with a change to accommodate RED codec). This
follows the E-model.
- Camera uses rtcscore-go. No change here. NOTE: THe rtscore here is
purely based on bits per pixel per frame (bpf). This has the following
existing issues (no change, these were already there)
o Does not take packet loss, jitter, rtt into account
o Expected frame rate is not available. So, measured frame rate is
used as expected frame rate also. If expected frame rate were available,
the score could be reduced for lower frame rates.
- Screen share tracks: No change. This uses the very old simple loss
based thresholding for scoring. As the bit rate varies a lot based on
content and rtcscore video algorithm used for camera relies on
bits per pixel per frame, this could produce a very low value
(large width/height encoded in a small number of bits because of static content)
and hence a low score. So, the old loss based thresholding is used.
* clean up
* update rtcscore pointer
* fix tests
* log lines reformat
* WIP commit
* WIP commit
* update mute of receiver
* WIP commit
* WIP commit
* start adding tests
* take min score if quality matches
* start adding bytes based scoring
* clean up
* more clean up
* Use Fuse
* log quality drop
* clean up debug log
* - Use number of windows for wait to make things simpler
- track no layer expected case
- always update transition
- always call updateScore
* Change lock scope of access to RTCP sender report data.
Forwarder calls back to get time stamp offset.
Holding buffer lock is a much bigger scoped lock.
Reduce lock scope and cache latest sender report under its own lock.
And use that cache when calculating time stamp offset.
* move sr cache to stream tracker manager for re-use in relay
* cache before spread
* Use purely RR based RTT.
With normalization of NTP time stamp to local time,
don't need to keep track of NTP time of publisher + local time of
when a report is sent. RTT calculations can happen with RR only.
Also, do not log errors when RTT cannot be calculated due to
no last SR. This can happen if the receiver sends an RR before
it receives an SR. As SFU is doing SRs once in 5 seconds, it is
possible some RRs happen before the first SR.
* use error type
* correct error name
* Use local time base for NTP in RTCP Sender Report for downtracks.
More details in comments in code.
* Remove debug
* RTCPSenderReportInfo -> RTCPSenderReportDataExt
* Get rid of sender report data pointer checks
* some additional logging
* Do not use local time stamp when sending RTCP Sender Report
As local time does not take into account the transmission delay
of publisher side sender report, using local time to calculate
offset is not accurate.
Calculate NTP time stamp based on difference in RTP time.
Notes in code about some shortcomings of this, but should
get better RTT numbers. I think RTT numbers were bloated because of
using local time stamp.
* WIP commit
* comment
* clean up
* remove unused stuff
* cleaner comment
* remove unused stuff
* remove unused stuff
* more comments
* TrackSender method to handle RTCP sender report data
* fix test
* push rtcp sender report data to down tracks
* Need payload type for codec id mapping in relay protocol
* rename variable a bit
* Split stream tracker impl from base
* slight re-arrangement of code
* fps based stream tracker
* MinFPS config
* switch back to packet based tracker
* use video config by default to handle sources without type
When switching from local -> remote or remote -> local,
the forwarder state is cached and restored after the switch
to ensure continuity in sequence number /time stamp.
But, if the forwarder had not started before the switch,
the sequence number always starts at 1 because of seeding.
So, do not see unless forwarder was started before the switch.
* Prevent rtx buffer and forwarding path colliding
Received packets are put into RTX buffer which is
a circular buffer and the packet (sequence number) is
queued for forwarding. If the RTX buffer fills up
and cycles before forwarding happens, forwarding
would pick the wrong packet (as it is holding a
reference to a byte slice in the RTX buffer) to forward.
Prevent it by moving reading from RTX buffer just
before forwarding. Adds an extra copy from RTX buffer
-> temp buffer for forwarding, but ensures that forwarding
buffer is not used by another go routine.
* Revert some changes from previous commit
Details:
- Do all forward processing as before.
- One difference is not load raw packet into ExtPacket.
- Load raw packet into provided buffer when module that reads
using ReadExtended calls that function. If the packet is
not there in the retransmission buffer, that packet will be
dropped. This is the case we are trying to fix, i. e. the RTX
buffer has cycled before ReadExtended could pull the packet.
This makes a copy into the provided buffer so that the data
does not change underneath.
* Remove debug comment
* Oops missed a function call
* Seed snapshots
- For one cycle after seeding, delta snap shot can get a huge gap
because of snapshot iitializing from start if not present. Not
a huge deal sa it should not affect functionality, but saving/restoring
(at least with down track) snap shot is a big deal. So just do it.
- Have been seeing a bunch of cases of delta stats getting a lot of
packets due to out-of-order (what seems like) receiver report. So,
save the receiver report and log it when out-of-order is detected
to understand if they are closely spaced or something else could be
happening.
* Remove comment that does not apply anymore
* log current time and RR
Have been seeing a few instances of "too many packets expected in delta"
when trying to generate RTCP SR on down track. Actual sequence numbers
indicate that start is after the end.
As down track RTPStats are driven by receiver report, wondering if we
are getting RTCP_RR out-of-order somehow causing this to happen.
Cannot find any other reason for this.
So, accepting RTCP_RR based update only if the sequence number is higher
than existing and also logging a warning with sequence numbers if they
look out-of-order.