* keep track of RTX bytes separately
* packet group
* Packet group of 50ms
* Minor refactoring
* rate calculator
* send bit rate
* WIP
* comment
* reduce packet infos size
* extended twcc seq num
* fix packet info
* WIP
* queuing delay
* refactor
* config
* callbacks
* fixes
* clean up
* remove debug file, fix rate calculation
* fmt
* fix probes
* format
* notes
* check loss
* tweak detection settings
* 24-bit wrap
* clean up a bit
* limit symbol list to number of packets
* fmt
* clean up
* lost
* fixes
* fmt
* rename
* fixes
* fmt
* use min/max
* hold on early warning of congestion
* make note about need for all optimal allocation on hold release
* estimate trend in congested state
* tweaks
* quantized
* fmt
* TrendDetector generics
* CTR trend
* tweaks
* config
* config
* comments
* clean up
* consistent naming
* pariticpant level setting
* log usage mode
* probing hacks
* WIP
* no lock
* packet group config
* ctr trend refactor
* cleanup and fixes
* format
* debug
* format
* move prober to ccutils
* clean up
* clean up
On a resume, the signal stats will call `ParticipantLeft`. Although, it
explicity says not to send events, it could still close the stats
worker.
To handle that, we created a stats worker if needed in
`ParticipantResume` notification in this PR
(https://github.com/livekit/livekit/pull/2982), but that is not enough
as that event could happen before previous signal connection closes the
stats worker.
A new stats worker does get created when `ParticipantJoined` is called
by the new signal connection, but it does not transfer connected state.
So, when the client leaves, `ParticipantLeft` is not sent.
I am not seeing why we should not transfer connected state always given
that it is the same participant SID/session. But, I have a feeling that
I am missing some corner case. Please let me know if I am missing
something here.
* Fix header size calculation in stats.
With pacer inserting some extensions, the header size used in stats
(and more impoetantly when probing for bandwidth estimation and
metering the bytes to control the probes) was incorrect. The size
was effectively was that of incoming extensions. It would have been
close enough though.
Anyhow, a bit of history
- initially was planning on packaging all the necessary fields into
pacer packet and pacer would callback after sending, but that was not
great for a couple of reasons
- had to send in a bunch of useless data (as far as pacer is
concerned) into pacer.
- callback every packet (this is not bad, just a function call which
happens in the foward path too, but had to lug around the above
data).
- in the forward path, there is a very edge case issue when calling stats update
after pacer.Enqueue() - details in https://github.com/livekit/livekit/pull/2085,
but that is a rare case.
Because of those reasons, the update was placed in the forward path
before enqueue, but did not notice the header size issue till now.
As a compromise, `pacer.Enqueue` returns the headerSize and payloadSize.
It uses a dummy header to calculate size. Real extension will be added
just before sending packet on the wire. pion/rtp replaces extension if
one is already present. So, the dummy would be replaced by the real one
before sending on the wire.
a21194ecfb/packet.go (L398)
This does introduce back the second rare edge case, but that is very
rare and even if it happens, not catastrophic.
* cleanup
* add extensions and dummy as well in downtrack to make pacer cleaner
* Use weighted loss to detect loss based congesiton signal.
- Increase JQR min loss to 0.25.
- Use weighted loss ratio so that more packet rate gets higher
weightage. At default config, 10 packets in 1/2 second will form a
valid packet group for loss based congestion signal consideration. Two
packets lost in that group may not be bad. So, bumped up the
JQR min loss to 0.25. However, 20% loss (or even much lesser loss)
could be problematic if the packet rate is higher (potentially
multiple streams affected and there could be a lot of NACKs as a result).
So, weight it by packet rate so that higher packet rates enter JQR
at lower losses.
* WIP
* use aggregated loss
* Loss based congestion signal detector.
It uses the same approach of thresholding + duration to detect
region of operation and further derive early warning/congested states.
A gutter is used for indeterminate region just like the queuing delay
based case.
The two approaches (queuing delay and loss) are treated independently,
i. e. packet groups have to satify the same type of condition (queuing
delay OR loss) to build up congestion.
The aggregate congestion signal is triggered if either one triggers.
Maybe, there is a way to accept hybrid signalling (i. e. each group
satisfying either threhsold adds up to congestion signal detection), but
needs more experimentation. For now, keeping them separate.
* apply max threshold
* clean up
* spelling
* file output
* wake under lock
* keep track of RTX bytes separately
* packet group
* Packet group of 50ms
* Minor refactoring
* rate calculator
* send bit rate
* WIP
* comment
* reduce packet infos size
* extended twcc seq num
* fix packet info
* WIP
* queuing delay
* refactor
* config
* callbacks
* fixes
* clean up
* remove debug file, fix rate calculation
* fmt
* fix probes
* format
* notes
* check loss
* tweak detection settings
* 24-bit wrap
* clean up a bit
* limit symbol list to number of packets
* fmt
* clean up
* lost
* fixes
* fmt
* rename
* fixes
* fmt
* use min/max
* hold on early warning of congestion
* make note about need for all optimal allocation on hold release
* estimate trend in congested state
* tweaks
* quantized
* fmt
* TrendDetector generics
* CTR trend
* tweaks
* config
* config
* comments
* clean up
* consistent naming
* pariticpant level setting
* log usage mode
* feedback
* De-centralize some configs to where they are used.
And make default variables.
Renaming a bit, but these are all internal config and have not been
added to documented config.
* Keep documented config as is.
* test
* typo
* Some misc clean up.
- Have been seeing counterfeiter warnings about efficiency for a while
with go:generate declaration multiple times in the same package.
Address that: https://github.com/maxbrunsfeld/counterfeiter?tab=readme-ov-file#step-2b---add-counterfeitergenerate-directives
- A bit more readability on parameters passed to `sendLeave`
* spacing
* revert some deletes as the complaint was in analytics service only
* Declare in package only once.
Although the warning is about go:generate multiple times when directly
giving the interface to generate, have `go:generate` multiple times in a
package even with `-generate` ends up generating once per invocation.
Once per package is enough to run the generation just once.
* Clean up drop ICE candidates.
With pion/ice v2.3.37, ICE Lite will accept use-candidate from peer.
So, there is no need to drop candidates.
Still leaving the FF change to not use Lite which was added as part of
this effort initially due to how FF does nominations. Updated comment to
explain why.
* clean up test
No functional effect, but was logging more than expected drift in the
down stream direction. Reason is that when passing through, we could be
using an older report. But, the adjustment was applied to the monotonic
clock and not the RTP timestamp. So, it looked like more time had
elapsed for the same RTP clock elapsed and logging higher than expected
drift. Correcting it so that the log is not misleading/confusing.
* Handle room configuration that's set in the grant itself
* ensure refresh token contains updates
* deps
* dep
---------
Co-authored-by: Paul Wells <paulwe@gmail.com>