Commit Graph

2365 Commits

Author SHA1 Message Date
Raja Subramanian
aa2ce22655 Stringer interface (#3181) 2024-11-16 10:14:37 +05:30
Raja Subramanian
6509cdb5ea StreamAllocator (congestion controller) refactor (#3180)
* refactor WIP

* WIP

* compiling

* runlock

* fixes

* fmt

* stringer and unlikely logger

* clean up
2024-11-16 03:06:37 +05:30
Raja Subramanian
eceada8b31 use spatialLayer var (#3178)
* use spatialLayer var

* lower end check
2024-11-15 03:13:53 +05:30
Raja Subramanian
11deab22d4 Clean up forwardRTP function a bit. (#3177)
- Pass in the buffer, don't read it everytime through the loop
- cache stream trackers and avoid getting from stream tracker manager
  every time.
2024-11-15 02:49:43 +05:30
Raja Subramanian
adaf56a30d Move Prober to ccutils. (#3175)
* keep track of RTX bytes separately

* packet group

* Packet group of 50ms

* Minor refactoring

* rate calculator

* send bit rate

* WIP

* comment

* reduce packet infos size

* extended twcc seq num

* fix packet info

* WIP

* queuing delay

* refactor

* config

* callbacks

* fixes

* clean up

* remove debug file, fix rate calculation

* fmt

* fix probes

* format

* notes

* check loss

* tweak detection settings

* 24-bit wrap

* clean up a bit

* limit symbol list to number of packets

* fmt

* clean up

* lost

* fixes

* fmt

* rename

* fixes

* fmt

* use min/max

* hold on early warning of congestion

* make note about need for all optimal allocation on hold release

* estimate trend in congested state

* tweaks

* quantized

* fmt

* TrendDetector generics

* CTR trend

* tweaks

* config

* config

* comments

* clean up

* consistent naming

* pariticpant level setting

* log usage mode

* probing hacks

* WIP

* no lock

* packet group config

* ctr trend refactor

* cleanup and fixes

* format

* debug

* format

* move prober to ccutils

* clean up

* clean up
2024-11-15 00:05:59 +05:30
Paul Wells
9e4dae7107 add per message deflate to signal ws (#3174) 2024-11-14 00:35:30 -08:00
Raja Subramanian
cc22306047 Attempt to fix missing participant left webhook. (#3173)
On a resume, the signal stats will call `ParticipantLeft`. Although, it
explicity says not to send events, it could still close the stats
worker.

To handle that, we created a stats worker if needed in
`ParticipantResume` notification in this PR
(https://github.com/livekit/livekit/pull/2982), but that is not enough
as that event could happen before previous signal connection closes the
stats worker.

A new stats worker does get created when `ParticipantJoined` is called
by the new signal connection, but it does not transfer connected state.
So, when the client leaves, `ParticipantLeft` is not sent.

I am not seeing why we should not transfer connected state always given
that it is the same participant SID/session. But, I have a feeling that
I am missing some corner case. Please let me know if I am missing
something here.
2024-11-14 10:59:15 +05:30
David Zhao
84cb14695f Fix incorrect computation of SecondsSinceNodeStatsUpdate (#3172)
Stats.UpdatedAt is in seconds, but we were loading as nanosecs
2024-11-12 01:13:08 -06:00
Raja Subramanian
41fbcec2cd Fix header size calculation in stats. (#3171)
* Fix header size calculation in stats.

With pacer inserting some extensions, the header size used in stats
(and more impoetantly when probing for bandwidth estimation and
metering the bytes to control the probes) was incorrect. The size
was effectively was that of incoming extensions. It would have been
close enough though.

Anyhow, a bit of history
- initially was planning on packaging all the necessary fields into
  pacer packet and pacer would callback after sending, but that was not
  great for a couple of reasons
  - had to send in a bunch of useless data (as far as pacer is
    concerned) into pacer.
  - callback every packet (this is not bad, just a function call which
    happens in the foward path too, but had to lug around the above
    data).
- in the forward path, there is a very edge case issue when calling stats update
  after pacer.Enqueue() - details in https://github.com/livekit/livekit/pull/2085,
  but that is a rare case.

Because of those reasons, the update was placed in the forward path
before enqueue, but did not notice the header size issue till now.

As a compromise, `pacer.Enqueue` returns the headerSize and payloadSize.
It uses a dummy header to calculate size. Real extension will be added
just before sending packet on the wire. pion/rtp replaces extension if
one is already present. So, the dummy would be replaced by the real one
before sending on the wire.
a21194ecfb/packet.go (L398)

This does introduce back the second rare edge case, but that is very
rare and even if it happens, not catastrophic.

* cleanup

* add extensions and dummy as well in downtrack to make pacer cleaner
2024-11-12 10:53:57 +05:30
Raja Subramanian
a825661aff Use weighted loss to detect loss based congesiton signal. (#3169)
* Use weighted loss to detect loss based congesiton signal.

- Increase JQR min loss to 0.25.
- Use weighted loss ratio so that more packet rate gets higher
  weightage. At default config, 10 packets in 1/2 second will form a
  valid packet group for loss based congestion signal consideration. Two
  packets lost in that group may not be bad. So, bumped up the
  JQR min loss to 0.25. However, 20% loss (or even much lesser loss)
  could be problematic if the packet rate is higher (potentially
  multiple streams affected and there could be a lot of NACKs as a result).
  So, weight it by packet rate so that higher packet rates enter JQR
  at lower losses.

* WIP

* use aggregated loss
2024-11-12 09:21:28 +05:30
Raja Subramanian
57b3dfdcf4 Loss based congestion signal detector. (#3168)
* Loss based congestion signal detector.

It uses the same approach of thresholding + duration to detect
region of operation and further derive early warning/congested states.
A gutter is used for indeterminate region just like the queuing delay
based case.

The two approaches (queuing delay and loss) are treated independently,
i. e. packet groups have to satify the same type of condition (queuing
delay OR loss) to build up congestion.

The aggregate congestion signal is triggered if either one triggers.

Maybe, there is a way to accept hybrid signalling (i. e. each group
satisfying either threhsold adds up to congestion signal detection), but
needs more experimentation. For now, keeping them separate.

* apply max threshold

* clean up

* spelling
2024-11-11 13:27:49 +05:30
Raja Subramanian
ceb8a70696 Use same components when logger is updated (#3166)
Logger in buffer can get updated when the layer is known. Use the same
components used in destructor.
2024-11-11 11:38:48 +05:30
Raja Subramanian
5109551262 Reduce lock scope. (#3167)
Also, do not close channel. stop fuse break will close
the worker and GC cleans up.
2024-11-11 11:38:32 +05:30
Raja Subramanian
a3f2ca56f9 TWCC based congestion control - v0 (#3165)
* file output

* wake under lock

* keep track of RTX bytes separately

* packet group

* Packet group of 50ms

* Minor refactoring

* rate calculator

* send bit rate

* WIP

* comment

* reduce packet infos size

* extended twcc seq num

* fix packet info

* WIP

* queuing delay

* refactor

* config

* callbacks

* fixes

* clean up

* remove debug file, fix rate calculation

* fmt

* fix probes

* format

* notes

* check loss

* tweak detection settings

* 24-bit wrap

* clean up a bit

* limit symbol list to number of packets

* fmt

* clean up

* lost

* fixes

* fmt

* rename

* fixes

* fmt

* use min/max

* hold on early warning of congestion

* make note about need for all optimal allocation on hold release

* estimate trend in congested state

* tweaks

* quantized

* fmt

* TrendDetector generics

* CTR trend

* tweaks

* config

* config

* comments

* clean up

* consistent naming

* pariticpant level setting

* log usage mode

* feedback
2024-11-11 10:24:47 +05:30
Raja Subramanian
653857e42b Split out audio level config. (#3163)
* Split out audio level config.

Inline it in yaml as it is exposed/documented config.

* test

* default congestion control enable
2024-11-08 21:36:38 +05:30
Raja Subramanian
86383b2271 De-centralize some configs to where they are used. (#3162)
* De-centralize some configs to where they are used.

And make default variables.

Renaming a bit, but these are all internal config and have not been
added to documented config.

* Keep documented config as is.

* test

* typo
2024-11-08 12:47:30 +05:30
Denys Smirnov
55d084fd18 Annotate SIP errors with Twirp codes. (#3161) 2024-11-07 17:00:57 +02:00
Raja Subramanian
f3a13569ee Use int64 nanoseconds and reduce conversion in a few places (#3159) 2024-11-06 12:28:30 +05:30
Paul Wells
09f140afa8 auto create rooms during create agent dispatch api request (#3158) 2024-11-05 16:15:05 -08:00
Raja Subramanian
365e63230d Some misc clean up. (#3156)
* Some misc clean up.

- Have been seeing counterfeiter warnings about efficiency for a while
  with go:generate declaration multiple times in the same package.
  Address that: https://github.com/maxbrunsfeld/counterfeiter?tab=readme-ov-file#step-2b---add-counterfeitergenerate-directives
- A bit more readability on parameters passed to `sendLeave`

* spacing

* revert some deletes as the complaint was in analytics service only

* Declare in package only once.

Although the warning is about go:generate multiple times when directly
giving the interface to generate, have `go:generate` multiple times in a
package even with `-generate` ends up generating once per invocation.
Once per package is enough to run the generation just once.
2024-11-04 11:26:41 +05:30
Raja Subramanian
35bef35d66 Clean up drop ICE candidates. (#3153)
* Clean up drop ICE candidates.

With pion/ice v2.3.37, ICE Lite will accept use-candidate from peer.
So, there is no need to drop candidates.

Still leaving the FF change to not use Lite which was added as part of
this effort initially due to how FF does nominations. Updated comment to
explain why.

* clean up test
2024-11-02 10:50:55 +05:30
Raja Subramanian
1c80ce8308 Only drop srflx if configured. (#3149) 2024-10-30 21:20:34 +05:30
Raja Subramanian
da9bd7f426 make a util of IP address truncation for logging. (#3148)
* make a util of IP address truncation for logging.

* exported method
2024-10-30 19:44:41 +05:30
Paul Wells
24f3c93204 ignore unexported fields in yaml lint (#3145) 2024-10-29 12:16:21 -07:00
cnderrauber
526985f109 don't return video/rtx to client (#3142) 2024-10-26 22:29:04 +08:00
Raja Subramanian
49b75e94a6 Consolidate operations on LocalNode. (#3140) 2024-10-25 18:57:23 +05:30
Raja Subramanian
d341ee1ce8 Maintain RTT marker for calculations. (#3139)
* Maintain RTT marker for calculations.

Restore the drift logging change.

* remove unnecessary cast
2024-10-25 11:50:59 +05:30
Raja Subramanian
542620b486 Revert "Adjust drift calculations for pass through. (#3129)" (#3138)
This reverts commit 7ab6e5df09.
2024-10-25 11:11:21 +05:30
Raja Subramanian
024a75d27c display related only when address is valid (#3137) 2024-10-24 18:26:56 +05:30
Raja Subramanian
fbdc2491d9 Log truncated (#3136)
* Log truncated

* add related address
2024-10-24 16:24:54 +05:30
Raja Subramanian
b8c6b1f1ec Log ICE connection info on failure. (#3134)
- Truncate public remote IP
- Log only on short connection to avoid logging too much
2024-10-24 14:30:04 +05:30
cnderrauber
ca77df8212 warn for multiple dd ext (#3135)
* warn for multiple dd ext

* unused
2024-10-24 16:59:24 +08:00
Raja Subramanian
de102f32db Display both pairs on selected candidate pair change (#3133)
* Display both pairs on selected canddiate pair change

* disable ICE lite for Firefox
2024-10-23 21:30:52 +05:30
Raja Subramanian
7ab6e5df09 Adjust drift calculations for pass through. (#3129)
No functional effect, but was logging more than expected drift in the
down stream direction. Reason is that when passing through, we could be
using an older report. But, the adjustment was applied to the monotonic
clock and not the RTP timestamp. So, it looked like more time had
elapsed for the same RTP clock elapsed and logging higher than expected
drift. Correcting it so that the log is not misleading/confusing.
2024-10-23 11:03:43 +05:30
Raja Subramanian
487a3fc3fb ICE candidate marking (#3128)
- Update filtered if dropping a pending candidate later.
- Ordinal for selected pair so it is easy to see which got selected
  later.
2024-10-22 20:23:55 +05:30
cnderrauber
b30cc9066a Drop remote candidates based on lite option (#3127)
Only drop remote candidates if remote peer is
not lite and local peer is lite.
2024-10-22 17:53:40 +08:00
Paul Wells
b0d3d65f18 update events package (#3126)
* update events package

* deps
2024-10-21 23:44:00 -07:00
David Zhao
dd7cd7eafc Handle room configuration that's set in the grant itself (#3120)
* Handle room configuration that's set in the grant itself

* ensure refresh token contains updates

* deps

* dep

---------

Co-authored-by: Paul Wells <paulwe@gmail.com>
2024-10-21 23:31:12 -07:00
Benjamin Pracht
d751f209d5 Allow requesting a dialtone during call transfer (#3122) 2024-10-21 21:05:31 -07:00
David Zhao
3e7185f264 chore: add check to skip launching TrackEgress for Egress participants (#3125)
Egress participants don't publish, so there is no functionality change
2024-10-21 20:47:41 -07:00
Raja Subramanian
d4e3c63406 Seed duplicate packets and bytes. (#3124)
Had missed this before. This could have cause retransmit packets/bytes
to be high.
2024-10-21 23:58:41 +05:30
Raja Subramanian
45b2804df8 Skip divide-by-0. (#3119)
Does not crash, but does a NaN. Avoid that.
2024-10-19 16:21:23 +05:30
Raja Subramanian
a564f7fbe6 Add option to drop remote ICE candidates. (#3118)
Defaults to OFF.
2024-10-19 10:30:22 +05:30
Raja Subramanian
182a073353 Log ICE reconnected when selected pair changes. (#3117)
Logging selected pair when ICE connection state changed could have
picked up previous selected pair.

Also, log shortened remote IP and remote port.
2024-10-18 23:32:18 +05:30
Raja Subramanian
44a74fc06a Clean up sending raw mime as well. (#3113) 2024-10-18 00:34:29 +05:30
Raja Subramanian
40b10af960 Use monotonic time util. (#3112)
Thank you @paulwe for doing this. I was promising to do this for a
while, but just like other times, empty promises :-(
2024-10-17 10:49:24 +05:30
Ben Cherry
19c5ed6343 Parse python, cpp, unity-web, node sdks in clientinfo (#3110) 2024-10-16 20:18:44 -07:00
Denys Smirnov
50b4d6605e Type safe IP checks for SIP Trunks. (#3108) 2024-10-16 17:48:55 +03:00
Raja Subramanian
8221471b67 Protocol update to get more precise protoproxy timing (#3107)
* Protocol update to get more precise protoproxy timing

* really update protocol
2024-10-16 18:43:09 +05:30
Raja Subramanian
792964ad1c Always add upper case mime for video to work around a prefix trim issue (#3106) 2024-10-16 15:32:37 +05:30