Commit Graph

2405 Commits

Author SHA1 Message Date
cnderrauber
5dd6858acf Don't wait rtp packet to fire track (#3246)
* Don't wait rtp packet to fire track

Create track from sdp instead of first rtp packet,
it is consistent with the browser behavior and
will accelerate the track publication.

* fix test
2024-12-13 15:06:14 +08:00
Raja Subramanian
789d0484e2 Add RTX to downstream (#3247)
* Add RTX to downstream

* test
2024-12-13 09:57:03 +05:30
Raja Subramanian
79eda6b72b Send side BWE: tighter contributing groups (#3245)
* WIP

* clean up

* debug

* epm log

* debug

* fmt

* clean up

* default no SSBWE

* clean up
2024-12-12 14:22:31 +05:30
Raja Subramanian
4b16017d09 Send side BWE - fixes (#3244)
* WIP

* no worker

* fixes

* use congested packet groups

* oldest group

* markers

* WIP

* WIP

* WIP

* WIP

* WIP

* clean up

* fmt

* consolidate

* store last packet only for bwe extension cases
2024-12-11 21:31:26 +05:30
Raja Subramanian
d0f7eaeadb Use sens side bwe config directly. (#3241) 2024-12-10 10:01:44 +05:30
Denys Smirnov
dc6fe3aae5 Support SIP list filters. (#3240) 2024-12-09 22:57:47 +02:00
Raja Subramanian
c172ba13e6 Cleaning up unused stream allocator experiments. (#3237)
Not sure if we will ever use it. Can bring it back if needed.
2024-12-08 13:00:58 +05:30
Raja Subramanian
7c5a558a48 Try up-allocation on neutral trend. (#3235)
* Try up-allocation on neutral trend.

Some probes end up with neutral trend due to getting much estimates of
same value. It is okay to try up-allocating in those cases. Otherwise,
the stream allocator some times gets stuck and does not up-allocate at
all as all probes end up neutral.

Changing the name of the signal to `NotCongesting` to signify it is
either neutral or clearing.

* wait 5 RTT for probe to finalize

* trend detector object encoder
2024-12-06 10:52:24 +05:30
Raja Subramanian
94488d434d TWCC probing (#3234)
* WIP

* WIP

* WIP

* make it compile

* typo

* clean up

* fmt

* fixes
2024-12-06 00:13:36 +05:30
Raja Subramanian
d862917249 Record probe information in send side BWE module. (#3231)
Still not doing anythingw with it, but just making a small PR to record
that information for future use.
2024-12-04 14:31:00 +05:30
Raja Subramanian
f9ee48f24b Tri-state probe signal. (#3229)
Need tri-state to indicate inconslusive, congeting and clearing.
Currently, no special treatment for inconclusive, but for future use.
2024-12-03 10:52:43 +05:30
Raja Subramanian
2dcb5c928a Freeze update on congested probe. (#3228)
Reverting back to pre-refactor behaviour. Was trying to avoid doing
special treatment when in probe, but REMB values are hard to predict
and the NACKs as well.

So, freeze updates when congesting in probe till the probe is done.
Otherwise, further changes while probe is finalising sometimes causes an
invalid signal and tracks are not up allocated.
2024-12-02 23:06:06 +05:30
Raja Subramanian
12b3da0a40 Bit more clean up around probe controller refactor (#3227)
* Bit more clean up around probe controller refactor

* consistent order
2024-12-02 13:36:27 +05:30
Raja Subramanian
ceefa8d150 Reset next probe time. (#3226) 2024-12-02 11:37:02 +05:30
Raja Subramanian
156114fcaf Clean up remote BWE a bit. (#3225)
* Clean up remote BWE a bit.

- Had forgotten to start worker, fix that
- ensure correct type of channel observer (probe OR non-probe) based on
  probe state.
- introduce congested hangover state to see better state transitions.
  Does not really affect operation, but state transitions are clearer.

* prevent 0 ticker
2024-12-02 11:09:21 +05:30
Raja Subramanian
3c42ccbb64 Keep congestion state only in BWE. (#3224) 2024-12-02 09:42:51 +05:30
Raja Subramanian
7f0c14306f One shot signalling mode fixes (#3223)
* set desired on synchronous track

* debug

* debug

* direction

* reuse

* clean up
2024-11-30 14:55:36 -08:00
Raja Subramanian
8bb29c3a7b Fixes from probe controller refactor (#3222)
* Fixes from probe controller refactor

* fmt

* static check
2024-11-30 13:34:01 +05:30
cnderrauber
c76fb0bcf4 Disable close by dtls to fix migration (#3220)
Pion v4 imports new feature that will close
peerconnection on dtls.close to detects
remote peer closed quickly, it breaks the
session migration.
2024-11-30 09:20:45 +08:00
Raja Subramanian
44d26f0cb4 Probe controller refactor (#3221)
* WIP

* WIP

* WIP
2024-11-30 01:38:25 +05:30
Raja Subramanian
0a3ba87183 Simplify probe sleep calculations. (#3218)
* Simplify probe sleep calculations.

Splitting into buckets made it problematic around the boundaries and it
was ugly code too. Simplify and set up probes with sleep after each
probe to get the desired interval/rate.

* continue after pop
2024-11-29 13:10:49 +05:30
Raja Subramanian
427ed23478 Move probe observer to pacer (#3214)
* Probe ID pass

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* clean up

* typo

* populate desired bytes

* correct num probes calculation

* debug log

* remove unused constant

* log channel observer

* debug

* clear isInProbe flag on end

* clear probe flag on reset

* re-arrange
2024-11-29 09:19:48 +05:30
lukasIO
1c940af8c0 Add datastream packet type handling (#3210)
* Add datastream packet type handling

* point to main in protocol

* Revert "point to main in protocol"

This reverts commit 2cc6ed6520.

* Update protocol
2024-11-28 12:23:40 +01:00
cnderrauber
54f9f7de51 upgrade to pion/webrtc v4 (#3213) 2024-11-28 16:05:38 +08:00
Raja Subramanian
baf47db834 Publish data and signal bytes once every 30 seconds. (#3212)
For applications with heavy data usage, accumulating data bytes over 5
minutes and then calculating rate using a much shorter window (like 2 -
5 seconds) makes it looks like there is a massive rate spike.

While this change is not a fix, this should soften the impact.

Need a better way to handle different parts of the system operating at
different frequencies. Can use rate in the reporting window, but that
will miss the spikes. Maybe that is okay. For example, if the reporting
window is 5 minutes and there was a 100 Mbps spike for about 10 seconds
of it, it would get smoothed out.
2024-11-28 09:21:44 +05:30
Raja Subramanian
d599911405 Fix prober listener. (#3207)
This was stopping active probe and taking longer to recover.
Missed in the refactor.
2024-11-27 16:18:08 +05:30
Raja Subramanian
c0d20885db Log last switch time stamp (#3205) 2024-11-27 11:26:13 +05:30
Raja Subramanian
c328b767c9 Do not treat data publisher as publisher. (#3204) 2024-11-26 20:44:37 +05:30
Raja Subramanian
a28764479b Give rtp stats context to forwarder. (#3202) 2024-11-26 12:57:02 +05:30
Raja Subramanian
23285744ba Server side metrics (#3198)
* mbb WIP

* deps

* WIP

* WIP

* remove unused file

* Switch to enabled

* misc

* deps

* mediatransportutil update

* Typo

* Set ParticipantIdentity in metrics data packets

* use uint32 as JSON decoder does not unmarshal time.Duration
2024-11-25 13:10:48 +05:30
Raja Subramanian
d07d84f99f Sender side snap shot clean up and logging. (#3196)
* Sender side snap shot clean up and logging.

Seeing cases of sender snap shot packet loss much higher the actual
packets some times. Tracking a bit more to understand that better.
- Rename variables to indicate what is coming from feed side clearly
- Fixed an issue with wrong init of feed side loss in snapshot
- Just use the loss from receiver report as it can go back (receiver
  would subtract on receiving out-of-order packet).
- keep track sof reports in a snapshot (this is temporary for
  debugging/understanding it better and will be removed later)

* remove check
2024-11-23 10:40:10 +05:30
Paul Wells
29c7906250 skip http request logging when the client aborts the request (#3195)
* skip http request logging when the client aborts the request

* cleanup
2024-11-22 00:42:49 -08:00
Raja Subramanian
a83a7abcf4 Start up subscriber RTCP worker in one-shot-signalling mode. (#3194) 2024-11-22 11:04:05 +05:30
Raja Subramanian
3498e53650 Participant method to check a track by name is subscribed. (#3192)
* Set down track connected flag in one-shot-signalling mode.

Also, added maintaing ICE candidates for info purposes.
And doing analytics events (have to maintain the subscription inside
subscriptionmanager to get list of subscribed tracks, so added enough
bits from the async path into sync path to get the analytics bits also)

* comment typo

* method to check if a track name is subscribed
2024-11-22 07:43:38 +05:30
Raja Subramanian
31d6dd7107 Set down track connected flag in one-shot-signalling mode. (#3191)
* Set down track connected flag in one-shot-signalling mode.

Also, added maintaing ICE candidates for info purposes.
And doing analytics events (have to maintain the subscription inside
subscriptionmanager to get list of subscribed tracks, so added enough
bits from the async path into sync path to get the analytics bits also)

* comment typo
2024-11-21 18:41:33 +05:30
Raja Subramanian
d5cc567140 Log more details of RTP stats snap shots. (#3190)
* Log more details of RTP stats snap shots.

Seeing cases of loss more than 100%. Logging snap shots to understand it
better.

* log message

* use delta to update packets lost from RR

* remove cast
2024-11-21 16:41:03 +05:30
Raja Subramanian
9f25603213 One shot signalling mode (#3188)
* WIP

* comment

* Verify method on LocalParticipant

* cleanup

* clean up

* pass in one-shot-mode to StartSession

* null message source and sink

* feedback and also remove check in ParticipantImpl for one-shot-mode-filtering as a null sink can be used for that
2024-11-21 09:33:28 +05:30
Paul Wells
73fbc6b8bb convert psprc error to http code in rtc service failure response (#3187) 2024-11-19 19:45:00 -08:00
Raja Subramanian
d0343808f2 Add ResyncDownTracks API that can be used to resync all down tracks on (#3185)
* Add ResyncDownTracks API that can be used to resync all down tracks on
these receivers.

* actually call the function
2024-11-18 20:01:14 +05:30
Raja Subramanian
cd718c84f6 Misc/minor clean up. (#3183)
Cosmetic. While thinking through how to structure probing better,
noticing small things here and there. Cleaning up and making some small
PRs along the way.
2024-11-17 12:14:46 +05:30
Raja Subramanian
aa2ce22655 Stringer interface (#3181) 2024-11-16 10:14:37 +05:30
Raja Subramanian
6509cdb5ea StreamAllocator (congestion controller) refactor (#3180)
* refactor WIP

* WIP

* compiling

* runlock

* fixes

* fmt

* stringer and unlikely logger

* clean up
2024-11-16 03:06:37 +05:30
Raja Subramanian
eceada8b31 use spatialLayer var (#3178)
* use spatialLayer var

* lower end check
2024-11-15 03:13:53 +05:30
Raja Subramanian
11deab22d4 Clean up forwardRTP function a bit. (#3177)
- Pass in the buffer, don't read it everytime through the loop
- cache stream trackers and avoid getting from stream tracker manager
  every time.
2024-11-15 02:49:43 +05:30
Raja Subramanian
adaf56a30d Move Prober to ccutils. (#3175)
* keep track of RTX bytes separately

* packet group

* Packet group of 50ms

* Minor refactoring

* rate calculator

* send bit rate

* WIP

* comment

* reduce packet infos size

* extended twcc seq num

* fix packet info

* WIP

* queuing delay

* refactor

* config

* callbacks

* fixes

* clean up

* remove debug file, fix rate calculation

* fmt

* fix probes

* format

* notes

* check loss

* tweak detection settings

* 24-bit wrap

* clean up a bit

* limit symbol list to number of packets

* fmt

* clean up

* lost

* fixes

* fmt

* rename

* fixes

* fmt

* use min/max

* hold on early warning of congestion

* make note about need for all optimal allocation on hold release

* estimate trend in congested state

* tweaks

* quantized

* fmt

* TrendDetector generics

* CTR trend

* tweaks

* config

* config

* comments

* clean up

* consistent naming

* pariticpant level setting

* log usage mode

* probing hacks

* WIP

* no lock

* packet group config

* ctr trend refactor

* cleanup and fixes

* format

* debug

* format

* move prober to ccutils

* clean up

* clean up
2024-11-15 00:05:59 +05:30
Paul Wells
9e4dae7107 add per message deflate to signal ws (#3174) 2024-11-14 00:35:30 -08:00
Raja Subramanian
cc22306047 Attempt to fix missing participant left webhook. (#3173)
On a resume, the signal stats will call `ParticipantLeft`. Although, it
explicity says not to send events, it could still close the stats
worker.

To handle that, we created a stats worker if needed in
`ParticipantResume` notification in this PR
(https://github.com/livekit/livekit/pull/2982), but that is not enough
as that event could happen before previous signal connection closes the
stats worker.

A new stats worker does get created when `ParticipantJoined` is called
by the new signal connection, but it does not transfer connected state.
So, when the client leaves, `ParticipantLeft` is not sent.

I am not seeing why we should not transfer connected state always given
that it is the same participant SID/session. But, I have a feeling that
I am missing some corner case. Please let me know if I am missing
something here.
2024-11-14 10:59:15 +05:30
David Zhao
84cb14695f Fix incorrect computation of SecondsSinceNodeStatsUpdate (#3172)
Stats.UpdatedAt is in seconds, but we were loading as nanosecs
2024-11-12 01:13:08 -06:00
Raja Subramanian
41fbcec2cd Fix header size calculation in stats. (#3171)
* Fix header size calculation in stats.

With pacer inserting some extensions, the header size used in stats
(and more impoetantly when probing for bandwidth estimation and
metering the bytes to control the probes) was incorrect. The size
was effectively was that of incoming extensions. It would have been
close enough though.

Anyhow, a bit of history
- initially was planning on packaging all the necessary fields into
  pacer packet and pacer would callback after sending, but that was not
  great for a couple of reasons
  - had to send in a bunch of useless data (as far as pacer is
    concerned) into pacer.
  - callback every packet (this is not bad, just a function call which
    happens in the foward path too, but had to lug around the above
    data).
- in the forward path, there is a very edge case issue when calling stats update
  after pacer.Enqueue() - details in https://github.com/livekit/livekit/pull/2085,
  but that is a rare case.

Because of those reasons, the update was placed in the forward path
before enqueue, but did not notice the header size issue till now.

As a compromise, `pacer.Enqueue` returns the headerSize and payloadSize.
It uses a dummy header to calculate size. Real extension will be added
just before sending packet on the wire. pion/rtp replaces extension if
one is already present. So, the dummy would be replaced by the real one
before sending on the wire.
a21194ecfb/packet.go (L398)

This does introduce back the second rare edge case, but that is very
rare and even if it happens, not catastrophic.

* cleanup

* add extensions and dummy as well in downtrack to make pacer cleaner
2024-11-12 10:53:57 +05:30
Raja Subramanian
a825661aff Use weighted loss to detect loss based congesiton signal. (#3169)
* Use weighted loss to detect loss based congesiton signal.

- Increase JQR min loss to 0.25.
- Use weighted loss ratio so that more packet rate gets higher
  weightage. At default config, 10 packets in 1/2 second will form a
  valid packet group for loss based congestion signal consideration. Two
  packets lost in that group may not be bad. So, bumped up the
  JQR min loss to 0.25. However, 20% loss (or even much lesser loss)
  could be problematic if the packet rate is higher (potentially
  multiple streams affected and there could be a lot of NACKs as a result).
  So, weight it by packet rate so that higher packet rates enter JQR
  at lower losses.

* WIP

* use aggregated loss
2024-11-12 09:21:28 +05:30