Commit Graph

2499 Commits

Author SHA1 Message Date
Raja Subramanian
ddece1fbb0 Use aarival time in cached packets. (#2633) 2024-04-08 11:29:55 +05:30
David Zhao
0712719c2f Fix deadlock in IncrementalDispatcher (#2632) 2024-04-06 10:36:02 -07:00
Raja Subramanian
8334149034 update mediatransportutil (#2630) 2024-04-06 12:13:21 +05:30
Raja Subramanian
8852d71a8a Disable audio loss proxying. (#2629)
* Disable audio loss proxying.

Added a config which is off by default.
With audio NACKs, that is the preferred repair mechanism.
With RED, repair is built in via packet redundancy to recover from
isolated losses.
So, proxying is not required. But, leaving it in there with a config
that is disabled by default.

* fix test
2024-04-06 11:28:04 +05:30
David Zhao
f4314686d1 Improve Agent logging (#2628) 2024-04-05 20:32:29 -07:00
David Colburn
4603b5c053 make IsDependent backwards compatible (#2627) 2024-04-05 14:24:34 -07:00
David Colburn
fff937a89c participant kinds (#2626) 2024-04-05 12:59:06 -07:00
Raja Subramanian
e93611eafa Log sender reports. (#2625) 2024-04-05 18:21:38 +05:30
Raja Subramanian
d3f0436d25 Fix reference time stamp layer. (#2623)
- Had arguments reversed.
- Also, cannot take away reference layer from state as a new layer
  as reference could have a time stamp that is widely different from
  expected. So, put that back.
2024-04-04 21:09:11 +05:30
Raja Subramanian
5cfcbc0ca6 Move caching of publisher sender report to subscriber side. (#2622)
* Move caching of publisher sender report to subscriber side.

Please see inline for descriptive comments on why. Basically,
pause/unpause using replaceTrack(null)/replaceTrack(actualTrack) can
cause time stamp in sender report sent to subscribers jump ahead.
This prevents that.

With the caching on subscriber side, cleaning up the caching on
publisher side.

* fix compile, test still failing, need to debug

* skip reference TS for testing
2024-04-04 18:23:30 +05:30
David Zhao
41b70ef555 Update Pion & SCTP to handle ZeroChecksum interop (#2619)
Details: https://github.com/pion/sctp/pull/327
2024-04-03 17:02:55 -07:00
Théo Monnom
dc67f505a5 agent service: new protocol & namespaces (#2545)
* initial worker impl

* fix test

* fix build

* TestAgentNamespaces

* log err

* nit cmt

* TestAgentMultiNode

* Update pkg/agent/worker.go

Co-authored-by: David Zhao <dz@livekit.io>

* retry on worker selection & fix review comments

* Update roommanager.go

* license

* use testutils.WIthTimeout

* abstract namespace/enabled logic into agent.Client, incrementally dispatch

* typos and dates

* lock

* timeout is now optional

* pass in topics instead of fixed

* handler handles connections

* onIdle, numConnections

* fix WithGrants

* update protocol

* check agent client

* broadcast after unlock

* fix data race

* remove ReadChan, fix dispatcher

---------

Co-authored-by: David Zhao <dz@livekit.io>
Co-authored-by: David Colburn <xero73@gmail.com>
2024-04-03 15:25:42 -07:00
Raja Subramanian
63b1fba082 Add start/end time to AnalyticsStream. (#2618)
* Add start/end time to AnalyticsStream.

* fix test
2024-04-03 12:23:18 +05:30
Raja Subramanian
1caa6ff6d0 Revert "Update pion deps (#2587)" (#2616)
This reverts commit f4ead06601.
2024-04-02 22:29:04 +05:30
Raja Subramanian
860702e9dc Prevent large spikes in propagation delay (#2615)
* Prevent large spikes in propagation delay

A few tweaks
- Large spike in propagation delay due to congested channel results in
  long term estimate getting high value. Ignore outliers in long term
  estimate.
- Introduce a new field for adjusted arrival time as adjusting the
  arrival time in place meant it got applied again across the relay and
  that caused different propagation delay on remote nodes.
- Reset path change counters as long as there is any sample that is not
  higher than the multiple of long term. There was a case of
  o Sample with high value that triggered path change start.
  o Then some samples with high enough delta, but did not meet the
    criteria for increasing counter further.
  o Some time later, another sample met the threshold and that triggered
    a path change re-init.

* do not adapt to large delta
2024-04-02 14:21:20 +05:30
Raja Subramanian
3c8980b443 Fix media track close check. (#2614)
With the change in https://github.com/livekit/livekit/pull/2611,
the dummy receiver was replaced with real receiver. But, the close check
was using the dummy receiver.

Doing two things
- Use the dummy receiver post upgrade also
  (NOTE: this is not needed, but just keeping old behaviour)
- Fix the close check to count number of open receivers.
2024-04-02 11:03:50 +05:30
renovate[bot]
f4ead06601 Update pion deps (#2587)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-04-01 22:17:20 -07:00
Denys Smirnov
19326a7162 Pass ringtone flag for SIP outbound. (#2613) 2024-04-01 19:20:51 +03:00
Raja Subramanian
3ed0527ef5 Reduce chances of missing layer post migration. (#2612)
When migrating out, it is possible that a published layer is not
notified to the migrating in node. Reduce chances of that layer not
getting published to the new node by curbing RTCP during migration.
It could still happen if stars line up, but this should reduce the
window to a much smaller one.
2024-04-01 18:53:54 +05:30
Raja Subramanian
b1a4d00fa9 Replace receiver when there is an existing one. (#2611)
The receiver should not change, but code wise, the option of replacing
receiver object makes more sense, i.e. otherwise, it could look like we
are leaving the stale object in there without replacing with new
receiver of same type.
2024-04-01 16:14:30 +05:30
Raja Subramanian
278ae72f70 Avoid duplicate receivers on migration. (#2608)
* Avoid duplicate receivers on migration.

When migrating, post migration call to set up could add duplicate
receivers.

* don't need to check upgraded
2024-03-31 11:15:50 +05:30
Raja Subramanian
4c9e59dc25 Small tweaks to propagation delay adaptation. (#2607) 2024-03-30 21:53:18 +05:30
Raja Subramanian
b5de646073 Remove redundant check. (#2605)
* Remove redundant check.

That check is already at the ouside check.

* print string

* space
2024-03-30 00:31:26 +05:30
Paul Wells
f1c991c547 skip logging retry message when ws disconnections before signal finishes (#2604) 2024-03-29 06:30:12 -07:00
cnderrauber
0a35e59ebd Replace sleep with sync.Cond to reduce jitter (#2603) 2024-03-29 17:24:31 +08:00
cnderrauber
95df9737a6 Fix twcc has chance to miss for firefox simulcast rtx (#2601) 2024-03-29 09:05:53 +08:00
cnderrauber
bc5fc17bdc Log high jitter case (#2602) 2024-03-28 15:59:03 +08:00
Raja Subramanian
45581433cc Add option to enable bitrate based scoring (#2600) 2024-03-27 18:45:53 +05:30
Raja Subramanian
0480f99a83 Tweak adaptation to increase in propagation delay. (#2598)
* Tweak adaptation to increase in propagation delay.

A couple of issues
- RTCP Sender Reports rate will vary based on underying track bitrate.
  (at least in theory, not all entities will do it though, for example
  SFU does standard rate of one per three seconds irrespective of track
  bit rate). So, adapt the long term estimate of propagation delay delta
  based on spacing of reports.
- Re-init of propagation delay to adapt to path change was taking the
  last value before the switch. But, that one value could have been an
  outlier and accepting it is not great. So, adapt spike time
  propagation delay in a smoother fashion to ensure that all values
  during spike contribute to the final value.

* clean up
2024-03-26 17:33:24 +05:30
renovate[bot]
d8226abf00 Update golang.org/x/exp digest to a685a6e (#2597)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-03-25 14:48:50 -07:00
Raja Subramanian
2dba3b2d2e Protect duplicate subscription (another try). (#2596)
Another case of duplciate tracks in SDP.
During migration (if both publisher and subscriber migrate), subscriber
could attach the remote track of the publisher. But, while that is
happening, publisher could migrate into the node and close the remote
media track. This was causing subscriber to switch from attaching to
remote media track -> attaching to local media track.

But, as remote media track was closed while add subscription was
happening, the subscriber is removed without subscription manager being
aware of it.

So, the subscription manager's reconcile and the remove subscriber is
racing and when subscription manager re-subscribes, caching has not run
yet and that creates a duplicate.

Delay removing subscribed track till after caching is done. That means,
even if the reconciler runs, it will get an `errAlreadySubscribed` error
and it will force it to reconcile again. By the time the subscribed
track is deleted from the subscriptions map, caching is done.
2024-03-25 15:07:29 +05:30
Raja Subramanian
95f5c94b4d Notify initial permissions (#2595)
* Notify initial permissions

NOTE: This does add an initial subscription permission notification
which should be fine, but something to watch for.

A stress test combining
- mute/unmute on publisher side.
- allowing/revoking permission for subscriber from publisher side.
- subscribing/unsubscribing from subscriber side.
results in a scenario where a subscription permission update of
`not_allowed` being sent and on a re-subscribe, an `allowed` update does
not happen.

It happens like so
- Subscription revoke cloes the down track of subscriber.
- The subscription is still desired.
- So, a subscription reconcile runs and sees `permission: false`. This
  sends subscription permission of `not_allowed`.
- Unsubscribe request comes in and sets `desired: false`.
- Reconsiler runs again and sees `desired: false` and `subscribedTrack:
  nil`. This cleans up the subscription.
- Publisher grants permission for the subscriber.
- Subscriber subscribes to the track again. A new subscription is
  created.
- Reconciler runs and sees `permission: true`, but there is no
  permission change as it is a new subscription object. So, `allowed`
  subscription permission update is not sent and the client is stuck at
  `not_allowed`.

Fix, maintain if permission has been initialized. Has the effect of
sending an initial update which should be fine.

* clean up comment

* no default
2024-03-22 23:22:20 +05:30
Raja Subramanian
ffb831aa8c Cache transceiver before closing subscribed track. (#2594)
On migration, when subscription moved from remote -> local,
transceiver caching was racing. Although a very small possibility,
it could happen like so

1. down track close
2. down track close callback fires go routine to close subscribed track
3. subscribed track close handler in subscription manager tries to
   reconcile
4. reconcile adds subscribed track again
5. cannot find cached transceiver as caching happens after down track
   close finishes in stap 1 above. Although there are a couple of
   gortouine jumps (step 2 fires a goroutine to close subscribed track
   and step 4 will reconcile in a goroutine too), it is theoretically
   possible that the step 1 has not finished and hence transceiver is
   not cached.

Fix is to move caching to before closing subscribed track.
2024-03-22 11:56:50 +05:30
Paul Wells
0f597e7e46 update protocol (#2593)
* update protocol

* cleanup

* deps
2024-03-21 02:48:26 -07:00
Raja Subramanian
7945c01dbe Reset sharp increase if received delta is small. (#2592) 2024-03-21 10:25:45 +05:30
Denys Smirnov
8564329579 Pass DTMF when creating SIP participants. (#2590) 2024-03-20 18:59:25 +02:00
Raja Subramanian
03ada9ba76 Proper RTCP report past mute. (#2588)
- When audio is muted, server injects silence frames which moves the
  time stamp forward and adjusts offset. That cannot be used against
  publisher side sender report. Use a pinned version.
- Ignore small changes to propagation delay even while checking for
  sharp increase. That is spamming a lot for small changes, i.e.
  existing delta is 100 micro seconds or so and the new one is 300 micro
  seconds. Also rename to `longTerm` from `smoothed` as it is a slow
  varying long term estimate of propagation delay delta. And slow down
  that adaptation more.
2024-03-19 11:59:24 +05:30
Denys Smirnov
c30c1be689 Push Docker images to the master tag. (#2579) 2024-03-18 20:48:26 +02:00
renovate[bot]
bc1711c476 Update golang.org/x/exp digest to a85f2c6 (#2551)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-03-18 11:00:28 -07:00
Raja Subramanian
e85922857a Use cloned grants. (#2586)
To ensure no pointer (i. e. mutable object) is held in returned proto.
2024-03-18 14:55:05 +05:30
Raja Subramanian
9e5f434cef Do not block as notification could make a network request. (#2581) 2024-03-15 10:59:37 +05:30
Denys Smirnov
1d920ae488 Support SIP DTMF data messages. (#2559) 2024-03-14 17:23:43 +02:00
Raja Subramanian
14321f21bf Make OpsQueueParams to make it easier to understand args. (#2578) 2024-03-14 10:27:24 +05:30
Raja Subramanian
e376625a13 Do not need to flush stream allocator events. (#2577) 2024-03-14 04:36:19 +05:30
Raja Subramanian
4e96ad2e5b Missed clean up in the last PR (#2576)
* Missed clean up in the last PR

* Infow -> Debugw
2024-03-13 23:01:19 +05:30
Raja Subramanian
3e43f75143 Forward publisher sender report. (#2572)
* Forward publisher sender report.

Publisher side RTCP sernfer report is rebased to SFU time base
and used to send sender rerport to subscriber.

Will wait to merge till previous versions are out as this will require a
bunch of testing.

* - Add rebased report drift
- update protocol dep
- fix path change check, it has to check against delta of propagation
  delay and not propagation delay as the two side clocks could be way
  off.
2024-03-13 14:31:39 +05:30
David Colburn
1cf04e3511 add egress proxy proto (#2570) 2024-03-11 21:53:12 -04:00
Raja Subramanian
610d68a409 Clean up using publisher side clock rate. (#2568)
It is not used any more.
2024-03-11 12:25:07 +05:30
Raja Subramanian
50c48ff29d Ignore out-of-order receiver side sender reports. (#2567) 2024-03-11 11:30:01 +05:30
Raja Subramanian
93c7d1f4fb Adjust first packet time on down track resume. (#2566)
Allows subscriber sender report to line up better quicker.
2024-03-11 00:40:16 +05:30