Commit Graph

2351 Commits

Author SHA1 Message Date
renovate[bot]
f26fbceaba Update pion deps (#2438)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-02-03 02:15:21 -08:00
Raja Subramanian
d0128b19cd Reset sender reports before measuring clock skew. (#2437) 2024-02-02 21:52:43 +05:30
Raja Subramanian
174e69c81d Restore min score to 30. (#2435)
Was at 20 when LOST was introduced, but was going to 20 even when under
not LOST conditions. When there are packets, want the min to be at 30.
Going down to 20 resulted in reporting LOST quality even when packets
were flowing (although they were experiencing heavy loss and quality
would have been very bad, yet they are not lost).

Also, sample warning about adding packet to bucket even more.
2024-02-02 08:52:52 +05:30
Raja Subramanian
ff69c2aa11 Add debug to understand VP9 freezes. (#2434)
* Add debug to understand VP9 freezes.

Have reports of VP9 freezing in some rooms.
Some data indicates that NACKs are received by SFU, but cannot get RTP
packet when that happens. It is possible that the NACKs are all from
dropped packets. Adding some debug to understand drops/NACKs better.

* enable DD debug

* comment out DD debug

* markers

* add back log about diff length mismatch

* add back key frame mismatch logging

* log skipped drops also
2024-01-31 15:33:39 +05:30
Raja Subramanian
c8b7d486b9 Do not synthesise DISCONNECT on session change. (#2412)
* Do not synthesise DISCONNECT on session change.

v12 clients can handle session change based on identity.

* change for testf

* Squelch participant update if close reason is DUPLICATE_IDENTITY.

* fix test

* comment

* Clean up participant close reason a bit

* fix test

* test
2024-01-31 11:36:50 +05:30
David Colburn
f960a4f9fb update egress client (#2431) 2024-01-29 16:57:50 -08:00
Raja Subramanian
a68500d4a1 Selective send of LeaveRequest. (#2429)
Cannot send old style leave request during migration and other scenarios
when client is expected to resume. The old style can only do a full
reconnect or disconnect. If `CanReconnect: false` which will be the case
for resume, client will disconnect.

Add a parameter to selectively send leave request to older clients.
2024-01-29 14:49:53 +05:30
Raja Subramanian
d53f167b31 LeaveRequest changes. (#2426)
Reworking this a bit
1. Send leave whenever the signal channel is closed to induce a resume.
2. Use a getter to get regions rather than setting.
2024-01-29 13:04:18 +05:30
Raja Subramanian
2a3de84351 Reverting participant worker. (#2428)
* Reverting participant worker.

Reverts https://github.com/livekit/livekit/pull/2420 partially.

This did not revert clean. So, reverting manually. Also, keeping the
drive-by clean up bits.

* fix test
2024-01-29 13:03:32 +05:30
Raja Subramanian
ad072f0836 Revert "Plug worker leaks" (#2427) 2024-01-29 12:40:55 +05:30
Raja Subramanian
846121e781 Revert "Cache data synchronously for processing in worker." (#2425) 2024-01-29 12:36:27 +05:30
Paul Wells
0be241eed8 refactor transport callbacks as interface (#2423)
* refactor transport callbacks as interface

* test
2024-01-28 21:35:25 -08:00
Raja Subramanian
efbc985c82 Cache data synchronously for processing in worker. (#2424)
It is possible that state of underlying object has changed between
event posting and event processing. So, cache data synchronously
and use it during event processing.

This is still not perfect as things like `hidden` and `IsClosed` is
accessed in worker. Ideally, it can be a snapshot of current state of
all required values that can be posted to the worker and the worker just
operates with data.
2024-01-29 10:57:41 +05:30
renovate[bot]
134b6f05b4 Update module github.com/pion/dtls/v2 to v2.2.9 (#2355)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-01-28 15:29:07 -08:00
renovate[bot]
5f3bd7cf59 Update actions/upload-artifact action to v4 (#2317)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-01-28 15:28:46 -08:00
Raja Subramanian
ea2fa30cf8 Plug worker leaks (#2422)
Thank you @paulwe
2024-01-28 23:12:33 +05:30
Raja Subramanian
bcf9fe3f0f Use a participant worker queue in room. (#2420)
* Use a participant worker queue in room.

Removes selectively needing to call things in goroutine from
participant.

Also, a bit of drive-by clean up.

* spelling

* prevent race

* don't need to remove in goroutine as it is already running in the worker

* worker will get cleaned up in state change callback

* create participant worker only if not created already

* ref count participant worker

* maintain participant list

* clean up oldState
2024-01-28 22:10:35 +05:30
Raja Subramanian
38352b6125 Change transport queue. (#2419)
From a channel to OpsQueue. Have seen extreme cases (with a ton of
candidates) overflowing the channel.
2024-01-28 14:28:29 +05:30
Raja Subramanian
b71d373f4a Use Deque in ops queue. (#2418)
* Use Seque in ops queue.

Standardizing some uses
- Change OpsQueue to use Deque so that it can grow/shrink as necessary and
  need not worry about channel getting full and dropping events.
- Change StreamAllocator and TelemetryService to use OpsQueue so that
  they also need not worry about channel size and overflows.

* Address feedback

* delete obvious comment

* clean up
2024-01-28 13:48:30 +05:30
Benjamin Pracht
c2549081c8 Allow creating SRT URL pull ingress (#2416) 2024-01-26 14:03:46 -08:00
Paul Wells
654b05638f update psrpc (#2414) 2024-01-26 10:39:08 -08:00
Paul Wells
9eca035738 revert signal retry (#2413) 2024-01-26 08:14:49 -08:00
cnderrauber
9b4ba2d41d use default max playout delay as chrome (#2411) 2024-01-26 13:32:54 +08:00
cnderrauber
995fddbaf9 Add dynamic playout delay if PlayoutDelay enabled in the room (#2403)
* Add dynamic playout delay

* type for state
2024-01-26 09:33:35 +08:00
aoife cassidy
0ebb861bdf Replace /bin/bash with env call (#2409)
Fixes build on FreeBSD, which uses /usr/local/bin/bash,
and also is just a more idiomatic way to handle shells.
2024-01-25 15:49:36 -08:00
Paul Wells
025eb1164c retry signal stream start (#2410) 2024-01-25 15:48:12 -08:00
renovate[bot]
d5b3bbac61 Update module github.com/livekit/protocol to v1.9.7 (#2337)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-01-25 15:47:15 -08:00
Raja Subramanian
d3da94c45e Augment LeaveRequest with alternate regions to connect. (#2408)
* Augment LeaveRequest with alternate regions to connect.

* update protocol and issue resume action on close if expected to resume

* use current protocol in tests

* address feedback
2024-01-25 22:22:46 +05:30
Raja Subramanian
43a40eb52d Using minimal TrackInfo when reporing to telemetry. (#2407)
Used the full TrackInfo in my previous PR, but telemetry might be
relying on top level Width/Height. So, make a pared down TrackInfo to
report to telemetry.

Also, correct some spelling/comments.
2024-01-25 10:27:55 +05:30
Raja Subramanian
79cdc2df2e Unify muted and unmuted migration paths. (#2406)
* Unify muted and unmuted migration paths.

If dynacast had disabled all layers, after a migration, the client did
not restart publish (it is akin to muted track). That failed migration
because migration state machine waits for unmuted tracks to be published
(i. e. server has to receive packets).

If a migrating track is in muted state, server does not wait for
packets. It synthesises the published event and catches up later when
packets actually come in.

Just treating all migrations as the erstwhile muted case. Sythesise
publish whether track is muted or not. In the unmuted case, packets
might arrive soon after whereas in muted case, it will depend on when
unmute happens.

This is tricky stuff. So, will need good testing.

* use muted from track info
2024-01-25 01:24:09 +05:30
Denys Smirnov
89c7cec2ad SIP: New protocol for creating participants. (#2404) 2024-01-24 20:01:22 +02:00
Pablo Fuente Pérez
f6608977f0 Fix race condition on Participant.updateState (#2401)
The comparisson between the last and current ParticipantInfo_State wasn't atomic. This sometimes resulted in two calls to onStateChange method for the same participant state. In the end this was reflected in two ACTIVE events being generated for the same participant at exactly the same moment. The fix actually uses the atomic method Swap to properly protect the "compare and set" operation and avoid any race condition.
2024-01-22 17:11:34 -08:00
Paul Wells
867325d120 restore legacy room delete behavior (#2400) 2024-01-22 05:18:12 -08:00
Paul Wells
cb42c6152c add psrpc redis keepalive (#2398)
* add psrpc redis keepalive

* deps
2024-01-21 06:16:40 -08:00
Raja Subramanian
8c932da678 Add ControllerNodeId and SelectionReason to StartSession. (#2396)
* Add ControllerNodeId and SelectionReason to StartSession.

Media node has that information and can log it in context.

* Update deps

* clean up and mage generate

* clean up and fix test

* clean up

* clean up
2024-01-19 17:06:09 +05:30
Jonas Schell
e255b8a51d update readme (#2392) 2024-01-18 15:48:23 -08:00
Paul Wells
fbd488adc3 remove participant key helpers (#2385)
* remove participant key helpers

* deps
2024-01-18 06:46:34 -08:00
Raja Subramanian
899067ba0f Simulation scenarios to disable signal channel on resume (#2389)
* Add a simulation scenario to disconnect signal channel on resume

- Requesting that scenario add that participant to a map with a timeout
  of 5 seconds.
- If a resume (reconnect = 1) happens before the timeout, the signalling
  channel is closed immediately on resume.
- There is a clean up worker which will remove entries from the map when
  they timout.
- The participant is also removed from the map if the disconnect on
  resume is invoked once.

* simulate disconnect signal on resume no messages

* comment

* comment

* Close all retries

* update deps

* abort resume only if simulation applied

* Revert SIP change
2024-01-17 20:44:05 +05:30
Sean DuBois
750d2b5765 Update livekit/protocol (#2390)
Fix API breakage with SIP
2024-01-17 10:03:47 -05:00
Raja Subramanian
f29a28611b Prevent writable race. (#2388)
It is possible that onBindAndConnectedChanged gets executed in such a
way that `writable` does not have the correct value in some very rare
timing case (i. e. case like two executions of the function is racing
and one atomic was read on first exeuction and second execution runs and
sets `writable` and then first execution completes the sets `writable`
to incorrect value based on stale read of first execution).

Prevent it by executing under bind lock.
2024-01-16 12:15:13 +05:30
cnderrauber
5b4848e772 remove dd debug logs (#2387) 2024-01-16 12:07:29 +08:00
Paul Wells
3f2f850bdb clean up legacy rpc (#2384)
* clean up legacy rpc

* cleanup

* cleanup

* cleanup

* tidy

* cleanup

* cleanup
2024-01-14 01:49:26 -08:00
renovate[bot]
1cb4b3e585 Update go deps (#2382)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-01-13 17:42:34 -08:00
renovate[bot]
2ba4e5c070 Update go deps (#2366)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2024-01-12 10:30:22 -08:00
Paul Wells
78bf642d67 record session once (#2381) 2024-01-12 05:55:56 -08:00
Paul Wells
c726cbf2ba increase max session start time bin size (#2380) 2024-01-12 03:49:23 -08:00
Raja Subramanian
bf0e88dea4 Squelch only the log, not the error return. (#2379) 2024-01-12 16:58:23 +05:30
Paul Wells
2fe2a9c9f2 add session start time metric (#2377) 2024-01-11 23:23:51 -08:00
Raja Subramanian
3687396d84 Squelch error logs while waiting for track resolve. (#2376) 2024-01-12 12:16:19 +05:30
Raja Subramanian
dc1b09c757 Update pion/ice to pick some more trace logging (#2374) 2024-01-11 13:57:29 +05:30