Commit Graph

1923 Commits

Author SHA1 Message Date
Raja Subramanian d216f94ac1 Remove some logs. (#2484)
* Remove some logs.

Also, changing Errorw -> Warnw in a bunch of places.
Going to move towards using `Errorw` for cases where a functionally
unexpected condition happens, i.e by design a condition should not
happen yet it triggered kind of scenarios.

* log error
2024-02-15 18:05:50 +05:30
Raja Subramanian e4c112929c Declare migration complete on publisher PC connected. (#2481)
Unless there are no published tracks, declare connected on primary PC
connected.

Streamlining this a bit. A bit of history
- With original migration, migration complete was declared on all tracks
  published.
- When muted tracks has to be migrated, a publish is synthesised for
  muted tracks, but migration complete did not wait till publisher peer
  connection connected.
- A few weeks back, those paths were merged and all cases were changed
  to use synthesised publish.
- Previously the completion point was different between muted and
  unmuted tracks. And with the change to treat everything like a muted
  track, completion point changed.

Change it so that if publisher PC is expected to be active, wait for it
to be connected before declaring migration complete.
2024-02-13 22:56:28 +05:30
Raja Subramanian f7b6e915cb Fix return on dropping a padding packet. (#2479)
Had deleted an extra line while cleaning up.
2024-02-13 14:24:31 +05:30
Raja Subramanian 0bcd9a2f8b Remove some noisy logs (#2477) 2024-02-13 12:01:20 +05:30
Raja Subramanian 07f64251b2 Delete spammy log (#2476)
* Move spammy log to  Debugw

* Actually, delete as log is not useful
2024-02-13 00:06:57 +05:30
Denys Smirnov 3674217d64 Support new SIP protocol. (#2474) 2024-02-12 18:52:21 +02:00
Raja Subramanian 49fd332e91 Store first SR also as it can get reset (#2472) 2024-02-12 12:14:25 +05:30
Raja Subramanian 89a312d259 Ignore duplicate RID. (#2471)
Firefox on Windows 10 seems to be producing simulcast tracks with
duplicate RID. That causes a leak as only one buffer is processed.

Ignore duplicate rid.

NOTE: This is not perfect as the actual layer -> rid is indeterminable
at addition time. It would require looking at packets to determine the
video dimensions and match to rid/layer to figure out which one is
correct and which one is duplicate.

To simplify though, taking the first one and dropping later ones.
This could mean the correct resolution is not streamed, but that should
be okay. The leak is far more destructive.
2024-02-12 11:49:14 +05:30
Théo Monnom 927d8fc0ef UserPacket sid should be empty for hidden participants (#2469) 2024-02-11 03:30:51 +01:00
Mathew Kamkar 7508560fde larger buckets for jitter prometheus histogram (#2468) 2024-02-09 12:09:51 -08:00
Paul Wells 213b46dca9 skip confirming room persistence (#2466) 2024-02-08 16:54:14 -08:00
Benjamin Pracht b659fef8ed Add support for ingress ParticipantMetadata (#2461) 2024-02-08 13:59:26 -08:00
Raja Subramanian d20811d1c2 Ignore disabled when adpative stream is enabled. (#2463)
* Ignore `disabled` when adpative stream is enabled.

Due to interplay of adaptive stream/visibility/dynacast, when adaptive
stream is enabled, subscribed track forces visibility and starts
streaming at low quality. This would trigger a render on client and
trigger a visibility update.

So, even if a migration disables a track, upon migration complete and
subscription bind, ignore disable and stream.

* don't hold lock during callback

* don't need to store pubMuted

* don't need to hold settings lock for pub muted
2024-02-08 18:58:48 +05:30
Paul Wells fd35c9edfc add exponential backoff to room service check retries (#2462) 2024-02-07 19:35:58 -08:00
Raja Subramanian f95194c833 Fixes to sync state disabled tracks. (#2459)
* Fixes to sync state disabled tracks.

* test
2024-02-07 13:52:57 +05:30
Raja Subramanian e1fb69b634 Synthesize a track setting on sync state. (#2455)
* Synthesize a track setting on sync state.

* Add setting before subscription

* clean up

* Skip tracks that are not subscribed

* protocol deps
2024-02-07 09:32:56 +05:30
Raja Subramanian 5a310f961c Log receiver close. (#2456)
* Log receiver close.

This is going to increase log volume, but want to check if peer
connection close trickles back into receiver close.

* log final close
2024-02-06 23:33:58 +05:30
cnderrauber af0a8fbbbc add log for extpacket accumulated (#2454) 2024-02-06 21:38:36 +08:00
Raja Subramanian 2f9ec2117f Do not enqueue after stop. (#2457)
Else, that could leak as the process routine may have exited and the
entries are not processed anywhere.
2024-02-06 19:01:24 +05:30
cnderrauber be87a1b6f0 Support rtx for publisher (#2452)
* Support rtx for publisher

* remote log

* solve comment
2024-02-06 21:30:37 +08:00
Raja Subramanian 716844c383 Log unpublish for debug. (#2451)
Also, call unpublish callback irrespective of state and let callback
handle do the needed checks.
2024-02-06 10:39:57 +05:30
Denys Smirnov e9cff525f4 Add method for creating SIP participants with a custom token. (#2448) 2024-02-06 00:13:44 +02:00
Raja Subramanian b7147efb87 Close published tracks on participant close (#2446) 2024-02-05 13:41:41 +05:30
Raja Subramanian 7c16ca6a0c Log feed Sender Report to better understand forwarded sender report (#2443)
anomalies.
2024-02-04 11:12:22 +05:30
Paul Wells 4bce0e7ed4 fix startup with -dev and -config (#2442) 2024-02-03 14:57:07 -08:00
Raja Subramanian d0128b19cd Reset sender reports before measuring clock skew. (#2437) 2024-02-02 21:52:43 +05:30
Raja Subramanian 174e69c81d Restore min score to 30. (#2435)
Was at 20 when LOST was introduced, but was going to 20 even when under
not LOST conditions. When there are packets, want the min to be at 30.
Going down to 20 resulted in reporting LOST quality even when packets
were flowing (although they were experiencing heavy loss and quality
would have been very bad, yet they are not lost).

Also, sample warning about adding packet to bucket even more.
2024-02-02 08:52:52 +05:30
Raja Subramanian ff69c2aa11 Add debug to understand VP9 freezes. (#2434)
* Add debug to understand VP9 freezes.

Have reports of VP9 freezing in some rooms.
Some data indicates that NACKs are received by SFU, but cannot get RTP
packet when that happens. It is possible that the NACKs are all from
dropped packets. Adding some debug to understand drops/NACKs better.

* enable DD debug

* comment out DD debug

* markers

* add back log about diff length mismatch

* add back key frame mismatch logging

* log skipped drops also
2024-01-31 15:33:39 +05:30
Raja Subramanian c8b7d486b9 Do not synthesise DISCONNECT on session change. (#2412)
* Do not synthesise DISCONNECT on session change.

v12 clients can handle session change based on identity.

* change for testf

* Squelch participant update if close reason is DUPLICATE_IDENTITY.

* fix test

* comment

* Clean up participant close reason a bit

* fix test

* test
2024-01-31 11:36:50 +05:30
Raja Subramanian a68500d4a1 Selective send of LeaveRequest. (#2429)
Cannot send old style leave request during migration and other scenarios
when client is expected to resume. The old style can only do a full
reconnect or disconnect. If `CanReconnect: false` which will be the case
for resume, client will disconnect.

Add a parameter to selectively send leave request to older clients.
2024-01-29 14:49:53 +05:30
Raja Subramanian d53f167b31 LeaveRequest changes. (#2426)
Reworking this a bit
1. Send leave whenever the signal channel is closed to induce a resume.
2. Use a getter to get regions rather than setting.
2024-01-29 13:04:18 +05:30
Raja Subramanian 2a3de84351 Reverting participant worker. (#2428)
* Reverting participant worker.

Reverts https://github.com/livekit/livekit/pull/2420 partially.

This did not revert clean. So, reverting manually. Also, keeping the
drive-by clean up bits.

* fix test
2024-01-29 13:03:32 +05:30
Raja Subramanian ad072f0836 Revert "Plug worker leaks" (#2427) 2024-01-29 12:40:55 +05:30
Raja Subramanian 846121e781 Revert "Cache data synchronously for processing in worker." (#2425) 2024-01-29 12:36:27 +05:30
Paul Wells 0be241eed8 refactor transport callbacks as interface (#2423)
* refactor transport callbacks as interface

* test
2024-01-28 21:35:25 -08:00
Raja Subramanian efbc985c82 Cache data synchronously for processing in worker. (#2424)
It is possible that state of underlying object has changed between
event posting and event processing. So, cache data synchronously
and use it during event processing.

This is still not perfect as things like `hidden` and `IsClosed` is
accessed in worker. Ideally, it can be a snapshot of current state of
all required values that can be posted to the worker and the worker just
operates with data.
2024-01-29 10:57:41 +05:30
Raja Subramanian ea2fa30cf8 Plug worker leaks (#2422)
Thank you @paulwe
2024-01-28 23:12:33 +05:30
Raja Subramanian bcf9fe3f0f Use a participant worker queue in room. (#2420)
* Use a participant worker queue in room.

Removes selectively needing to call things in goroutine from
participant.

Also, a bit of drive-by clean up.

* spelling

* prevent race

* don't need to remove in goroutine as it is already running in the worker

* worker will get cleaned up in state change callback

* create participant worker only if not created already

* ref count participant worker

* maintain participant list

* clean up oldState
2024-01-28 22:10:35 +05:30
Raja Subramanian 38352b6125 Change transport queue. (#2419)
From a channel to OpsQueue. Have seen extreme cases (with a ton of
candidates) overflowing the channel.
2024-01-28 14:28:29 +05:30
Raja Subramanian b71d373f4a Use Deque in ops queue. (#2418)
* Use Seque in ops queue.

Standardizing some uses
- Change OpsQueue to use Deque so that it can grow/shrink as necessary and
  need not worry about channel getting full and dropping events.
- Change StreamAllocator and TelemetryService to use OpsQueue so that
  they also need not worry about channel size and overflows.

* Address feedback

* delete obvious comment

* clean up
2024-01-28 13:48:30 +05:30
Benjamin Pracht c2549081c8 Allow creating SRT URL pull ingress (#2416) 2024-01-26 14:03:46 -08:00
Paul Wells 9eca035738 revert signal retry (#2413) 2024-01-26 08:14:49 -08:00
cnderrauber 9b4ba2d41d use default max playout delay as chrome (#2411) 2024-01-26 13:32:54 +08:00
cnderrauber 995fddbaf9 Add dynamic playout delay if PlayoutDelay enabled in the room (#2403)
* Add dynamic playout delay

* type for state
2024-01-26 09:33:35 +08:00
Paul Wells 025eb1164c retry signal stream start (#2410) 2024-01-25 15:48:12 -08:00
Raja Subramanian d3da94c45e Augment LeaveRequest with alternate regions to connect. (#2408)
* Augment LeaveRequest with alternate regions to connect.

* update protocol and issue resume action on close if expected to resume

* use current protocol in tests

* address feedback
2024-01-25 22:22:46 +05:30
Raja Subramanian 43a40eb52d Using minimal TrackInfo when reporing to telemetry. (#2407)
Used the full TrackInfo in my previous PR, but telemetry might be
relying on top level Width/Height. So, make a pared down TrackInfo to
report to telemetry.

Also, correct some spelling/comments.
2024-01-25 10:27:55 +05:30
Raja Subramanian 79cdc2df2e Unify muted and unmuted migration paths. (#2406)
* Unify muted and unmuted migration paths.

If dynacast had disabled all layers, after a migration, the client did
not restart publish (it is akin to muted track). That failed migration
because migration state machine waits for unmuted tracks to be published
(i. e. server has to receive packets).

If a migrating track is in muted state, server does not wait for
packets. It synthesises the published event and catches up later when
packets actually come in.

Just treating all migrations as the erstwhile muted case. Sythesise
publish whether track is muted or not. In the unmuted case, packets
might arrive soon after whereas in muted case, it will depend on when
unmute happens.

This is tricky stuff. So, will need good testing.

* use muted from track info
2024-01-25 01:24:09 +05:30
Denys Smirnov 89c7cec2ad SIP: New protocol for creating participants. (#2404) 2024-01-24 20:01:22 +02:00
Pablo Fuente Pérez f6608977f0 Fix race condition on Participant.updateState (#2401)
The comparisson between the last and current ParticipantInfo_State wasn't atomic. This sometimes resulted in two calls to onStateChange method for the same participant state. In the end this was reflected in two ACTIVE events being generated for the same participant at exactly the same moment. The fix actually uses the atomic method Swap to properly protect the "compare and set" operation and avoid any race condition.
2024-01-22 17:11:34 -08:00