When a fallback is not applied, it is due to signal interruption.
ICE connection failing happens. And every time there is error, it is due
to "no selected pair".
Move all of it to `Debugw`. `setting ICE config` is the definitive log
which says if a different ICE config was applied.
* Maintain subscription count.
Does not affect function as it is not decremented only if limits are
configured. But, good to maintain proper count anyway.
* wire
Simplify a bit. Pending migratiion tracks need not be maintained as when
a migrated track is added, it is added to up track manager and treated
as a published track. When up track manager closes, published tracks
will close. So, no need to maintain a separate list.
* Buffer size config for video and audio.
There was only one buffer size in config.
In upstream, config value was used for video.
Audio used a hard coded value of 200 packets.
But, in the down stream sequencer, the config value was used for both
video and audio. So, if video was set up for high bit rate (deep
buffers), audio sequencer ended up using a lot of memory too in
sequencer.
Split config to be able to control that and also not hard code audio.
Another optimisation here would be to not instantiate sequencer unkess
NACK is negotiated.
* deprecate packet_buffer_size
It is possible that participant state and migration state updates are
racing. And a participant update could end up with no tracks when
migration is being processed.
Moving handling of migrated tracks to when the migration state moves to
completed. Pending data channel were already happening only on complete.
Move tracks also to that point.
Handling it earlier meant that track published callback happened and
ownership of track moved to new node before the new node could finish
peer connection. So, in cases where migration did not go through, this
caused confusion of track ownership.
* Remove some logs.
Also, changing Errorw -> Warnw in a bunch of places.
Going to move towards using `Errorw` for cases where a functionally
unexpected condition happens, i.e by design a condition should not
happen yet it triggered kind of scenarios.
* log error
Unless there are no published tracks, declare connected on primary PC
connected.
Streamlining this a bit. A bit of history
- With original migration, migration complete was declared on all tracks
published.
- When muted tracks has to be migrated, a publish is synthesised for
muted tracks, but migration complete did not wait till publisher peer
connection connected.
- A few weeks back, those paths were merged and all cases were changed
to use synthesised publish.
- Previously the completion point was different between muted and
unmuted tracks. And with the change to treat everything like a muted
track, completion point changed.
Change it so that if publisher PC is expected to be active, wait for it
to be connected before declaring migration complete.
Firefox on Windows 10 seems to be producing simulcast tracks with
duplicate RID. That causes a leak as only one buffer is processed.
Ignore duplicate rid.
NOTE: This is not perfect as the actual layer -> rid is indeterminable
at addition time. It would require looking at packets to determine the
video dimensions and match to rid/layer to figure out which one is
correct and which one is duplicate.
To simplify though, taking the first one and dropping later ones.
This could mean the correct resolution is not streamed, but that should
be okay. The leak is far more destructive.
* Ignore `disabled` when adpative stream is enabled.
Due to interplay of adaptive stream/visibility/dynacast, when adaptive
stream is enabled, subscribed track forces visibility and starts
streaming at low quality. This would trigger a render on client and
trigger a visibility update.
So, even if a migration disables a track, upon migration complete and
subscription bind, ignore disable and stream.
* don't hold lock during callback
* don't need to store pubMuted
* don't need to hold settings lock for pub muted
* Log receiver close.
This is going to increase log volume, but want to check if peer
connection close trickles back into receiver close.
* log final close
* Do not synthesise DISCONNECT on session change.
v12 clients can handle session change based on identity.
* change for testf
* Squelch participant update if close reason is DUPLICATE_IDENTITY.
* fix test
* comment
* Clean up participant close reason a bit
* fix test
* test
Cannot send old style leave request during migration and other scenarios
when client is expected to resume. The old style can only do a full
reconnect or disconnect. If `CanReconnect: false` which will be the case
for resume, client will disconnect.
Add a parameter to selectively send leave request to older clients.
* Reverting participant worker.
Reverts https://github.com/livekit/livekit/pull/2420 partially.
This did not revert clean. So, reverting manually. Also, keeping the
drive-by clean up bits.
* fix test
It is possible that state of underlying object has changed between
event posting and event processing. So, cache data synchronously
and use it during event processing.
This is still not perfect as things like `hidden` and `IsClosed` is
accessed in worker. Ideally, it can be a snapshot of current state of
all required values that can be posted to the worker and the worker just
operates with data.
* Use a participant worker queue in room.
Removes selectively needing to call things in goroutine from
participant.
Also, a bit of drive-by clean up.
* spelling
* prevent race
* don't need to remove in goroutine as it is already running in the worker
* worker will get cleaned up in state change callback
* create participant worker only if not created already
* ref count participant worker
* maintain participant list
* clean up oldState
* Use Seque in ops queue.
Standardizing some uses
- Change OpsQueue to use Deque so that it can grow/shrink as necessary and
need not worry about channel getting full and dropping events.
- Change StreamAllocator and TelemetryService to use OpsQueue so that
they also need not worry about channel size and overflows.
* Address feedback
* delete obvious comment
* clean up
* Augment LeaveRequest with alternate regions to connect.
* update protocol and issue resume action on close if expected to resume
* use current protocol in tests
* address feedback
Used the full TrackInfo in my previous PR, but telemetry might be
relying on top level Width/Height. So, make a pared down TrackInfo to
report to telemetry.
Also, correct some spelling/comments.
* Unify muted and unmuted migration paths.
If dynacast had disabled all layers, after a migration, the client did
not restart publish (it is akin to muted track). That failed migration
because migration state machine waits for unmuted tracks to be published
(i. e. server has to receive packets).
If a migrating track is in muted state, server does not wait for
packets. It synthesises the published event and catches up later when
packets actually come in.
Just treating all migrations as the erstwhile muted case. Sythesise
publish whether track is muted or not. In the unmuted case, packets
might arrive soon after whereas in muted case, it will depend on when
unmute happens.
This is tricky stuff. So, will need good testing.
* use muted from track info
The comparisson between the last and current ParticipantInfo_State wasn't atomic. This sometimes resulted in two calls to onStateChange method for the same participant state. In the end this was reflected in two ACTIVE events being generated for the same participant at exactly the same moment. The fix actually uses the atomic method Swap to properly protect the "compare and set" operation and avoid any race condition.
* Add a simulation scenario to disconnect signal channel on resume
- Requesting that scenario add that participant to a map with a timeout
of 5 seconds.
- If a resume (reconnect = 1) happens before the timeout, the signalling
channel is closed immediately on resume.
- There is a clean up worker which will remove entries from the map when
they timout.
- The participant is also removed from the map if the disconnect on
resume is invoked once.
* simulate disconnect signal on resume no messages
* comment
* comment
* Close all retries
* update deps
* abort resume only if simulation applied
* Revert SIP change