Sometimes the initial selected node could fail. In that case, we'll give it a few more attempts to locate a media node for the session instead of failing it after the first try.
Added a new manager to handle all subscription needs. Implemented using reconciler pattern. The goals are:
improve subscription resilience by separating desired state and current state
reduce complexity of synchronous processing
better detect failures with the ability to trigger full reconnect
Related to livekit/protocol#273
This PR adds:
- ParticipantResumed - for when ICE restart or migration had occurred
- TrackPublishRequested - when we initiate a publication
- TrackSubscribeRequested - when we initiate a subscription
- TrackMuted - publisher muted track
- TrackUnmuted - publisher unmuted track
- TrackPublish/TrackSubcribe events will indicate when those actions have been successful, to differentiate.
* Fix handling of non-monotonic timestamps
Timed version is inspired by Hybrid Clock. We used to have a mixed behavior
by using time.Time:
* during local comparisons, it does increment monotonically
* when deserializing remote timestamps, we lose that attribute
So it's possible for two requests to be sent in the same microsecond, and
for the latter one to be dropped.
To fix that behavior, I'm switching it to keeping timestamps to consolidate
that behavior, and accepting multiple updates in the same ms by incrementing ticks.
Also using @paulwe's idea of a version generator.
* initial commit
* add correct label
* clean up
* more cleanup on adding stats
* cleanup
* move things to pub and sub monitors, ensure stats are correctly updated
* fix merge conflict
* Fix panic on MacOS (#1296)
* fixing last feedback
Co-authored-by: Raja Subramanian <raja.gobi@tutanota.com>
In certain scenarios such as migration, we do not want a duplicate event
to be sent when the participant is reconnecting. The Prometheus metric
should still be updated though.
* Add option to issue full reconnect on a publication error.
Leaving the publication error timeout at 30 seconds as there
are some publications taking long. Also, there are cases
where the peer connection fails after 30 seconds. The peer
connection failure happens after publication error is detected.
But, 30 seconds is a good amount of time for publication to establish.
* prevent recursive lock
* Fixed single-node routing breakage.
Due to a regression of a previous change, Redis was always enabled even
when no configuration was provided.
* updated go modules
* API execution timeout is now configurable
In certain environments, it can take longer than the default 2s to
fully execute API requests. Making execution timeout a configurable option.
* do not expose api to YAML. internal for now.
Previously, CreateRoom only created the room in storage, but did not
hydrate it on an RTC node. This has caused strange behaviors such as
emptyTimeout not working correctly (#1109).
Also reduced room reap worker to consistently reap rooms. Fixes#241
- ticker.Stop always
- clean up timer func (if they are added) on participant close
- sequencer test enhancement to add a real packet after a pdding packet
Now only set mapping when user_external_ip enabled or node_ip is
explicitly set. If multiple local address resolved to same external
ip, only the first one will be mapped to external, avoid candidate
conflict between different clients.