In certain scenarios such as migration, we do not want a duplicate event
to be sent when the participant is reconnecting. The Prometheus metric
should still be updated though.
* Add option to issue full reconnect on a publication error.
Leaving the publication error timeout at 30 seconds as there
are some publications taking long. Also, there are cases
where the peer connection fails after 30 seconds. The peer
connection failure happens after publication error is detected.
But, 30 seconds is a good amount of time for publication to establish.
* prevent recursive lock
* Fixed single-node routing breakage.
Due to a regression of a previous change, Redis was always enabled even
when no configuration was provided.
* updated go modules
* API execution timeout is now configurable
In certain environments, it can take longer than the default 2s to
fully execute API requests. Making execution timeout a configurable option.
* do not expose api to YAML. internal for now.
Previously, CreateRoom only created the room in storage, but did not
hydrate it on an RTC node. This has caused strange behaviors such as
emptyTimeout not working correctly (#1109).
Also reduced room reap worker to consistently reap rooms. Fixes#241
- ticker.Stop always
- clean up timer func (if they are added) on participant close
- sequencer test enhancement to add a real packet after a pdding packet
Now only set mapping when user_external_ip enabled or node_ip is
explicitly set. If multiple local address resolved to same external
ip, only the first one will be mapped to external, avoid candidate
conflict between different clients.
This allows deleting and updating an ingress even if the ingress server that was handling it died. It does however mean that if the ingress responds again later, its state will be inconsistent. To somewhat make this less likely, also keep trying contacting the ingress for 1 min in the background.
Also fixing a race where an active deleted Ingress would get recreated on delete because of the update triggered by the ingress session shutdown
In extreme cases, media nodes could take more than 5s to spin up the
session. Increasing this timeout to 10s reduces the number of disconnections
due to edge cases.
* Remove VP9 from media engine set up.
* Remove vp9 from config sample
* Supervisor beginnings
Eventual goal is to have a reconciler which moves state from
actual -> desired. First step along the way is to observe/monitor.
The first step even in that is an initial implementation to get
feedback on the direction.
This PR is a start in that direction
- Concept of a supervisor at local participant level
- This supervisor will be responsible for periodically monitor
actual vs desired (this is the one which will eventually trigger
other things to reconcile, but for now it just logs on error)
- A new interface `OperationMonitor` which requires two methods
o Check() returns an error based on actual vs desired state.
o IsIdle() returns bool. Returns true if the monitor is idle.
- The supervisor maintains a list of monitors and does periodic check.
In the above framework, starting with list of
subscriptions/unsubscriptions. There is a new module
`SubscriptionMonitor` which checks subscription transitions.
A subscription transition is queued on subscribe/unsubscribe.
The transition can be satisfied when a subscribedTrack is added OR
removed. Error condition is when a transition is not satisfied for
10 seconds. Idle is when the transition queue is empty and
subscribedTrack is nil, i. e. the last transition would have been
unsubscribe and subscribed track removed (unsubscribe satisfied).
The idea is individual monitors can check on different things.
Some more things that I am thinking about are
- PublishedTrackMonitor - started when an add track happens,
satisfied when OnTrack happens, error if `OnTrack` does not
fire for a while and track is not muted, idle when there is
nothing pending.
- PublishedTrackStreamingMonitor - to ensure that a published track
is receiving media at the server (accounting for dynacast, mute, etc)
- SubscribedTrackStreamingMonitor - to ensure down track is sending
data unless muted.
* Remove debug
* Protect against early casting errors
* Adding PublicationMonitor