Seeing some supervisor error logs under two conditions
- Issuing a full reconnect - client should close this session and
form a new one. So, supervisor errors on the to be closed session
is not useful.
- Some times it takes a long time for publisher PC to establish.
If publish monitor timer stars when a pending track is added,
the time out fires before ICE/DTLS is established. So, include
a condition to start timer on publication monitor only after
peer connection is connected.
This allows deleting and updating an ingress even if the ingress server that was handling it died. It does however mean that if the ingress responds again later, its state will be inconsistent. To somewhat make this less likely, also keep trying contacting the ingress for 1 min in the background.
Also fixing a race where an active deleted Ingress would get recreated on delete because of the update triggered by the ingress session shutdown
With migration in, once the local track is published, the
remote track should be closed. Add a flag to `RemovePublishedTrack`
to control the close behaviour. Invoke `Close` if specified.
Without, the remote track is not closed if it is waiting to resolve,
i. e. not yet attached. That remote track is left hanging.
It's my understanding that the nodeIP config can be set to ensure that a
specific IP is provided for the host candidate. The code being changed
here was added as a convenience so that:
| By giving it STUN servers, it should be
| connectable even without passing in --node-ip explicitly
We'd prefer to be able to specify a nodeIP and then as a side effect
have a STUN server added.
In extreme cases, media nodes could take more than 5s to spin up the
session. Increasing this timeout to 10s reduces the number of disconnections
due to edge cases.
With removal of subscription, it could happen twice.
When the subscriber actually unusbscribes and the publisher
removing all subscribers. The second unsubscribe was never
removed from the event list and eventually timed out.
So, process events as long as condition is satisfied.
* Remove VP9 from media engine set up.
* Remove vp9 from config sample
* Supervisor beginnings
Eventual goal is to have a reconciler which moves state from
actual -> desired. First step along the way is to observe/monitor.
The first step even in that is an initial implementation to get
feedback on the direction.
This PR is a start in that direction
- Concept of a supervisor at local participant level
- This supervisor will be responsible for periodically monitor
actual vs desired (this is the one which will eventually trigger
other things to reconcile, but for now it just logs on error)
- A new interface `OperationMonitor` which requires two methods
o Check() returns an error based on actual vs desired state.
o IsIdle() returns bool. Returns true if the monitor is idle.
- The supervisor maintains a list of monitors and does periodic check.
In the above framework, starting with list of
subscriptions/unsubscriptions. There is a new module
`SubscriptionMonitor` which checks subscription transitions.
A subscription transition is queued on subscribe/unsubscribe.
The transition can be satisfied when a subscribedTrack is added OR
removed. Error condition is when a transition is not satisfied for
10 seconds. Idle is when the transition queue is empty and
subscribedTrack is nil, i. e. the last transition would have been
unsubscribe and subscribed track removed (unsubscribe satisfied).
The idea is individual monitors can check on different things.
Some more things that I am thinking about are
- PublishedTrackMonitor - started when an add track happens,
satisfied when OnTrack happens, error if `OnTrack` does not
fire for a while and track is not muted, idle when there is
nothing pending.
- PublishedTrackStreamingMonitor - to ensure that a published track
is receiving media at the server (accounting for dynacast, mute, etc)
- SubscribedTrackStreamingMonitor - to ensure down track is sending
data unless muted.
* Remove debug
* Protect against early casting errors
* Adding PublicationMonitor
* Send connection type to telemetry
When connected, determine how the participant's primary connection is
connected and report it in ParticipantActive event.
* address feedback
* fixed case where prflx is reported instead of relay
* incorporate comments