Commit Graph

1019 Commits

Author SHA1 Message Date
Raja Subramanian
b3e148771a Tweaks to reduce supervisor error logs (#1039)
Seeing some supervisor error logs under two conditions
- Issuing a full reconnect - client should close this session and
form a new one. So, supervisor errors on the to be closed session
is not useful.
- Some times it takes a long time for publisher PC to establish.
If publish monitor timer stars when a pending track is added,
the time out fires before ICE/DTLS is established. So, include
a condition to start timer on publication monitor only after
peer connection is connected.
2022-09-27 08:20:06 +05:30
David Zhao
3da908302a Do not warn when notifier isn't configured (#1043)
By default there are no webhook URLs to notify, so a notifier isn't created.
2022-09-26 13:30:27 -07:00
Benjamin Pracht
932af81f34 Update strored version of an active ingress if no ingress server responds (#1031)
This allows deleting and updating an ingress even if the ingress server that was handling it died. It does however mean that if the ingress responds again later, its state will be inconsistent. To somewhat make this less likely, also keep trying contacting the ingress for 1 min in the background.

Also fixing a race where an active deleted Ingress would get recreated on delete because of the update triggered by the ingress session shutdown
2022-09-26 11:16:27 -07:00
Raja Subramanian
dfc71d5bf8 Add a flag to signal need to close underlying media track. (#1038)
With migration in, once the local track is published, the
remote track should be closed. Add a flag to `RemovePublishedTrack`
to control the close behaviour. Invoke `Close` if specified.

Without, the remote track is not closed if it is waiting to resolve,
i. e. not yet attached. That remote track is left hanging.
2022-09-26 15:32:22 +05:30
David Zhao
5e1912e44c Enable TCP/TURN fallback by default (#1033) 2022-09-22 23:58:07 -07:00
David Colburn
b97d59b8db consolidate room internal (#1030)
* consolidate room internal

* create room internal map

* pipelined room read

* check error

* fix pipelined reads

* clean up after test
2022-09-22 15:59:27 -07:00
David Colburn
20c565ca02 log when auto egress fails (#1029)
* log when auto egress fails

* use room logger
2022-09-22 12:23:25 -07:00
David Colburn
803046b882 Auto egress (#1011)
* auto egress

* fix room service test

* reuse StartTrackEgress

* add timestamp

* update prefixed filename explicitly

* update protocol

* clean up telemetry

* fix telemetry tests

* separate room internal storage

* auto participant egress

* remove custom template url

* fix internal key

* use map for stats workers

* remove sync.Map

* remove participant composite
2022-09-21 12:04:19 -07:00
cnderrauber
48588d7c3d code clean & fix h264 test fail (#1028) 2022-09-21 16:59:18 +08:00
Guru Govindan
fc9d76c7a7 we need to consider SPS presence as keyframe indicator for simple NALU (#1016) 2022-09-21 16:03:36 +08:00
David Zhao
feb47812e7 Allow CORS responses to be cached to allow faster initial connection (#1027) 2022-09-20 23:56:24 -07:00
maxb
eabecb99ac Don't automatically add STUN servers if nodeIP set (#1023)
It's my understanding that the nodeIP config can be set to ensure that a
specific IP is provided for the host candidate. The code being changed
here was added as a convenience so that:
| By giving it STUN servers, it should be
| connectable even without passing in --node-ip explicitly

We'd prefer to be able to specify a nodeIP and then as a side effect
have a STUN server added.
2022-09-20 23:43:07 -07:00
David Zhao
99cb021e85 Increase max wait for media node to respond. (#1026)
In extreme cases, media nodes could take more than 5s to spin up the
session. Increasing this timeout to 10s reduces the number of disconnections
due to edge cases.
2022-09-20 23:38:58 -07:00
Raja Subramanian
a523b929a9 Adding UT to cover case found with padding only packets (#1024)
Confirmed that it fails before change to fix it and it is good now.
2022-09-21 09:40:26 +05:30
Raja Subramanian
1e20786521 Store pure padding packets also in SnOffsets cache. (#1020) 2022-09-20 19:39:36 +05:30
Raja Subramanian
26e6024137 Log NACK information in stream allocator. (#1018) 2022-09-19 14:33:16 +05:30
Raja Subramanian
924be2fbb7 Supervisor tweaks (#1017) 2022-09-19 08:27:51 +05:30
cnderrauber
7ad123e4a6 fix ssrc messed up when generating trackinfo (#1014) 2022-09-16 15:59:44 +05:30
cnderrauber
a421ab8d2b enable stereo opus (#1013) 2022-09-16 12:36:02 +08:00
Raja Subramanian
6234e4e725 Process as many events as possible in update. (#1010)
With removal of subscription, it could happen twice.
When the subscriber actually unusbscribes and the publisher
removing all subscribers. The second unsubscribe was never
removed from the event list and eventually timed out.
So, process events as long as condition is satisfied.
2022-09-15 22:53:44 +05:30
Raja Subramanian
60297b24fd Use UnixMilli (#1009)
* Use UnixMilli

* enhance comment
2022-09-15 19:17:24 +05:30
Raja Subramanian
c03003becf Logging some connection quality stuff to get some data. (#1008)
* Logging some connection quality stuff to get some data.

Setting it at 4.5 as normalised scores are higher.

* log average score
2022-09-15 17:16:59 +05:30
Raja Subramanian
33f782a99b Use PostEvent to avoid casting to concrete type (#1006) 2022-09-15 12:22:13 +05:30
Raja Subramanian
07c43e0972 Supervisor beginnings (#1005)
* Remove VP9 from media engine set up.

* Remove vp9 from config sample

* Supervisor beginnings

Eventual goal is to have a reconciler which moves state from
actual -> desired. First step along the way is to observe/monitor.
The first step even in that is an initial implementation to get
feedback on the direction.

This PR is a start in that direction
- Concept of a supervisor at local participant level
- This supervisor will be responsible for periodically monitor
  actual vs desired (this is the one which will eventually trigger
  other things to reconcile, but for now it just logs on error)
- A new interface `OperationMonitor` which requires two methods
  o Check() returns an error based on actual vs desired state.
  o IsIdle() returns bool. Returns true if the monitor is idle.
- The supervisor maintains a list of monitors and does periodic check.

In the above framework, starting with list of
subscriptions/unsubscriptions. There is a new module
`SubscriptionMonitor` which checks subscription transitions.
A subscription transition is queued on subscribe/unsubscribe.
The transition can be satisfied when a subscribedTrack is added OR
removed. Error condition is when a transition is not satisfied for
10 seconds. Idle is when the transition queue is empty and
subscribedTrack is nil, i. e. the last transition would have been
unsubscribe and subscribed track removed (unsubscribe satisfied).

The idea is individual monitors can check on different things.
Some more things that I am thinking about are
- PublishedTrackMonitor - started when an add track happens,
  satisfied when OnTrack happens, error if `OnTrack` does not
  fire for a while and track is not muted, idle when there is
  nothing pending.
- PublishedTrackStreamingMonitor - to ensure that a published track
  is receiving media at the server (accounting for dynacast, mute, etc)
- SubscribedTrackStreamingMonitor - to ensure down track is sending
  data unless muted.

* Remove debug

* Protect against early casting errors

* Adding PublicationMonitor
2022-09-15 11:16:37 +05:30
Raja Subramanian
e7cc6bd4a1 Remove VP9 from media engine set up. (#1004)
* Remove VP9 from media engine set up.

* Remove vp9 from config sample
2022-09-14 16:16:14 +05:30
Benjamin Pracht
b0eead22b5 Allow specifying a different RTMP url domain for each ingress (#994) 2022-09-12 14:03:15 -07:00
David Zhao
7e3155dcd6 ForceTCP only for supported clients (#997)
* ForceTCP only for supported clients

Revert back to standard if forceRelay with TLS fails
Don't force TLS unless it's configured

* fix lint
2022-09-09 18:14:36 -07:00
cnderrauber
f1915feb1a keep mid unchange after migration for subscribed track (#995) 2022-09-09 17:39:09 +08:00
Raja Subramanian
93da599059 Return previous max layer. (#993)
Relay up track holds this value too. This is to prevent duplication.
2022-09-08 11:04:26 +05:30
David Zhao
5cfd21c1ef Fix inaccurate participant count due to storing old data (#992) 2022-09-07 18:33:28 -07:00
cnderrauber
441053b7fa add participant id when client reconnect (#988) 2022-09-07 15:56:56 +08:00
Raja Subramanian
d76f7811e9 An attempt to use consistent layer mapping (#986)
* WIP commit

* Consistent layers.

* slight re-arrangement of code

* log mime

* fix tests

* map -> array
2022-09-07 09:57:31 +05:30
Benjamin Pracht
aaeba74402 Import ErrIngressOutOfDate from protocol (#987) 2022-09-06 16:05:17 -07:00
Raja Subramanian
68fcd377fa Add exempt layer when not found AND exempt. (#984) 2022-09-05 22:22:41 +05:30
Raja Subramanian
d8ae453fb9 Handle exempted layers (#983)
* WIP commit

* WIP commit

* Add tests

* remove debug
2022-09-05 18:42:59 +05:30
Raja Subramanian
021ec596b5 Fix check, thank you @cnderrauber (#982) 2022-09-05 10:16:57 +05:30
Raja Subramanian
20bd99903e Close down track before closing subscriber peer connection (#981)
* Close down track before closing subscriber peer connection

* plural
2022-09-03 12:16:40 +05:30
Raja Subramanian
d13c4be923 Close subscriber PC after a wait to aid in migration. (#979)
* Close subscriber PC after a wait to aid in migration.

* mage generate
2022-09-03 01:16:51 +05:30
Raja Subramanian
c75f38bce6 Protect against looking up dimensions for invalid spatial layer (#977)
Also use loss based scoring when track dimensions are not available.
2022-09-03 00:59:47 +05:30
David Zhao
da2525e973 for some reasons this wasn't generated before committed. (#974) 2022-08-31 21:35:51 -07:00
Benjamin Pracht
d8edb9b2e7 Adopt Ingress RPC interface changes (#972) 2022-08-31 14:14:40 -07:00
cnderrauber
c401ca58af turn packet and bytes stats used for telemetry and load control (#969)
* stats for turn

* add connections stats

* stats for standalone turn server only

* wire update
2022-08-31 11:00:27 +08:00
Raja Subramanian
df189984f3 Add resyn on next packet to buffer.Bucket (#968) 2022-08-30 12:58:10 +05:30
David Zhao
69bf31944e Send connection type to telemetry (#964)
* Send connection type to telemetry

When connected, determine how the participant's primary connection is
connected and report it in ParticipantActive event.

* address feedback

* fixed case where prflx is reported instead of relay

* incorporate comments
2022-08-29 23:17:13 -07:00
Raja Subramanian
032c3a1603 Fix track info available for down tracks. (#967)
Should have here in the first place. Brain damage :-(
2022-08-30 09:35:05 +05:30
Raja Subramanian
9b0539eb43 Need this for clean up during migration (#965) 2022-08-29 13:19:58 +05:30
Raja Subramanian
4217f198d6 Have to go through the full ICE restart checks after gatheting finishes. (#963) 2022-08-29 09:45:38 +05:30
David Zhao
aa4f713d1e Document tcp fallback (#961)
* Updated docs around TCP fallback

* changed allowFallback to a pointer
2022-08-27 14:59:01 -07:00
Mathew Kamkar
767d660809 Use LocalNode ID in Prometheus metrics (#959) 2022-08-25 22:16:20 -07:00
David Zhao
747089a005 Additional closure reasons (#958) 2022-08-25 19:36:47 -07:00