Commit Graph

614 Commits

Author SHA1 Message Date
David Colburn b97d59b8db consolidate room internal (#1030)
* consolidate room internal

* create room internal map

* pipelined room read

* check error

* fix pipelined reads

* clean up after test
2022-09-22 15:59:27 -07:00
David Colburn 20c565ca02 log when auto egress fails (#1029)
* log when auto egress fails

* use room logger
2022-09-22 12:23:25 -07:00
David Colburn 803046b882 Auto egress (#1011)
* auto egress

* fix room service test

* reuse StartTrackEgress

* add timestamp

* update prefixed filename explicitly

* update protocol

* clean up telemetry

* fix telemetry tests

* separate room internal storage

* auto participant egress

* remove custom template url

* fix internal key

* use map for stats workers

* remove sync.Map

* remove participant composite
2022-09-21 12:04:19 -07:00
maxb eabecb99ac Don't automatically add STUN servers if nodeIP set (#1023)
It's my understanding that the nodeIP config can be set to ensure that a
specific IP is provided for the host candidate. The code being changed
here was added as a convenience so that:
| By giving it STUN servers, it should be
| connectable even without passing in --node-ip explicitly

We'd prefer to be able to specify a nodeIP and then as a side effect
have a STUN server added.
2022-09-20 23:43:07 -07:00
Raja Subramanian 924be2fbb7 Supervisor tweaks (#1017) 2022-09-19 08:27:51 +05:30
cnderrauber 7ad123e4a6 fix ssrc messed up when generating trackinfo (#1014) 2022-09-16 15:59:44 +05:30
cnderrauber a421ab8d2b enable stereo opus (#1013) 2022-09-16 12:36:02 +08:00
Raja Subramanian 6234e4e725 Process as many events as possible in update. (#1010)
With removal of subscription, it could happen twice.
When the subscriber actually unusbscribes and the publisher
removing all subscribers. The second unsubscribe was never
removed from the event list and eventually timed out.
So, process events as long as condition is satisfied.
2022-09-15 22:53:44 +05:30
Raja Subramanian c03003becf Logging some connection quality stuff to get some data. (#1008)
* Logging some connection quality stuff to get some data.

Setting it at 4.5 as normalised scores are higher.

* log average score
2022-09-15 17:16:59 +05:30
Raja Subramanian 33f782a99b Use PostEvent to avoid casting to concrete type (#1006) 2022-09-15 12:22:13 +05:30
Raja Subramanian 07c43e0972 Supervisor beginnings (#1005)
* Remove VP9 from media engine set up.

* Remove vp9 from config sample

* Supervisor beginnings

Eventual goal is to have a reconciler which moves state from
actual -> desired. First step along the way is to observe/monitor.
The first step even in that is an initial implementation to get
feedback on the direction.

This PR is a start in that direction
- Concept of a supervisor at local participant level
- This supervisor will be responsible for periodically monitor
  actual vs desired (this is the one which will eventually trigger
  other things to reconcile, but for now it just logs on error)
- A new interface `OperationMonitor` which requires two methods
  o Check() returns an error based on actual vs desired state.
  o IsIdle() returns bool. Returns true if the monitor is idle.
- The supervisor maintains a list of monitors and does periodic check.

In the above framework, starting with list of
subscriptions/unsubscriptions. There is a new module
`SubscriptionMonitor` which checks subscription transitions.
A subscription transition is queued on subscribe/unsubscribe.
The transition can be satisfied when a subscribedTrack is added OR
removed. Error condition is when a transition is not satisfied for
10 seconds. Idle is when the transition queue is empty and
subscribedTrack is nil, i. e. the last transition would have been
unsubscribe and subscribed track removed (unsubscribe satisfied).

The idea is individual monitors can check on different things.
Some more things that I am thinking about are
- PublishedTrackMonitor - started when an add track happens,
  satisfied when OnTrack happens, error if `OnTrack` does not
  fire for a while and track is not muted, idle when there is
  nothing pending.
- PublishedTrackStreamingMonitor - to ensure that a published track
  is receiving media at the server (accounting for dynacast, mute, etc)
- SubscribedTrackStreamingMonitor - to ensure down track is sending
  data unless muted.

* Remove debug

* Protect against early casting errors

* Adding PublicationMonitor
2022-09-15 11:16:37 +05:30
Raja Subramanian e7cc6bd4a1 Remove VP9 from media engine set up. (#1004)
* Remove VP9 from media engine set up.

* Remove vp9 from config sample
2022-09-14 16:16:14 +05:30
David Zhao 7e3155dcd6 ForceTCP only for supported clients (#997)
* ForceTCP only for supported clients

Revert back to standard if forceRelay with TLS fails
Don't force TLS unless it's configured

* fix lint
2022-09-09 18:14:36 -07:00
cnderrauber f1915feb1a keep mid unchange after migration for subscribed track (#995) 2022-09-09 17:39:09 +08:00
Raja Subramanian d76f7811e9 An attempt to use consistent layer mapping (#986)
* WIP commit

* Consistent layers.

* slight re-arrangement of code

* log mime

* fix tests

* map -> array
2022-09-07 09:57:31 +05:30
Raja Subramanian 021ec596b5 Fix check, thank you @cnderrauber (#982) 2022-09-05 10:16:57 +05:30
Raja Subramanian 20bd99903e Close down track before closing subscriber peer connection (#981)
* Close down track before closing subscriber peer connection

* plural
2022-09-03 12:16:40 +05:30
Raja Subramanian d13c4be923 Close subscriber PC after a wait to aid in migration. (#979)
* Close subscriber PC after a wait to aid in migration.

* mage generate
2022-09-03 01:16:51 +05:30
David Zhao 69bf31944e Send connection type to telemetry (#964)
* Send connection type to telemetry

When connected, determine how the participant's primary connection is
connected and report it in ParticipantActive event.

* address feedback

* fixed case where prflx is reported instead of relay

* incorporate comments
2022-08-29 23:17:13 -07:00
Raja Subramanian 032c3a1603 Fix track info available for down tracks. (#967)
Should have here in the first place. Brain damage :-(
2022-08-30 09:35:05 +05:30
Raja Subramanian 9b0539eb43 Need this for clean up during migration (#965) 2022-08-29 13:19:58 +05:30
Raja Subramanian 4217f198d6 Have to go through the full ICE restart checks after gatheting finishes. (#963) 2022-08-29 09:45:38 +05:30
Mathew Kamkar 767d660809 Use LocalNode ID in Prometheus metrics (#959) 2022-08-25 22:16:20 -07:00
David Zhao 747089a005 Additional closure reasons (#958) 2022-08-25 19:36:47 -07:00
Raja Subramanian 7ad8f87a52 Wait for answer (#957)
Maybe this is what is causing test flakiness in GH CI. Let's see.
2022-08-25 13:41:25 +05:30
Raja Subramanian 781bd74098 443 for TLS (#956)
* Use 443 for TURN TLS

* Explicit disable when TLS is not set
2022-08-25 09:05:44 +05:30
Raja Subramanian 06a46d5de0 Replace Target with params to indicate direction (#955)
* Replace Target with params to indicate direction

* Add missed send answer call
2022-08-25 08:33:06 +05:30
Raja Subramanian 5223c8292e Attempt to fix CI UT (#954)
I don't like using `Target` as direction.
There is one place in code that depends on it.
I am thinking we should add a params `IsOfferer` or something to make it
explicit.
2022-08-24 22:31:18 +05:30
Raja Subramanian 34bab018dc Do not initialize subscription version until explicitly set. (#951)
Initializing with current time means some updates are ignored.
Do not initialize until explicitly set.
2022-08-24 20:50:23 +05:30
cnderrauber 1350400c3a fallback to turn over tls when tcp short connection happen (#950)
* fallback to tls when tcp failed

* go mod

* magefile
2022-08-24 20:42:56 +08:00
Raja Subramanian aaa3a5b46e Transport restructure (#944)
* WIP commit

* WIP commit

* fix copy pasta

* setting PC with previous answer has to happen synchronously

* static check

* WIP commit

* WIP commit

* fixing transport tests

* fix tests and clean up

* minor renaming

* FIx test race

* log event when channel is full
2022-08-24 14:31:45 +05:30
cnderrauber c20a91d2b2 enable red by default (#940)
* enable red by default

* fix test case
2022-08-22 17:40:12 +08:00
cnderrauber a118d21af0 add red codec for opus (#938)
* opus/red codec

* panic

* forward red track to nonred subscriber

* config

* clean code

* solve comments
2022-08-22 12:32:27 +08:00
Raja Subramanian a600dfc9e3 Log and return error on no response sink. (#937)
* Log and return error on no response sink.

* Clean up
2022-08-21 22:36:59 +05:30
Raja Subramanian 70422c0267 Export CloseSignalConnection (#936)
* Export CloseSignalConnection

There are a few places where that close pattern is repeated.
Export it and use that function in other places directly.

* fix test
2022-08-21 11:33:35 +05:30
Raja Subramanian fae3857800 Log errors on sending offer/answer (#933)
* Log errors on sending offer/answer

* minor clean up

* remove unneeded logs

* fix test
2022-08-19 17:54:27 +05:30
Raja Subramanian e4e2e4189b Clear disconnect timer on ICERestart (#932)
* Clear disconnect timer on ICERestart

Disconnect timer is set up when a transport fails.
But, it is possible that the connection is resumed.
So, clear disconnect timer on resume.

* clean up
2022-08-19 16:24:41 +05:30
Raja Subramanian 0cd9c87dc9 Misc clean up (#931)
* Start RTCP workers after peer connection connects

* Move more things into transport module

* Start RTCP workers only on connected

* Test needs PeerConnection() method

* adjust comment
2022-08-19 11:49:12 +05:30
cnderrauber 770076febf fix resume/restart with single node mode (#930)
* fix resume/restart with single node mode

* clean comment
2022-08-18 12:46:18 +08:00
Raja Subramanian 05fcca9a04 Need to support receiver re-add/setup. (#929) 2022-08-18 08:23:49 +05:30
cnderrauber f819dcb63d use protocol/sdp for sdp process (#926) 2022-08-17 16:12:33 +08:00
Raja Subramanian f5627c3859 Prevent track subscriptions/adding receivers after close (#924)
* Prevent track subscriptions/adding receivers after close

With subscribe/unsubscribe queuing, a subscribe may be
attempted after a call to `RemoveAllSubscribers`.
So, renaming `RemoveAllSubscribers` to `InitiateClose`
and maintaining state that track is in the process of closing.

* Mime specific remove

* Remove unused error

* do not add receiver when closing
2022-08-17 13:07:59 +05:30
Raja Subramanian 3f53dea223 Log ICE candidates on peer connection established to get remote also (#922) 2022-08-16 15:31:16 +05:30
Raja Subramanian d9fdcf8c2b Promoting a few logs to Info (#921)
* Promoting a few logs to Info

Also, adding a couple of more info logs which I will remove later
after some debugging.

* mime type

* Protect pause/max layer

* notify even if not bound
2022-08-16 13:03:14 +05:30
David Zhao 1d199d1efa Populate network field when set by clients (#919) 2022-08-15 23:28:15 -07:00
Raja Subramanian 4f19866578 TrackInfo may not be available in Bind. (#918) 2022-08-15 21:18:22 +05:30
Raja Subramanian eaaec0aae1 Only change committed quality in update. (#917) 2022-08-15 17:29:52 +05:30
cnderrauber c38d4df52f server side codec preference for publish (#916) 2022-08-15 18:46:24 +08:00
Raja Subramanian 9d22225e92 A few misc changes (#915)
- Do not update jitter on padding only packet.
Padding only packet may not have proper timestamp.
If it does, it probably has the time stamp of the
last packet with payload. That will also affect
jitter calculation, i. e. wall clock time is moving,
but RTP time is the same.
- Do not send `onMaxLayer` changed on bind.
It was probably racing with update when max layer
is updated when adaptive stream is off. There is
no need to send that update as the default would
be OFF. It will be enabled when adaptive stream
subscription turns it on or when max layer is
set when down track bind happens and adaptive stream
is off.
2022-08-15 15:57:19 +05:30
Raja Subramanian b5c023f986 Connection quality changes (#913)
* WIP commit

* Connection quality changes

- Fix Firefox showing poor quality
  o The issue was that we were using max available layer and
    calculating quality. The rationale being that even if
    server sends dynacast messages, client may not implement
    dynacast and still stream all layers. But, with Firefox
    (maybe a Firefox bug), it sends some small amount of
    data on layer 2 even when that layer is disabled.
    Guessing it is probing (or actually we might be using
    some small value for high layers as Firefox cannot turn off
    layers). That higher layer gets used in quality calculation.
    As the bit rate on that layer is extremely low, it yields low
    score.

    Fixed by considering the max expected layer. That is of most
    interest. Yes, clients may ignore dynacast and stream all layers,
    but, max expected is the one of interest. So, look for
    quality in the max expected layer and not max available layer.
- Lots of clean up around connection quality stuff
  o Use a dynamic scaling thing to ensure that we do not get bitten
    by absolute values. Calculate best possible scenario score and
    map that to maximum MOS score. This will ensure that different
    codecs, different settings do not mess up the scoring. For example,
    a client might use 1 Mbps for 720p, but a different client could
    use 2 Mbps for 720p. As an SFU/infrastructure middlebox, we do
    not have control over quality at those rates. We can only ensure
    that streaming happens smoothly at those rates. So, in that
    example, for client 1, 1 Mbps will map to MOS 5.0 and for client 2,
    2 Mbps will map to MOS 5.0. Any impairments after that will
    reflect in the score.
  o Penalise for missing target layer by one level for one layer missed.
  o Move tests to connection quality directory. The participant test
    was not super useful.

* Add missed file

* Remove debug code

* use more constants and initialise normalisation factor

* rtcscore pointer
2022-08-15 13:21:07 +05:30