Commit Graph

126 Commits

Author SHA1 Message Date
Raja Subramanian 56dd399684 Use a worker to report signal/data stats. (#2260)
* Use a worker to report signal/data stats.

Was checking if reporting is needed on every update.
The check is wasted work if volume of signal/data messages is high
as reporting happens only once in 10 seconds.

Changing to a worker based on a timer. And also aligning with
telemetry reporting interval which defaults to 30 seconds.

* Remove unused constant
2023-11-22 11:47:15 +05:30
David Colburn 60374c6402 Agents (#2203)
* agents

* add test

* undo name changes

* remove debug logs

* fixes

* fix data race in test
2023-11-03 11:43:35 -07:00
David Colburn b8ac836b9b Only launch room egress once (#2175)
* only launch room egress once

* regenerate fakes
2023-10-24 13:05:23 -07:00
Raja Subramanian 281d2b674b Changing some log levels (#2094)
Logging expected WS close at Infow to understand reasons for closure.
Moving "read from ws" to Debugw as it happens when signalling closes.

Also filter out a data channel abort chunk log as it shows a bunch of
errors, but those are expected though.
2023-09-20 14:13:48 +05:30
David Zhao d5808f96df Disconnect participant when signal proxy is closed (#2024)
When signal proxy is closed, we'd want to stop writing to it and sever
WS connection with the client. Client would go into a reconnect sequence
2023-08-31 09:39:45 -07:00
David Zhao eed8e85008 Demote more logs to debug (#1998) 2023-08-27 19:17:38 -07:00
Raja Subramanian db3fbb57ae Do not log as error on connection reset by peer (#1923) 2023-08-01 19:04:54 +05:30
David Zhao 981fb7cac7 Adding license notices (#1913)
* Adding license notices

* remove from config
2023-07-27 16:43:19 -07:00
Paul Wells c38791ff0a stop retrying signal connection if the request context is closed (#1820) 2023-06-22 07:09:34 -07:00
Raja Subramanian 2438058474 Drop error logs due to pipe close (#1813) 2023-06-21 14:11:17 +05:30
paulwe 0dab55556d add drain function to rtc service 2023-06-15 11:56:41 -07:00
Raja Subramanian 12db469297 Better tracking of signalling connection. (#1794)
* Better tracking of signalling connection.

- Reason for closing signaling channel.
- ConnectionID attached to request source/response sink

* Tests
2023-06-15 12:53:34 +05:30
Paul Wells 70041f004f create signalStats from out of order join (#1640) 2023-04-20 03:27:41 -07:00
Paul Wells 96f3aaa587 free signal join response to gc after forwarding (#1619) 2023-04-16 17:38:09 -07:00
Raja Subramanian ac266fbcd6 Support subscriber_allow_pause connect option (#1612)
* Support subscriber_allow_pause connect option

* optional subscriber_allow_pause field
2023-04-13 17:00:32 +05:30
David Zhao 6abe3b1aee Adding logs when clients reconnect (#1598) 2023-04-10 21:26:16 -07:00
David Zhao e03f75d6a1 Implements source-specific permissions and client-driven metadata updates (#1590)
Closes #1565
2023-04-07 23:47:49 -07:00
David Zhao fc6a306031 Create a helper for retrieving a user's actual IP (#1579) 2023-04-04 19:32:49 -07:00
davidliu f05a3a047a add handling for react native and rust sdk client infos (#1544) 2023-03-25 01:27:37 +09:00
Raja Subramanian 2006359a97 move SDP to Debugw (#1413) 2023-02-12 22:56:43 +05:30
David Zhao 3e08ff1043 version 1.3.4 (#1411) 2023-02-09 23:31:42 -08:00
David Zhao 9a7ea7a2fa Close previous request channels when during initial retry (#1409)
So we don't leave abandoned requests hanging on the media instance
2023-02-09 17:27:33 -08:00
Dan McFaul ad7e075c18 exit after panic (#1392)
* let panics crash

* Revert "let panics crash"

This reverts commit 8027cccadd.

* catch and log panics then os.Exit

* Recover only recovers, caller can exit

* only exit on pacic, still need Recover calls in goroutines
2023-02-09 16:33:22 -07:00
cnderrauber 8b6dab780c Add reconnect reason and signal rtt calculation (#1381)
* Add connect reason and signal rtt calculate

* Update protocol

* solve comment
2023-02-06 11:12:25 +08:00
David Zhao be4764b93b Improve panic recovery to use participant logger. (#1375)
Also made IssueFullReconnect public
2023-02-02 14:55:50 -08:00
cnderrauber 7e5ba6a3b0 Improve connectivity check (#1366)
* Add Timer to detect dtls failure quickly

* Fix pc state check in timeout after ice

* More strict conditions to switch candidate type

* log for signal interuppt

* typo
2023-02-01 20:00:34 +08:00
Raja Subramanian 71eac631a1 Log offer/answer close to WebSocket connection ingress/egress. (#1359) 2023-01-31 17:50:56 +05:30
David Zhao 9a1f4ab18b Allow /rtc/validate to return room not found message (#1344)
It's not always possible for WebSocket clients to obtain status code or
error messages returned during WS Upgrade. Moving autocreation validation
to an explicit interface in the validation step so the /rtc/validate
would be able to return an appropriate message.
2023-01-29 21:41:44 -08:00
David Zhao 2fa46e2df4 Retry initial connection attempt should it fail (#1335)
Sometimes the initial selected node could fail. In that case, we'll give it a few more attempts to locate a media node for the session instead of failing it after the first try.
2023-01-25 22:59:57 -08:00
Dan McFaul 4d6f0cd0f7 Stats collect v2 (#1291)
* initial commit

* add correct label

* clean up

* more cleanup on adding stats

* cleanup

* move things to pub and sub monitors, ensure stats are correctly updated

* fix merge conflict

* Fix panic on MacOS (#1296)

* fixing last feedback

Co-authored-by: Raja Subramanian <raja.gobi@tutanota.com>
2023-01-11 14:49:50 -07:00
David Zhao 7a1273151f Update to new logging library, using sampling participant logger (#1219) 2022-12-09 00:09:03 -08:00
cnderrauber 3c907ed460 Add stats for data channel and signal (#1198)
* Add stats for data channel and signal

* Solve comment
2022-11-30 14:53:19 +08:00
David Zhao 1019faa0e6 Cleanup pass through logging (#1073)
* added filtering for noisy pion logs
* demoted some logs to debug
* using consistent trackID / participant / publisher / subscriber terminology
* removed ice candidate log lines, deferring to combined log
2022-10-06 23:48:37 -07:00
David Colburn b97d59b8db consolidate room internal (#1030)
* consolidate room internal

* create room internal map

* pipelined room read

* check error

* fix pipelined reads

* clean up after test
2022-09-22 15:59:27 -07:00
David Zhao 99cb021e85 Increase max wait for media node to respond. (#1026)
In extreme cases, media nodes could take more than 5s to spin up the
session. Increasing this timeout to 10s reduces the number of disconnections
due to edge cases.
2022-09-20 23:38:58 -07:00
Raja Subramanian 60297b24fd Use UnixMilli (#1009)
* Use UnixMilli

* enhance comment
2022-09-15 19:17:24 +05:30
Raja Subramanian 07c43e0972 Supervisor beginnings (#1005)
* Remove VP9 from media engine set up.

* Remove vp9 from config sample

* Supervisor beginnings

Eventual goal is to have a reconciler which moves state from
actual -> desired. First step along the way is to observe/monitor.
The first step even in that is an initial implementation to get
feedback on the direction.

This PR is a start in that direction
- Concept of a supervisor at local participant level
- This supervisor will be responsible for periodically monitor
  actual vs desired (this is the one which will eventually trigger
  other things to reconcile, but for now it just logs on error)
- A new interface `OperationMonitor` which requires two methods
  o Check() returns an error based on actual vs desired state.
  o IsIdle() returns bool. Returns true if the monitor is idle.
- The supervisor maintains a list of monitors and does periodic check.

In the above framework, starting with list of
subscriptions/unsubscriptions. There is a new module
`SubscriptionMonitor` which checks subscription transitions.
A subscription transition is queued on subscribe/unsubscribe.
The transition can be satisfied when a subscribedTrack is added OR
removed. Error condition is when a transition is not satisfied for
10 seconds. Idle is when the transition queue is empty and
subscribedTrack is nil, i. e. the last transition would have been
unsubscribe and subscribed track removed (unsubscribe satisfied).

The idea is individual monitors can check on different things.
Some more things that I am thinking about are
- PublishedTrackMonitor - started when an add track happens,
  satisfied when OnTrack happens, error if `OnTrack` does not
  fire for a while and track is not muted, idle when there is
  nothing pending.
- PublishedTrackStreamingMonitor - to ensure that a published track
  is receiving media at the server (accounting for dynacast, mute, etc)
- SubscribedTrackStreamingMonitor - to ensure down track is sending
  data unless muted.

* Remove debug

* Protect against early casting errors

* Adding PublicationMonitor
2022-09-15 11:16:37 +05:30
cnderrauber 441053b7fa add participant id when client reconnect (#988) 2022-09-07 15:56:56 +08:00
David Zhao d9059f4f3b Do not accept websocket connection if response not received (#923)
When the instance handling the signal request did not respond to the
initial connection, we will fail the connection attempt instead of
having it hang forever.
2022-08-16 19:57:41 -07:00
David Zhao 1d199d1efa Populate network field when set by clients (#919) 2022-08-15 23:28:15 -07:00
David Zhao ab1ccae0c7 Respond to signal ping / pong (#871)
* Respond to signal ping / pong

* Pass back 1 for pong for now, we don't need the timestamp

* update protocol
2022-08-05 09:24:47 -07:00
David Zhao 53f51c8cb0 Logging cleanup (#843)
* Logging cleanup

Changes log levels to better match significance

* fix lock
2022-07-21 00:39:49 -07:00
Raja Subramanian 52aed86080 Add remote participant context to logger (#815) 2022-07-07 10:44:10 +05:30
David Zhao b7d22c4f34 Fix MessageChannel leaks (#646) 2022-04-22 10:53:20 -07:00
cnderrauber 3f9d6c11bc add log info for client closed websocket (#640) 2022-04-21 12:44:43 +08:00
David Zhao b821a0997d Use common logging init functions (#633)
* Use common logging init functions

* update protocol commit

* fix tests
2022-04-20 00:15:11 -07:00
Raja Subramanian cf627d8bbe Send adaptive stream param in join (#626) 2022-04-19 16:45:35 +05:30
Raja Subramanian 4696503790 Include region in ParticipantInfo (#585) 2022-03-31 14:57:55 +05:30
David Colburn 0b8a180554 Code inspection (#581)
* Code inspection

* fix [4]int64 conversiong
2022-03-30 13:49:53 -07:00
David Colburn 26f7bb498a Identity cannot be empty (#580) 2022-03-30 12:53:32 -07:00