Commit Graph

771 Commits

Author SHA1 Message Date
David Zhao
5ff72a99b9 Report publish & subscribe RTPStats as Telemetry events (#1506) 2023-03-10 10:28:54 -08:00
imcdd
1f4fd6aafe 1. Fix wrong atomic pkg from go1.19 std sync/atomic to go.uber.org/atomic (#1479)
2. Fix CI buildtest config '>=1.18' to '1.18',ensure compatibility with go1.18
2023-03-07 23:27:26 -08:00
cnderrauber
11ae7fdbb6 Don't switch candidate if signal closed when pc failed (#1498)
* Don't switch candidate if signal closed when pc failed

* change comment

* test case
2023-03-08 15:16:40 +08:00
cnderrauber
38deab8991 Send room update while client reconnecting (#1499) 2023-03-07 15:59:40 +08:00
cnderrauber
48cf30ba23 Send disconnected participant update for reconnecting user (#1495)
* Send disconnected participant update for reconnecting user

* clean code
2023-03-07 09:13:15 +08:00
Raja Subramanian
c3b9849328 Return high quality when there are no tracks. (#1493) 2023-03-06 09:08:02 +05:30
Raja Subramanian
9e327b1f3c Connection quality (#1490)
* Make connection quality not too optimistic.

With score normalization, the quality indicator showed good
under conditions which should have normally showed some badness.

So, a few things in this PR
- Do not normalize scores
- Pick the weakest link as the representative score (moving away from
  averaging)
- For down track direction, when reporting delta stats, take the number
  of packets sent actually. If there are holes in the feed (upstream
  packet loss), down tracks should not be penalised for that loss.

State of things in connection quality feature
- Audio uses rtcscore-go (with a change to accommodate RED codec). This
  follows the E-model.
- Camera uses rtcscore-go. No change here. NOTE: THe rtscore here is
  purely based on bits per pixel per frame (bpf). This has the following
  existing issues (no change, these were already there)
  o Does not take packet loss, jitter, rtt into account
  o Expected frame rate is not available. So, measured frame rate is
    used as expected frame rate also. If expected frame rate were available,
    the score could be reduced for lower frame rates.
- Screen share tracks: No change. This uses the very old simple loss
  based thresholding for scoring. As the bit rate varies a lot based on
  content and rtcscore video algorithm used for camera relies on
  bits per pixel per frame, this could produce a very low value
  (large width/height encoded in a small number of bits because of static content)
  and hence a low score. So, the old loss based thresholding is used.

* clean up

* update rtcscore pointer

* fix tests

* log lines reformat

* WIP commit

* WIP commit

* update mute of receiver

* WIP commit

* WIP commit

* start adding tests

* take min score if quality matches

* start adding bytes based scoring

* clean up

* more clean up

* Use Fuse

* log quality drop

* clean up debug log

* - Use number of windows for wait to make things simpler
- track no layer expected case
- always update transition
- always call updateScore
2023-03-05 12:55:04 +05:30
cnderrauber
4277699600 Add option to enable skip tcp ice if tcp rtt is high (#1484)
* Add option to switch tcp ice only if tcp works well

* solve comment

* rename and remove config change
2023-03-01 16:45:39 +08:00
cnderrauber
29fa61068e enable nack if red encoding disabled (#1477) 2023-02-28 12:44:25 +08:00
cnderrauber
c367c36d8f Add config for active red encoding (#1476) 2023-02-28 10:44:47 +08:00
David Zhao
8c43b7b48f Fix unsubscribed speakers stuck as speaking to clients (#1475)
When we unsubscribe from a speaker, SendSpeakerUpdates will drop updates
from that speaker. This has the side effect of dropping the "clearing"
message that we are sending as well.
2023-02-26 23:56:09 -08:00
David Zhao
ade26f7c9e Keep track of pending reconciles to avoid duplicate queueReconcile (#1474) 2023-02-26 23:45:32 -08:00
Raja Subramanian
73399dd565 Encapsulate better. (#1466)
Only `logger` and `trackID` of `trackSubscription` is directly accessed
from outside. They are set at construction time. So, should be fine.
2023-02-24 09:34:03 +05:30
David Zhao
34fcf9e496 Additional case of subscribing to a closed track (#1465)
When the publisher stops publishing, the individual receivers would close
attached DownTracks first before notifying MediaTrackReceiver callbacks.

This means #1454 does not fix the issue entirely since there is still a window
when we can subscribe to a closing track.
2023-02-23 17:07:16 -08:00
cnderrauber
dbb2cdf2b6 switch to tls if tcp ice dose not work well (#1458)
* switch to tls if tcp ice don't work well

* ignore tcp quality too bad
2023-02-23 14:07:50 +08:00
Raja Subramanian
cd0359c898 Reset subscription start timer on permission grant. (#1457)
If not, bind timeout could be reported on permission grant
as it could be using some old timer.
2023-02-23 09:39:33 +05:30
David Zhao
e855620379 Prevent subscribing to track that's closing (#1454)
Due to the order of events in MediaTrackReceiver and friends, SubscribedTrack
will be closed before the track is removed from RoomTrackManager.

Because of this, when a track is unpublished, it's possible to be subscribed
to the track as it's closing.

By introducing a closing state, we'd prevent accidental subscription to
closing tracks.
2023-02-22 01:14:49 -08:00
cnderrauber
1b3d6fad54 update to utils parallel execute (#1450)
* log change

* remove identity field

* update to protocol parallel execute

* update protocol
2023-02-22 11:09:32 +08:00
Raja Subramanian
6b55742564 Use available layers in optimal allocation. (#1445)
Addressing edge case where a layer stopped before bitrate could be
measured. Purely bit rate based change deduction missed this as
the before and after did not have bit rates.

Use available layers to look for changes, especially currently
forwarding layer going away.

Also, simplifying bits. Only in the optimal allocation path,
these things are required. When congested, bitrate is always needed.
So, for optimal path, just look at available layer changes and adjust.

Don't need to look for bitrate based layer changes. Clean up that code.
2023-02-19 11:41:40 +05:30
Paul Wells
b35d64ae86 finish timed version migration (#1443)
* finish timed version migration

* update protocol dep
2023-02-18 12:08:08 -08:00
Raja Subramanian
bdc515774e Declare migration complete only after publish callback finishes. (#1442)
The following sequence caused early migration complete declaration
1. Audio track received
2. Audio track published callback in progress
3. Video track received, this clears the pending track
4. Audio track published callback finishes. This checks for pending
   tracks. As nothing is pending migration complete declared.
5. Due to the above, the remote video track is closed as not resuming.
   That causes an unsubscription.

Fix
- Wait till publish callback to finish to remove a track from pending
  fully.
- Introducing a new map as pending tracks is used in OnClose too. So,
  did not want to delay removing from it as a close could happen while
  publish callback is happening.

Also, moving the publish callback to a go routine (just like the recent
change for running those in a go routine for migrated muted tracks)
2023-02-18 12:08:43 +05:30
David Zhao
7a2d9b3d61 Ensure subscription logging is clear & without sampling (#1440) 2023-02-17 22:15:19 -08:00
Raja Subramanian
9f33ce0ecd Declare migration complete after track publish callback. (#1436) 2023-02-17 19:34:31 +05:30
Raja Subramanian
9f94fc8347 Callback support for migrate state change. (#1435)
This can be used to detect changes in migrate state and signal
migration completion to remote nodes.
2023-02-17 13:13:01 +05:30
cnderrauber
6d16f061de Suppress negotiation timout log if signal disconnect (#1433)
* Suppress negotiation timout log if signal disconnect

* solve comment
2023-02-17 15:40:35 +08:00
David Zhao
c16eb66925 Fix race condition with unsubscribing from a republished track (#1429) 2023-02-16 15:11:23 -08:00
Raja Subramanian
6cb46107c8 Delete signal de-duper. (#1427)
Not a good design. There is not an easy way to filter messages
before it hits media node. Without that, there is not a lot
of advantage.

And there are sequences that are not handled correctly in this
deleted implementation.

So, deleting code to prevent use.
2023-02-16 09:32:48 +05:30
David Colburn
10c53e0ebb Move psrpc to protocol (#1426)
* move psrpc to protocol

* update checks

* update protocol

* update protocol ref

* blank line
2023-02-15 16:47:38 -08:00
Dan McFaul
1848a21eda add configurable environment value (#1421)
* add configurable prometheus env label

* Update pkg/config/config.go

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* Update cmd/server/main.go

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* Update config-sample.yaml

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* set config.Environment value to dev when in dev mode

* be more precise for config-sample

---------

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>
2023-02-15 14:41:44 -07:00
cnderrauber
4367e93855 parallel writing for data packet broadcast (#1425) 2023-02-15 17:18:43 +08:00
David Zhao
4f6fda586c Do not unsubscribe from track if it's been republished (#1424)
thank you @boks1971 for the debugging
2023-02-14 23:12:08 -08:00
Raja Subramanian
3309c6fff0 Log feed SSRC (#1419) 2023-02-14 18:05:20 +05:30
Raja Subramanian
7857af4a66 Send unpublished callback unconditionally. (#1418)
Even if an add track has been queued and can be used immediately
when the previous incarnation unpublishes, send the unpublished
callback as the track was technically unpublished and republished.
The re-publish will pick the same track SID when the pending track
is queued as it will get the SID from an existing published track.
2023-02-14 13:17:54 +05:30
cnderrauber
e6eabb3a03 Prevent force tcp when client left between ice and dtls (#1414) 2023-02-13 13:57:50 +08:00
Raja Subramanian
481b4abbe5 Adding a debug log for track unpublish (#1412) 2023-02-11 09:38:58 +05:30
Trey Hakanson
ce07914e44 Allow for strict ACKs to be disabled or subscriber peer connections (#1410) 2023-02-10 22:51:03 +05:30
Dan McFaul
ad7e075c18 exit after panic (#1392)
* let panics crash

* Revert "let panics crash"

This reverts commit 8027cccadd.

* catch and log panics then os.Exit

* Recover only recovers, caller can exit

* only exit on pacic, still need Recover calls in goroutines
2023-02-09 16:33:22 -07:00
cnderrauber
a04e7ce647 Fix panic by CreateSenderReport before bind completed (#1397) 2023-02-08 17:02:19 +08:00
Raja Subramanian
4d8def2eb7 Ignore errors on closed connection. (#1396) 2023-02-08 13:41:04 +05:30
cnderrauber
5161dba873 Filter mdns candidate if not mdns not enabled (#1393)
* Filter mdns candidate if not mdns not enabled

* log target
2023-02-08 10:27:24 +08:00
Raja Subramanian
9d32c6065d Include track priority in de-duper checks (#1388) 2023-02-07 11:57:29 +05:30
David Zhao
2851a8ac98 Improved robustness of subscription stack (#1382)
UpdateSubscription had a shortcoming where when it couldn't find the
participant, it ignored the request.

This PR further removes the reliance of current publisher state from
subscribers.
- SubscribeToTrack only takes in a trackID
- Introduced RoomTrackManager to maintain all published tracks to a room
- Added TrackUnpublished event to clearly indicate when a track has been removed
- SubscribeRequested event no longer include information about the publisher
2023-02-06 18:08:26 -08:00
Raja Subramanian
aaba402ada Log messages processed by signal de-duper. (#1384)
This will get noisy. Will revert once we have some data to analyse.
2023-02-06 10:55:42 +05:30
cnderrauber
8b6dab780c Add reconnect reason and signal rtt calculation (#1381)
* Add connect reason and signal rtt calculate

* Update protocol

* solve comment
2023-02-06 11:12:25 +08:00
Raja Subramanian
d67cdb6141 Return early if already subscribed. (#1377)
* Return early if already subscribed.

When already subscribed, returned `subTrack` is nil.
Return early, but do not return an error.

* check for nil subTrack

* check for nil as well
2023-02-04 13:47:35 +05:30
David Zhao
add9962655 Avoid triggering subscription failed handler unnecessarily. (#1379)
Certain errors are not at fault of the subscriber. For these errors
the reconciler should keep trying instead of giving up.
2023-02-03 01:06:04 -08:00
David Zhao
be4764b93b Improve panic recovery to use participant logger. (#1375)
Also made IssueFullReconnect public
2023-02-02 14:55:50 -08:00
Raja Subramanian
501fb0860e correct test name, only latest version is used (#1373) 2023-02-03 03:50:45 +05:30
Raja Subramanian
037ae572d9 Ensure older participant session update does not go out after a newer (#1372)
* Ensure older participant session update does not go out after a newer
session has joined.

* fix tests

* change comment

* do not send older version
2023-02-03 00:30:11 +05:30
David Zhao
8c513a1fc5 Improve SubscriptionManagerTest resilience (#1370)
This check fails on GH Actions occasionally due to timing differences.
2023-02-01 22:55:57 -08:00