Commit Graph

235 Commits

Author SHA1 Message Date
Raja Subramanian
c8b7d486b9 Do not synthesise DISCONNECT on session change. (#2412)
* Do not synthesise DISCONNECT on session change.

v12 clients can handle session change based on identity.

* change for testf

* Squelch participant update if close reason is DUPLICATE_IDENTITY.

* fix test

* comment

* Clean up participant close reason a bit

* fix test

* test
2024-01-31 11:36:50 +05:30
Raja Subramanian
d53f167b31 LeaveRequest changes. (#2426)
Reworking this a bit
1. Send leave whenever the signal channel is closed to induce a resume.
2. Use a getter to get regions rather than setting.
2024-01-29 13:04:18 +05:30
Raja Subramanian
bcf9fe3f0f Use a participant worker queue in room. (#2420)
* Use a participant worker queue in room.

Removes selectively needing to call things in goroutine from
participant.

Also, a bit of drive-by clean up.

* spelling

* prevent race

* don't need to remove in goroutine as it is already running in the worker

* worker will get cleaned up in state change callback

* create participant worker only if not created already

* ref count participant worker

* maintain participant list

* clean up oldState
2024-01-28 22:10:35 +05:30
Raja Subramanian
d3da94c45e Augment LeaveRequest with alternate regions to connect. (#2408)
* Augment LeaveRequest with alternate regions to connect.

* update protocol and issue resume action on close if expected to resume

* use current protocol in tests

* address feedback
2024-01-25 22:22:46 +05:30
Raja Subramanian
899067ba0f Simulation scenarios to disable signal channel on resume (#2389)
* Add a simulation scenario to disconnect signal channel on resume

- Requesting that scenario add that participant to a map with a timeout
  of 5 seconds.
- If a resume (reconnect = 1) happens before the timeout, the signalling
  channel is closed immediately on resume.
- There is a clean up worker which will remove entries from the map when
  they timout.
- The participant is also removed from the map if the disconnect on
  resume is invoked once.

* simulate disconnect signal on resume no messages

* comment

* comment

* Close all retries

* update deps

* abort resume only if simulation applied

* Revert SIP change
2024-01-17 20:44:05 +05:30
Paul Wells
a10830a995 add protocol version 12 helper (#2358) 2024-01-03 00:06:43 -08:00
Raja Subramanian
2f1a2ff39d Protect against stats getting reset. (#2351)
A reset would make `after` look like it is `before` and the diff will be
large unsigned numbers.
2023-12-28 01:16:59 +05:30
Raja Subramanian
5eea679589 Include packets out-of-order in TrafficStats. (#2350)
PacketsLost may not provide useful if repairs are discounting the loss.
So, out-of-order packets are an indication of loss and maybe subsequent
repair. Note that out-of-order could be just out-of-order by a short
amount of time, but a lot of that happening is not good either.
So, out-of-order could provide a decent view of link quality.
2023-12-27 23:15:24 +05:30
Raja Subramanian
15500d8a18 Add padding and lost packets to traffic stats (#2349)
* Add padding and lost packets to traffic stats

* aggregate padding packets
2023-12-27 17:40:59 +05:30
Raja Subramanian
faff67162b Consolidate TrackInfo. (#2331)
* Consolidate TrackInfo.

TrackInfo was spread across a bit. Consolidating it.

* TODO comments

* test

* update TrackInfo on SSRC change

* further consolidation

* log mimes only

* update receivers on SSRC set

* clone proto on return

* feedback: break loop on mime match

* prevent data race
2023-12-21 09:56:54 +05:30
Raja Subramanian
d0c36aa6cc Make UpdateTrackInfo an interface. (#2327) 2023-12-20 09:58:08 +05:30
Raja Subramanian
37539fdf76 Add Version to TrackInfo. (#2324)
* Add Version to TrackInfo.

Set when a track is published.

* update protocol
2023-12-19 11:50:48 +05:30
Raja Subramanian
0478af449f Do not error on end-of-candidates candidate (#2314) 2023-12-14 15:22:57 +05:30
Raja Subramanian
83efa9258e Bump up protocol for connection quality LOST. (#2297)
Also log trackID/trackInfo in layer mapping.
2023-12-06 16:59:05 +05:30
David Zhao
02c28a5946 Fix Selected attribute not being copied (#2289) 2023-12-03 22:05:23 -08:00
David Zhao
37e1864df8 Expose detailed connection info with ICEConnectionDetails (#2287)
* Expose detailed connection info with ICEConnectionDetails

* clone to avoid data race

* lower transport

* simplify

* address feedback
2023-12-03 10:03:41 -08:00
Raja Subramanian
53542b09a0 Participant traffic load. (#2262)
* Participant traffic load.

Capturing information about participant traffic
- Upstream/Downstream
- Audio/Video/Data
- Packets/Bytes

This captures a notion of how much traffic load a participant is
generating.

Can be used to make allocation decisions.

* Clean up

* SIP patches

* reporter goroutine

* unlock

* move traffic stats from protocol

* check type
2023-11-26 23:05:00 +05:30
David Colburn
57643a42ed Agents enabled check (#2227)
* agents enabled check

* participant -> publisher

* nil check client

* add NumConnections

* add lock around agent check

* do not launch agents against other agents

* regen

* don't need atomic anymore

* update protocol
2023-11-07 19:19:07 -08:00
cnderrauber
302facc60d Reject migration if codec mismatch with published tracks (#2225)
* Reject migrated/published track mismatch codec with track info

* Check potential codecs

* Issue full connect if mismatch

* fix codec finding
2023-11-06 21:11:39 +08:00
Paul Wells
325e5ca753 add psrpc room service (#2171)
* add psrpc room service

* update deps

* disable by default

* feedback

* config

* test
2023-10-22 22:49:38 -07:00
cnderrauber
92a355e1f3 Add SyncStreams flag to Room (#2110)
* Add SyncStreams flag to Room

* Increase protocol version

* Revert version change

* Move flags to internal & solve comment
2023-09-28 15:41:44 +08:00
Raja Subramanian
019ad88b08 Do not force reconnect on resume if there is a pending track (#2081)
* Do not force reconnect on resume if there is a pending track

* move GetPendingTrack -> LocalParticipant
2023-09-17 14:00:09 +05:30
Raja Subramanian
6509bb0325 Add option to issue full reconnect on data channel error. (#2026)
* Add option to issue full reconnect on data channel error.

There are situations where send data packet fails because of "stream
closed". It is unclear when that happens. Seems to be after an
ICERestart after ICE failed and connection type switching to TURN
from ICE.

Once the failure happens, it is not recoverable. Potentially, it is
recoverable, but unclear where the problem lies. Attempts to reproduce
looking at the pattern of failures has been unsuccesful.

In the mean time, adding an option to issue full reconnect
when send data packet fails.

* typo
2023-09-01 17:59:25 +05:30
Raja Subramanian
8c99a9e307 Move GetAudioLevel interface. (#1992)
To allow use with RemoteParticipant/RemoteMediaTrack too.
2023-08-24 13:25:49 +05:30
cnderrauber
f7a1776f4c Add control of playout delay (#1838)
* Add control of playout delay

Add config to enable playout delay. The delay will be limited by
[min,max] in the config option and calculated by upstream & downstream
RTT.

* check protocol version to enable playout delay

* Move config to room, limit playout-delay update interval, solve comments

* Remove adaptive playout-delay

* Remove unused config
2023-08-02 16:12:23 +08:00
David Zhao
981fb7cac7 Adding license notices (#1913)
* Adding license notices

* remove from config
2023-07-27 16:43:19 -07:00
Raja Subramanian
fc7d4bd01e E2EE trailer for server injected packets. (#1908)
* Ability to use trailer with server injected frames

A 32-byte trailer generated per room.
Trailer appended when track encryption is enabled.

* E2EE trailer for server injected packets.

- Generate a 32-byte per room trailer. Too reasons for longer length
  o Laziness: utils generates a 32 byte string.
  o Longer length random string reduces chances of colliding with real data.
- Trailer sent in JoinResponse
- Trailer added to server injected frames (not to padding only packets)

* generate

* add a length check

* pass trailer in as an argument
2023-07-27 16:50:18 +05:30
Paul Wells
3980d049c9 close disconnected participants when signal channel fails (#1895)
* close disconnected participants when signal channel fails

* fix typefake

* update reason
2023-07-20 19:23:35 -07:00
Raja Subramanian
eaf70d5549 Pacer in down stream path. (#1835)
* Pacer interface to send packets

* notify outside lock

* use select

* use pass through pacer

* add error to OnSent

* Remove log which could get noisy

* Starting TWCC work (#1727)

* add packet time

* WIP commit

* WIP commit

* WIP commit

* minor comments

* Some measurements (#1736)

* WIP commit

* some notes

* WIP commit

* variable name change and do not post to closed channel

* unlock

* clean up

* comment

* Hooking up some more bits for TWCC (#1752)

* wake under lock

* Pacer in down stream path.

Splitting out only the pacer from a feature branch to
introduce the concept of pacer.

Currently, there should be no difference in functionality
as a pass through pacer is used.

Another implementation exists which is just put it in a queue and send
it from one goroutine.

A potential implementation to try would be data paced by bandwidth
estimate. That could include priority queues and such.

But, the main goal here is to introduce notion of pacer in the down
stream path and prepare for more congestion control possibilities down
the line.

* Don't need peak detector

* remove throttling of write IO errors
2023-06-28 13:22:44 +05:30
Raja Subramanian
352bb1d204 Add GetClientInfo interface, to be used to decide migration vs full-reconenct (#1827) 2023-06-26 23:15:53 +05:30
Raja Subramanian
00558dee5c Close participant on full reconnect. (#1818)
* Close participant on full reconnect.

A full reconnect == irrecoverable error. Participant cannot continue.
So, close the participant when issuing a full reconnect.
That should prevent subscription manager reconcile till the participant
is finally closed down when participant is stale.

* format
2023-06-22 10:09:10 +05:30
Raja Subramanian
12db469297 Better tracking of signalling connection. (#1794)
* Better tracking of signalling connection.

- Reason for closing signaling channel.
- ConnectionID attached to request source/response sink

* Tests
2023-06-15 12:53:34 +05:30
Raja Subramanian
7ed3af193a No proof that this helps (#1772) 2023-06-06 11:28:13 +05:30
cnderrauber
c1842cb54f Avoid reconnect loop for unsupported downtrack (#1754)
* Avoid reconnect loop for unsupported downtrack

If the client subscribes to a track which codec is unsupported by the
client, sfu will trigger negotiation failed and issue a full reconnect
after received client answer. If the client try to subscribe that track
then it will got full reconnect again. That will cause a infinite
reconnect loop until the client don't subscribe that track. This PR
will unsubscribe the error track for the client and send a
SubscriptionResponse that contain the reason to indicates the track's
codec is not supported to avoid the reconnect loop.
2023-05-31 11:41:22 +08:00
Raja Subramanian
3fb93135f5 Experimental flag to try time stamp adjustment to control drift. (#1687)
* Experimental flag to try time stamp adjustment to control drift.

There is a config to enable this.

Using a PID controller to try and keep the sample rate at expected
value. Need to be seen if this works well. Adjustment are limited
to 25 ms max at a time to ensure there are no large jumps.
And it is applied when doing RTCP sender report which happens
once in 5 seconds currently for both audio and video tracks.

A nice introduction to PID controllers - https://alphaville.github.io/qub/pid-101/#/
Implementation borrowed from - https://github.com/pms67/PID

A few things TODO
1. PID controller tuning is a process. Have picked values from test from
   that implementation above. May not be the best. Need to try.
2. Can potentially run this more often. Rather than running it only when
   running RTCP sender report (which is once in 5 seconds now), can
   potentially run it every second and limit the amount of change to
   something like 10 ms max.

* remove unused variable

* debug log a bit more
2023-05-06 11:52:57 +05:30
David Zhao
5fcd682fb0 Refactor participant metadata updates to avoid duplication (#1679)
* Refactor participant metadata updates to avoid duplication

* generated fakes
2023-05-03 13:50:45 -07:00
Raja Subramanian
35b8319b08 Remove disallowed subscriptions on close. (#1668)
With subscription manager, there is no need to tell a publisher
about a subscriber going away. Before subscription manager,
the up track manager of a participant (i. e. the publisher side)
was holding a list of pending subscriptions for its published tracks
and that had to be cleaned up if one of the subscriber goes away.
That is not the case any more.

Also set publisherID early so that subscription permission update has
the right publisherID. In fact, saw an empty ID in the logs and saw
that we still have the disallowed subscription handling which is not
necessary any more.
2023-04-29 09:18:07 +05:30
Paul Wells
11eedf4514 update participant to support signal broadcast skipping (#1657)
* update participant to support signal broadcast skipping

* cleanup

* lock

* feedback

* order

* update requireBroadcast in SetPermissions
2023-04-26 17:11:33 -07:00
David Zhao
3f64828a77 Send Room updates when participant counts change (#1647)
Reduces the number of unneeded generation with ProtoProxy
2023-04-22 21:08:59 -07:00
Paul Wells
745410bd69 only increment participant version after updates (#1646)
* only increment participant version after updates

* fix test util

* cleanup

* test uptrackmanager permission update version check
2023-04-22 17:48:10 -07:00
Raja Subramanian
d2bf8f0ba1 Support simulating subscriber bandwidth. (#1609)
* Support simualting subscriber bandwidth.

When non-zero, a full allocation is triggered.
Also, probes are stopped.

When set to zero, normal probing mechanism should catch up.

Adding `allowPause` override which can be a connection option.

* fix log

* allowPause in participant params
2023-04-13 13:59:24 +05:30
cnderrauber
c70a5c831f Refine transport fallback for client resuming (#1597)
* reset fallback after ice restart

* Configure ice for reconnect before send response
2023-04-10 15:12:05 +08:00
David Zhao
e03f75d6a1 Implements source-specific permissions and client-driven metadata updates (#1590)
Closes #1565
2023-04-07 23:47:49 -07:00
cnderrauber
11ae7fdbb6 Don't switch candidate if signal closed when pc failed (#1498)
* Don't switch candidate if signal closed when pc failed

* change comment

* test case
2023-03-08 15:16:40 +08:00
cnderrauber
48cf30ba23 Send disconnected participant update for reconnecting user (#1495)
* Send disconnected participant update for reconnecting user

* clean code
2023-03-07 09:13:15 +08:00
Raja Subramanian
9e327b1f3c Connection quality (#1490)
* Make connection quality not too optimistic.

With score normalization, the quality indicator showed good
under conditions which should have normally showed some badness.

So, a few things in this PR
- Do not normalize scores
- Pick the weakest link as the representative score (moving away from
  averaging)
- For down track direction, when reporting delta stats, take the number
  of packets sent actually. If there are holes in the feed (upstream
  packet loss), down tracks should not be penalised for that loss.

State of things in connection quality feature
- Audio uses rtcscore-go (with a change to accommodate RED codec). This
  follows the E-model.
- Camera uses rtcscore-go. No change here. NOTE: THe rtscore here is
  purely based on bits per pixel per frame (bpf). This has the following
  existing issues (no change, these were already there)
  o Does not take packet loss, jitter, rtt into account
  o Expected frame rate is not available. So, measured frame rate is
    used as expected frame rate also. If expected frame rate were available,
    the score could be reduced for lower frame rates.
- Screen share tracks: No change. This uses the very old simple loss
  based thresholding for scoring. As the bit rate varies a lot based on
  content and rtcscore video algorithm used for camera relies on
  bits per pixel per frame, this could produce a very low value
  (large width/height encoded in a small number of bits because of static content)
  and hence a low score. So, the old loss based thresholding is used.

* clean up

* update rtcscore pointer

* fix tests

* log lines reformat

* WIP commit

* WIP commit

* update mute of receiver

* WIP commit

* WIP commit

* start adding tests

* take min score if quality matches

* start adding bytes based scoring

* clean up

* more clean up

* Use Fuse

* log quality drop

* clean up debug log

* - Use number of windows for wait to make things simpler
- track no layer expected case
- always update transition
- always call updateScore
2023-03-05 12:55:04 +05:30
David Zhao
8c43b7b48f Fix unsubscribed speakers stuck as speaking to clients (#1475)
When we unsubscribe from a speaker, SendSpeakerUpdates will drop updates
from that speaker. This has the side effect of dropping the "clearing"
message that we are sending as well.
2023-02-26 23:56:09 -08:00
David Zhao
e855620379 Prevent subscribing to track that's closing (#1454)
Due to the order of events in MediaTrackReceiver and friends, SubscribedTrack
will be closed before the track is removed from RoomTrackManager.

Because of this, when a track is unpublished, it's possible to be subscribed
to the track as it's closing.

By introducing a closing state, we'd prevent accidental subscription to
closing tracks.
2023-02-22 01:14:49 -08:00
Raja Subramanian
9f94fc8347 Callback support for migrate state change. (#1435)
This can be used to detect changes in migrate state and signal
migration completion to remote nodes.
2023-02-17 13:13:01 +05:30
Raja Subramanian
6cb46107c8 Delete signal de-duper. (#1427)
Not a good design. There is not an easy way to filter messages
before it hits media node. Without that, there is not a lot
of advantage.

And there are sequences that are not handled correctly in this
deleted implementation.

So, deleting code to prevent use.
2023-02-16 09:32:48 +05:30