Commit Graph

235 Commits

Author SHA1 Message Date
Raja Subramanian
fa061b47fc Logging adjustnments (#2273) 2023-11-29 15:40:01 +05:30
David Colburn
57643a42ed Agents enabled check (#2227)
* agents enabled check

* participant -> publisher

* nil check client

* add NumConnections

* add lock around agent check

* do not launch agents against other agents

* regen

* don't need atomic anymore

* update protocol
2023-11-07 19:19:07 -08:00
David Colburn
60374c6402 Agents (#2203)
* agents

* add test

* undo name changes

* remove debug logs

* fixes

* fix data race in test
2023-11-03 11:43:35 -07:00
cnderrauber
eca32792b8 Add configuration to limit MaxBufferedAmount for data channel (#2170)
* Add configuration to limit MaxBufferedAmount for data channel

* comment

* Fix generate flags

* fix test

* Don't disconnect slow subscriber
2023-10-23 15:03:58 +08:00
David Zhao
65934e6486 Fix ICE connection fallback (#2144)
* Fix ICE connection fallback

Short connection detection relied on iceFailedTimeout, which previously
had been misinterpreted. Since we've reduced iceFailedTimeout, it is
creating false negatives.

We'll instead use PingTimeout since clients are expected to keep the
signal connection active.

* reduce ping interval to align with total ice failure timeout
2023-10-15 14:36:12 -07:00
cnderrauber
1200a960a2 Use generic type cast for IDs (#2095) 2023-09-20 17:09:24 +08:00
Raja Subramanian
281d2b674b Changing some log levels (#2094)
Logging expected WS close at Infow to understand reasons for closure.
Moving "read from ws" to Debugw as it happens when signalling closes.

Also filter out a data channel abort chunk log as it shows a bunch of
errors, but those are expected though.
2023-09-20 14:13:48 +05:30
David Colburn
8a0d417a8c Participant egress (#2070)
* participant egress

* remove hasPublished when participant leaves room
2023-09-18 12:47:57 -07:00
Raja Subramanian
019ad88b08 Do not force reconnect on resume if there is a pending track (#2081)
* Do not force reconnect on resume if there is a pending track

* move GetPendingTrack -> LocalParticipant
2023-09-17 14:00:09 +05:30
David Zhao
2113557842 Skip SendDataPacket logging on transport failure (#2074)
That's a sign of peer connection failure, we do not need to log these
2023-09-15 12:55:54 -07:00
cnderrauber
696798279e Check destination identites of data message if sid is empty (#2058) 2023-09-11 14:42:48 +08:00
Raja Subramanian
c7683fd383 Check for sctp.ErrStreamClosed (#2023) 2023-08-31 21:44:19 +05:30
David Zhao
64bcef28aa Address comments from #1998 (#2006) 2023-08-27 22:50:36 -07:00
David Zhao
eed8e85008 Demote more logs to debug (#1998) 2023-08-27 19:17:38 -07:00
David Zhao
75f5387ccd Allow data packet to be sent to participants by identity (#1982)
* Allow data packet to be sent to participants by identity

* update gomodules
2023-08-19 23:03:09 -07:00
Raja Subramanian
1a32439d7e Ensure older session does not clobeer newer session. (#1974) 2023-08-18 02:00:43 +05:30
David Zhao
13b1b4808f Fix race condition causing new participants to have stale room metadata (#1969)
If room metadata is changed in between when a participant is joining and
when they've became active, that participant will not have the latest
room metadata.
2023-08-15 17:30:26 -07:00
David Zhao
debd75fa15 Integrate logger components (#1933)
* Integrate logger components

Dividing into the following components
* pub - publisher
* pub.sfu
* sub - subscriber
* transport
* transport.pion
* transport.cc
* api
* webhook

* update go modules
2023-08-03 13:31:17 -07:00
Raja Subramanian
f3a0e3e71c skip logging when stream closed (#1928) 2023-08-02 14:00:58 +05:30
David Zhao
981fb7cac7 Adding license notices (#1913)
* Adding license notices

* remove from config
2023-07-27 16:43:19 -07:00
Raja Subramanian
fc7d4bd01e E2EE trailer for server injected packets. (#1908)
* Ability to use trailer with server injected frames

A 32-byte trailer generated per room.
Trailer appended when track encryption is enabled.

* E2EE trailer for server injected packets.

- Generate a 32-byte per room trailer. Too reasons for longer length
  o Laziness: utils generates a 32 byte string.
  o Longer length random string reduces chances of colliding with real data.
- Trailer sent in JoinResponse
- Trailer added to server injected frames (not to padding only packets)

* generate

* add a length check

* pass trailer in as an argument
2023-07-27 16:50:18 +05:30
Raja Subramanian
469f1cd073 Minor changes to publisher bool. (#1880)
* Minor changes to publisher bool.

* address feedback
2023-07-15 12:43:05 +05:30
David Zhao
557fe7c9d3 Mark room as dirty after track published changes (#1878)
Ensure that we are recomputing NumPublished when needed
2023-07-13 16:33:04 -07:00
Raja Subramanian
81f41aca20 Full reconnect on publication mismatch on resume. (#1823)
* Full reconnect on publication mismatch on resume.

It is possible that publications mismatch on resume. An example sequence
- Client sends `AddTrack` for `trackA`
- Server never receives it due to signalling connection breakage.
- Client could do a resume (reconnect=1) noticing signalling connection
  breakage.
- Client's view thinks that `trackA` is known to server, but server does
  not know about it.
- A subsequence offer containing `trackA` triggers `trackInfo not
  available before track publish` and the track does not get published.

Detect the case of missing track and issue a full reconnect.

* UpdateSubscriptions from sync state a la cloud

* add missing shouldReconnect
2023-06-24 19:18:05 +05:30
Raja Subramanian
00558dee5c Close participant on full reconnect. (#1818)
* Close participant on full reconnect.

A full reconnect == irrecoverable error. Participant cannot continue.
So, close the participant when issuing a full reconnect.
That should prevent subscription manager reconcile till the participant
is finally closed down when participant is stale.

* format
2023-06-22 10:09:10 +05:30
David Zhao
f71544e27a Do not send ParticipantJoined webhook if connection was resumed (#1795)
* Do not send ParticipantJoined webhook if connection was resumed

* isResume -> isMigration
2023-06-15 15:39:04 -07:00
Raja Subramanian
12db469297 Better tracking of signalling connection. (#1794)
* Better tracking of signalling connection.

- Reason for closing signaling channel.
- ConnectionID attached to request source/response sink

* Tests
2023-06-15 12:53:34 +05:30
David Zhao
0586009e0d Do not send hidden participants after resume (#1689) 2023-05-05 22:38:17 -07:00
David Zhao
5fcd682fb0 Refactor participant metadata updates to avoid duplication (#1679)
* Refactor participant metadata updates to avoid duplication

* generated fakes
2023-05-03 13:50:45 -07:00
Raja Subramanian
35b8319b08 Remove disallowed subscriptions on close. (#1668)
With subscription manager, there is no need to tell a publisher
about a subscriber going away. Before subscription manager,
the up track manager of a participant (i. e. the publisher side)
was holding a list of pending subscriptions for its published tracks
and that had to be cleaned up if one of the subscriber goes away.
That is not the case any more.

Also set publisherID early so that subscription permission update has
the right publisherID. In fact, saw an empty ID in the logs and saw
that we still have the disallowed subscription handling which is not
necessary any more.
2023-04-29 09:18:07 +05:30
David Zhao
b4ea4de5c0 Skip room updates to participants unless they are active 2023-04-24 15:24:50 -07:00
David Zhao
279b3604c3 Add back ServerRegion and ServerVersion (#1650)
clients are still dependent on them
2023-04-23 23:10:00 -07:00
David Zhao
3f64828a77 Send Room updates when participant counts change (#1647)
Reduces the number of unneeded generation with ProtoProxy
2023-04-22 21:08:59 -07:00
Paul Wells
745410bd69 only increment participant version after updates (#1646)
* only increment participant version after updates

* fix test util

* cleanup

* test uptrackmanager permission update version check
2023-04-22 17:48:10 -07:00
Raja Subramanian
d2bf8f0ba1 Support simulating subscriber bandwidth. (#1609)
* Support simualting subscriber bandwidth.

When non-zero, a full allocation is triggered.
Also, probes are stopped.

When set to zero, normal probing mechanism should catch up.

Adding `allowPause` override which can be a connection option.

* fix log

* allowPause in participant params
2023-04-13 13:59:24 +05:30
cnderrauber
c70a5c831f Refine transport fallback for client resuming (#1597)
* reset fallback after ice restart

* Configure ice for reconnect before send response
2023-04-10 15:12:05 +08:00
David Zhao
d8356e012e Give proper grace period when recorder is still in the room (#1547)
When a recorder is in the room, we would skip grace period due
to a bug with how LastLeftAt was set. This would cause RoomComposite
templates to exit immediately if the last participant in the room reconnects.
2023-03-23 23:57:11 -07:00
David Colburn
191a9e8014 update core to 0.0.5 (#1540)
* update core

* sort imports

* fix typos

* redundant types
2023-03-22 16:53:23 -07:00
cnderrauber
11ae7fdbb6 Don't switch candidate if signal closed when pc failed (#1498)
* Don't switch candidate if signal closed when pc failed

* change comment

* test case
2023-03-08 15:16:40 +08:00
cnderrauber
38deab8991 Send room update while client reconnecting (#1499) 2023-03-07 15:59:40 +08:00
Raja Subramanian
9e327b1f3c Connection quality (#1490)
* Make connection quality not too optimistic.

With score normalization, the quality indicator showed good
under conditions which should have normally showed some badness.

So, a few things in this PR
- Do not normalize scores
- Pick the weakest link as the representative score (moving away from
  averaging)
- For down track direction, when reporting delta stats, take the number
  of packets sent actually. If there are holes in the feed (upstream
  packet loss), down tracks should not be penalised for that loss.

State of things in connection quality feature
- Audio uses rtcscore-go (with a change to accommodate RED codec). This
  follows the E-model.
- Camera uses rtcscore-go. No change here. NOTE: THe rtscore here is
  purely based on bits per pixel per frame (bpf). This has the following
  existing issues (no change, these were already there)
  o Does not take packet loss, jitter, rtt into account
  o Expected frame rate is not available. So, measured frame rate is
    used as expected frame rate also. If expected frame rate were available,
    the score could be reduced for lower frame rates.
- Screen share tracks: No change. This uses the very old simple loss
  based thresholding for scoring. As the bit rate varies a lot based on
  content and rtcscore video algorithm used for camera relies on
  bits per pixel per frame, this could produce a very low value
  (large width/height encoded in a small number of bits because of static content)
  and hence a low score. So, the old loss based thresholding is used.

* clean up

* update rtcscore pointer

* fix tests

* log lines reformat

* WIP commit

* WIP commit

* update mute of receiver

* WIP commit

* WIP commit

* start adding tests

* take min score if quality matches

* start adding bytes based scoring

* clean up

* more clean up

* Use Fuse

* log quality drop

* clean up debug log

* - Use number of windows for wait to make things simpler
- track no layer expected case
- always update transition
- always call updateScore
2023-03-05 12:55:04 +05:30
David Zhao
8c43b7b48f Fix unsubscribed speakers stuck as speaking to clients (#1475)
When we unsubscribe from a speaker, SendSpeakerUpdates will drop updates
from that speaker. This has the side effect of dropping the "clearing"
message that we are sending as well.
2023-02-26 23:56:09 -08:00
cnderrauber
1b3d6fad54 update to utils parallel execute (#1450)
* log change

* remove identity field

* update to protocol parallel execute

* update protocol
2023-02-22 11:09:32 +08:00
cnderrauber
4367e93855 parallel writing for data packet broadcast (#1425) 2023-02-15 17:18:43 +08:00
David Zhao
2851a8ac98 Improved robustness of subscription stack (#1382)
UpdateSubscription had a shortcoming where when it couldn't find the
participant, it ignored the request.

This PR further removes the reliance of current publisher state from
subscribers.
- SubscribeToTrack only takes in a trackID
- Introduced RoomTrackManager to maintain all published tracks to a room
- Added TrackUnpublished event to clearly indicate when a track has been removed
- SubscribeRequested event no longer include information about the publisher
2023-02-06 18:08:26 -08:00
cnderrauber
8b6dab780c Add reconnect reason and signal rtt calculation (#1381)
* Add connect reason and signal rtt calculate

* Update protocol

* solve comment
2023-02-06 11:12:25 +08:00
Raja Subramanian
037ae572d9 Ensure older participant session update does not go out after a newer (#1372)
* Ensure older participant session update does not go out after a newer
session has joined.

* fix tests

* change comment

* do not send older version
2023-02-03 00:30:11 +05:30
David Zhao
cd6b8b80b9 feat: SubscriptionManager to consolidate subscription handling (#1317)
Added a new manager to handle all subscription needs. Implemented using reconciler pattern. The goals are:

improve subscription resilience by separating desired state and current state
reduce complexity of synchronous processing
better detect failures with the ability to trigger full reconnect
2023-01-24 23:06:16 -08:00
Dan McFaul
4d6f0cd0f7 Stats collect v2 (#1291)
* initial commit

* add correct label

* clean up

* more cleanup on adding stats

* cleanup

* move things to pub and sub monitors, ensure stats are correctly updated

* fix merge conflict

* Fix panic on MacOS (#1296)

* fixing last feedback

Co-authored-by: Raja Subramanian <raja.gobi@tutanota.com>
2023-01-11 14:49:50 -07:00
cnderrauber
25debc6d35 add reconnect response to update configuration while reconnecting (#1300)
* add reconnect response to update configuration while reconnecting

* fix test
2023-01-11 17:40:12 +08:00