* Introduce `DISCONNECTED` connection quality.
Currently, this state happens when any up stream track does not
send any packets in an analysis window when it is expected to send
packets.
This can be used by participants to know the quality of a potentially
disconnected participant. Previously, it took 20 - 30 seconds for
the stale timeout to kick in and disconnect the limbo participant which
triggered a participant update through which other participants knew
about it.
Previously, `POOR` quality was also overloaded to denote that the
up stream is not sending any packets. With this change, that is a
separate indicator, i. e. `DISCONNECTED`.
* clean up
* Update deps
* spelling
* Participant traffic load.
Capturing information about participant traffic
- Upstream/Downstream
- Audio/Video/Data
- Packets/Bytes
This captures a notion of how much traffic load a participant is
generating.
Can be used to make allocation decisions.
* Clean up
* SIP patches
* reporter goroutine
* unlock
* move traffic stats from protocol
* check type
* Use a worker to report signal/data stats.
Was checking if reporting is needed on every update.
The check is wasted work if volume of signal/data messages is high
as reporting happens only once in 10 seconds.
Changing to a worker based on a timer. And also aligning with
telemetry reporting interval which defaults to 30 seconds.
* Remove unused constant
* Disable H.264 for android firefox
* Fix syntax error for rule
* lower case
* Remove disabled codec from AddTrackRequest
* Consistent handling of enabled codecs
Mainly cleaning up where we are doing codec filtering.
There's also behavior change of how we handle codec compatibility. If a client doesn't support the client's desired codec, we'll pick a backup automatically
instead of rejecting the client's request.
Requires an update on multi-codec simulcast handling.
* fix alternative codec selection
---------
Co-authored-by: David Zhao <dz@livekit.io>
* Fix ICE connection fallback
Short connection detection relied on iceFailedTimeout, which previously
had been misinterpreted. Since we've reduced iceFailedTimeout, it is
creating false negatives.
We'll instead use PingTimeout since clients are expected to keep the
signal connection active.
* reduce ping interval to align with total ice failure timeout
Logging expected WS close at Infow to understand reasons for closure.
Moving "read from ws" to Debugw as it happens when signalling closes.
Also filter out a data channel abort chunk log as it shows a bunch of
errors, but those are expected though.
It's been reported that "ghost" participants, those that did not terminate
cleanly, hang around the room for too long after they disappear.
Evaluating our timeouts a bit, it seems that we are really conservative
in waiting for participants to disconnect. This PR cuts down the disconnect
timeout from 50s to 20s, a 30s reduction.
* Fix time stamp adjustment when starting with dmummy packets.
- Populated extended values in ExtPacket on dummy packet.
- Have to pass reference time stamp offset to first packet time
adjustment.
* display participant version info
* Add option to issue full reconnect on data channel error.
There are situations where send data packet fails because of "stream
closed". It is unclear when that happens. Seems to be after an
ICERestart after ICE failed and connection type switching to TURN
from ICE.
Once the failure happens, it is not recoverable. Potentially, it is
recoverable, but unclear where the problem lies. Attempts to reproduce
looking at the pattern of failures has been unsuccesful.
In the mean time, adding an option to issue full reconnect
when send data packet fails.
* typo
* Integrate logger components
Dividing into the following components
* pub - publisher
* pub.sfu
* sub - subscriber
* transport
* transport.pion
* transport.cc
* api
* webhook
* update go modules
* Add control of playout delay
Add config to enable playout delay. The delay will be limited by
[min,max] in the config option and calculated by upstream & downstream
RTT.
* check protocol version to enable playout delay
* Move config to room, limit playout-delay update interval, solve comments
* Remove adaptive playout-delay
* Remove unused config
Server could have closed subscriber PC to aid migration.
But, if a resumes lands back on that node, a resume of
the participant session is not possible as subscriber PC is already
closed. While theoretically possible to form a new subscriber
peer conenction, reducing complexity and issuing a full reconnect
as this should be a rare case.
* Ability to use trailer with server injected frames
A 32-byte trailer generated per room.
Trailer appended when track encryption is enabled.
* E2EE trailer for server injected packets.
- Generate a 32-byte per room trailer. Too reasons for longer length
o Laziness: utils generates a 32 byte string.
o Longer length random string reduces chances of colliding with real data.
- Trailer sent in JoinResponse
- Trailer added to server injected frames (not to padding only packets)
* generate
* add a length check
* pass trailer in as an argument
1. When re-allocating for a track in DEFICIENT state, try to use
available headroom to accommodate change before trying to steal
bits from other tracks.
2. If the changing track gives back bits (because of muting or
moving to a lower layer subscription), use the returned bits
to try and boost deficient track(s).
* Pacer interface to send packets
* notify outside lock
* use select
* use pass through pacer
* add error to OnSent
* Remove log which could get noisy
* Starting TWCC work (#1727)
* add packet time
* WIP commit
* WIP commit
* WIP commit
* minor comments
* Some measurements (#1736)
* WIP commit
* some notes
* WIP commit
* variable name change and do not post to closed channel
* unlock
* clean up
* comment
* Hooking up some more bits for TWCC (#1752)
* wake under lock
* Pacer in down stream path.
Splitting out only the pacer from a feature branch to
introduce the concept of pacer.
Currently, there should be no difference in functionality
as a pass through pacer is used.
Another implementation exists which is just put it in a queue and send
it from one goroutine.
A potential implementation to try would be data paced by bandwidth
estimate. That could include priority queues and such.
But, the main goal here is to introduce notion of pacer in the down
stream path and prepare for more congestion control possibilities down
the line.
* Don't need peak detector
* remove throttling of write IO errors
When migrating muted track, need to set potential codecs.
For audio, there may not be `simulcast_codecs` in `AddTrack`.
Hence when migrating a muted track, the potential codecs are not set.
That results in no receivers in relay up track (because all this
could happen before the audio track is unmuted).
So, look at MimeType in TrackInfo (this will be set in OnTrack) and
use that as potential codec.
* Do not process events after participant close.
Avoid processing transport events after participant/transport close.
It causes error logs which are not really errors, but distracting noise.
* correct comment
* Close participant on full reconnect.
A full reconnect == irrecoverable error. Participant cannot continue.
So, close the participant when issuing a full reconnect.
That should prevent subscription manager reconcile till the participant
is finally closed down when participant is stale.
* format
Till now only video was using simulated publish when migrating on mute.
But, with `pauseUpstream() + replaceTrack(null)`, it is possible that
client does not send any data when muted.
I do not think there is a problem to do this (even when cleint is
actually using mute which sends silence frames).