* Log cleanup pass
Demoted a bunch of logs to DEBUG, consolidated logs.
* use context logger and fix context var usage
* moved common error types, fixed tests
* Restrict scope of negotiation time out error logs
1. Log "negotiation failed" only if signal channel was active
within half window of negotiation timeout. Negotiation timeout currently
is at 15 seconds. Signal pings are every 10 seconds.
2. In transport.go, do not report negotiation timed out and do not
callback negotiation failure if the peer connection state is not
connected. Goal of negotiation failure tracker is to take remedial
action when an in-session negotiation fails. Seeing a bunch of cases
of the case hitting even without ICE connection forming. Negotiation
timer is not intended for those cases.
* fix test
* Add optional supervisor disable.
Used `DisableSupervisor` so that default can be enabled and
it can be disabled explicity. But, open to defaulting to disable
(i. e. change param to `EnableSupervisor`).
* Move nil check to call site
* Introduce `DISCONNECTED` connection quality.
Currently, this state happens when any up stream track does not
send any packets in an analysis window when it is expected to send
packets.
This can be used by participants to know the quality of a potentially
disconnected participant. Previously, it took 20 - 30 seconds for
the stale timeout to kick in and disconnect the limbo participant which
triggered a participant update through which other participants knew
about it.
Previously, `POOR` quality was also overloaded to denote that the
up stream is not sending any packets. With this change, that is a
separate indicator, i. e. `DISCONNECTED`.
* clean up
* Update deps
* spelling
* Participant traffic load.
Capturing information about participant traffic
- Upstream/Downstream
- Audio/Video/Data
- Packets/Bytes
This captures a notion of how much traffic load a participant is
generating.
Can be used to make allocation decisions.
* Clean up
* SIP patches
* reporter goroutine
* unlock
* move traffic stats from protocol
* check type
* Use a worker to report signal/data stats.
Was checking if reporting is needed on every update.
The check is wasted work if volume of signal/data messages is high
as reporting happens only once in 10 seconds.
Changing to a worker based on a timer. And also aligning with
telemetry reporting interval which defaults to 30 seconds.
* Remove unused constant
* Reduce logging
1. Do not print rtp stats if nil. Means that some subscribed tracks may
not have any logs (very short subscriptions which end before any
packet is sent).
2. Log ICE candidates only at the end, not when ICE connects. That logs
the selected ICE candidate pair.
3. Log ICE candidates only if not empty.
* Update some deps
* Do not block on down track close with flush.
When publisher removes all subscribers, publisher side should
not be blocked for long. With close with flush, it could happen
if there a lot of bunch of subscribers.
So, when is expected, run it in a goroutine like it is done in
subscription manager.
Not moving the entire `RemoveSubscriber` bit to subscription manager as
there are two bits which are not tracked now
- mime type
- willBeResumed
Those two would have to be tracked in track manager and notified to
subscription manager so that it can act for that mine and if the track
will be resumed or not. As that touch more parts and could get
complicated, doing the simpler thing of cloning behaviour from
subscription manager for now.
* clean up
* code readability
* Disable H.264 for android firefox
* Fix syntax error for rule
* lower case
* Remove disabled codec from AddTrackRequest
* Consistent handling of enabled codecs
Mainly cleaning up where we are doing codec filtering.
There's also behavior change of how we handle codec compatibility. If a client doesn't support the client's desired codec, we'll pick a backup automatically
instead of rejecting the client's request.
Requires an update on multi-codec simulcast handling.
* fix alternative codec selection
---------
Co-authored-by: David Zhao <dz@livekit.io>
* Remove un-preferred codecs for android firefox
Android firefox don't comply with the codec order in answer sdp and
has problem to publish h.264, remove other codecs to fix this.
* false(false) is true
* Use 32-bit time stamp to get reference time stamp on a switch.
With relay and dyncast and migration, it is possible that different
layers of a simulcast get out of sync in terms of extended type,
i. e. layer 0 could keep running and its timestamp could have
wrapped around and bumped the extended timestamp. But, another layer
could start and stop.
One possible solution is sending the extended timestamp across relay.
But, that breaks down during migration if publisher has started afresh.
Subscriber could still be using extended range.
So, use 32-bit timestamp to infer reference timestamp and patch it with
expected extended time stamp to derive the extended reference.
* use calculated value
* make it test friendly
* Fix ICE connection fallback
Short connection detection relied on iceFailedTimeout, which previously
had been misinterpreted. Since we've reduced iceFailedTimeout, it is
creating false negatives.
We'll instead use PingTimeout since clients are expected to keep the
signal connection active.
* reduce ping interval to align with total ice failure timeout
There are cases that experience the signalling channel timeout
and disconnect and there are no logs of what the state of ICE at all.
Log ICE candidates when closing transport so that there is some
visibility in those cases.
n
Logging expected WS close at Infow to understand reasons for closure.
Moving "read from ws" to Debugw as it happens when signalling closes.
Also filter out a data channel abort chunk log as it shows a bunch of
errors, but those are expected though.
It's been reported that "ghost" participants, those that did not terminate
cleanly, hang around the room for too long after they disappear.
Evaluating our timeouts a bit, it seems that we are really conservative
in waiting for participants to disconnect. This PR cuts down the disconnect
timeout from 50s to 20s, a 30s reduction.
* Split RTPStats into receiver and sender.
For receiver, short types are input and need to calculate extended type.
For sender (subscriber), it can operate only in extended type.
This makes the subscriber side a little simpler and should make it more
efficient as it can do simple comparisons in extended type space.
There was also an issue with subscriber using shorter type and
calculating extended type. When subscriber starts after the publisher
has already rolled over in sequence number OR timestamp, when
subsequent publisher side sender reports are used to adjust subscriber
time stamps, they were out of whack. Using extended type on subscriber
does not face that.
* fix test
* extended types from sequencer
* log
* Fix time stamp adjustment when starting with dmummy packets.
- Populated extended values in ExtPacket on dummy packet.
- Have to pass reference time stamp offset to first packet time
adjustment.
* display participant version info