Commit Graph

1848 Commits

Author SHA1 Message Date
Raja Subramanian 72ed5b19f7 Use receiver report stats for loss/rtt/jitter. (#1781)
* Use receiver report stats for loss/rtt/jitter.

Reversing a bit of https://github.com/livekit/livekit/pull/1664.
That PR did two snapshots (one based on what SFU is sending
and one based on combination of what SFU is sending reconciled with
stats reported from client via RTCP Receiver Report). That PR
reported SFU only view to analytics. But, that view does not have
information about loss seen by client in the downstream.
Also, that does not have RTT/jitter information. The rationale behind
using SFU only view is that SFU should report what it sends irrespective
of client is receiving or not. But, that view did not have proper
loss/RTT/jitter.

So, switch back to reporting SFU + receiver report reconciled view.
The down side is that when receiver reports are not receiver,
packets sent/bytes sent will not be reported to analytics.

An option is to report SFU only view if there are no receiver reports.
But, it becomes complex because of the offset. Receiver report would
acknowledge certain range whereas SFU only view could be different
because of propagation delay. To simplify, just using the reconciled
view to report to analytics. Using the available view will require
a bunch more work to produce accurate data.
(NOTE: all this started due to a bug where RTCP was not restarted on
a track resume which killed receiver reports and we went on this path
to distinguish between publisher stopping vs RTCP receiver report not
happening)

One optimisation to here here concerns the check to see if publisher is sending data.
Using a full DeltaInfo for that is an overkill. Can do a lighter weight
for that later.

* return available streams

* fix test
2023-06-09 23:31:25 +05:30
Raja Subramanian f518f5d743 Log head SN when packet cannot be fetched (#1780) 2023-06-09 12:13:06 +05:30
David Colburn 8235310a92 don't save info after UpdateStream (#1779) 2023-06-07 16:27:37 -07:00
Raja Subramanian 22813cd2be Recreate channel observer irrespective of probe success/fail. (#1778) 2023-06-08 01:40:07 +05:30
renovate[bot] bc11419755 Update module github.com/hashicorp/golang-lru/v2 to v2.0.3 (#1774)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-06-06 23:32:28 -07:00
Raja Subramanian b591140d66 Ignore receiver report till initialized (#1773) 2023-06-06 21:43:49 +05:30
Raja Subramanian 7ed3af193a No proof that this helps (#1772) 2023-06-06 11:28:13 +05:30
Raja Subramanian 076d8cad73 Promote switch log to Infow (#1771) 2023-06-06 11:20:57 +05:30
Paul Wells 6e063896d0 update psrpc (#1770)
* update psrpc

* update protocol
2023-06-05 18:42:02 -07:00
David Zhao 7e5a7ae79f Fixed windows build (#1768) v1.4.3 2023-06-04 00:17:25 -07:00
David Zhao 109620dfb6 Version 1.4.3 (#1767) 2023-06-03 23:51:29 -07:00
Raja Subramanian f5c5d4e079 Wait for a more stable measurement of sample rate. (#1764) 2023-06-03 14:26:26 +05:30
Benjamin Pracht e7879a46fc Add ingress telemetry support (#1763) 2023-06-02 17:38:19 -07:00
Raja Subramanian c2ae34151c Enable some debug logs to debug freeze (#1761)
* Enable some debug logs to debug freeze

* log receiver sender report also
2023-06-02 16:31:19 +05:30
David Zhao b5c8fe5294 Perform unsubscribe in parallel to avoid blocking (#1760)
* Perform unsubscribe in parallel to avoid blocking

When unsubscribing from tracks, we flush a blank frame in order to prepare
the transceivers for re-use. This process is blocking for ~200ms. If
the unsubscribes are performed serially, it would prevent other subscribe
operation from continuing.

This PR parallelizes that operation, and ensures subsequent subscribe
operations could reuse the existing transceivers.

* also perform in parallel when uptrack close

* fix a few log fields
2023-06-02 00:13:18 -07:00
David Colburn 9a698736d1 include await_start_signal (#1759) 2023-06-01 16:56:12 -07:00
renovate[bot] 5a8305f09b Update module github.com/stretchr/testify to v1.8.4 (#1756)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-05-31 22:18:26 -07:00
cnderrauber c1842cb54f Avoid reconnect loop for unsupported downtrack (#1754)
* Avoid reconnect loop for unsupported downtrack

If the client subscribes to a track which codec is unsupported by the
client, sfu will trigger negotiation failed and issue a full reconnect
after received client answer. If the client try to subscribe that track
then it will got full reconnect again. That will cause a infinite
reconnect loop until the client don't subscribe that track. This PR
will unsubscribe the error track for the client and send a
SubscriptionResponse that contain the reason to indicates the track's
codec is not supported to avoid the reconnect loop.
2023-05-31 11:41:22 +08:00
Raja Subramanian 13d599d2d9 Comment out noisy log. (#1757) 2023-05-31 06:35:25 +05:30
Benjamin Pracht d598e06d9f Add support for bypass_transcoding field in ingress (#1741) 2023-05-30 13:41:12 -07:00
renovate[bot] ca3e9ab524 Update go deps (#1750)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-05-29 22:50:24 -07:00
David Zhao 956735ae05 Fix node stats updates on Windows (#1748)
Because we aren't able to get CPU count/load info on Windows, they are
stubbed out to return placeholders. This restores compatibility to run
on Windows.
2023-05-29 10:53:08 -07:00
Raja Subramanian fdfd830394 Split probe controller from StreamAllocator. (#1751)
* Split probe controller from StreamAllocator.

With TWCC, there is a need to check for probe status
in a separate goroutine. So, probe specific stuff need
locking. Split out the probe controller to make that cleaner.

* remove defer
2023-05-29 14:41:44 +05:30
Paul Wells 2edd257705 update psrpc (#1749) 2023-05-28 12:54:23 -07:00
Raja Subramanian ea57e4f2c1 Ignore receiver reports that have a sequence number before first packet. (#1745) 2023-05-28 10:05:35 +05:30
renovate[bot] e99aabd908 Update go deps (#1697)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-05-27 21:34:15 -07:00
renovate[bot] aefbdde3b8 Update pion deps (#1706)
* Update pion deps

Generated by renovateBot

* remove active TCP override

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: David Zhao <dz@livekit.io>
2023-05-27 21:33:24 -07:00
Raja Subramanian 9dd2ebc960 Change too many packets log to error to get back trace. (#1744) 2023-05-27 12:19:30 +05:30
Raja Subramanian 1c920812d3 Return max spatial layer from selectors. (#1743)
* Return max spatial layer from selectors.

With differing requirements of SVC and allowing overshoot in Simulcast,
selectors are best placed to indicate what is the max spatial layer when
they indicate a switch to max spatial layer.

* fix test

* prevent race
2023-05-26 12:49:31 +05:30
cnderrauber fc8375f150 Fix dynacast for svc codec (#1742) 2023-05-26 14:34:35 +08:00
Raja Subramanian 0354626bfc Adjust sender report time stamp for slow publishers. (#1740)
It is possible that publisher paces the media.
So, RTCP sender report from publisher could be ahead of
what is being fowarded by a good amount (have seen up to 2 seconds
ahead). Using the forwarded time stamp for RTCP sender report
in the down stream leads to jumps back and forth in the down track
RTCP sender report.

So, look at the publisher's RTCP sender report to check for it being
ahead and use the publisher rate as a guide.
2023-05-25 21:55:54 +05:30
Raja Subramanian 11c5737e04 Filter another expected error. (#1738)
Actually, was not filtering the not last sender report error before.
Previous PR did that. This PR restores the old no last sender report
filter. Both are filterable errors.
2023-05-24 12:41:47 +05:30
Raja Subramanian 07252b7ce3 Filter not last SR error (#1737) 2023-05-24 12:32:12 +05:30
David Zhao 61d393e709 Disable active TCP by rolling back to ICE v2.3.3 (#1735)
* Revert "Disable active TCP (#1726)"

This reverts commit 5260907ffe.

* Disable active TCP by rolling back to ICE v2.3.3
2023-05-23 21:27:03 -07:00
Raja Subramanian bbbe815260 Init min to max MOS (#1734)
* Init min to max MOS

Could have been contributing to low p50 score in prom stats.

* don't need to reset on no tracks as default is that
2023-05-23 12:55:24 +05:30
David Zhao 12c6f1e12c Added Xiaomi 2201117TI to devices that does not support H.264 (#1728) 2023-05-22 21:38:56 -07:00
Raja Subramanian cba37389da mediatransportutil to get wrap back fix (#1732) 2023-05-23 10:02:36 +05:30
renovate[bot] ceac340ddc Update github.com/livekit/mediatransportutil digest to cc5a379 (#1731)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-05-22 13:41:27 -07:00
renovate[bot] 04bcd601f0 Update livekit deps (#1599)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-05-22 13:29:52 -07:00
Raja Subramanian d9e682a0d2 Fix unwrap (#1729)
* Fix unwrap

An out-or-order packet wrapping back after a wrap around had already happened
 was not using proper cycle ounter to calculate unerapped value.

* update mediatransportutil
2023-05-22 18:46:56 +05:30
David Zhao 5260907ffe Disable active TCP (#1726)
Active TCP was added in pion/ice v2.3.4. This is causing a couple of issues for us.

Active TCP does not make sense for an SFU. Clients are expected to be behind NAT and we should not be dialing them. Instead, LiveKit exposes a TCP port so clients could dial in
Active TCP is causing all iOS clients to become disconnected immediately. This is impacting all version of libwebrtc-based iOS clients (tested from M104 to M111)
2023-05-19 23:00:06 -07:00
shishirng 3de51181ec Fix setting minscore - initialized to 0 (#1725)
Signed-off-by: shishir gowda <shishir@livekit.io>
2023-05-19 11:00:32 -04:00
Raja Subramanian 0bb89575eb Fix min TS before first sender report (#1724) 2023-05-19 12:43:19 +05:30
David Zhao 93d6651d60 Improve error message when WaitUntil fails. (#1723) 2023-05-18 14:10:40 -07:00
Paul Wells 5f3ea75a1e conditionally block on signal relay close (#1722) 2023-05-18 13:53:20 -07:00
Paul Wells e03b7ef8de start signal relay sessions with the correct node (#1721)
* start signal relay sessions with the correct node

* enable signal relay in multiregion integration test
2023-05-18 12:39:02 -07:00
shishirng 2e93d386fe send min/median connection score along with avg (#1720)
* send min/median connection score along with avg
* guard against divide by zero for avg score calculation
* update median calculation

Signed-off-by: shishir gowda <shishir@livekit.io>
2023-05-18 13:50:54 -04:00
Raja Subramanian 1d3faefc5e More scoring tweaks (#1719)
1. Completely removing RTT and jitter from score calculation.
   Need to do more work there.
   a. Jitter is slow moving (RFC 3550 formula is designed that way).
      But, we still get high values at times. Ideally, that should
      penalise the score, but due to jitter buffer, effect may not be
      too bad.
   b. Need to smooth RTT. It is based on receiver report and if one
      sample causes a high number, score could be penalised
      (this was being used in down track direction only). One option
      is to smooth it like the jitter formula above and try using it.
      But, for now, disabling that also.

2. When receiving lesser number of packets (for example DTX), reduce the
   weight of packet loss with a quadratic relationship to packet loss
   ratio. Previously using a square root and it was potentially
   weighting it too high. For example, if only 5 packets were received
   due to DTX instead of 50, we were still giving 30% weight
   (sqrt(0.1)). Now, it gets 1% weight. So, if one of those 5 packets
   were lost (20% packet loss ratio), it still does not get much weight
   as the number of packets is low.,

3. Slightly slower decrease in score (in EWMA)

4. When using RED, increase packet loss weight thresholds to be able to
   take more loss before penalizing score.
2023-05-18 20:16:43 +05:30
David Colburn c3d6ecca6e check egress status on UpdateStream failure (#1716) 2023-05-17 16:46:22 -07:00
Benjamin Pracht f401c44a46 Move TURNServers back to livekit-server (#1715) 2023-05-17 15:24:17 -07:00