livekit

mirror of https://github.com/livekit/livekit.git synced 2026-06-09 17:12:51 +00:00

Author	SHA1	Message	Date
Raja Subramanian	faa66d1138	Make sender report pass through an option. (#2861 ) Enabled by default. Also, tweak the long term propagation delay a bit. The first propagation delay itself was too high and the long term initialized with a high value. Prevent that and also ensure large negtaives do not have an effect by using a lower bound of 0. Lower bound of 0 is okay as the main purpose is to track sustained high positive values.	2024-07-15 11:27:31 +05:30
Raja Subramanian	27f6794e77	Check sender report against media path. (#2843 ) Seeing cases (mostly across relay) of large first packet time adjustment getting ignored. From data, it looks like the first packet is extremely delayed (some times of the order of minutes) which does not make sense. Adding some checks against media path, i. e. compare RTP timestamp from sender report against expected RTP timestamp based on media path arrivals and log deviations more than 5 seconds. Another puzzling case. Trying to understand more. Also, refactoring SetRtcpSenderReportData() function as it was getting unwieldy.	2024-07-09 09:20:27 +05:30
Raja Subramanian	acbd4ea104	Handle cases of long mute/rollover of time stamp. (#2842 ) * Handle cases of long mute/rollover of time stamp. There are cases where the track is muted for long enough for timestamp roll over to happen. There are no packets in that window (typically there should be black frames (for video) or silence (for audio)). But, maybe the pause based implementation of mute is causing this. Anyhow, use time since last packet to gauge how much roll over should have happened and use that to update time stamp. There will be really edge cases where this could also fail (for e. g. packet time is affected by propagation delay, so it could theoretically happen that mute/unmute + packet reception could happen exactly around that rollover point and miscalculate, but should be rare). As this happen per packet on receive side, changing time to `UnixNano()` to make it more efficient to check this. * spelling * tests * test util * tests	2024-07-08 11:07:20 +05:30
Raja Subramanian	b4134edf40	Log rtp stats state on large jumps. (#2829 ) Forgot to include in receiver.	2024-07-01 11:51:05 +05:30
Raja Subramanian	57980fcc36	fix logging ignored key (#2826 )	2024-06-28 10:34:41 +05:30
Raja Subramanian	fa490dd510	Log rtp stats more consistently. (#2816 ) * Log rtp stats more consistently. Thank you Paul for the logging tip. Also update deps. * remove duplicate logging field * nil check	2024-06-25 14:55:42 +05:30
Raja Subramanian	6bb48dd6f1	Do not log duplicate on large negative on send side (#2815 )	2024-06-24 12:51:35 +05:30
Raja Subramanian	091eab556d	Update mediatransportutil (#2812 )	2024-06-22 11:32:28 +05:30
Raja Subramanian	d4e50b633f	Do not log warns on duplicate. (#2807 ) With RTX, some clients use very old packets for probing. Check for duplicate before logging warning about old packet/negative sequence number jump. Also, double the history so that duplicate tracking is better. Adds about 1/2 KB per RTP stream.	2024-06-20 10:52:12 +05:30
Raja Subramanian	a31f59b689	Log first time adjustment total. (#2776 ) * Log first time adjustment total. Seeing cases where the first time is 400ms+ before start time. Possible it is getting that much adjustment, but would be good to see how much total adjustment happens. * log propagation delay	2024-06-09 23:07:01 +05:30
Raja Subramanian	73852d0a13	Reduce large sequence number jump threshold for logging. (#2770 ) Seeing some unexplained large jumps on remotes across relay. Unclear if there was a jump on origin side at some point. Reducing threshold for large jump so that we can catch unexpected jumps more.	2024-06-07 12:36:02 +05:30
Raja Subramanian	7d035deef8	Clean up logging fields a bit (#2767 )	2024-06-06 23:03:21 +05:30
Raja Subramanian	9781d30611	Do not propagate RTCP if report is not processed. (#2739 )	2024-05-28 19:29:54 +05:30
Raja Subramanian	8be2005e0f	More detailed logging to understand old packets. (#2730 )	2024-05-25 18:34:55 +05:30
Raja Subramanian	96cb829b84	Log more info when adjusting start timestamp. (#2722 ) Seeing some large time stamp jump in relay down track once in a while. Logging more details on time stamp switch to learn more.	2024-05-23 13:03:26 +05:30
Raja Subramanian	a2a8810734	log the correct new propagation delay (#2694 )	2024-04-30 08:29:59 +05:30
Raja Subramanian	c8b289daa5	(Attempted) Simplify time stamp calculation on switches. (#2688 ) * Simplify time stamp calculation on switches. Trying to simplify time stamp calculation on restarts. The additional checks take effect rarely and it not worth the extra complication. Also, doing the reference time stamp in extended range. The challenge with that is when publisher migrates the extended timestamp could change post migration (i. e. post migration would not know about rollovers). To address that, maintain an offset that is updated on resync. * WIP * Revert to resume threshold * typo * clean up	2024-04-28 12:13:52 +05:30
Raja Subramanian	2ad0efc28f	Handle large jumps in RTCP sender report timestamp. (#2674 ) * Handle large jumps in RTCP sender report timestamp. Seeing cases of RTCP Sender Report spaced apart by more than half the RTP Timestamp range. Maybe a case of laptop going to sleep and waking up. Handle it using time diff from last report and calculating expected timestamp. * try go 1.22	2024-04-22 23:04:56 +05:30
Raja Subramanian	af0b0c4734	Connection quality LOST only if RTCP is also not available. (#2670 ) * Connection quality LOST only if RTCP is also not available. It is possible that sender stops all layers of video due to some constraint (CPU or bandwidth). Packet reception going dry due to that should not trigger `LOST` quality. Add last received RTCP time also to distinguish the case of real `LOST` and sender stopping traffic. Some bits to watch for - With audio, RTCP reports could be more than 5 seconds apart (5 seconds is the default interval for connection quality scorer), but audio senders usually send silence packets even when there is no input. So audio completely stopping can be considered `LOST`. - With video, have to observe if all clients continue to send RTCP even if all layers are stopped. - RTCP bandwidth is not supposed to exceed the primary stream bandwidth. libwebrtc calculates that and spaces out RTCP reports accordingly. That is the reason why audio reports are that far apart. If a video stream is encoded at a very low bit rate, it could also be sending RTCP rarely. So, there is the case of LOST being indistinguishable from sender stopping all layers. But, this should be a rare case. * typo	2024-04-21 23:35:24 +05:30
Raja Subramanian	21fbda3470	Silence some noisy debug logs (#2643 )	2024-04-11 10:58:19 +05:30
Raja Subramanian	860702e9dc	Prevent large spikes in propagation delay (#2615 ) * Prevent large spikes in propagation delay A few tweaks - Large spike in propagation delay due to congested channel results in long term estimate getting high value. Ignore outliers in long term estimate. - Introduce a new field for adjusted arrival time as adjusting the arrival time in place meant it got applied again across the relay and that caused different propagation delay on remote nodes. - Reset path change counters as long as there is any sample that is not higher than the multiple of long term. There was a case of o Sample with high value that triggered path change start. o Then some samples with high enough delta, but did not meet the criteria for increasing counter further. o Some time later, another sample met the threshold and that triggered a path change re-init. * do not adapt to large delta	2024-04-02 14:21:20 +05:30
Raja Subramanian	4c9e59dc25	Small tweaks to propagation delay adaptation. (#2607 )	2024-03-30 21:53:18 +05:30
Raja Subramanian	b5de646073	Remove redundant check. (#2605 ) * Remove redundant check. That check is already at the ouside check. * print string * space	2024-03-30 00:31:26 +05:30
Raja Subramanian	0480f99a83	Tweak adaptation to increase in propagation delay. (#2598 ) * Tweak adaptation to increase in propagation delay. A couple of issues - RTCP Sender Reports rate will vary based on underying track bitrate. (at least in theory, not all entities will do it though, for example SFU does standard rate of one per three seconds irrespective of track bit rate). So, adapt the long term estimate of propagation delay delta based on spacing of reports. - Re-init of propagation delay to adapt to path change was taking the last value before the switch. But, that one value could have been an outlier and accepting it is not great. So, adapt spike time propagation delay in a smoother fashion to ensure that all values during spike contribute to the final value. * clean up	2024-03-26 17:33:24 +05:30
Raja Subramanian	7945c01dbe	Reset sharp increase if received delta is small. (#2592 )	2024-03-21 10:25:45 +05:30
Raja Subramanian	03ada9ba76	Proper RTCP report past mute. (#2588 ) - When audio is muted, server injects silence frames which moves the time stamp forward and adjusts offset. That cannot be used against publisher side sender report. Use a pinned version. - Ignore small changes to propagation delay even while checking for sharp increase. That is spamming a lot for small changes, i.e. existing delta is 100 micro seconds or so and the new one is 300 micro seconds. Also rename to `longTerm` from `smoothed` as it is a slow varying long term estimate of propagation delay delta. And slow down that adaptation more.	2024-03-19 11:59:24 +05:30
Raja Subramanian	4e96ad2e5b	Missed clean up in the last PR (#2576 ) * Missed clean up in the last PR * Infow -> Debugw	2024-03-13 23:01:19 +05:30
Raja Subramanian	3e43f75143	Forward publisher sender report. (#2572 ) * Forward publisher sender report. Publisher side RTCP sernfer report is rebased to SFU time base and used to send sender rerport to subscriber. Will wait to merge till previous versions are out as this will require a bunch of testing. * - Add rebased report drift - update protocol dep - fix path change check, it has to check against delta of propagation delay and not propagation delay as the two side clocks could be way off.	2024-03-13 14:31:39 +05:30
Raja Subramanian	50c48ff29d	Ignore out-of-order receiver side sender reports. (#2567 )	2024-03-11 11:30:01 +05:30
Raja Subramanian	bdbc9dcbc7	Use start time stamp to calculate down stream sender report. (#2564 ) * Use start time stamp to calculate down stream sender report. With first packet time adjustment, using the first time stamp is more accurate. This still suffers if the up stream clock rate changes (happens in cases like noise suppression which is not well understood). Will be looking at pass through of sender report from publisher to subscriber. * similar log strings * avoid early sender reports * log messages * Reduce first packet adjustment threshold to 15 seconds	2024-03-10 23:18:54 +05:30
Raja Subramanian	a08b058abc	Structured logging for sender report data. (#2563 )	2024-03-10 01:29:37 +05:30
Raja Subramanian	0618cb39df	Logging time and rtp diff for easier debugging (#2548 )	2024-03-05 20:31:44 +05:30
Raja Subramanian	d0128b19cd	Reset sender reports before measuring clock skew. (#2437 )	2024-02-02 21:52:43 +05:30
cnderrauber	e1cc9d6b3c	Fix log marshal error (#2295 )	2023-12-06 00:08:48 +08:00
Raja Subramanian	c62382c76e	Clean up restart a bit. (#2247 )	2023-11-17 00:40:00 +05:30
Raja Subramanian	12a9d74acb	Do not restart on receiver side. (#2224 ) * Do not restart on receiver side. Restart with wrap back causes issues in the forwarding path as the subscriber assumes the extended type from receiver side does not restart. Restart was an attempt to include as many packets as possible, but in practice is not super useful. So, taking it out. Can clean up a bit more stuff, but want to run this first and check for any oddities. * fix test	2023-11-06 10:41:56 +05:30
Raja Subramanian	fa01297d96	Slight sequencer tweaks. (#2184 ) The buffer is not for padding packets. So, calculate adjusted sequence numbers before comparing against size. Also, it is possible that invalidated slot is accessed due to not being able to exclude padding range. This was causing time stamp reset to 0. Will remove the error log after this goes out and the condition does not show up for a few days.	2023-10-25 23:12:14 +05:30
Raja Subramanian	f622fc2490	Sample clock skew down by an order of magnitude (#2173 )	2023-10-23 16:58:02 +05:30
Raja Subramanian	3e9450c774	Log more details in warns. (#2166 ) Logging more details in warns so that we do not have to enable Infow for some logs later.	2023-10-21 11:02:34 +05:30
Raja Subramanian	b591c56aa3	Logging reduction. (#2165 ) Move some to Debugw and add sampling for a few.	2023-10-21 10:26:30 +05:30
Raja Subramanian	0d7477178e	More fine grained filtering NACKs after a key frame. (#2159 ) * More fine grained filtering NACKs after a key frame. There are applications with periodic key frame. So, a packet lost before a key frame will not be retransmitted. But, decoder could wait (jitter buffer, play out time) and cause a stutter. Idea behind disabling NACKs after key frame was another knob to throttle retransmission bit rate. But, with spaced out retransmissions and max retransmissions per sequence number, there are throttles. This would provide more throttling, but affects some applications. So, disabling filtering NACKs after a key frame. Introducing another flag to disallow layers. This would still be quite useful, i. e. under congestion the stream allocator would move the target lower. But, because of congestion, higher layer would have lost a bunch of packets. Client would NACK those. Retransmitting those higher layer packets would congest the channel more. The new flag (default enabled) would disallow higher layers retransmission. This was happening before this change also, just splitting out the flag for more control. * split flag	2023-10-20 00:44:39 +05:30
Raja Subramanian	e461e9cd79	Log skew in clock rate. (#2158 ) * Log skew in clock rate. Remember seeing sender report time stamp moving backward across mute with replaceTrack(null). Not able to reproduce it in JS sample app, but have seen it elsewhere. Logging to understand it better. Wondering if the sender report should be reset on time stamp moving backward or if we should drop backwards moving reports. * set threshold at 20%	2023-10-19 13:58:50 +05:30
Raja Subramanian	f97242c8ba	Use 32-bit time stamp to get reference time stamp on a switch. (#2153 ) * Use 32-bit time stamp to get reference time stamp on a switch. With relay and dyncast and migration, it is possible that different layers of a simulcast get out of sync in terms of extended type, i. e. layer 0 could keep running and its timestamp could have wrapped around and bumped the extended timestamp. But, another layer could start and stop. One possible solution is sending the extended timestamp across relay. But, that breaks down during migration if publisher has started afresh. Subscriber could still be using extended range. So, use 32-bit timestamp to infer reference timestamp and patch it with expected extended time stamp to derive the extended reference. * use calculated value * make it test friendly	2023-10-18 21:48:41 +05:30
Raja Subramanian	3e4cd3a161	Accept more range for first packet time adjustment. (#2150 )	2023-10-17 23:52:14 +05:30
Raja Subramanian	70b60101f4	do the proper large negative check (#2139 )	2023-10-10 11:01:49 +05:30
Raja Subramanian	31b042ddce	Log larga negative gap. (#2138 ) Seeing a large positive gap which I am not able to explain. Wondering if at some other time, a large negative is happening and the large positive is just a correction.	2023-10-09 14:32:10 +05:30
Raja Subramanian	ebe1470e46	Simplify (#2137 )	2023-10-07 13:15:51 +05:30
Raja Subramanian	9fc276481a	Increase accuracy of delay since last sender report. (#2136 ) It is in 1 / 65536 seconds units. That is about 0.015 ms. So, use microseconds to increase accuracy.	2023-10-07 12:25:47 +05:30
Raja Subramanian	c710ab901e	Log stream start at Infow (#2125 ) Seeing some sequence number adjustment where timestamp is 0. Want to check the stream start.	2023-10-05 00:12:41 +05:30
Raja Subramanian	180ad541fc	Mark packet not handled if restart is rejected (#2115 ) * Mark packet not handled if restart is rejected * log both sn and ts on restart/rollback	2023-09-30 10:24:13 +05:30

1 2

55 Commits