livekit

mirror of https://github.com/livekit/livekit.git synced 2026-04-27 02:05:25 +00:00

Author	SHA1	Message	Date
Raja Subramanian	64c651431e	Update mediatransportutil (#4115 ) - New bucket API to pass in max packet size and sequence number offset and seequence number size generic type - Move OWD estimator to mediatransportutil.	2025-11-28 21:51:53 +05:30
Raja Subramanian	ffbabcc772	Switch forwarding latency log to Debugw (#4098 )	2025-11-23 11:22:10 +05:30
cnderrauber	54cf7d46c8	Control latency of lossy data channel (#4088 ) * Control latency of lossy data channel * remove log * test	2025-11-18 16:30:16 +08:00
Raja Subramanian	c3964ba2eb	Use sync.Pool for objects in packet path. (#4066 ) * Use sync.Pool for objects in packet path. Seeing cases of forwarding latency spikes that aling with GC. This might be a bit overkill, but using sync.Pool for small + short-lived objects in packet path. Before this, all these were increasing in alloc_space heap profile samples over time. With these, there is no increase (actually the lines corresponding to geting from pool does not even show up in heap accounting when doing `list` in `pprof`) * merge * Paul feedback	2025-11-14 16:13:23 +05:30
Raja Subramanian	f8b994d491	Forwarding latency measurement tweaks. (#4080 ) * Forwarding latency measurement tweaks. - prom transmission type public - do not measure short term values as it is not used and saves some lock contention time in packet path potentially. Adding a separate method for that. - Change latency/jitter summary reporting to `ns` also to match the histogram. * add GetShortStats	2025-11-13 18:39:49 +05:30
Raja Subramanian	4ce07bedeb	Higher resolution forwarding latency histogram. (#4067 ) * Higher resolution forwarding latency histogram. Was using the average latency/jitter of last second to populate forwarding latency/jitter histogram. But, it is too coarse, i. e. the average value of latency/jitter is very low and those summarised samples end up in the lowest bucket always. A few things to address it - record per packet forwarding latency in histogram - adjust histogram bins to include smaller values - Drop jitter histogram This is a per packet call, but prometheus histogram is supposedly fast/light weight. Would be good to get better resolution histograms. Hence doing this. Please let me know if there are performance concerns. * typo * one more typo	2025-11-09 17:29:40 +05:30
Raja Subramanian	1dc9b8fc5c	Use buffered indicator to exclude from forwarding latency. (#4062 ) * Debug high forwarding latency missing. * log highest * log condition * update log * log * log * change log * Track start up delay. Digging into forwarding latency, there are a few things 1. Seems to be caused due to forwarding packets queued before bind. They would be in the queue till bind. There are two ways it is showing up a. Bind itself is delayed and releasing queued packets causes the high forwarding latency. b. There is a significant gap between bind and first packet being pulled off the queue to be forwarded, in one example 100ms. (a) is understandable if the signalling delays things. Can drop these packets without forwarding or indicate in the packet that it is a queued packet and drop it from forwarding latency calculation. Dropping is probably better as down stream components like egress will see a burst in these situations. (b) looks like go scheduling latency? Unsure. Logging more to understand this better. * log start * Use buffered indicator to exclude from forwarding latency. Buffered packets live the queue for a while before Bind releases them. They have high(ish) queuing latency and not true representation of forwarding latency.	2025-11-07 21:46:14 +05:30
Raja Subramanian	f117ee511f	Track start up delay. (#4061 ) * Debug high forwarding latency missing. * log highest * log condition * update log * log * log * change log * Track start up delay. Digging into forwarding latency, there are a few things 1. Seems to be caused due to forwarding packets queued before bind. They would be in the queue till bind. There are two ways it is showing up a. Bind itself is delayed and releasing queued packets causes the high forwarding latency. b. There is a significant gap between bind and first packet being pulled off the queue to be forwarded, in one example 100ms. (a) is understandable if the signalling delays things. Can drop these packets without forwarding or indicate in the packet that it is a queued packet and drop it from forwarding latency calculation. Dropping is probably better as down stream components like egress will see a burst in these situations. (b) looks like go scheduling latency? Unsure. Logging more to understand this better. * log start	2025-11-07 16:55:18 +05:30
Raja Subramanian	4872f2051d	Return write count from WriteRTP. (#4059 ) * Log write count atomic. * Return write count from WriteRTP. Apologies for the frequent changes on this. With relays, the down track could write to several targets. So, use count to have an accurate indication of how may subscribers were written to.	2025-11-06 13:29:21 +05:30
Raja Subramanian	d0ba46b460	Log write count atomic. (#4057 )	2025-11-06 13:00:08 +05:30
Raja Subramanian	ae5fb7e882	Add packet to forwarding stats only if packet is forwarded. (#4056 ) Packets not being forwarded were getting included in forwarding stats calculation and skewing the measurement towards a smaller number. The latency measurement does not include the batch IO of packets on send. With a 2ms batching, that will add an average latency of 1ms.	2025-11-06 12:31:49 +05:30
cnderrauber	c264b504c4	Don't warn 0 payload type for PCMU (#4039 )	2025-10-28 23:11:51 +08:00
Raja Subramanian	32fc35254e	Broadcast cond var on RTX write. (#4038 ) * Broadcast cond var on RTX write. High forwarding latency logs all show high queuing delay so far. From code inspection, RTX writes were not signaling the cond var. Not sure if that is the reason, but adding a signal there for further tests. * Remove return values from writeRTX as they are not used	2025-10-28 11:27:02 +05:30
Raja Subramanian	061eb8b4e8	AddDownTrack to regressed codec after restarting forwarder. (#4037 ) Without that the new codec was skipping through with old selector and not working correctly.	2025-10-27 20:14:33 +05:30
Raja Subramanian	ab906d710c	Prevent leakage of previous codec after codec regression. (#4035 ) * Prevent leakage of previous codec after codec regression. In the window between forwarder restart and determining codec, the old codec packet could leak through. Prevent tha by doing the restart and codec determination atomically on a codec regression. * tidy * use locked function	2025-10-27 17:40:39 +05:30
Raja Subramanian	79b03f97a2	Log queueing latency when encountering high forwarding latency (#4034 )	2025-10-27 15:27:03 +05:30
Raja Subramanian	29117b1422	set max layer in allocation (#4033 )	2025-10-26 17:51:35 +05:30
Raja Subramanian	34e16a8709	Check more conditions for opportunistic alloc. (#4031 )	2025-10-26 14:03:26 +05:30
Raja Subramanian	81fbd3551a	Use the optimal allocation function for opportunistic allocation. (#4030 ) * Use the optimal allocation function for opportunistic allocation. Allocation functions set the `lastAllocation` state also. This might have been causing an e2e failure with v1 client on migration. * annotate args	2025-10-26 00:27:41 +05:30
Raja Subramanian	a2ce73e0d0	Do not bind buffer if codec is invalid. (#4028 ) Seeing cases of codec with zero clock rate. Do not bind to those.	2025-10-25 14:30:30 +05:30
Raja Subramanian	5042c06cb2	Use rtp converter from protocol/utils/rtputil (#4020 ) * Use rtp converter from protocol/utils/rtputil * lock x/tools as counterfeiter needs it	2025-10-22 15:15:46 +05:30
Raja Subramanian	5a426d15e1	Use rtp converter from protocol/utils (#4019 )	2025-10-22 14:09:33 +05:30
Alexey Sokolov	c039769607	Issue #1 only: Fix spatial layer initialization in Forwarder (#4003 ) When SetMaxSpatialLayer() is called with target/current layers in InvalidLayerSpatial state, opportunistically initialize the target layer to avoid dropped packets during async stream allocator initialization. Guards: - Only sets target if not congestion-throttled (isDeficientLocked) - Does not set current layer (deferred to keyframe-based forwarder start) - Logs at Debug level to avoid log noise This prevents undefined layer state during manual subscription with immediate quality upgrades (WithAutoSubscribe(false) + SetVideoQuality(HIGH)).	2025-10-21 12:54:05 +05:30
Raja Subramanian	2afbf0e8ca	Some golang modernisation bits. (#4016 ) Mainly doing this to check CI static check failures.	2025-10-21 12:53:18 +05:30
Raja Subramanian	dd62eb0072	Resort to full search for requested quality is not available. (#4000 ) When doing code changes for dynamic rid, inadventently relied on ordering of quality in track info layers to pick the highest layer if the requested quality is higher than available qualities. @cnderrauber addressed it in https://github.com/livekit/livekit/pull/3998. Just adding some more robustness behind that by doing a full search when requested quality is not available. Tested using JS SDK demo app and picking different qualities from subscriber side with adaptive streaming turned off.	2025-10-14 10:05:33 +05:30
Raja Subramanian	f6ca82d177	Revert to using silence packets for audio dummy start. (#3999 ) Effectively reverts https://github.com/livekit/livekit/pull/3984. Using padding only packets for audio dummy start introduces dependencies on other services and is not a necessary change. Would have been good to use padding only for audio also from t=0. We can re-visit this for better compatbility down the line.	2025-10-14 10:05:16 +05:30
Raja Subramanian	a20bbe34fa	Log RPC details. (#3991 ) Seeing cases of `ConnectionTimeout` and `ResponseTimeout`. So, logging destination identity in RPC request and also logging ACK and response. Will pare back logs/log level of these messages after gettnig some data. Also a small change I noticed and had sitting in my local tree to set the previous RTP marker on a padding packet.	2025-10-09 00:16:56 +05:30
Raja Subramanian	158496bca1	Increment RTP timestamp on padding when using dummy start. (#3989 ) * Increment RTP timestamp on padding when using dummy start. This allows things like egress to have proper sequence to start the pipeline. * test	2025-10-07 23:39:51 +05:30
Raja Subramanian	4f6ed65d61	Limit check to red + opus when looking for primary codec match. (#3988 ) For codec regression, even if track is encrypted, should be able to fall back to a backup codec and trigger a regression.	2025-10-07 23:28:26 +05:30
Raja Subramanian	bf06596fcb	Support Opus mixed with RED when encrypted. (#3986 ) Even when encrypted, can set up opus as the second codec to support the case of RED interspersed with Opus packets when the RED packet is too big to fit in one packet. The change here is to not go through all up stream codecs when trying to find a match in DownTrack.Bind when source is encrypted. When encrypted, the down track codec should match the primary upstream codec, i. e. the codec at index 0.	2025-10-07 16:23:28 +05:30
Raja Subramanian	2a6adbe80e	Use padding only packets for dummy start of audio. (#3984 ) If egress does not need silence packets to start audio, this will simplify dummy start by using the same mechanism for video and audio.	2025-10-07 10:11:15 +05:30
Raja Subramanian	01337ba730	Do not start forawarding on out-of-order packet. (#3985 ) It is posible that a subscriber joins when a publisher has reconnected and has received a flood of retransmitted packets due to NACKing the gap caused by the publisher reconnecting. Starting on that spurt means the subscriber gets a burst of unpaced packets that could lead to issues with calculating render time (especially obvious in cases like egress).	2025-10-06 13:16:48 +05:30
Raja Subramanian	3bd20ddb28	Revert unintentional change to not handle transport fallback on (#3970 ) publisher peer connection. While cleaning up during single peer connection changes, unintentionally removed handler. Also, another small change to log first packet time adjustment after increment.	2025-09-30 10:24:26 +05:30
Raja Subramanian	735c663adc	Update protocol for EventKey helper. (#3963 )	2025-09-29 11:42:18 +05:30
Raja Subramanian	0bf7b178eb	avoid logging on small values (#3958 )	2025-09-28 10:46:41 +05:30
Raja Subramanian	00ff2ab941	Adjust for hold time when fowarding RTCP report. (#3956 ) * Adjust for hold time when fowarding RTCP report. When passing through RTCP sender report, holding it for some time before sending means the remote receiver could see varying amount of propagation delay if the remote uses something like local_clock - ntp_sender_report_time and adapting to it. Ideally, SFU should just forward RTCP Sender Report, but the current pull model to group RTCP sender reports makes it a bigger change. So, adjust it by hold time. Also add a initial condition for one-way-delay estimator which can init with a smaller value of latency if the first sample to measure one-way-delay itself experienced higher delay than the prevailing conditions. * variable name * log as duration	2025-09-26 18:57:21 +05:30
Raja Subramanian	bfba6feed4	Adjust stream allocator ping interval based on state. (#3951 ) * Adjust stream allocator ping interval based on state. In steady state, does a 15 second ping. While deficient, to be able to react to probes faster, it pings at 100ms interval. * clean up * log ops queue not able to wake up	2025-09-24 14:45:57 +05:30
Raja Subramanian	49f9b9c8bd	Flush stats when there are no packets. (#3947 ) With no packets flowing through, the stat gets stuck. Flush the pipe if there have been no packets in the report interval.	2025-09-23 16:57:41 +05:30
Raja Subramanian	e6a3df1edc	ForwarStats.GetStats needs to be public (#3946 ) * ForwarStats.GetStats needs to be public * prevent deadlock	2025-09-23 15:46:12 +05:30
Raja Subramanian	824d116bfe	Tweaks tresholds for logging high forwarding latency/jitter. (#3945 ) * Tweaks tresholds for logging high forwarding latency/jitter. Previous attempt showed skewed jitter (i. e. more than 10x latency), But, no large latency. So, reducing the latency treshold to declare high latency. And also keeping track of lowest/highest per reporting window and logging those along with short term and long term measurements. NOTE: previously short term and long term were separate calls with locks acquired. Now, it is all in one lock. So, it does increase the lock duration a bit, but hopefully not by too much as the welford merge for short term would go over 20 samples (at 50 ms sampling interval and 1 s reporting window). * revert skew factor	2025-09-23 14:46:43 +05:30
Raja Subramanian	408492e030	Log some information around high forwarding latency. (#3944 ) * Log some information around high forwarding latency. Latency is not 0 after switching to microseconds resolution. But, still seeing high jitter. Logging a bit more to understand under what conditions it happens. More notes inline. * compact	2025-09-23 12:37:09 +05:30
Raja Subramanian	6a41fae548	Use microseconds for forwarding stats. (#3943 ) Latency is always 0, but jitter is high. Not sure how that happens as latency is the welford mean and jitter is welford standard deviation. Feels like some mis-labeling. Anyhow, switching to microseconds units to get better resolution.	2025-09-23 02:28:19 +05:30
Raja Subramanian	b07e7a3828	Use difference in key frame counter to stop seeder. (#3936 ) The key frame seeder could be started multiple times. Use difference to detect stop condition.	2025-09-19 15:15:26 +05:30
Raja Subramanian	56fb28858a	Do DD restart only if DD structure is present. (#3935 )	2025-09-19 02:39:08 +05:30
Raja Subramanian	86facce9f4	More debugging of DD jump (#3934 )	2025-09-19 01:29:28 +05:30
Raja Subramanian	6058a3f622	Add debugging from DD frame number wrap around. (#3933 ) * Add debugging from DD frame number wrap around. On a DD parser restart, the extended highest sequence number oes not seem to be updated. Adding some debug to understand it better. * more logs * log incoming sequence number and frame number	2025-09-19 00:17:45 +05:30
Raja Subramanian	6489237e33	Simulcast audio fixes (#3925 ) * Simulcast audio fixes * clean up	2025-09-14 09:41:40 +05:30
Raja Subramanian	eee2001a31	Set publisher codec preferences after setting remote description (#3913 ) * Set publisher codec preferences after setting remote description Munging SDP prior to setting remote description was becoming problematic in single peer connection mode. In that mode, it is possible that a subscribe track m-section is added which sets the fmtp of H.265 to a value that is different from when that client publishes. That gets locked in as negotiated codecs when pion processes remote description. Later when the client publishes H.265, the H.265 does only partial match. So, if we munge offer and send it to SetRemoteDescription, the H.265 does only a partial match due to different fmtp line and that gets put at the end of the list. So, the answer does not enforce the preferred codec. Changing pion to put partial match up front is more risky given other projects. So, switch codec preferences to after remote description is set and directly operate on transceiver which is a better place to make these changes without munging SDP. This fixes the case of - firefox joins first - Chrome preferring H.265 joining next. This causes a subscribe track m-section (for firefox's tracks) to be created first. So, the preferred codec munging was not working. Works after this change. * clean up * mage generate * test * clean up	2025-09-10 18:28:36 +05:30
cnderrauber	76645fad5e	Rpcs for ingress proxy WHIP (#3911 ) See https://github.com/livekit/protocol/pull/1194	2025-09-09 22:49:42 +08:00
Raja Subramanian	991a4a4f53	Refactor subscribedTrack + mediaTrackSubscriptions. (#3908 ) - Move downTrack instantiation to SubscribedTrack as it should own that DownTrack. Still more to do here as `DownTrack` is fetched from `SubscribedTrack` in a few places and used. Would like to avoid that, but doing this initially. - Use an interface from sfu.Downtrack and replace a bunch of callbacks. SubscribedTrack is the implementation for DownTrackListener.	2025-09-08 18:20:19 +05:30

1 2 3 4 5 ...

1110 Commits