livekit

mirror of https://github.com/livekit/livekit.git synced 2026-07-10 03:31:59 +00:00

Author	SHA1	Message	Date
Raja Subramanian	8c241ecf12	Fix RTCP reader leak in DownTrack. (#4131 ) When a participant is closing, RTCP readers should be cleaned up from factory even if the participant is expected to resume. The resumed participant will be a new participant session and peer connection(s) and everything will be set up again.	2025-12-06 17:49:23 +05:30
Raja Subramanian	3eef869a68	Do not pause rid in SDP (#4129 )	2025-12-05 15:57:31 +05:30
Raja Subramanian	7c1a0fab7c	Fix concurrent map access. (#4127 ) https://github.com/livekit/livekit/issues/4126	2025-12-05 10:48:10 +05:30
Raja Subramanian	14446b1cc1	Let participant close remove the published tracks. (#4125 )	2025-12-04 22:37:08 +05:30
cnderrauber	fa0633aa3e	move utils.WrapAround to mediatransportutil (#4124 )	2025-12-04 17:45:11 +08:00
Raja Subramanian	7954748d7a	Data tracks (#4089 ) * WIP * WIP * Starting to add some signalling integration testing. * Working tests. * fix tests * Forward data packets (#4096) * WIP commit * WIP * WIP * fix forwarding * address PR comments * move some methods from LocalParticipant to Participant interface * handle subscription update * add extensions and tests * more packet tests * add test for replace extension and fix a bug * update protocol and add config	2025-12-04 10:44:34 +05:30
Raja Subramanian	7158d98366	log bucket growth (#4122 )	2025-12-03 18:48:02 +05:30
Raja Subramanian	64c651431e	Update mediatransportutil (#4115 ) - New bucket API to pass in max packet size and sequence number offset and seequence number size generic type - Move OWD estimator to mediatransportutil.	2025-11-28 21:51:53 +05:30
Raja Subramanian	0a2943bbc5	Clean up bits added to debug peer connection close hang. (#4114 )	2025-11-28 10:30:39 +05:30
Raja Subramanian	bd5382daaa	Splitting transport close timeout logs. (#4108 ) After adding more fields in https://github.com/livekit/livekit/pull/4105/files, it was not even logging. Access to one of the added fields must have ended up waiting on a lock and blocked. Unfotunately, the deadlock fix in https://github.com/pion/ice/pull/840 did not address the peer connection close hang. Splitting the logs so that the base log still happens. Ordering after looking at the code and guessing what could still log to see if we get more of the logs and learn more about the state and which lock ends up the first blocking one.	2025-11-27 10:02:01 +05:30
Raja Subramanian	a6418ae219	Log more peer conenction state on close timeout. (#4105 )	2025-11-26 19:58:31 +05:30
Raja Subramanian	06d999748f	Check for cancel on unsubscription/source track going away. (#4104 )	2025-11-25 21:32:21 +05:30
Raja Subramanian	7f10e18bac	Record join/publish/subscribe cancellations. (#4102 ) To get better picture of success/failure rate.	2025-11-25 14:06:02 +05:30
Raja Subramanian	402936324c	Clear stereo=1 if stereo is not enabled. (#4101 )	2025-11-24 21:31:56 +05:30
Raja Subramanian	70f6def39d	Add checks for participant and sub-components close. (#4100 ) * Add checks for participant and sub-components close. Looks like there might be some memory leak with participant sessions not getting closed properly. Adding checks (to be cleaned up later) to see if there is a consistent place where things might hang. * init with right type * Remove unnecessary goroutine, thank you @milos-lk * clean up	2025-11-24 18:07:33 +05:30
Raja Subramanian	ffbabcc772	Switch forwarding latency log to Debugw (#4098 )	2025-11-23 11:22:10 +05:30
aleb_the_flash	27d82a724e	Fix "address" typo in transport logs (addddress → address) (#4097 ) Correct triple-d spelling of "address" field in transport logs. I’m not sure whether this was intentional, but I noticed it while creating Grafana queries and filters. This matters because anyone filtering logs using the correct spelling may unintentionally miss relevant data, leading to incomplete or misleading analysis.	2025-11-22 21:30:02 +05:30
Raja Subramanian	37a06821e2	logger proto redaction. (#4090 ) Unfortunately, this could not be used for twirp/analytics redaction. Probably worth writing a proto clone utility which will filter out based on tags.	2025-11-18 14:15:17 +05:30
cnderrauber	54cf7d46c8	Control latency of lossy data channel (#4088 ) * Control latency of lossy data channel * remove log * test	2025-11-18 16:30:16 +08:00
Raja Subramanian	d510fff1e7	Downgrade x/tools to be able to make a release (#4084 )	2025-11-15 18:56:22 +05:30
Raja Subramanian	c3964ba2eb	Use sync.Pool for objects in packet path. (#4066 ) * Use sync.Pool for objects in packet path. Seeing cases of forwarding latency spikes that aling with GC. This might be a bit overkill, but using sync.Pool for small + short-lived objects in packet path. Before this, all these were increasing in alloc_space heap profile samples over time. With these, there is no increase (actually the lines corresponding to geting from pool does not even show up in heap accounting when doing `list` in `pprof`) * merge * Paul feedback	2025-11-14 16:13:23 +05:30
Raja Subramanian	f8b994d491	Forwarding latency measurement tweaks. (#4080 ) * Forwarding latency measurement tweaks. - prom transmission type public - do not measure short term values as it is not used and saves some lock contention time in packet path potentially. Adding a separate method for that. - Change latency/jitter summary reporting to `ns` also to match the histogram. * add GetShortStats	2025-11-13 18:39:49 +05:30
cnderrauber	2d5054ad01	kind details for connector (#4072 )	2025-11-11 21:50:48 +08:00
Raja Subramanian	a272e28ae0	Log raeson for subscriber not being to determine codec. (#4071 )	2025-11-11 16:42:42 +05:30
Raja Subramanian	4ce07bedeb	Higher resolution forwarding latency histogram. (#4067 ) * Higher resolution forwarding latency histogram. Was using the average latency/jitter of last second to populate forwarding latency/jitter histogram. But, it is too coarse, i. e. the average value of latency/jitter is very low and those summarised samples end up in the lowest bucket always. A few things to address it - record per packet forwarding latency in histogram - adjust histogram bins to include smaller values - Drop jitter histogram This is a per packet call, but prometheus histogram is supposedly fast/light weight. Would be good to get better resolution histograms. Hence doing this. Please let me know if there are performance concerns. * typo * one more typo	2025-11-09 17:29:40 +05:30
Raja Subramanian	1dc9b8fc5c	Use buffered indicator to exclude from forwarding latency. (#4062 ) * Debug high forwarding latency missing. * log highest * log condition * update log * log * log * change log * Track start up delay. Digging into forwarding latency, there are a few things 1. Seems to be caused due to forwarding packets queued before bind. They would be in the queue till bind. There are two ways it is showing up a. Bind itself is delayed and releasing queued packets causes the high forwarding latency. b. There is a significant gap between bind and first packet being pulled off the queue to be forwarded, in one example 100ms. (a) is understandable if the signalling delays things. Can drop these packets without forwarding or indicate in the packet that it is a queued packet and drop it from forwarding latency calculation. Dropping is probably better as down stream components like egress will see a burst in these situations. (b) looks like go scheduling latency? Unsure. Logging more to understand this better. * log start * Use buffered indicator to exclude from forwarding latency. Buffered packets live the queue for a while before Bind releases them. They have high(ish) queuing latency and not true representation of forwarding latency.	2025-11-07 21:46:14 +05:30
Raja Subramanian	f117ee511f	Track start up delay. (#4061 ) * Debug high forwarding latency missing. * log highest * log condition * update log * log * log * change log * Track start up delay. Digging into forwarding latency, there are a few things 1. Seems to be caused due to forwarding packets queued before bind. They would be in the queue till bind. There are two ways it is showing up a. Bind itself is delayed and releasing queued packets causes the high forwarding latency. b. There is a significant gap between bind and first packet being pulled off the queue to be forwarded, in one example 100ms. (a) is understandable if the signalling delays things. Can drop these packets without forwarding or indicate in the packet that it is a queued packet and drop it from forwarding latency calculation. Dropping is probably better as down stream components like egress will see a burst in these situations. (b) looks like go scheduling latency? Unsure. Logging more to understand this better. * log start	2025-11-07 16:55:18 +05:30
Raja Subramanian	4872f2051d	Return write count from WriteRTP. (#4059 ) * Log write count atomic. * Return write count from WriteRTP. Apologies for the frequent changes on this. With relays, the down track could write to several targets. So, use count to have an accurate indication of how may subscribers were written to.	2025-11-06 13:29:21 +05:30
Raja Subramanian	d0ba46b460	Log write count atomic. (#4057 )	2025-11-06 13:00:08 +05:30
Raja Subramanian	ae5fb7e882	Add packet to forwarding stats only if packet is forwarded. (#4056 ) Packets not being forwarded were getting included in forwarding stats calculation and skewing the measurement towards a smaller number. The latency measurement does not include the batch IO of packets on send. With a 2ms batching, that will add an average latency of 1ms.	2025-11-06 12:31:49 +05:30
Raja Subramanian	ca3c507b3f	Prevent invalid track access while peer connection is shutting down. (#4054 )	2025-11-05 17:48:27 +05:30
Raja Subramanian	9ca6ee0077	Use replace so that x/tools does not get overridden (#4048 )	2025-11-02 17:58:01 +05:30
Raja Subramanian	9d5c351d36	Fix prom units for forwarding latency/jitter. (#4045 )	2025-11-02 14:38:25 +05:30
Raja Subramanian	e183657cff	Add prom histogram for forwarding latency and jitter. (#4044 ) * Add prom histogram for forwarding latency and jitter. Using short term stats for histogram. An example setting is 1s - short term 1m - long term Using the 1s (short term) data for histogram. In that 1 second, all packet forwarding latencies are averaged for latency and std. dev. of the collection is used as jitter. * try different staticcheck	2025-11-01 23:25:03 +05:30
Trey Hakanson	1eefeb3089	Enable AbsCaptureTimeURI in RTC configuration (#4043 ) Enable absolute capture time RTP extension. This logic was added a while back, but was disabled.	2025-10-31 09:42:36 +05:30
cnderrauber	075a7576ed	Use simulcast codec as default policy for audio track (#4040 )	2025-10-29 21:39:20 +08:00
cnderrauber	c264b504c4	Don't warn 0 payload type for PCMU (#4039 )	2025-10-28 23:11:51 +08:00
Raja Subramanian	32fc35254e	Broadcast cond var on RTX write. (#4038 ) * Broadcast cond var on RTX write. High forwarding latency logs all show high queuing delay so far. From code inspection, RTX writes were not signaling the cond var. Not sure if that is the reason, but adding a signal there for further tests. * Remove return values from writeRTX as they are not used	2025-10-28 11:27:02 +05:30
Raja Subramanian	061eb8b4e8	AddDownTrack to regressed codec after restarting forwarder. (#4037 ) Without that the new codec was skipping through with old selector and not working correctly.	2025-10-27 20:14:33 +05:30
Artur Melanchyk	c87eb8ed11	fix: add missing Unlock() in AddReceiver (#4036 ) Signed-off-by: Artur Melanchyk <13834276+arturmelanchyk@users.noreply.github.com> Co-authored-by: Artur Melanchyk <13834276+arturmelanchyk@users.noreply.github.com>	2025-10-27 18:45:44 +05:30
Matthew Brown	704449247e	if RingingTimeout is provided, deadline should be set to that timeout. (#4018 ) * if RingingTimeout is provided, deadline should be set to that timeout. This is because the SIP bridge will not return until RingingTimeout which may be longer than the 30 second default deadline. * handle Deadline being "before" timeout.	2025-10-27 15:03:03 +02:00
Raja Subramanian	ab906d710c	Prevent leakage of previous codec after codec regression. (#4035 ) * Prevent leakage of previous codec after codec regression. In the window between forwarder restart and determining codec, the old codec packet could leak through. Prevent tha by doing the restart and codec determination atomically on a codec regression. * tidy * use locked function	2025-10-27 17:40:39 +05:30
Raja Subramanian	79b03f97a2	Log queueing latency when encountering high forwarding latency (#4034 )	2025-10-27 15:27:03 +05:30
Raja Subramanian	29117b1422	set max layer in allocation (#4033 )	2025-10-26 17:51:35 +05:30
Raja Subramanian	15b19ccd26	Remove ~ from rid which indicates disabled layer to get the actual rid (#4032 )	2025-10-26 15:44:32 +05:30
Raja Subramanian	34e16a8709	Check more conditions for opportunistic alloc. (#4031 )	2025-10-26 14:03:26 +05:30
Raja Subramanian	81fbd3551a	Use the optimal allocation function for opportunistic allocation. (#4030 ) * Use the optimal allocation function for opportunistic allocation. Allocation functions set the `lastAllocation` state also. This might have been causing an e2e failure with v1 client on migration. * annotate args	2025-10-26 00:27:41 +05:30
Raja Subramanian	a2ce73e0d0	Do not bind buffer if codec is invalid. (#4028 ) Seeing cases of codec with zero clock rate. Do not bind to those.	2025-10-25 14:30:30 +05:30
Raja Subramanian	cef6fdb7b6	Correct direction for request/response for prom counters. (#4027 ) * Correct direction for request/response for prom counters. I think I had it reversed. * clean up * clean up	2025-10-24 23:15:23 +05:30
Raja Subramanian	5042c06cb2	Use rtp converter from protocol/utils/rtputil (#4020 ) * Use rtp converter from protocol/utils/rtputil * lock x/tools as counterfeiter needs it	2025-10-22 15:15:46 +05:30

1 2 3 4 5 ...

2843 Commits