livekit

mirror of https://github.com/livekit/livekit.git synced 2026-04-25 15:32:09 +00:00

Author	SHA1	Message	Date
Raja Subramanian	69aa94797b	Some drive-by clean up (#4452 )	2026-04-15 12:23:33 +05:30
Paul Wells	c6e6c0215f	add debug metric for tracking references (#4134 )	2025-12-07 11:39:21 -08:00
Raja Subramanian	7f10e18bac	Record join/publish/subscribe cancellations. (#4102 ) To get better picture of success/failure rate.	2025-11-25 14:06:02 +05:30
Raja Subramanian	f8b994d491	Forwarding latency measurement tweaks. (#4080 ) * Forwarding latency measurement tweaks. - prom transmission type public - do not measure short term values as it is not used and saves some lock contention time in packet path potentially. Adding a separate method for that. - Change latency/jitter summary reporting to `ns` also to match the histogram. * add GetShortStats	2025-11-13 18:39:49 +05:30
Raja Subramanian	4ce07bedeb	Higher resolution forwarding latency histogram. (#4067 ) * Higher resolution forwarding latency histogram. Was using the average latency/jitter of last second to populate forwarding latency/jitter histogram. But, it is too coarse, i. e. the average value of latency/jitter is very low and those summarised samples end up in the lowest bucket always. A few things to address it - record per packet forwarding latency in histogram - adjust histogram bins to include smaller values - Drop jitter histogram This is a per packet call, but prometheus histogram is supposedly fast/light weight. Would be good to get better resolution histograms. Hence doing this. Please let me know if there are performance concerns. * typo * one more typo	2025-11-09 17:29:40 +05:30
Raja Subramanian	9d5c351d36	Fix prom units for forwarding latency/jitter. (#4045 )	2025-11-02 14:38:25 +05:30
Raja Subramanian	e183657cff	Add prom histogram for forwarding latency and jitter. (#4044 ) * Add prom histogram for forwarding latency and jitter. Using short term stats for histogram. An example setting is 1s - short term 1m - long term Using the 1s (short term) data for histogram. In that 1 second, all packet forwarding latencies are averaged for latency and std. dev. of the collection is used as jitter. * try different staticcheck	2025-11-01 23:25:03 +05:30
Raja Subramanian	ca0d5ee972	Count request/response packets on both client and server side. (#4001 ) Currently, the signal requests are counted on media side and signal responses are counted on controller side. This does not provide the granularity to check how many response messages each media node is sending. Seeing some cases where track subscriptions are slow under load. This would be good to see if the media node is doing a lot of signal response messages.	2025-10-14 16:58:36 +05:30
Raja Subramanian	10103449c5	Add country label to edge prom stats. (#3816 ) * Add country label to edge prom stats. * data channel country stats * test * pub/sub time country	2025-07-24 13:23:05 +05:30
Raja Subramanian	fc867c5b8e	Webhook prom stats (#3697 )	2025-06-04 14:31:28 -07:00
Raja Subramanian	1c8307c72c	Use cgroup for memstats. (#3573 ) * Use cgroup for memstats. * deps	2025-04-05 11:54:36 +05:30
Raja Subramanian	3238ab8d77	Calculate rates for memory used and total. (#3570 ) Calculating rate for total does seem odd, but keeping it consitent/lined up with used memory calculation.	2025-04-02 10:23:38 +05:30
Raja Subramanian	8cc17f8f8b	Rework node stats a bit. (#3555 ) * Rework node stats a bit. Related protocol PR - https://github.com/livekit/protocol/pull/1023 - Make a config for node stats measurements. Wanted to put the config in `routing` package, but a circular dependency forced me to put in config.go - Make rate calculations explicit, i. e. requested via config. Previously, it had some odd checks to decide when to calculate rate and it would have been calculating over different windows. - Report signal/data channel bytes every 5 seconds to stats collection module. Previously, it was doing it every 30 seconds and that meant some windows could have had a large spike NOTE: Still need to think about this for load calculations as a large number of participants leaving could flush in a small window and that could report a large spike in bytes/packets. Maybe need to ignore signal bytes for load calculation? * deps * use default node stats config if given config is nil * split out node stats into a struct for re-use * update config	2025-03-27 12:42:19 +05:30
Paul Wells	3167266495	add datapacket stream metrics (#3450 ) * add datapacket stream metrics * normalize mime type	2025-02-19 22:28:10 -08:00
Raja Subramanian	3b0077f2fe	Log connection quality changes. (#3311 ) Also remove the connection quality drop prom as it is unused and also adds state/complexity.	2025-01-07 10:58:31 +05:30
Raja Subramanian	86383b2271	De-centralize some configs to where they are used. (#3162 ) * De-centralize some configs to where they are used. And make default variables. Renaming a bit, but these are all internal config and have not been added to documented config. * Keep documented config as is. * test * typo	2024-11-08 12:47:30 +05:30
cnderrauber	cf59267631	Add counter for pub&sub time metrics (#3084 ) * Add counter for pub&sub time metrics The pub&sub shows large value in migration related case like muted/disabled migration, the subscription time depends on the time when publisher unmute the track(sending rtp packet after migration), add a counter to distinguish since we can't control the time in such cases and the first subscription attemps also is more meaningful than those cases. * Add info log for high publish delay	2024-10-11 12:07:24 +08:00
cnderrauber	978db00034	Add sdk, participant_kind to pub sub metrics (#3023 ) * exclude go client from track publication metric * add sdk,participant_kind lables * fix test	2024-09-19 10:42:47 +08:00
Raja Subramanian	787b8450e9	Record out-of-packet count/rate in prom. (#2980 ) * Record out-of-packet count/rate in prom. Adding a field to AnalyticsStream to make this easier to report. Let me know if adding to AnalyticsStream is not ok. Will set up a protocol PR if it is okay. * deps	2024-09-07 00:19:54 +05:30
cnderrauber	947e8f5909	Speed up track publication (#2952 ) * speed up track publication Add metrics for track publication and subscription Return EnabledCodecs in JoinResponse so client can choose codec without server side codec fallback Cache remote webrtc track without AddTrackRequest to let client send publisher offer before AddTrackRequest response * go mod * clean code	2024-08-23 18:38:32 +08:00
Lukas Herman	8a229fda9d	add participant session duration metric (#2801 )	2024-06-17 17:52:08 -04:00
cnderrauber	7ed1284b96	report average forward metrics (#2737 ) * report average forward metrics * unused parameter	2024-05-28 17:03:18 +08:00
cnderrauber	2288e402ac	register forward metrics (#2735 )	2024-05-27 15:47:01 +08:00
Paul Wells	38470f378b	add message bytes metric (#2731 )	2024-05-26 14:01:13 -07:00
cnderrauber	e6aa36fdd6	Add forward stats (#2725 ) * Add forward metrics * ignore packets was not forwarded * rename	2024-05-24 17:43:28 +08:00
Mathew Kamkar	10c8582a6b	get cpu stats from cgroup, remove env (#2636 ) * get cpu stats from cgroup, remove env * undo rand seed removal * tests	2024-04-08 21:15:17 -07:00
Paul Wells	e5b8e25064	use shared psrpc utils (#2506 ) * use shared psrpc utils * fix * deps	2024-02-24 00:38:49 -08:00
Mathew Kamkar	7508560fde	larger buckets for jitter prometheus histogram (#2468 )	2024-02-09 12:09:51 -08:00
Paul Wells	c726cbf2ba	increase max session start time bin size (#2380 )	2024-01-12 03:49:23 -08:00
Paul Wells	2fe2a9c9f2	add session start time metric (#2377 )	2024-01-11 23:23:51 -08:00
Paul Wells	f4a984d446	preallocate prometheus packet counters (#1942 )	2023-08-08 01:06:14 -07:00
David Zhao	981fb7cac7	Adding license notices (#1913 ) * Adding license notices * remove from config	2023-07-27 16:43:19 -07:00
David Zhao	7e5a7ae79f	Fixed windows build (#1768 )	2023-06-04 00:17:25 -07:00
David Zhao	956735ae05	Fix node stats updates on Windows (#1748 ) Because we aren't able to get CPU count/load info on Windows, they are stubbed out to return placeholders. This restores compatibility to run on Windows.	2023-05-29 10:53:08 -07:00
Raja Subramanian	a085afc6ee	Send quality stats to prometheus. (#1708 )	2023-05-12 09:44:03 +05:30
David Colburn	ab6c994db4	update protocol/psrpc (#1643 ) * update protocol/psrpc * metadata references	2023-04-21 12:43:20 -07:00
Paul Wells	6636e37664	add prometheus psrpc metrics observer (#1571 ) * add prometheus psrpc metrics observer * record rpc error counts * update psrpc * update protocol	2023-04-05 03:50:43 -07:00
Dan McFaul	1848a21eda	add configurable environment value (#1421 ) * add configurable prometheus env label * Update pkg/config/config.go Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com> * Update cmd/server/main.go Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com> * Update config-sample.yaml Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com> * set config.Environment value to dev when in dev mode * be more precise for config-sample --------- Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>	2023-02-15 14:41:44 -07:00
Mathew Kamkar	937256d89e	don't error when get tc stats fails (#1386 )	2023-02-10 10:05:45 -08:00
Paul Wells	52fd0a641b	adjust jitter histogram buckets (#1347 ) * adjust jitter histogram buckets * typo	2023-01-29 22:45:32 -08:00
David Zhao	2fa46e2df4	Retry initial connection attempt should it fail (#1335 ) Sometimes the initial selected node could fail. In that case, we'll give it a few more attempts to locate a media node for the session instead of failing it after the first try.	2023-01-25 22:59:57 -08:00
David Zhao	cd6b8b80b9	feat: SubscriptionManager to consolidate subscription handling (#1317 ) Added a new manager to handle all subscription needs. Implemented using reconciler pattern. The goals are: improve subscription resilience by separating desired state and current state reduce complexity of synchronous processing better detect failures with the ability to trigger full reconnect	2023-01-24 23:06:16 -08:00
Dan McFaul	9e3ca1e989	adding rtc_init stat (#1316 ) * adding rtc_initiated stat * clean up signal and rtc init/connected * update naming and break out stats update funcs * update protocol dependency	2023-01-23 12:49:15 -07:00
Paul Wells	1ef7c46fd7	publish stream stats to prometheus (#1313 ) * add prometheus stats for rtt/jitter/packet loss * add track source to metrics * better packet loss bins * add track type to metrics * remove source from AnalyticsStat * regenerate telemetry service fake * compute loss from per stream packet count	2023-01-19 19:37:15 -08:00
Benjamin Pracht	edc39da0b1	Add TwirpRequestStatusReporter twirp server hook to count requests (#1309 )	2023-01-18 11:53:20 -08:00
Dan McFaul	4d6f0cd0f7	Stats collect v2 (#1291 ) * initial commit * add correct label * clean up * more cleanup on adding stats * cleanup * move things to pub and sub monitors, ensure stats are correctly updated * fix merge conflict * Fix panic on MacOS (#1296) * fixing last feedback Co-authored-by: Raja Subramanian <raja.gobi@tutanota.com>	2023-01-11 14:49:50 -07:00
Raja Subramanian	1db218a5b1	Fix panic on MacOS (#1296 )	2023-01-11 10:08:56 +05:30
Mathew Kamkar	7c970da974	add memory used and total to node stats (#1293 ) * add memory used and total to node stats * raja review: consistency * update protocol	2023-01-10 12:32:04 -08:00
Mathew Kamkar	caae389717	node type prometheus metric labels (#1197 )	2022-11-29 20:36:35 -08:00
Raja Subramanian	1e8cc0dc76	Consolidate getMemoryStats (#1122 ) * Consolidate getMemoryStats * Avoid divide-by-0	2022-10-26 09:16:39 +05:30

1 2

78 Commits