Commit Graph

70 Commits

Author SHA1 Message Date
Raja Subramanian
10103449c5 Add country label to edge prom stats. (#3816)
* Add country label to edge prom stats.

* data channel country stats

* test

* pub/sub time country
2025-07-24 13:23:05 +05:30
Raja Subramanian
fc867c5b8e Webhook prom stats (#3697) 2025-06-04 14:31:28 -07:00
Raja Subramanian
1c8307c72c Use cgroup for memstats. (#3573)
* Use cgroup for memstats.

* deps
2025-04-05 11:54:36 +05:30
Raja Subramanian
3238ab8d77 Calculate rates for memory used and total. (#3570)
Calculating rate for total does seem odd, but keeping it consitent/lined
up with used memory calculation.
2025-04-02 10:23:38 +05:30
Raja Subramanian
8cc17f8f8b Rework node stats a bit. (#3555)
* Rework node stats a bit.

Related protocol PR - https://github.com/livekit/protocol/pull/1023

- Make a config for node stats measurements. Wanted to put the config in
  `routing` package, but a circular dependency forced me to put in
   config.go
- Make rate calculations explicit, i. e. requested via config.
  Previously, it had some odd checks to decide when to calculate rate
  and it would have been calculating over different windows.
- Report signal/data channel bytes every 5 seconds to stats collection
  module. Previously, it was doing it every 30 seconds and that meant
  some windows could have had a large spike
  NOTE: Still need to think about this for load calculations as a large
  number of participants leaving could flush in a small window and that
  could report a large spike in bytes/packets. Maybe need to ignore
  signal bytes for load calculation?

* deps

* use default node stats config if given config is nil

* split out node stats into a struct for re-use

* update config
2025-03-27 12:42:19 +05:30
Paul Wells
3167266495 add datapacket stream metrics (#3450)
* add datapacket stream metrics

* normalize mime type
2025-02-19 22:28:10 -08:00
Raja Subramanian
3b0077f2fe Log connection quality changes. (#3311)
Also remove the connection quality drop prom as it is unused and also
adds state/complexity.
2025-01-07 10:58:31 +05:30
Raja Subramanian
86383b2271 De-centralize some configs to where they are used. (#3162)
* De-centralize some configs to where they are used.

And make default variables.

Renaming a bit, but these are all internal config and have not been
added to documented config.

* Keep documented config as is.

* test

* typo
2024-11-08 12:47:30 +05:30
cnderrauber
cf59267631 Add counter for pub&sub time metrics (#3084)
* Add counter for pub&sub time metrics

The pub&sub shows large value in migration related case like
muted/disabled migration, the subscription time depends on
the time when publisher unmute the track(sending rtp packet
after migration), add a counter to distinguish since we
can't control the time in such cases and the first subscription
attemps also is more meaningful than those cases.

* Add info log for high publish delay
2024-10-11 12:07:24 +08:00
cnderrauber
978db00034 Add sdk, participant_kind to pub sub metrics (#3023)
* exclude go client from track publication metric

* add sdk,participant_kind lables

* fix test
2024-09-19 10:42:47 +08:00
Raja Subramanian
787b8450e9 Record out-of-packet count/rate in prom. (#2980)
* Record out-of-packet count/rate in prom.

Adding a field to AnalyticsStream to make this easier to report.
Let me know if adding to AnalyticsStream is not ok.

Will set up a protocol PR if it is okay.

* deps
2024-09-07 00:19:54 +05:30
cnderrauber
947e8f5909 Speed up track publication (#2952)
* speed up track publication

Add metrics for track publication and subscription

Return EnabledCodecs in JoinResponse so client can
choose codec without server side codec fallback

Cache remote webrtc track without AddTrackRequest to
let client send publisher offer before AddTrackRequest response

* go mod

* clean code
2024-08-23 18:38:32 +08:00
Lukas Herman
8a229fda9d add participant session duration metric (#2801) 2024-06-17 17:52:08 -04:00
cnderrauber
7ed1284b96 report average forward metrics (#2737)
* report average forward metrics

* unused parameter
2024-05-28 17:03:18 +08:00
cnderrauber
2288e402ac register forward metrics (#2735) 2024-05-27 15:47:01 +08:00
Paul Wells
38470f378b add message bytes metric (#2731) 2024-05-26 14:01:13 -07:00
cnderrauber
e6aa36fdd6 Add forward stats (#2725)
* Add forward metrics

* ignore packets was not forwarded

* rename
2024-05-24 17:43:28 +08:00
Mathew Kamkar
10c8582a6b get cpu stats from cgroup, remove env (#2636)
* get cpu stats from cgroup, remove env

* undo rand seed removal

* tests
2024-04-08 21:15:17 -07:00
Paul Wells
e5b8e25064 use shared psrpc utils (#2506)
* use shared psrpc utils

* fix

* deps
2024-02-24 00:38:49 -08:00
Mathew Kamkar
7508560fde larger buckets for jitter prometheus histogram (#2468) 2024-02-09 12:09:51 -08:00
Paul Wells
c726cbf2ba increase max session start time bin size (#2380) 2024-01-12 03:49:23 -08:00
Paul Wells
2fe2a9c9f2 add session start time metric (#2377) 2024-01-11 23:23:51 -08:00
Paul Wells
f4a984d446 preallocate prometheus packet counters (#1942) 2023-08-08 01:06:14 -07:00
David Zhao
981fb7cac7 Adding license notices (#1913)
* Adding license notices

* remove from config
2023-07-27 16:43:19 -07:00
David Zhao
7e5a7ae79f Fixed windows build (#1768) 2023-06-04 00:17:25 -07:00
David Zhao
956735ae05 Fix node stats updates on Windows (#1748)
Because we aren't able to get CPU count/load info on Windows, they are
stubbed out to return placeholders. This restores compatibility to run
on Windows.
2023-05-29 10:53:08 -07:00
Raja Subramanian
a085afc6ee Send quality stats to prometheus. (#1708) 2023-05-12 09:44:03 +05:30
David Colburn
ab6c994db4 update protocol/psrpc (#1643)
* update protocol/psrpc

* metadata references
2023-04-21 12:43:20 -07:00
Paul Wells
6636e37664 add prometheus psrpc metrics observer (#1571)
* add prometheus psrpc metrics observer

* record rpc error counts

* update psrpc

* update protocol
2023-04-05 03:50:43 -07:00
Dan McFaul
1848a21eda add configurable environment value (#1421)
* add configurable prometheus env label

* Update pkg/config/config.go

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* Update cmd/server/main.go

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* Update config-sample.yaml

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* set config.Environment value to dev when in dev mode

* be more precise for config-sample

---------

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>
2023-02-15 14:41:44 -07:00
Mathew Kamkar
937256d89e don't error when get tc stats fails (#1386) 2023-02-10 10:05:45 -08:00
Paul Wells
52fd0a641b adjust jitter histogram buckets (#1347)
* adjust jitter histogram buckets

* typo
2023-01-29 22:45:32 -08:00
David Zhao
2fa46e2df4 Retry initial connection attempt should it fail (#1335)
Sometimes the initial selected node could fail. In that case, we'll give it a few more attempts to locate a media node for the session instead of failing it after the first try.
2023-01-25 22:59:57 -08:00
David Zhao
cd6b8b80b9 feat: SubscriptionManager to consolidate subscription handling (#1317)
Added a new manager to handle all subscription needs. Implemented using reconciler pattern. The goals are:

improve subscription resilience by separating desired state and current state
reduce complexity of synchronous processing
better detect failures with the ability to trigger full reconnect
2023-01-24 23:06:16 -08:00
Dan McFaul
9e3ca1e989 adding rtc_init stat (#1316)
* adding rtc_initiated stat

* clean up signal and rtc init/connected

* update naming and break out stats update funcs

* update protocol dependency
2023-01-23 12:49:15 -07:00
Paul Wells
1ef7c46fd7 publish stream stats to prometheus (#1313)
* add prometheus stats for rtt/jitter/packet loss

* add track source to metrics

* better packet loss bins

* add track type to metrics

* remove source from AnalyticsStat

* regenerate telemetry service fake

* compute loss from per stream packet count
2023-01-19 19:37:15 -08:00
Benjamin Pracht
edc39da0b1 Add TwirpRequestStatusReporter twirp server hook to count requests (#1309) 2023-01-18 11:53:20 -08:00
Dan McFaul
4d6f0cd0f7 Stats collect v2 (#1291)
* initial commit

* add correct label

* clean up

* more cleanup on adding stats

* cleanup

* move things to pub and sub monitors, ensure stats are correctly updated

* fix merge conflict

* Fix panic on MacOS (#1296)

* fixing last feedback

Co-authored-by: Raja Subramanian <raja.gobi@tutanota.com>
2023-01-11 14:49:50 -07:00
Raja Subramanian
1db218a5b1 Fix panic on MacOS (#1296) 2023-01-11 10:08:56 +05:30
Mathew Kamkar
7c970da974 add memory used and total to node stats (#1293)
* add memory used and total to node stats

* raja review: consistency

* update protocol
2023-01-10 12:32:04 -08:00
Mathew Kamkar
caae389717 node type prometheus metric labels (#1197) 2022-11-29 20:36:35 -08:00
Raja Subramanian
1e8cc0dc76 Consolidate getMemoryStats (#1122)
* Consolidate getMemoryStats

* Avoid divide-by-0
2022-10-26 09:16:39 +05:30
Raja Subramanian
96a058b503 Populate memory load in node stats. (#1121) 2022-10-25 21:31:23 +05:30
cnderrauber
c401ca58af turn packet and bytes stats used for telemetry and load control (#969)
* stats for turn

* add connections stats

* stats for standalone turn server only

* wire update
2022-08-31 11:00:27 +08:00
Mathew Kamkar
767d660809 Use LocalNode ID in Prometheus metrics (#959) 2022-08-25 22:16:20 -07:00
Mathew Kamkar
e0676132d4 Packet stats from TC (#832)
* system level packet stats from tc

* drop percent

* test fix

* formatting

* formatting/wording

* prometheus metrics

* update livekit protocol go module
2022-07-15 10:41:40 -07:00
David Zhao
b316698409 Release with GoReleaser. Allow start without key configuration (#788) 2022-06-26 12:27:43 -07:00
Raja Subramanian
f19815754c Do not re-compute average on real time metric change (#743) 2022-05-31 10:33:17 +05:30
Raja Subramanian
508aa471a9 Track participant join total + rate in node stats (#741)
* Track participant join total + rate in node stats

* update protocol
2022-05-30 15:58:30 +05:30
Raja Subramanian
33032f6c4b Fix some test races and other things found with go test -race (#711) 2022-05-24 10:16:36 +05:30