Commit Graph

136 Commits

Author SHA1 Message Date
Benjamin Pracht
552e3758d5 Add IngressUpdated event (#1775) 2023-06-16 10:58:49 -07:00
David Zhao
f71544e27a Do not send ParticipantJoined webhook if connection was resumed (#1795)
* Do not send ParticipantJoined webhook if connection was resumed

* isResume -> isMigration
2023-06-15 15:39:04 -07:00
shishirng
2dd4e1365b Send EgressUpdated event (#1792)
Signed-off-by: shishir gowda <shishir@livekit.io>
2023-06-14 18:56:07 -04:00
David Zhao
7e5a7ae79f Fixed windows build (#1768) 2023-06-04 00:17:25 -07:00
Benjamin Pracht
e7879a46fc Add ingress telemetry support (#1763) 2023-06-02 17:38:19 -07:00
David Zhao
956735ae05 Fix node stats updates on Windows (#1748)
Because we aren't able to get CPU count/load info on Windows, they are
stubbed out to return placeholders. This restores compatibility to run
on Windows.
2023-05-29 10:53:08 -07:00
shishirng
3de51181ec Fix setting minscore - initialized to 0 (#1725)
Signed-off-by: shishir gowda <shishir@livekit.io>
2023-05-19 11:00:32 -04:00
shishirng
2e93d386fe send min/median connection score along with avg (#1720)
* send min/median connection score along with avg
* guard against divide by zero for avg score calculation
* update median calculation

Signed-off-by: shishir gowda <shishir@livekit.io>
2023-05-18 13:50:54 -04:00
Raja Subramanian
a085afc6ee Send quality stats to prometheus. (#1708) 2023-05-12 09:44:03 +05:30
David Colburn
2ccee369a6 update notifier (#1702) 2023-05-09 20:52:22 -07:00
David Colburn
ab6c994db4 update protocol/psrpc (#1643)
* update protocol/psrpc

* metadata references
2023-04-21 12:43:20 -07:00
David Zhao
40ceddd18b Integrate QueuedNotifier, fixes out-of-order delivery (#1615) 2023-04-15 01:20:23 -07:00
Paul Wells
6636e37664 add prometheus psrpc metrics observer (#1571)
* add prometheus psrpc metrics observer

* record rpc error counts

* update psrpc

* update protocol
2023-04-05 03:50:43 -07:00
David Colburn
108b251045 egress updated webhook (#1555) 2023-03-27 16:34:44 -07:00
David Colburn
191a9e8014 update core to 0.0.5 (#1540)
* update core

* sort imports

* fix typos

* redundant types
2023-03-22 16:53:23 -07:00
David Zhao
5ff72a99b9 Report publish & subscribe RTPStats as Telemetry events (#1506) 2023-03-10 10:28:54 -08:00
shishirng
8856ce6422 Bump up interval for sending telemetry stats to 30 seconds (#1430)
Signed-off-by: shishir gowda <shishir@livekit.io>
2023-02-16 15:53:58 -05:00
Dan McFaul
1848a21eda add configurable environment value (#1421)
* add configurable prometheus env label

* Update pkg/config/config.go

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* Update cmd/server/main.go

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* Update config-sample.yaml

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>

* set config.Environment value to dev when in dev mode

* be more precise for config-sample

---------

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>
2023-02-15 14:41:44 -07:00
Mathew Kamkar
937256d89e don't error when get tc stats fails (#1386) 2023-02-10 10:05:45 -08:00
David Zhao
2851a8ac98 Improved robustness of subscription stack (#1382)
UpdateSubscription had a shortcoming where when it couldn't find the
participant, it ignored the request.

This PR further removes the reliance of current publisher state from
subscribers.
- SubscribeToTrack only takes in a trackID
- Introduced RoomTrackManager to maintain all published tracks to a room
- Added TrackUnpublished event to clearly indicate when a track has been removed
- SubscribeRequested event no longer include information about the publisher
2023-02-06 18:08:26 -08:00
cnderrauber
8b6dab780c Add reconnect reason and signal rtt calculation (#1381)
* Add connect reason and signal rtt calculate

* Update protocol

* solve comment
2023-02-06 11:12:25 +08:00
David Zhao
b023c531c2 Fix incorrect unsubscribed track telemetry (#1350)
- only log unsubscribe on close if Track was actually bound
- update subscribed counters even if a failure had been logged before
2023-01-30 10:16:21 -08:00
Paul Wells
52fd0a641b adjust jitter histogram buckets (#1347)
* adjust jitter histogram buckets

* typo
2023-01-29 22:45:32 -08:00
David Zhao
2fa46e2df4 Retry initial connection attempt should it fail (#1335)
Sometimes the initial selected node could fail. In that case, we'll give it a few more attempts to locate a media node for the session instead of failing it after the first try.
2023-01-25 22:59:57 -08:00
David Zhao
cd6b8b80b9 feat: SubscriptionManager to consolidate subscription handling (#1317)
Added a new manager to handle all subscription needs. Implemented using reconciler pattern. The goals are:

improve subscription resilience by separating desired state and current state
reduce complexity of synchronous processing
better detect failures with the ability to trigger full reconnect
2023-01-24 23:06:16 -08:00
Dan McFaul
9e3ca1e989 adding rtc_init stat (#1316)
* adding rtc_initiated stat

* clean up signal and rtc init/connected

* update naming and break out stats update funcs

* update protocol dependency
2023-01-23 12:49:15 -07:00
Paul Wells
1ef7c46fd7 publish stream stats to prometheus (#1313)
* add prometheus stats for rtt/jitter/packet loss

* add track source to metrics

* better packet loss bins

* add track type to metrics

* remove source from AnalyticsStat

* regenerate telemetry service fake

* compute loss from per stream packet count
2023-01-19 19:37:15 -08:00
Benjamin Pracht
edc39da0b1 Add TwirpRequestStatusReporter twirp server hook to count requests (#1309) 2023-01-18 11:53:20 -08:00
David Zhao
732309a8c1 Added track success & muted events (#1308)
Related to livekit/protocol#273

This PR adds:
- ParticipantResumed - for when ICE restart or migration had occurred
- TrackPublishRequested - when we initiate a publication
- TrackSubscribeRequested - when we initiate a subscription
- TrackMuted - publisher muted track
- TrackUnmuted - publisher unmuted track
- TrackPublish/TrackSubcribe events will indicate when those actions have been successful, to differentiate.
2023-01-15 15:40:20 -08:00
Dan McFaul
4d6f0cd0f7 Stats collect v2 (#1291)
* initial commit

* add correct label

* clean up

* more cleanup on adding stats

* cleanup

* move things to pub and sub monitors, ensure stats are correctly updated

* fix merge conflict

* Fix panic on MacOS (#1296)

* fixing last feedback

Co-authored-by: Raja Subramanian <raja.gobi@tutanota.com>
2023-01-11 14:49:50 -07:00
Raja Subramanian
1db218a5b1 Fix panic on MacOS (#1296) 2023-01-11 10:08:56 +05:30
Mathew Kamkar
7c970da974 add memory used and total to node stats (#1293)
* add memory used and total to node stats

* raja review: consistency

* update protocol
2023-01-10 12:32:04 -08:00
shishirng
dceb624b83 Send participant object in telemetry on Active event (#1282)
Signed-off-by: shishir gowda <shishir@livekit.io>
2023-01-04 08:55:49 -05:00
David Zhao
c1d7dbd4fc Tweaks to prometheus participant counter (#1240)
* Tweaks to prometheus participant counter

Ensure that we don't miss adding a count in migration scenarios

* avoid nil ICEConfig
2022-12-19 14:30:14 -08:00
David Zhao
120335da00 Allow skipping of sending ParticipantJoined analytics event (#1236)
In certain scenarios such as migration, we do not want a duplicate event
to be sent when the participant is reconnecting. The Prometheus metric
should still be updated though.
2022-12-18 22:09:20 -08:00
David Zhao
33902a9f2a Do not send ParticipantLeft webhook event unless connected successfully. (#1234)
Fixes #1130
2022-12-18 17:37:55 -08:00
cnderrauber
3c907ed460 Add stats for data channel and signal (#1198)
* Add stats for data channel and signal

* Solve comment
2022-11-30 14:53:19 +08:00
Mathew Kamkar
caae389717 node type prometheus metric labels (#1197) 2022-11-29 20:36:35 -08:00
Raja Subramanian
1e8cc0dc76 Consolidate getMemoryStats (#1122)
* Consolidate getMemoryStats

* Avoid divide-by-0
2022-10-26 09:16:39 +05:30
Raja Subramanian
96a058b503 Populate memory load in node stats. (#1121) 2022-10-25 21:31:23 +05:30
shishirng
8dc5a899a9 Create stats worker for participant on Active if not exists (#1059)
on migration, participants don't send JOIN event, so create it in
PARTICIPANT_ACTIVE event
2022-09-29 17:26:43 -04:00
David Zhao
3da908302a Do not warn when notifier isn't configured (#1043)
By default there are no webhook URLs to notify, so a notifier isn't created.
2022-09-26 13:30:27 -07:00
David Colburn
803046b882 Auto egress (#1011)
* auto egress

* fix room service test

* reuse StartTrackEgress

* add timestamp

* update prefixed filename explicitly

* update protocol

* clean up telemetry

* fix telemetry tests

* separate room internal storage

* auto participant egress

* remove custom template url

* fix internal key

* use map for stats workers

* remove sync.Map

* remove participant composite
2022-09-21 12:04:19 -07:00
cnderrauber
c401ca58af turn packet and bytes stats used for telemetry and load control (#969)
* stats for turn

* add connections stats

* stats for standalone turn server only

* wire update
2022-08-31 11:00:27 +08:00
Mathew Kamkar
767d660809 Use LocalNode ID in Prometheus metrics (#959) 2022-08-25 22:16:20 -07:00
shishirng
79cf614783 Send egressInfo in telemetry event (#941)
Signed-off-by: shishir gowda <shishir@livekit.io>

Signed-off-by: shishir gowda <shishir@livekit.io>
2022-08-23 08:18:12 -04:00
David Zhao
b8bda3f14b Separate calls to Telemetry vs Prometheus room lifecycle (#935)
* Separate calls to Telemetry vs Prometheus room lifecycle

* remove unused import
2022-08-20 20:22:16 -07:00
shishirng
a3e8304b56 send participant info/identity during track_published event (#846)
Signed-off-by: shishir gowda <shishir@livekit.io>
2022-07-21 17:34:52 -04:00
Raja Subramanian
29039b4e76 Use a go routine to clean up stats workers. (#836)
* Use a go routine to clean up stats workers.

It is possible that certain events (like TrackUnpublished) can
happen after the participant is closed. For webhooks pertaining
to those events, need details like room name/id. So,reap stats
workers a little while after the participant left event happens.

* handle data race report

* log analytics worker reap

* debug log
2022-07-18 11:47:43 +05:30
Mathew Kamkar
e0676132d4 Packet stats from TC (#832)
* system level packet stats from tc

* drop percent

* test fix

* formatting

* formatting/wording

* prometheus metrics

* update livekit protocol go module
2022-07-15 10:41:40 -07:00