Commit Graph

413 Commits

Author SHA1 Message Date
David Zhao 2fa46e2df4 Retry initial connection attempt should it fail (#1335)
Sometimes the initial selected node could fail. In that case, we'll give it a few more attempts to locate a media node for the session instead of failing it after the first try.
2023-01-25 22:59:57 -08:00
David Zhao bd39a96eac Tweak call stack depth to show more helpful error lines (#1333) 2023-01-25 15:35:18 -08:00
David Zhao cd6b8b80b9 feat: SubscriptionManager to consolidate subscription handling (#1317)
Added a new manager to handle all subscription needs. Implemented using reconciler pattern. The goals are:

improve subscription resilience by separating desired state and current state
reduce complexity of synchronous processing
better detect failures with the ability to trigger full reconnect
2023-01-24 23:06:16 -08:00
David Colburn e31b25300d update psrpc (#1312) 2023-01-18 13:52:03 -08:00
Benjamin Pracht edc39da0b1 Add TwirpRequestStatusReporter twirp server hook to count requests (#1309) 2023-01-18 11:53:20 -08:00
David Colburn a87107a0f3 IOInfo service (#1305)
* IOInfo service

* only start if not nil

* use ctx in updateEgressInfo

* updates

* fix merge
2023-01-16 16:26:03 -08:00
David Zhao 732309a8c1 Added track success & muted events (#1308)
Related to livekit/protocol#273

This PR adds:
- ParticipantResumed - for when ICE restart or migration had occurred
- TrackPublishRequested - when we initiate a publication
- TrackSubscribeRequested - when we initiate a subscription
- TrackMuted - publisher muted track
- TrackUnmuted - publisher unmuted track
- TrackPublish/TrackSubcribe events will indicate when those actions have been successful, to differentiate.
2023-01-15 15:40:20 -08:00
David Zhao 17236799bb Fix handling of non-monotonic timestamps (#1304)
* Fix handling of non-monotonic timestamps

Timed version is inspired by Hybrid Clock. We used to have a mixed behavior
by using time.Time:
* during local comparisons, it does increment monotonically
* when deserializing remote timestamps, we lose that attribute

So it's possible for two requests to be sent in the same microsecond, and
for the latter one to be dropped.

To fix that behavior, I'm switching it to keeping timestamps to consolidate
that behavior, and accepting multiple updates in the same ms by incrementing ticks.

Also using @paulwe's idea of a version generator.
2023-01-12 11:57:26 -08:00
Paul Wells a052ebd644 Ingress psrpc (#1295)
* add ingress psrpc codegen

* use psrpc for ingress

* merge entity/info update psrpc services

* split update/delete ingress methods

* add race helper test

* add race context cancel test

* sync race result with mutex
2023-01-12 11:00:43 -08:00
Dan McFaul 4d6f0cd0f7 Stats collect v2 (#1291)
* initial commit

* add correct label

* clean up

* more cleanup on adding stats

* cleanup

* move things to pub and sub monitors, ensure stats are correctly updated

* fix merge conflict

* Fix panic on MacOS (#1296)

* fixing last feedback

Co-authored-by: Raja Subramanian <raja.gobi@tutanota.com>
2023-01-11 14:49:50 -07:00
Benjamin Pracht 0ca80a4fa7 Fix log statement in egress service (#1301) 2023-01-11 11:42:47 -08:00
cnderrauber 25debc6d35 add reconnect response to update configuration while reconnecting (#1300)
* add reconnect response to update configuration while reconnecting

* fix test
2023-01-11 17:40:12 +08:00
Raja Subramanian 4ba7e57683 Make an IsDisconnected interface and use it (#1278) 2022-12-31 12:53:02 +05:30
David Zhao 112d6fc18b Reduced log verbosity for pieces that are stable (#1274) 2022-12-29 23:47:36 -08:00
Benjamin Pracht 86bf5cb62e Ensure we create en Egress ID with PsRPC (#1273) 2022-12-30 13:46:45 +13:00
Benjamin Pracht 7778cdf2cd Do not use the egress version stored in redis to decide whether to enable PsRPC. Use a conf entry instead (#1262) 2022-12-30 09:32:55 +13:00
David Colburn 5d3f644667 update psrpc (#1266) 2022-12-27 13:43:32 -08:00
David Zhao 988858a98a Update dependencies to generic versions (#1259) 2022-12-26 22:29:13 -08:00
David Colburn 976d4ea9db Update psrpc, egressStore interface (#1256)
* Update psrpc, egressStore interface

* psrpc v0.2.0
2022-12-24 00:49:31 -08:00
Raja Subramanian 1a48cc6a8b Track subscription operations per source track. (#1248) 2022-12-23 12:23:26 +05:30
David Colburn 6719a3c714 Updated egress rpc (#1252)
* updated egress rpc

* check if egress exists on stop

* fix static check

* remove old migration code

* rename

* regenerate, update test

* latest staticcheck

* update to psrpc 0.1.0

* fix tests

* dual write rpcs on running egress

* remove unused field

* fix race, change service for egress impl

* return nil if bus is nil

* id -> ids

* add affinityFunc to StartEgress
2022-12-22 21:03:27 -08:00
Raja Subramanian 50e39b9985 Check participant SID also while removing a participant. (#1237) 2022-12-19 22:53:11 +05:30
David Zhao 120335da00 Allow skipping of sending ParticipantJoined analytics event (#1236)
In certain scenarios such as migration, we do not want a duplicate event
to be sent when the participant is reconnecting. The Prometheus metric
should still be updated though.
2022-12-18 22:09:20 -08:00
Raja Subramanian 241a7120f5 ICE config using protocol model (#1233)
* ICE config using protocol model

* use pointers consistently

* protocol pointer

* mage generate
2022-12-19 10:25:08 +05:30
David Zhao 33902a9f2a Do not send ParticipantLeft webhook event unless connected successfully. (#1234)
Fixes #1130
2022-12-18 17:37:55 -08:00
Haibo Chen 8a6c6de1db update name of participant (#1213) 2022-12-15 22:03:59 -08:00
David Zhao 7a1273151f Update to new logging library, using sampling participant logger (#1219) 2022-12-09 00:09:03 -08:00
Raja Subramanian 6bd5504bff Add option to issue full reconnect on a publication error. (#1214)
* Add option to issue full reconnect on a publication error.

Leaving the publication error timeout at 30 seconds as there
are some publications taking long. Also, there are cases
where the peer connection fails after 30 seconds. The peer
connection failure happens after publication error is detected.
But, 30 seconds is a good amount of time for publication to establish.

* prevent recursive lock
2022-12-06 14:46:59 +05:30
David Zhao e9abb47020 Added logging fields for Ingress & Egress services (#1205) 2022-12-04 21:44:16 -08:00
David Zhao 14de2bec9c Fixed single-node routing breakage. (#1209)
* Fixed single-node routing breakage.

Due to a regression of a previous change, Redis was always enabled even
when no configuration was provided.

* updated go modules
2022-12-04 16:23:35 -08:00
David Zhao 12ae179be2 Configurable RoomService execution timeout (#1206)
* API execution timeout is now configurable

In certain environments, it can take longer than the default 2s to
fully execute API requests. Making execution timeout a configurable option.

* do not expose api to YAML. internal for now.
2022-12-04 10:13:09 -08:00
David Zhao d146ec7a1f Improve logging messages with RoomService (#1203) 2022-11-30 22:17:28 -08:00
cnderrauber 3c907ed460 Add stats for data channel and signal (#1198)
* Add stats for data channel and signal

* Solve comment
2022-11-30 14:53:19 +08:00
cnderrauber 6711060cdb Add enable loopback candidate option (#1185) 2022-11-23 16:01:36 +08:00
Tom Xiong e5dabd466e Support redis cluster mode (#1181)
* use redisConfig of protocol instead of redisConfig and use redis of protocol to create redis client to support redis cluster mode too
2022-11-22 10:36:43 -08:00
Benjamin Pracht 2c2c6f9da2 Do not append the stream key to the ingress URL for rtmp (#1167) 2022-11-15 09:19:25 -08:00
David Zhao e2d775588f Confirm room creation prior to returning from CreateRoom (#1157) 2022-11-09 23:47:41 -08:00
David Zhao e5d21cb1d9 CreateRoom API to actually create the room on an RTC node (#1155)
Previously, CreateRoom only created the room in storage, but did not
hydrate it on an RTC node. This has caused strange behaviors such as
emptyTimeout not working correctly (#1109).

Also reduced room reap worker to consistently reap rooms. Fixes #241
2022-11-09 23:35:35 -08:00
MaxnSter 7e89ad3fbd RedisStore: make UnlockRoom atomic (#1044)
Co-authored-by: David Zhao <dz@livekit.io>
2022-11-07 23:04:46 -08:00
Benjamin Pracht c735668f67 Use the redis.UniversalClient interface instead of *redis.Client when interacting with go-redis (#1149)
* Use the redis.UniversalClient interface instead of *redis.Client when interacting with go-redis

* Update protocol to v1.2.1
2022-11-07 17:27:28 -08:00
cnderrauber 0310aa9250 Make sure client get participant info before track fired (#1147) 2022-11-07 14:50:45 +08:00
Benjamin Pracht 9a45b59414 Use ingress specific grants (#1125) 2022-10-26 21:37:36 -07:00
David Colburn 7223d9c132 web egress (#1126) 2022-10-26 13:43:56 -07:00
Mathew Kamkar 26fe910e88 Generated CLI Flags (#1112) 2022-10-25 22:24:08 -07:00
Raja Subramanian 82c2ec8273 Remove named returns from room service. (#1124) 2022-10-26 10:24:13 +05:30
Raja Subramanian 4344af6fd3 Some misc changes (#1107)
- ticker.Stop always
- clean up timer func (if they are added) on participant close
- sequencer test enhancement to add a real packet after a pdding packet
2022-10-20 11:11:45 +05:30
David Colburn dff5379b78 remove record check on CreateRoom (#1096) 2022-10-17 11:25:20 -07:00
cnderrauber 759e3bb1f2 Refine nat 1to1 mapping setting (#1094)
Now only set mapping when user_external_ip enabled or node_ip is
explicitly set. If multiple local address resolved to same external
ip, only the first one will be mapped to external, avoid candidate
conflict between different clients.
2022-10-17 16:11:52 +08:00
David Zhao 4161768530 Log Service API requests (#1091) 2022-10-17 00:16:54 -07:00
Samuel Humeau 00ec859dd1 Add default handler to 404 (#1088) 2022-10-16 10:30:34 -07:00