Commit Graph

888 Commits

Author SHA1 Message Date
Raja Subramanian 835ef1b353 Metrics for participant active, i. e. fully established. (#4557)
* Metrics for participant active, i. e. fully established.

- Egress stub for v2 API
- Fix the participant canceled counter 🤦
- Add active counter -> this is increment when a participant becomes
  active, i. e. primary peer connection established. Can be used to
  monitor node wise connection establishment issues.
- Add singnalling validation fail counter.

With this, we have
- signalling validation fail
- signalling failed --> this is when the `startSession` fails
- signalling connected -> signalling is succesful and can send back
  joinResponse to client

on media connection side
- rtc_init -> start
- rtc_connected -> participant session created (joined)
- rtc_active -> primay peer connection established
- rtc_canceled -> could not proceed with RTC connection due to not being
  able to resume.

* signalling counters deps

* revert pion/webrtc to 4.2.12 to get SCTP without interleaving

* go back to pion/webrtc 4.2.11 and sctp 1.9.5
2026-06-03 19:50:19 +05:30
cnderrauber 356ae211a3 Config documentation for advertise_internal_ip and skip_external_ip_validation (#4552)
See https://github.com/livekit/mediatransportutil/pull/88
2026-06-01 14:37:08 +08:00
Paul Wells 2dd5e63207 telemetry: split webhook-processed hook out of NewTelemetryService (#4548)
* telemetry: split webhook-processed hook registration out of NewTelemetryService

NewTelemetryService used to register a notifier processed-hook on the inner
*telemetryService directly. That made it impossible for downstream wrappers
(e.g. cloud's TelemetryService that overrides Webhook to fan out to a v3
observability pipeline) to intercept webhook events without double-firing
the legacy emission.

Lift the registration into a new exported helper RegisterWebhookHook, and
have the standalone server's wire provider createTelemetryService call it
right after construction so behavior is unchanged for callers that don't
wrap the service.
2026-05-27 09:40:55 -07:00
Paul Wells 222177a9e4 service: prevent nil deref in validate with wrapped join request (#4547)
When a client hits /rtc/v[01]/validate with a base64 WrappedJoinRequest
whose embedded JoinRequest.ClientInfo is unset, validateInternal called
AugmentClientInfo with a nil *ClientInfo and panicked at ci.Address =
GetClientIP(req). The non-wrapped branch already allocates via
ParseClientInfo; do the same here so pi.Client always gets at least the
resolved client Address.
2026-05-26 08:34:15 -07:00
Raja Subramanian dd7580b454 Protect against nil clientInfo (#4546) 2026-05-26 20:32:11 +05:30
Ninad Pundalik 145689e627 Start tracking Twirp method request latency in prometheus too, not just in logs (#4545)
* Start tracking Twirp method request latency in prometheus too, not just datadog
* Simplify latency tracking, do it in the logger itself
2026-05-26 14:53:16 +05:30
Raja Subramanian 2e22911dcd Remove backwards compatibility support for TURN auth. (#4539)
This was indiecated in release v1.12.0 - https://github.com/livekit/livekit/releases/tag/v1.12.0
2026-05-22 17:00:42 +05:30
He Chen 77595d387a TEL-336: fix sip error categorization (#4528) 2026-05-18 15:44:44 -07:00
cnderrauber f303f499ef Always enable rtx codec (#4533)
Sfu will fallback to retransmit packet by media stream ssrc if rtx
is not negotiated (client doesn't have), so we should not disable
rtx explicitly (by codec config).

Fix #4519
2026-05-18 15:51:10 +08:00
cnderrauber 89faaeba82 Apply ttl check only when authenticate allocation creating (#4526)
* Apply ttl check only when authenticate allocation creating

TTL check could reject allocation/persmission refresh in
security enhancement #4505, cause long-live session disconnect
when turn credential is expired.
Only check ttl on allocation creating to prevent abusing leaked
credential but keep long-live session work.
2026-05-15 14:55:05 +08:00
Denys Smirnov 8b79ec9e47 Support SIP auth realm for inbound. (#4522) 2026-05-14 10:45:16 +02:00
networkException d123675008 feat: auto create rooms for tokens with the RoomCreate grant (#4320)
This patch updates the check for auto creating rooms to also
consider the RoomCreate grant per token instead of just the
global config option.

With this patch, applications can decide on their own whether
users or which users can auto create rooms. This allows
applications that rely on auto creation (saving an API call)
to co-exist with those who might want to mint tokens for
subscribe-only users.

Specifically LaSuite Meet relies on the auto create behavior,
however enabling the global config option would make a
MatrixRTC deployment vulnerable to abuse, as users on remote
homeservers get tokens in order to subscribe.
2026-05-13 11:25:08 +05:30
Théo Monnom 7a3e595bde apply room tags from JWT grant room configuration (#4518) 2026-05-12 21:21:42 -07:00
Raja Subramanian cf20c9cd05 Add expiry to TURN password. (#4515)
* Add expiry to TURN password.

Defaults to 5m. For backwards compatibility expiry = 0 skips adding it.

* fix variable shadowing
2026-05-09 12:15:01 +05:30
Paul Wells 12fff29a12 allow setting agent job assignment url (#4512) 2026-05-07 13:13:21 -07:00
Paul Wells 8fbc5adfce update protocol for protojson (#4510) 2026-05-07 00:55:00 -07:00
Raja Subramanian 3de6f517e5 Add TURN permission handler. (#4505)
* Add TURN permission handler.

- Turn off permissions to private/link local/multicast and internal IPs
- Add a list of CIDRs that can be used for more things to deny
  permission to.

* unused

* add config for allowing private IPs, used in testing

* add a TTL to user name and use it to auth

* allow list for restricted peer CIDRs
2026-05-06 23:43:11 +05:30
Denys Smirnov 8ffcef93b2 Update protocol to support SIP media config. (#4509) 2026-05-06 18:18:21 +02:00
Raja Subramanian c4fd71a5dd Fix sense check in DeltaInfo gathering (#4507) 2026-05-06 13:34:26 +05:30
Paul Wells 803999efad rename agent environment to deployment (#4506)
* rename agent environment to deployment

* deps
2026-05-05 14:19:40 -07:00
Paul Wells bacc21e6c0 add helper to check for agent worker endpoint (#4503) 2026-05-05 13:38:53 -07:00
Paul Wells 253f977d32 add duration seconds reporting (#4500)
* add duration seconds reporting

* deps

* deps
2026-05-02 06:19:23 -07:00
Paul Wells ffab3bd308 add agent environment (#4498)
* add agent environment

* lint

* psrpc error

* deps
2026-05-01 19:30:06 -07:00
Théo Monnom af1dcc8843 Add CloseWithReason to agent SignalConn interface (#4492) 2026-04-28 22:14:06 -07:00
David Chen 743d9c8b3a add support for client capabilities (#4461)
* update protocol version

* only check for client capabiltiy to strip packet trailer
2026-04-27 17:58:36 -07:00
Fabian Stehle f3b80b2886 fix: wrap IPv6 addresses in brackets in UDP TURN URLs (RFC 3986) (#4476)
`iceServersForParticipant` builds UDP TURN URLs by interpolating the
node IP directly into a format string:

    fmt.Sprintf("turn:%s:%d?transport=udp", ip, port)

When `NodeIP.V6` is set, `ToStringSlice()` includes the bare IPv6
address, producing URLs like:

    turn:2a05:d014:ee4:1201:7039:38c:f652:a252:443?transport=udp

RFC 3986 §3.2.2 requires IPv6 addresses in URIs to be enclosed in
square brackets. Without them the port is ambiguous and WebRTC clients
(e.g. libdatachannel) reject the URL with "Invalid ICE server port".

Use `net.JoinHostPort` which handles bracketing for IPv6 and is a
no-op for IPv4, producing well-formed URLs:

    turn:[2a05:d014:ee4:1201:7039:38c:f652:a252]:443?transport=udp
    turn:1.2.3.4:443?transport=udp
2026-04-24 14:28:25 +05:30
Anunay Maheshwari 1d804737f9 fix: limit join request and WHIP request body to http.DefaultMaxHeaderBytes (#4450)
* fix: CS-1665

* cleanup

* cleanup and testes

* updates
2026-04-16 01:12:33 +05:30
cnderrauber ce1bf47b5c Revert "fix: ensure num_participants is accurate in webhook events (#4265) (#…" (#4448)
This reverts commit cdb0769c38.
2026-04-13 22:21:22 +08:00
Onyeka Obi cdb0769c38 fix: ensure num_participants is accurate in webhook events (#4265) (#4422)
* fix: ensure num_participants is accurate in webhook events (#4265)

  Three fixes for stale/incorrect num_participants in webhook payloads:

  1. Move participant map insertion before MarkDirty in join path so
     updateProto() counts the new participant.
  2. Use fresh room.ToProto() for participant_joined webhook instead of
     a stale snapshot captured at session start.
  3. Remove direct NumParticipants-- in leave path (inconsistent with
     updateProto's IsDependent check), force immediate proto update,
     and wait for completion before triggering onClose callbacks.

* fix: use ToProtoConsistent for webhook events instead of forcing immediate updates
2026-04-13 09:26:14 +08:00
Raja Subramanian c91e79af35 Switch to stdlib maps, slices (#4445)
* Switch to stdlib maps, slices

* slices
2026-04-13 00:11:48 +05:30
David Zhao 4b3856125c chore: pin GH commits and switch to golangci-lint (#4444)
* chore: pin GH commits

* switch to golangci-lint-action

* fix lint issues
2026-04-11 13:04:22 -07:00
Paul Wells 88c77dc666 compute agent dispatch affinity from target load (#4442)
* compute agent dispatch affinity from target load

* fix test config
2026-04-09 13:49:43 -07:00
Raja Subramanian 8fe9937770 Log join duration. (#4433)
* Log join duration.

Also revert the "unresolved" init. Defeated the purpose of log resolver
as it was resolving with those values even if not forced. Instead set it
to "unresolved" if not set when forced.

Join duration is not reset if resolver is reset as that happens on
moving a participant and there is no new join duration in that case.

* explode
2026-04-05 14:01:43 +05:30
Raja Subramanian 050909e627 Enable data tracks by default. (#4429) 2026-04-04 00:54:48 +05:30
David Zhao 72c7e65c25 chore: log API key during worker registration (#4428) 2026-04-03 09:48:42 -07:00
Raja Subramanian 8a67dd1b9f Do not close publisher peer connection to aid migration. (#4427) 2026-04-03 21:50:59 +05:30
Raja Subramanian 91e90c1020 Add some more logging around migration. (#4426)
Some e2e is failing due to subscriptions happening late and the expected
order of m-lines is different. Not a hard failure, but logging more to
make seeing this easie.
2026-04-03 13:07:32 +05:30
Raja Subramanian 7d06cfca8b Keep subscription synchronous when publisher is expected to resume. (#4424)
Subscription can switch between remote track and local track or
vice-versa. When that happens, closing the subscribed track of one or
the other asynchronously means the re-subscribe could race with
subscribed track closing.

Keeping the case of `isExpectedToResume` sync to prevent the race.

Would be good to support multiple subscribed tracks per subscription.
So, when subscribed track closes, subscription manager can check and
close the correct subscribed track. But, it gets complex to clearly
determine if a subccription is pending or not and other events. So,
keeping it sync.
2026-04-02 19:54:14 +05:30
Omar Pakker e9b113c8f2 Make the TURN bind address configurable and allow for multiple addresses. (#4315) 2026-03-30 14:46:10 +08:00
Anunay Maheshwari ff7fd7ed56 feat(agent-dispatch): add job restart policy (#4401)
* feat(agent-dispatch): add job restart policy

* deps
2026-03-27 21:32:04 +05:30
Raja Subramanian 9055a34981 Path check helpers (#4392)
* Path check helpers

* remove trailing slash
2026-03-25 10:53:07 +05:30
cnderrauber 1f1eeb6832 Fallback to servicestore if rpc is unavailable (#4391)
* Fallback to servicestore if rpc is unavailable

compatibility mode for #4387

* conf
2026-03-25 11:09:52 +08:00
Raja Subramanian 59e9bb41b9 Fix TURN server URL (#4389)
Addresses https://github.com/livekit/livekit/issues/4384
2026-03-24 15:52:05 +05:30
cnderrauber 9474c807c0 route participant reads through PSRPC instead of Redis (#4387)
rel: #4373
2026-03-24 16:25:11 +08:00
Charlie Tonneslan 8cdd6f4cc7 Replace deprecated io/ioutil with io in whipservice (#4375)
ioutil.ReadAll has been deprecated since Go 1.16 in favor of io.ReadAll.
This was the last remaining io/ioutil usage in the codebase.
2026-03-21 10:01:30 -07:00
Paul Wells c8bb2578be Rename log field "pID" to "participantID" for consistency (#4365)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 04:32:02 -07:00
cnderrauber e963953817 Refine ipv6 support (#4352)
* Refine ipv6 support

* go mod

* check ipv4 is set in turn
2026-03-09 20:43:00 +08:00
Milos Pesic b34b047247 Add StopEgress function to the EgressLauncher interface (#4353)
This allows for abstracting away how the stop is implemented - default implementation stays the same - the existing OSS egress launcher just calls the existing Stop method on the client.
2026-03-09 13:17:05 +01:00
He Chen cb7dc2d02a TEL-405: support originating calls from custom domains (#4349) 2026-03-06 12:25:40 -08:00
Denys Smirnov 493e87dfd4 Fix SIP client timeout. (#4345) 2026-03-05 19:09:17 +02:00