Commit Graph

3750 Commits

Author SHA1 Message Date
cloudwebrtc feec013c4b enable LiveStreamingMode for test. 2026-06-09 16:18:26 +08:00
cloudwebrtc 3fa24cf7c8 fix: acquire requested video layer directly under live streaming mode
A subscriber that requests the top spatial layer would briefly decode a
lower layer before settling on the requested one (e.g. layer 0 -> layer 2),
a visible low->high quality ramp. Two distinct causes:

1. Simulcast.Select latched opportunistically onto the first key frame of
   any layer <= target, so a lower layer's key frame (which usually arrives
   first) was selected before the requested layer's.

2. When the subscriber joined before the publisher started, the layers are
   detected gradually and `maxSeen` climbs 0->1->2. AllocateOptimal caps the
   target at `min(maxSeen, requested)`, so the target itself ramped up and
   Select followed it.

The new behavior is gated behind a `LiveStreamingMode` Room config option
(default false -> original opportunistic behavior unchanged):

- Select latches directly onto the target layer during initial acquisition.
- Forwarder gains an initial-acquisition grace: while not yet streaming and
  the requested layer has not been seen, the target/key-frame-request aim
  straight at the requested layer instead of the highest seen so far. Gated
  on `maxSeen < requested` so steady-state behavior (incl. overshoot) is
  unchanged.
- If the requested layer never shows up within the grace, the key frame
  requester triggers a re-allocation so the target falls back to the highest
  layer actually seen, avoiding a stall/black screen.
- On bind, a live-streaming subscriber requests the highest layer up front
  (instead of the adaptive-stream LOW start) so it is acquired directly.

LiveStreamingMode is threaded from config.RoomConfig through the participant
(GetLiveStreamingMode) and SubscribedTrack/DownTrack params into the Forwarder
and Simulcast layer selector.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 16:04:51 +08:00
David Zhao 46c4309554 fix goreleaser workflow, version 1.13.1 (#4577) v1.13.1 2026-06-08 16:15:09 -07:00
David Zhao e0815be27d chore: improve docker test shutdown reliability (#4576) 2026-06-08 08:27:15 -07:00
Dan Root bfd9deffd7 expose TCPFallbackRTTThreshold and AllowUDPUnstableFallback via config (#4556) 2026-06-08 22:07:08 +08:00
Raja Subramanian b93c1e1607 Release v1.13.0. (#4573)
Please see note about TURN authentication without TTL backwards
compatibility removal.
v1.13.0
2026-06-08 13:46:52 +05:30
Raja Subramanian fd452212c7 Update mediatransportutil to get ICE candidate timeout config (#4572) 2026-06-08 12:42:58 +05:30
renovate[bot] 8be8c74a59 Update github workflows (#4463)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2026-06-07 23:08:50 -07:00
renovate[bot] c4e41872c5 Update go deps to v1.17.2 (#4462)
Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2026-06-07 23:08:05 -07:00
renovate[bot] dc8e0310ad Update go deps to v4 (#4482)
* Update go deps to v4

Generated by renovateBot

* update dockertest to v4

* fix

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: David Zhao <dz@livekit.io>
2026-06-07 23:07:40 -07:00
Ben Mayer 20fd1ad2c1 turn: allow for providing secret via file (#4564)
* turn: allow for providing secret via file

* turn: improve secret_file changes
2026-06-08 11:18:14 +08:00
Raja Subramanian 6590570d7c Pin pion/dtls to v3.1.2 (#4570) 2026-06-06 20:36:10 +05:30
Paul Wells cdbbee1f8e deps: bump protocol + psrpc to latest tips (#4565)
* deps: bump psrpc + protocol + cloud-protocol + backend-common to latest

* deps: go get -u sweep + bump counterfeiter

Direct deps bumped via go get -u ./...:
  clipperhouse/displaywidth 0.10.0 -> 0.11.0
  clipperhouse/uax29/v2 2.6.0 -> 2.7.0
  fatih/color 1.18.0 -> 1.19.0
  florianl/go-tc 0.4.7 -> 0.4.8
  hashicorp/go-version 1.8.0 -> 1.9.0
  livekit/protocol -> a7a83da5 (latest)
  ua-parser bumped
  urfave/cli/v3 3.8.0 -> 3.9.0
  otel/sdk 1.43.0 -> 1.44.0
  yaml.in/yaml/v2 2.4.2 -> 2.4.4

Indirect: mdlayher/netlink + socket bumped, mattn/* bumped, olekukonko/*
bumped, counterfeiter v6.11.1 -> v6.12.2.

Newer-major audit: no actionable majors. +incompatible: twitchtv/twirp
v8.1.3 (upstream choice, stays); docker/cli and docker/docker indirect.

Notable stuck patterns worth a separate cleanup:
- pkg/errors v0.9.1 direct dep (unmaintained; stdlib supplants)
- go.uber.org/atomic + multierr direct deps
- ory/dockertest/v3 v3.12.0 (v4 is available — cascade has migrated
  cloud-protocol, backend-common, psrpc to v4).

* deps: pin pion/webrtc/v4 v4.2.11 + pion/sctp v1.9.5

* deps: pin protocol+psrpc+MTU to landed versions
2026-06-05 14:48:51 -07:00
cnderrauber d290de8165 Correct config comment (#4563) 2026-06-04 16:43:59 +08:00
Paul Wells 77ecf920ff rtc: report participant session end time on room move (#4561)
MoveToRoom resets the participant reporter resolver to receive new
(room, participant_session) keys for the destination, but the source
room's participant_session row never gets an end_time — the periodic
duration scrape only emits one once disconnectedAt is set, and a move
doesn't transition the participant to DISCONNECTED. Report end_time
immediately before the reset so the row is closed out cleanly.
2026-06-03 21:35:39 -07:00
cnderrauber 63be96f631 Prevent panic from nil(illegal) syncState.Subscriptions message (#4560) 2026-06-04 10:32:24 +08:00
Raja Subramanian 835ef1b353 Metrics for participant active, i. e. fully established. (#4557)
* Metrics for participant active, i. e. fully established.

- Egress stub for v2 API
- Fix the participant canceled counter 🤦
- Add active counter -> this is increment when a participant becomes
  active, i. e. primary peer connection established. Can be used to
  monitor node wise connection establishment issues.
- Add singnalling validation fail counter.

With this, we have
- signalling validation fail
- signalling failed --> this is when the `startSession` fails
- signalling connected -> signalling is succesful and can send back
  joinResponse to client

on media connection side
- rtc_init -> start
- rtc_connected -> participant session created (joined)
- rtc_active -> primay peer connection established
- rtc_canceled -> could not proceed with RTC connection due to not being
  able to resume.

* signalling counters deps

* revert pion/webrtc to 4.2.12 to get SCTP without interleaving

* go back to pion/webrtc 4.2.11 and sctp 1.9.5
2026-06-03 19:50:19 +05:30
cnderrauber 5bd425346c Document of advertise_internal_ip and external_ip_only (#4554) 2026-06-02 09:50:45 +08:00
cnderrauber 356ae211a3 Config documentation for advertise_internal_ip and skip_external_ip_validation (#4552)
See https://github.com/livekit/mediatransportutil/pull/88
2026-06-01 14:37:08 +08:00
shishirng 7c319a67d4 rtc: prevent duration reporting for inactive participants (#4550)
Added a check to ensure that duration is not published for participants
that never became active.
2026-05-27 14:39:04 -04:00
Paul Wells 2dd5e63207 telemetry: split webhook-processed hook out of NewTelemetryService (#4548)
* telemetry: split webhook-processed hook registration out of NewTelemetryService

NewTelemetryService used to register a notifier processed-hook on the inner
*telemetryService directly. That made it impossible for downstream wrappers
(e.g. cloud's TelemetryService that overrides Webhook to fan out to a v3
observability pipeline) to intercept webhook events without double-firing
the legacy emission.

Lift the registration into a new exported helper RegisterWebhookHook, and
have the standalone server's wire provider createTelemetryService call it
right after construction so behavior is unchanged for callers that don't
wrap the service.
2026-05-27 09:40:55 -07:00
Paul Wells 222177a9e4 service: prevent nil deref in validate with wrapped join request (#4547)
When a client hits /rtc/v[01]/validate with a base64 WrappedJoinRequest
whose embedded JoinRequest.ClientInfo is unset, validateInternal called
AugmentClientInfo with a nil *ClientInfo and panicked at ci.Address =
GetClientIP(req). The non-wrapped branch already allocates via
ParseClientInfo; do the same here so pi.Client always gets at least the
resolved client Address.
2026-05-26 08:34:15 -07:00
Raja Subramanian dd7580b454 Protect against nil clientInfo (#4546) 2026-05-26 20:32:11 +05:30
Ninad Pundalik 145689e627 Start tracking Twirp method request latency in prometheus too, not just in logs (#4545)
* Start tracking Twirp method request latency in prometheus too, not just datadog
* Simplify latency tracking, do it in the logger itself
2026-05-26 14:53:16 +05:30
Paul Wells cde8962709 rtc: emit per-data-track bytes via BytesTrackStats (#4540)
Data tracks (the new _data_track datachannel) previously only updated a
private dataTrackStats that logged a single summary at Close. Bytes never
reached the OnTrackStats -> TelemetryService.TrackStats pipeline that
media tracks and signal channels feed.

Wire DataTrack (UPSTREAM, publisher-home) and DataDownTrack (DOWNSTREAM,
per-subscriber) into BytesTrackStats on the same 5s cadence, mirroring
the media-track convention: subscriber's country and ID with publisher's
track ID for DOWNSTREAM. Cross-region proxy DataTracks leave the stats
pointer nil (no publisher reporter on that node, and relayed bytes would
double-count). Legacy dataTrackStats packet-loss/frame counters are
preserved.
2026-05-23 17:42:55 -07:00
Raja Subramanian 2e22911dcd Remove backwards compatibility support for TURN auth. (#4539)
This was indiecated in release v1.12.0 - https://github.com/livekit/livekit/releases/tag/v1.12.0
2026-05-22 17:00:42 +05:30
Raja Subramanian 062d12197f Use NACKQuueInterface type. (#4538)
And some extra logging for subscription permission when it fails.
2026-05-21 23:00:51 +05:30
Paul Wells 7f08b04c1e Add IsIntentionalDisconnect helper (#4537)
Shared helper for callers that need to distinguish intentional/expected
participant closures (client leave, admin action, room teardown, migration)
from connection failures. Extracted from cloud's IsClosedIntentionally
switch so cloud-side code paths can share a single source of truth.
2026-05-20 11:42:51 -07:00
Raja Subramanian 1ab2bf043b Clean up packet size logging (#4536)
Reverting
- https://github.com/livekit/livekit/pull/4521
- https://github.com/livekit/livekit/pull/4525

There are TWCC feedback packets that are larger than MTU. Seems to
happen under a couple of conditions
1. Bad client data, i. e. severely out-of-order packets, bad sequence
   numbers, etc.
2. On an ICE restart - this is rare, but it seemed to be flaky network
   with some packets arriving and some not and causing a lot of gaps.

Either case, not much to do. If fargmentation/re-assembly back to
publisher works, the feedback will make it through. If not, feedbacks
will be missed and clients have to work with some missing data which is
not unexpected and the protocol is designed to handle.

However, filed pion/interceptor issue just in case - https://github.com/pion/interceptor/issues/416
2026-05-20 23:58:05 +05:30
cnderrauber 8ab92a80f6 Don't require media sections when joining (#4535)
* Don't require media sections when joining

Client except browser (rust/libwebrtc is known) could have problem
to fire ontrack event when reuses extra media section to subscribe
track, so disable this feature in server side and let client determine
if extra media sections are needed.

* lint
2026-05-20 13:28:51 +08:00
Paul Wells 019a6640ae rtc: report participant kind code and details (#4534)
* rtc: report participant kind code and details

Plumb ParticipantKind and KindDetails through MediaTrack and
BytesTrackStats so track-level reporting can record the numeric kind
code plus details codes on every participant_session aggregation,
alongside the existing Kind string. Also picks up the new kind fields
on resolved BytesSignalStats participants.

Adds deployment/agentID/version to the agent worker logger.
2026-05-18 23:20:52 -07:00
He Chen 77595d387a TEL-336: fix sip error categorization (#4528) 2026-05-18 15:44:44 -07:00
cnderrauber f303f499ef Always enable rtx codec (#4533)
Sfu will fallback to retransmit packet by media stream ssrc if rtx
is not negotiated (client doesn't have), so we should not disable
rtx explicitly (by codec config).

Fix #4519
2026-05-18 15:51:10 +08:00
Raja Subramanian e4a8a55c4b Check Less and LessEq in version compare. (#4532)
* Check Less and LessEq in version compare.

Thank you @cnderrauber for catching this.

* add test
2026-05-18 12:38:49 +05:30
Raja Subramanian 37eb7a3276 Release v1.12.0 (#4529)
* Release v1.12.0

Please read the note in the release about TURN related changes and let
me know if it is clear enough that projects should update and prepare
for backwards compatibility removal in the next release.

* space
v1.12.0
2026-05-16 22:11:24 +05:30
Raja Subramanian 4a7b1e8587 Create NACK tracker only once. (#4527)
Not a major issue, but just avoiding duplicate creation of NACK module.
RTCP feedback of `nack` and `nack pli` end up getting treated as `nack`
and was double creating.
2026-05-15 12:45:51 +05:30
cnderrauber 89faaeba82 Apply ttl check only when authenticate allocation creating (#4526)
* Apply ttl check only when authenticate allocation creating

TTL check could reject allocation/persmission refresh in
security enhancement #4505, cause long-live session disconnect
when turn credential is expired.
Only check ttl on allocation creating to prevent abusing leaked
credential but keep long-live session work.
2026-05-15 14:55:05 +08:00
Raja Subramanian b32933b0d4 Log details of RTCP packets. (#4525)
* Log details of RTCP packets.

Seeing large (> MTU) packets on publisher peer connection RTCP. The
four types there are
- RTCP Receiver Reports
- NACK
- TWCC
- PLI

Can't think of what would be blowing up in size.

RTCP Receiver Report and PLI are fixed in size

NACKs vary, but the limit is 100 NACKs which should fit in 400 bytes
even if all of them are spread apart in the sequence number space.

TWCC varies, but a feedback packet is sent every 100ms or when it holds
100 packets. So, that also should not be too big.

Logging packet details to understand this better.

* revert debug
2026-05-14 18:55:00 +05:30
Denys Smirnov 8b79ec9e47 Support SIP auth realm for inbound. (#4522) 2026-05-14 10:45:16 +02:00
Raja Subramanian 4b8db3cfe5 Add integration test for TURN auth failures (#4524)
* Add integration test for TURN auth failures

Covers four credential-corruption scenarios against the TURN server
embedded in a single-node server: unparseable username, wrong password,
expired username, and unknown API key. Each case drives a raw pion
turn.Client Allocate and asserts the server rejects with a TURN error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* TURN auth test: cover password expiry binding

The TURN password's hash includes the expiry along with the secret and
participant ID. Add two cases that exercise this binding: a password
generated for a different expiry than the username's, and a password
generated without any expiry component paired with a username that has
one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 10:48:27 +05:30
Raja Subramanian ef2e5efe14 Log large packets receive/send. (#4521)
* Log large packets receive/send.

Seeing cases of servers reporting need for segmentation/re-assembly of
packets. So, logging packet receive/send for RTP/RTCP to check if
anything is seeing more than 1400 byte packets.

* log downtrack RTCP too
2026-05-13 16:04:53 +05:30
networkException d123675008 feat: auto create rooms for tokens with the RoomCreate grant (#4320)
This patch updates the check for auto creating rooms to also
consider the RoomCreate grant per token instead of just the
global config option.

With this patch, applications can decide on their own whether
users or which users can auto create rooms. This allows
applications that rely on auto creation (saving an API call)
to co-exist with those who might want to mint tokens for
subscribe-only users.

Specifically LaSuite Meet relies on the auto create behavior,
however enabling the global config option would make a
MatrixRTC deployment vulnerable to abuse, as users on remote
homeservers get tokens in order to subscribe.
2026-05-13 11:25:08 +05:30
Théo Monnom 7a3e595bde apply room tags from JWT grant room configuration (#4518) 2026-05-12 21:21:42 -07:00
Paul Wells ab7fdeab7c add AssignmentHook to AssignJob; propagate websocket write errors (#4516)
* add AssignmentHook to AssignJob; propagate websocket write errors

- Replace the `url *string` parameter on `Worker.AssignJob` with a
  middleware-style `AssignmentHook` so callers can intercept the
  `JobAssignment` send (e.g. to set Url, or to gate hedged attempts so
  only one assignment is written).
- Remove the `sendRequest` helper. Inline `WriteServerMessage` and
  propagate the error: `AssignJob` returns immediately on a failed
  availability or assignment write, leaving the job out of
  `runningJobs`; `TerminateJob` still updates local bookkeeping when
  the wire write fails but surfaces the write error to the caller.

* tidy
2026-05-10 21:14:02 -07:00
Raja Subramanian cf20c9cd05 Add expiry to TURN password. (#4515)
* Add expiry to TURN password.

Defaults to 5m. For backwards compatibility expiry = 0 skips adding it.

* fix variable shadowing
2026-05-09 12:15:01 +05:30
Raja Subramanian 20d4a3a168 Populate data track loggers with context (#4514) 2026-05-09 10:14:48 +05:30
Paul Wells 12fff29a12 allow setting agent job assignment url (#4512) 2026-05-07 13:13:21 -07:00
Denys Smirnov ba366fc712 Fix SIP media config upgrade. (#4511) 2026-05-07 10:12:45 +02:00
Paul Wells 8fbc5adfce update protocol for protojson (#4510) 2026-05-07 00:55:00 -07:00
Raja Subramanian 3de6f517e5 Add TURN permission handler. (#4505)
* Add TURN permission handler.

- Turn off permissions to private/link local/multicast and internal IPs
- Add a list of CIDRs that can be used for more things to deny
  permission to.

* unused

* add config for allowing private IPs, used in testing

* add a TTL to user name and use it to auth

* allow list for restricted peer CIDRs
2026-05-06 23:43:11 +05:30