Commit Graph

3015 Commits

Author SHA1 Message Date
Paul Wells
2a04bc3ca8 fix publisher frame count reporting for simulcast streams (#4457) 2026-04-16 11:08:33 -07:00
Anunay Maheshwari
1d804737f9 fix: limit join request and WHIP request body to http.DefaultMaxHeaderBytes (#4450)
* fix: CS-1665

* cleanup

* cleanup and testes

* updates
2026-04-16 01:12:33 +05:30
Raja Subramanian
3cfb71e7ca Use Muted in TrackInfo to propagated published track muted. (#4453)
* Use Muted in TrackInfo to propagated published track muted.

When the track is muted as a receiver is created, the receiver
potentially was not getting the muted property. That would result in
quality scorer expecting packets.

Use TrackInfo consistently for mute and apply the mute on start up of a
receiver.

* update mute of subscriptions
2026-04-16 01:03:40 +05:30
Raja Subramanian
69aa94797b Some drive-by clean up (#4452) 2026-04-15 12:23:33 +05:30
Raja Subramanian
6c81f67858 Add subscriber stream start event notification (#4449) 2026-04-14 22:08:31 +05:30
cnderrauber
ce1bf47b5c Revert "fix: ensure num_participants is accurate in webhook events (#4265) (#…" (#4448)
This reverts commit cdb0769c38.
2026-04-13 22:21:22 +08:00
Onyeka Obi
cdb0769c38 fix: ensure num_participants is accurate in webhook events (#4265) (#4422)
* fix: ensure num_participants is accurate in webhook events (#4265)

  Three fixes for stale/incorrect num_participants in webhook payloads:

  1. Move participant map insertion before MarkDirty in join path so
     updateProto() counts the new participant.
  2. Use fresh room.ToProto() for participant_joined webhook instead of
     a stale snapshot captured at session start.
  3. Remove direct NumParticipants-- in leave path (inconsistent with
     updateProto's IsDependent check), force immediate proto update,
     and wait for completion before triggering onClose callbacks.

* fix: use ToProtoConsistent for webhook events instead of forcing immediate updates
2026-04-13 09:26:14 +08:00
Raja Subramanian
c91e79af35 Switch to stdlib maps, slices (#4445)
* Switch to stdlib maps, slices

* slices
2026-04-13 00:11:48 +05:30
renovate[bot]
97378368dd Update go deps (major) (#3179)
* Update go deps

Generated by renovateBot

* update api usage

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: David Zhao <dz@livekit.io>
2026-04-11 14:28:33 -07:00
David Zhao
4b3856125c chore: pin GH commits and switch to golangci-lint (#4444)
* chore: pin GH commits

* switch to golangci-lint-action

* fix lint issues
2026-04-11 13:04:22 -07:00
Raja Subramanian
2974ba879f Unsubscribe from data track on close (#4443)
* Unsubscribe from data track on close

* clean up
2026-04-10 15:29:25 +05:30
Paul Wells
88c77dc666 compute agent dispatch affinity from target load (#4442)
* compute agent dispatch affinity from target load

* fix test config
2026-04-09 13:49:43 -07:00
Raja Subramanian
8fe9937770 Log join duration. (#4433)
* Log join duration.

Also revert the "unresolved" init. Defeated the purpose of log resolver
as it was resolving with those values even if not forced. Instead set it
to "unresolved" if not set when forced.

Join duration is not reset if resolver is reset as that happens on
moving a participant and there is no new join duration in that case.

* explode
2026-04-05 14:01:43 +05:30
Raja Subramanian
0a503a57f6 Add Close method for UpDataTrackManager and call it on participant (#4432)
* Add `Close` method for UpDataTrackManager and call it on participant
close.

* include out-of-order packets in total packets
2026-04-04 17:09:02 +05:30
Raja Subramanian
55912dff7e Add some simple data track stats (#4431) 2026-04-04 15:23:49 +05:30
Raja Subramanian
050909e627 Enable data tracks by default. (#4429) 2026-04-04 00:54:48 +05:30
David Zhao
72c7e65c25 chore: log API key during worker registration (#4428) 2026-04-03 09:48:42 -07:00
Raja Subramanian
8a67dd1b9f Do not close publisher peer connection to aid migration. (#4427) 2026-04-03 21:50:59 +05:30
Raja Subramanian
91e90c1020 Add some more logging around migration. (#4426)
Some e2e is failing due to subscriptions happening late and the expected
order of m-lines is different. Not a hard failure, but logging more to
make seeing this easie.
2026-04-03 13:07:32 +05:30
Raja Subramanian
c6ddc879e7 isExpectedToResume is based on whether flushing or not. (#4425)
For a participant migrating out, the track could be resumed on a
different node, but ending on the migrating out node. So, `flush` should
be used to indicate if track is going to be resumed.
2026-04-03 00:49:12 +05:30
Raja Subramanian
7d06cfca8b Keep subscription synchronous when publisher is expected to resume. (#4424)
Subscription can switch between remote track and local track or
vice-versa. When that happens, closing the subscribed track of one or
the other asynchronously means the re-subscribe could race with
subscribed track closing.

Keeping the case of `isExpectedToResume` sync to prevent the race.

Would be good to support multiple subscribed tracks per subscription.
So, when subscribed track closes, subscription manager can check and
close the correct subscribed track. But, it gets complex to clearly
determine if a subccription is pending or not and other events. So,
keeping it sync.
2026-04-02 19:54:14 +05:30
Raja Subramanian
934f8598e2 Clean up data track observers on unsubscribe. (#4421)
Media track clean up fixed some leaks. There are more when the
participants thrash. This is not the issue, but doing this to match
media tracks.
2026-04-02 11:55:46 +05:30
Raja Subramanian
9674ac48ab Cleaning up some logs and standardising log frequency. (#4420)
Removing some logs which have not been useful in terms of insights other
than saying that there are a bunch of packets missing. Going to start
looking at gaps in terms of time if the inter-packet gap is too high.

Also, using logging these events as first 20 and then every 200.
2026-04-01 21:17:43 +05:30
Raja Subramanian
7b92530461 Drop time inverted packets in RED -> Opus conversion. (#4418)
A bunch of edges to note here
RED packet does not have sequence number for redundant blocks. It only
has timestamp offset compared to the primary payload. The receivers are
supposed to use just timestamp to sequence the payload and decode.

But, when converting from RED -> Opus, the packets extracted from RED
packet should be assigned a sequence number before they can be
forwarded. The simple rule is, if packet N contains X redundant
payloads, they are assigned sequence number of N - X to N - 1.

However there are cases like the following sequence (with 1 packet
redundancy)
- Seq num 10, timestamp 2000, forwarded
- Seq num 11 is lost
- Seq num 12 has a redundant payload. Seq num 12 has timestamp of 4000.
  Ideally would expect the redundant payload to have a timestamp offset
  of 1000, so the redundant payload can be mapped to sequence number 11
  and timestamp 3000 (4000 - 1000). But, in the problematic case, it has
  an offset of 3000 resulting in sequence number 11 and timestamp of
  1000 causing an inversion with packet at sequence number 10.

Unclear if this a publisher issue, i. e. packing RED wrong or if this is
some expected behaviour with DTX. i. e. the DTX packets are not included
in redundant payload. For example, the sequence
- Seq num 10 -> DTX
- Seq num 11 -> DTX -> lost
- Seq num 12 -> Regular packet and include sequence num 9 as that is the
  last regular packet.

Anyhow, detect this condition and drop the time inverted packet.

Note however this handles only inversion against the highest sent packet
sequence number and timestamp. So, some old packet inverted with some
other old packet getting forwarded will get through. That has been the
case always though and detecting that would be expensive and
complicated.

At least for egress, will also look at adding a check for inversion so
that it can catch it before sending it down the gstreamer pipeline. As
the egress uses a jitter buffer with ordered sequence number emits, it
will be simpler to detect timestamp going back when sequence number is
moving forward (of course the mute/dtx challenege is there).
2026-04-01 11:40:01 +05:30
Paul Wells
4d8d232a19 ensure participant init is correctly serialized for logging (#4417) 2026-03-31 19:33:57 -07:00
Raja Subramanian
4fe80877df Log time inversion between incoming packets (#4415)
* Log time inversion between incoming packets

Log of timestamp inversion within a red packet did not show anything.
Log across packets. Not dropping till there is more evidence of the
cause.

* save

* comment
2026-03-31 20:09:07 +05:30
Raja Subramanian
248d73948d Guard against timestamp inversion in RED -> Opus conversion. (#4414)
* Guard against timestamp inversion in RED -> Opus conversion.

Seeing timestamp inversion (sequence number is +1, but timestamp is
-960, i.e. 20ms) in the RED -> Opus conversion path. Not able to spot
any bugs in code. So, logging details upon detection and also dropping
the packet. If not dropped, downstream components like Egress treat it
as a big timestamp jump (because sequence number is moving forward) and
try to adjust pts which ends up causing drops.

* do not log time reversal at the start

* typo
2026-03-31 17:08:13 +05:30
Paul Wells
9ab8c1d522 clear track notifier observers on subscription teardown (#4413)
When a subscriber disconnects, observer closures registered on the
publisher's TrackChangedNotifier and TrackRemovedNotifier were never
removed. These closures capture the SubscriptionManager, which holds
the ParticipantImpl, preventing the entire participant object graph
(PCTransport, SDPs, RTP stats, DownTracks) from being garbage collected.

In rooms with many participants that disconnect and reconnect frequently,
this causes unbounded memory growth proportional to the number of
disconnect events. The leaked memory is not recoverable while the room
remains open.

Clear notifiers in both handleSubscribedTrackClose (individual
subscription teardown) and SubscriptionManager.Close (full participant
teardown), matching the existing cleanup in handleSourceTrackRemoved.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 20:48:48 -07:00
Omar Pakker
e9b113c8f2 Make the TURN bind address configurable and allow for multiple addresses. (#4315) 2026-03-30 14:46:10 +08:00
Raja Subramanian
4bc5e6bbef Address malformed H264/H265 parsing issues. (#4407)
* Address malformed H264/H265 parsing issues.

Thank you for the report in
https://github.com/livekit/livekit/security/advisories/GHSA-qxj9-fmqx-r7j8#advisory-comment-179701
with examples. Addressing the parsing issues.

* early continue
2026-03-30 09:30:58 +05:30
Raja Subramanian
77a0a4fcc7 AV1 parser overflow fix. (#4405)
Upstream had patched this a while back -
c0b755f82f.

Addresses https://github.com/livekit/livekit/security/advisories/GHSA-qxj9-fmqx-r7j8
2026-03-29 09:53:15 +05:30
Anunay Maheshwari
ff7fd7ed56 feat(agent-dispatch): add job restart policy (#4401)
* feat(agent-dispatch): add job restart policy

* deps
2026-03-27 21:32:04 +05:30
Raja Subramanian
34bd1e0851 do not log roll over for padding only packets (#4396)
* do not log roll over for padding only packets

* calculated expected earlier
2026-03-26 11:47:29 +05:30
Paul Wells
13d02ee9a8 add deadline to dtls connect context (#4395) 2026-03-25 21:13:23 -07:00
Raja Subramanian
9055a34981 Path check helpers (#4392)
* Path check helpers

* remove trailing slash
2026-03-25 10:53:07 +05:30
cnderrauber
1f1eeb6832 Fallback to servicestore if rpc is unavailable (#4391)
* Fallback to servicestore if rpc is unavailable

compatibility mode for #4387

* conf
2026-03-25 11:09:52 +08:00
Raja Subramanian
59e9bb41b9 Fix TURN server URL (#4389)
Addresses https://github.com/livekit/livekit/issues/4384
2026-03-24 15:52:05 +05:30
Raja Subramanian
9e0a7e545f Close both peer connections to aid migration. (#4382)
* Close both peer connections to aid migration.

In single peer connection case, that would close publisher peer
connection.

@cnderrauber I don't remember why we only closed subscriber peer
connection. I am thinking it is okay to close both (or the publisher
peer connection in single peer connection mode). Please let me know if I
am missing something.

* log change only
2026-03-24 14:19:46 +05:30
cnderrauber
9474c807c0 route participant reads through PSRPC instead of Redis (#4387)
rel: #4373
2026-03-24 16:25:11 +08:00
David Chen
a5333a86bb add packet trailer stripping support (#4361)
* bump protocol version to 17 to enable packet trailer stripping functionality
* check subscriber protocol version for trailer stripping
2026-03-23 13:33:42 -07:00
Charlie Tonneslan
8cdd6f4cc7 Replace deprecated io/ioutil with io in whipservice (#4375)
ioutil.ReadAll has been deprecated since Go 1.16 in favor of io.ReadAll.
This was the last remaining io/ioutil usage in the codebase.
2026-03-21 10:01:30 -07:00
Théo Monnom
89410df74c handle AGENT_ERROR disconnect reason (#4339) 2026-03-17 23:00:16 -07:00
Raja Subramanian
8f984c770a Fix repair stream ID reporting for RTX pairing. (#4369)
If RTX stream got a packet before primary stream, the pairing was not
getting set up properly.
2026-03-17 15:06:57 +05:30
Raja Subramanian
cdfaacfca3 Restart nacker on OOB sequence number restart. (#4368)
Clears NACKs based on old sequence number base and restarts it.
Also rename the function to be more reflective of what stats lite is
for.
2026-03-17 09:16:08 +05:30
Raja Subramanian
750d5904f0 Add API to restart lite stats. (#4366) 2026-03-16 15:20:13 +05:30
Paul Wells
c8bb2578be Rename log field "pID" to "participantID" for consistency (#4365)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 04:32:02 -07:00
Raja Subramanian
77fc74a727 Do not block all ext ID determination on stream allocator listener (#4364)
availability.

Added checks for unexpected changes.
2026-03-15 14:30:47 +05:30
Raja Subramanian
90a46fabb1 Do not kick off migration of closed participant (#4363) 2026-03-15 10:39:55 +05:30
Raja Subramanian
5dc2e7b180 Switch data track extension to 1-byte ID/length. (#4362)
And match design to RTP header extension, i. e. the padding for
extensions is not at per extension level (which was the case before),
but has been changed to padding the aggregate of all extensions in this
PR.
2026-03-14 13:29:40 +05:30
Raja Subramanian
7323ad02b7 Sample data send error logging. (#4358)
There are cases where data channel is not created potentially and
logging on every one of those errors is verbose.
2026-03-12 12:02:18 +05:30