Commit Graph

186 Commits

Author SHA1 Message Date
Raja Subramanian c3964ba2eb Use sync.Pool for objects in packet path. (#4066)
* Use sync.Pool for objects in packet path.

Seeing cases of forwarding latency spikes that aling with GC.

This might be a bit overkill, but using sync.Pool for small +
short-lived objects in packet path.

Before this, all these were increasing in alloc_space heap profile
samples over time. With these, there is no increase (actually the lines
corresponding to geting from pool does not even show up in heap
accounting when doing `list` in `pprof`)

* merge

* Paul feedback
2025-11-14 16:13:23 +05:30
Raja Subramanian 1dc9b8fc5c Use buffered indicator to exclude from forwarding latency. (#4062)
* Debug high forwarding latency missing.

* log highest

* log condition

* update log

* log

* log

* change log

* Track start up delay.

Digging into forwarding latency, there are a few things
1. Seems to be caused due to forwarding packets queued before bind. They
   would be in the queue till bind. There are two ways it is showing up
   a. Bind itself is delayed and releasing queued packets causes the
      high forwarding latency.
   b. There is a significant gap between bind and first packet being
      pulled off the queue to be forwarded, in one example 100ms.

(a) is understandable if the signalling delays things. Can drop these
packets without forwarding or indicate in the packet that it is a queued
packet and drop it from forwarding latency calculation. Dropping is
probably better as down stream components like egress will see a burst
in these situations.

(b) looks like go scheduling latency? Unsure.

Logging more to understand this better.

* log start

* Use buffered indicator to exclude from forwarding latency.

Buffered packets live the queue for a while before Bind releases them.
They have high(ish) queuing latency and not true representation of
forwarding latency.
2025-11-07 21:46:14 +05:30
Raja Subramanian f117ee511f Track start up delay. (#4061)
* Debug high forwarding latency missing.

* log highest

* log condition

* update log

* log

* log

* change log

* Track start up delay.

Digging into forwarding latency, there are a few things
1. Seems to be caused due to forwarding packets queued before bind. They
   would be in the queue till bind. There are two ways it is showing up
   a. Bind itself is delayed and releasing queued packets causes the
      high forwarding latency.
   b. There is a significant gap between bind and first packet being
      pulled off the queue to be forwarded, in one example 100ms.

(a) is understandable if the signalling delays things. Can drop these
packets without forwarding or indicate in the packet that it is a queued
packet and drop it from forwarding latency calculation. Dropping is
probably better as down stream components like egress will see a burst
in these situations.

(b) looks like go scheduling latency? Unsure.

Logging more to understand this better.

* log start
2025-11-07 16:55:18 +05:30
cnderrauber c264b504c4 Don't warn 0 payload type for PCMU (#4039) 2025-10-28 23:11:51 +08:00
Raja Subramanian 32fc35254e Broadcast cond var on RTX write. (#4038)
* Broadcast cond var on RTX write.

High forwarding latency logs all show high queuing delay so far. From
code inspection, RTX writes were not signaling the cond var. Not sure if
that is the reason, but adding a signal there for further tests.

* Remove return values from writeRTX as they are not used
2025-10-28 11:27:02 +05:30
Raja Subramanian a2ce73e0d0 Do not bind buffer if codec is invalid. (#4028)
Seeing cases of codec with zero clock rate. Do not bind to those.
2025-10-25 14:30:30 +05:30
Raja Subramanian b07e7a3828 Use difference in key frame counter to stop seeder. (#3936)
The key frame seeder could be started multiple times.
Use difference to detect stop condition.
2025-09-19 15:15:26 +05:30
Raja Subramanian 6489237e33 Simulcast audio fixes (#3925)
* Simulcast audio fixes

* clean up
2025-09-14 09:41:40 +05:30
cnderrauber 5026de2bea handle frame number wrap back in svc (#3885)
* handle frame number wrap back in svc

* Add Slack Notifier

* check nil dd ext

* log format
2025-08-29 17:11:49 +08:00
cnderrauber b660c3b582 Extract video size from media stream (#3856)
* Extract video size from media stream

* fix test
2025-08-18 09:06:57 +08:00
cnderrauber 8c2fc0bcd9 Fix svc encoding for chrome mobile on iOS (#3751)
The browser could send rtp packets of svc encoding without
DD extension while the sdp negotiates it, sfu detects extension
in rtp packet for this case.
2025-06-23 22:39:12 +08:00
Raja Subramanian 086704128c Limit buffer queue before Bind. (#3634)
* Limit buffer queue before Bind.

* more generic
2025-05-01 13:49:06 +05:30
Raja Subramanian f24152b4c0 Call Broadcast in lock scope. (#3625)
* Call Broadcast in lock scope.

Seems like there is a possible window where things can hang forever
if a goroutine enters the Wait) after the lock is released but before
Broadcast gets called, it will never see that broadcast and will hang forever.

* RLock
2025-04-25 12:12:10 +05:30
Raja Subramanian 8eb81388e6 Use a generation to counter to stop key frame seeder on codec change (#3531) 2025-03-18 07:41:30 +05:30
Raja Subramanian a6cb00b31e Reduce seeder duration to 30s and also do not force send PLI. (#3525)
Can use the normal PLI throttle cadence.
2025-03-13 10:41:42 +05:30
Raja Subramanian c823320528 Add a key frame seeder in up track. (#3524) 2025-03-12 22:11:27 +05:30
cnderrauber ff9115b228 Disable dd parser for vp8 if extension is not found (#3492)
Browser would not send dd extension for vp8 in some case even if
it is negotiated.
2025-03-06 17:20:00 +08:00
Raja Subramanian 9551c52c85 Try 2 to consolidate mime type (#3407)
* Normalize mime type and add utilities.

An attempt to normalize mime type and avoid string compares remembering
to do case insensitive search.

Not the best solution. Open to ideas. But, define our own mime types
(just in case Pion changes things and Pion also does not have red mime
type defined which should be easy to add though) and tried to use it everywhere.
But, as we get a bunch of callbacks and info from Pion, needed conversion in
more places than I anticipated. And also makes it necessary to carry
that cognitive load of what comes from Pion and needing to process it
properly.

* more locations

* test

* Paul feedback

* MimeType type

* more consolidation

* Remove unused

* test

* test

* mime type as int

* use string method

* Pass error details and timeouts. (#3402)

* go mod tidy (#3408)

* Rename CHANGELOG to CHANGELOG.md (#3391)

Enables markdown features in this otherwise already markdown'ish formatted document

* Update config.go to properly process bool env vars (#3382)

Fixes issue https://github.com/livekit/livekit/issues/3381

* fix(deps): update go deps (#3341)

Generated by renovateBot

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>

* Use a Twirp server hook to send API call details to telemetry. (#3401)

* Use a Twirp server hook to send API call details to telemetry.

* mage generate and clean up

* Add project_id

* deps

* - Redact requests
- Do not store responses
- Extract top level fields room_name, room_id, participant_identity,
  participant_id, track_id as appropriate
- Store status as int

* deps

* Update pkg/sfu/mime/mimetype.go

* Fix prefer codec test

* handle down track mime changes

---------

Co-authored-by: Denys Smirnov <dennwc@pm.me>
Co-authored-by: Philzen <Philzen@users.noreply.github.com>
Co-authored-by: Pablo Fuente Pérez <pablofuenteperez@gmail.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Paul Wells <paulwe@gmail.com>
Co-authored-by: cnderrauber <zengjie9004@gmail.com>
2025-02-10 10:44:15 +05:30
cnderrauber aeec75edeb H265 supoort and codec regression (#3358)
* H265 supoort and codec regression

Support H265 codec.
Add optional codec regression for subscribers don't
support advanced codecs like H265, AV1, VP9.

* restart forwarder on upstream codec change

* tests

* Reneogitate new codec if client doesn't support change

* Add option to disable codec regression

---------

Co-authored-by: boks1971 <raja.gobi@tutanota.com>
2025-02-06 11:56:49 +08:00
Raja Subramanian 7c58fdf329 move unrolled mime type check for broader use (#3326)
* move unrolled mime type check for broader use

* Use in IsSvcCodec and make MimeType exported

* test

* tidy branches

* tidy

---------

Co-authored-by: Paul Wells <paulwe@gmail.com>
2025-01-13 10:24:03 +05:30
cnderrauber 384e21abc0 vp8 temporal layer selection with dependency descriptor (#3302)
* vp8 with dd

* make temporal layer selection work with DD

* fix test

---------

Co-authored-by: boks1971 <raja.gobi@tutanota.com>
2025-01-03 21:26:03 +08:00
Raja Subramanian c8b644934f Update deque and friends. (#3276) 2024-12-20 07:16:14 +05:30
cnderrauber 54f9f7de51 upgrade to pion/webrtc v4 (#3213) 2024-11-28 16:05:38 +08:00
Raja Subramanian ceb8a70696 Use same components when logger is updated (#3166)
Logger in buffer can get updated when the layer is known. Use the same
components used in destructor.
2024-11-11 11:38:48 +05:30
cnderrauber ca77df8212 warn for multiple dd ext (#3135)
* warn for multiple dd ext

* unused
2024-10-24 16:59:24 +08:00
Raja Subramanian 40b10af960 Use monotonic time util. (#3112)
Thank you @paulwe for doing this. I was promising to do this for a
while, but just like other times, empty promises :-(
2024-10-17 10:49:24 +05:30
Raja Subramanian 2491ee7c7c Make lite version of RTPStatsReceiver called RTPStatsReceiverLite. (#3065)
* Make lite version of RTPStatsReceiver called RTPStatsReceiverLite.

Refactor around that.

Will probably make some more flavors to have lighter versions still.

* update deps

* use MarshalLogArray

* use util
2024-10-05 10:50:25 +05:30
Raja Subramanian 8ac33a868c Splitting out rtp stats stuff into its own package. (#3060)
* Splitting out rtp stats stuff into its own package.

Going to be making some lighter versions of these.
Will be cleaner to have all of these grouped together.
So, as a first step, just making a package for it.

* tests
2024-10-03 15:51:24 +05:30
Raja Subramanian b678ccdd66 Cache RTCP sender report in forwarder state. (#2994)
* Cache RTCP sender report in forwarder state.

To be used in migration.

TODO: need to check more places to operate pure in unix nano rather than
converting.

* match name
2024-09-10 20:50:50 +05:30
Raja Subramanian 08b8ef56de Use monotonic clock in packet path. (#2940)
Set up a base time when starting a receiver and use that clock as base
for other packet times to ensure that clock is monotonic.
2024-08-17 23:19:27 +05:30
Raja Subramanian 7018e485f2 Do not start forwarding on an out-of-order packet. (#2917)
It is possible that old packets arrive on receiver. If subscriber starts
on that, the first packet time would be incorrect. Do not start
forwarding on out-of-order packets.
2024-08-08 23:15:04 +05:30
Raja Subramanian 01100650f6 Clean up packet checks. (#2910)
Still leaving the utility `ValidateRTPPacket` in helpers as it could be
useful.
2024-08-06 14:30:08 +05:30
Raja Subramanian d68dd3033d Use extended sequence number in bucket (#2895) 2024-07-30 14:21:37 +05:30
cnderrauber f6f6cca133 don't push 0 ssrc probing packets to pending queue (#2888) 2024-07-23 17:58:04 +08:00
cnderrauber 0c5b5537b2 Don't create DDParser for non-svc codec (#2883) 2024-07-19 10:52:27 +08:00
Raja Subramanian 95f4b304ef Prevent data race. (#2881)
* Prevent data race.

CI is reporting some data race warnings. Prevent that.

* prevent recursive lock

* prevent more recursive locks

* more lock dance
2024-07-18 19:53:41 +05:30
Raja Subramanian 91782b68be Recalc gap of sequence number after forcing rollover. (#2880) 2024-07-18 18:58:10 +05:30
Raja Subramanian f3d3ec1ce7 Record packet/octet count in sender report. (#2864)
Seeing cases of huge jumps in sender erport rtp time stamp
(of the order of minutes) a few hundred ms after start of track.
Only less than 20 packets have been published at that time as seen by
server. Adding these to sender report to check if client thinks it has
sent much more.
2024-07-16 07:59:27 +05:30
Raja Subramanian acbd4ea104 Handle cases of long mute/rollover of time stamp. (#2842)
* Handle cases of long mute/rollover of time stamp.

There are cases where the track is muted for long enough for timestamp
roll over to happen. There are no packets in that window (typically
there should be black frames (for video) or silence (for audio)). But,
maybe the pause based implementation of mute is causing this.

Anyhow, use time since last packet to gauge how much roll over should
have happened and use that to update time stamp. There will be really
edge cases where this could also fail (for e. g. packet time is affected
by propagation delay, so it could theoretically happen that mute/unmute
+ packet reception could happen exactly around that rollover point and
  miscalculate, but should be rare).

As this happen per packet on receive side, changing time to `UnixNano()`
to make it more efficient to check this.

* spelling

* tests

* test util

* tests
2024-07-08 11:07:20 +05:30
Raja Subramanian 39c59d913d Do not warn on padding (#2839) 2024-07-07 12:30:54 +05:30
Raja Subramanian bfb7db2d91 RTP packet validity check. (#2833)
Adding some checks before packet is forwarded to check for anomalies.
Will remove after a round of debug.
2024-07-04 12:42:25 +05:30
Raja Subramanian d4e50b633f Do not log warns on duplicate. (#2807)
With RTX, some clients use very old packets for probing. Check for
duplicate before logging warning about old packet/negative sequence
number jump.

Also, double the history so that duplicate tracking is better. Adds
about 1/2 KB per RTP stream.
2024-06-20 10:52:12 +05:30
Raja Subramanian ea60368100 Do not error out on invalid packet. (#2789)
Remove the return when encountering invalid packet.
Also, log more sparesely.
Proper error returns from util so that we can selectively drop packets
based on error type, for example SSRC mismatches are okay type of thing.
2024-06-14 11:10:57 +05:30
Raja Subramanian 129ba62d61 Validate RTP packets. (#2778)
* Validate RTP packets.

Check version, payload type (if available) and SSRC (if available)
and drop bad packets. And let repair mechanisms take effect for those
packets.

* address data race reported by test

* fix an unlock and test packets
2024-06-10 15:43:59 +05:30
Raja Subramanian 38d213ed10 Do not compare payload type before bind (#2775) 2024-06-09 01:03:38 +05:30
Raja Subramanian b58db82254 Log invalid RTP packet (#2774) 2024-06-08 10:36:05 +05:30
cnderrauber 908baeb942 initialize bucket size by publish bitrates (#2763) 2024-06-06 14:31:20 +08:00
Raja Subramanian 03bb468472 Log range map for debugging. (#2754)
* Log range map for debugging.

* log details on errors

* log details
2024-06-04 08:00:26 +05:30
Raja Subramanian 9781d30611 Do not propagate RTCP if report is not processed. (#2739) 2024-05-28 19:29:54 +05:30
Raja Subramanian 8be2005e0f More detailed logging to understand old packets. (#2730) 2024-05-25 18:34:55 +05:30