Commit Graph

1144 Commits

Author SHA1 Message Date
Raja Subramanian b91cd2e4ea Rework receiver restart. (#4202)
* Rework receiver restart.

- Protect against concurrent restarts
- Clean up and consolidate code around restarts
- Use `RestartStream` of buffer rather than creating new buffers.

* fix test
2025-12-27 17:17:16 +05:30
Raja Subramanian bb00c86489 Restart API on receiver. (#4200)
Can be used when track moves forward/backward.
2025-12-27 03:42:23 +05:30
Raja Subramanian 25ece1e911 Minor refactor in buffer base and audio level (#4198)
* Minor refactor in buffer base and audio level

- Make a function for `restartStream`. Will be useful
  when external signal needs to restart a stream. Also restart all the
  bits (audio level, dd parser and frame rate calculator)
- make an audio level mode with RTP timestamp so that some state can be
  moved out of buffer base

* clean up

* log restart
2025-12-26 20:13:38 +05:30
Raja Subramanian 599002f890 ignore PLI requests for non-video (#4196)
* ignore PLI requests for non-video

* under lock
2025-12-26 12:26:22 +05:30
Raja Subramanian 2510b9462e Taking a bunch of go modernize suggestions. (#4194)
This is not all of it as it is not possible (or at least I do not know
of a way) to get all suggestions for a repo/project. Did this via loop
searching mainly and taking the modernize suggestions.
2025-12-25 16:55:58 +05:30
Raja Subramanian ed8e6afcd7 Handle repair SSRC of simulcast tracks during migration. (#4193)
* Handle repair SSRC of simulcast tracks during migration.

* fix

* fix comment
2025-12-25 14:45:48 +05:30
Raja Subramanian c6bf7a2786 Fix logging key and other clean up around stream restart. (#4192) 2025-12-24 22:48:25 +05:30
Raja Subramanian 8b0efb8c89 Resolve RTX pair via OnTrack also. (#4190)
* Resolve RTX pair via OnTrack also.

In simulcast probing path, the interceptor chain is not invoked
for primary stream. Not sure if this is a recent change. Due to this,
the RTX pair does not get resolved.

Use the onTrack callback to resolve the pair.

* remove debug
2025-12-24 15:27:13 +05:30
Raja Subramanian 381bce03ae Return extended sequence number only and not packet. (#4189)
* Return extended sequence number only and not packet.

Callers need only the extended sequence number.
Extended packet could get release if the forwarder processes it before
caller accesses it causing a data race.

* grow bucket in a go routine
2025-12-24 10:44:56 +05:30
Raja Subramanian 6bcbf54ea1 Always instantiate nacker when using out-of-band sequence numbers. (#4187) 2025-12-24 04:10:11 +05:30
Raja Subramanian e71184dea0 Store buffer after creating it. (#4186)
* Refactor receiver and buffer into Base and higher layer.

To be able to share code/functionality with relay.

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* clean up

* deps

* fix test

* fix test

* Store buffer after creating it.

Also changing signature of creator function as it could call TrackInfo()
and get into a deadlock.

* fix double unlock

* add some more debug logging
2025-12-24 02:55:51 +05:30
Raja Subramanian 7c8ea11505 Refactor receiver and buffer into Base and higher layer. (#4185)
* Refactor receiver and buffer into Base and higher layer.

To be able to share code/functionality with relay.

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* WIP

* clean up

* deps

* fix test

* fix test
2025-12-23 21:35:48 +05:30
Raja Subramanian 32cd0370c7 Flush the ext packets on restart/close and release packets. (#4179) 2025-12-19 20:25:22 +05:30
Raja Subramanian fb849edc6a Minor clean up (#4172) 2025-12-18 08:46:40 +05:30
Raja Subramanian 47324abd0e Drop run away receiver reports. (#4170) 2025-12-17 21:58:47 +05:30
Paul Wells 462ec324be prevent uint overflow setting packet not found count (#4169) 2025-12-17 06:54:23 -08:00
Raja Subramanian 5c841b8ea1 Some logging changes. (#4168)
* Some logging changes.

Trying to chase a case of large sequence number gap on subscriber side
where packets are sent after a long time.

* return values instead of logging
2025-12-17 18:05:29 +05:30
Paul Wells 2f2d0a5735 skip lost sequence number ranges in getIntervalStats (#4166) 2025-12-17 00:02:51 -08:00
Raja Subramanian a26c48304a Add support for RTP stream restart. (#4161)
* Add support for RTP stream restart.

When an unhandled packet is encountered, try a restart sequence.
Restart happens when 5 packets with contiguous sequence numbers and same
or increasing time stamps are received. Note that this does not work for
B-frame type of scenarios, but that is true for receive path handling
even before this. As WebRTC does not use B-frames, it is fine. But,
needs to be looked at again if B-frames are necessary.

It is controlled by a config that is disabled by default.

* clean up

* debug log
2025-12-16 13:21:39 +05:30
Raja Subramanian 97aba5e77b Consistently undo update to sequence number and timestamp when the (#4156)
incoming packet cannot be sequenced.
2025-12-13 15:46:04 +05:30
Raja Subramanian ca4b56d2d5 Handle case of sequence number jump just after start. (#4150)
It is possible that the stream stops just after start and
restarts much later introducing a large gap in sequence number.
That could look like an unhandled case because the wrap back handler
does not have enough packets yet.

Let other checks based on time stamp gap take effect and only if that
also leaves the sequence number unhandled, drop the packet.
2025-12-12 00:29:15 +05:30
changgesi d7db7cb389 chore: fix a large number of spelling issues (#4147)
Signed-off-by: changgesi <changgesi@outlook.com>
2025-12-11 09:34:13 +05:30
Raja Subramanian 498304cdd9 defensive nil check (#4144) 2025-12-10 13:33:08 +05:30
Raja Subramanian 20f6a49780 Store ddParser in atomic.Pointer (#4143)
* Store ddParser in atomic.Pointer

as release is handled outside lock

* log space

* make non-struct methods to release packets
2025-12-10 13:01:17 +05:30
Raja Subramanian 037cb9062f release ext packet if patching fails (#4142) 2025-12-10 12:09:49 +05:30
Raja Subramanian dd598ef23f Release ExtPacket if dependency descriptor or other parsing fails (#4141) 2025-12-10 11:05:19 +05:30
Raja Subramanian 1c1a836c3c Mark RTCP buffer Write as noinline. (#4138)
Seeing a bunch of objects in ReadStreamSRTP.write which does not make
a lot of sense as the function does not allocate anything
(https://github.com/pion/srtp/blob/8fe528a0c4ebb5c46d40a9fd5b77e5b6655fa919/stream_srtp.go#L68-L77)

RTP buffer was marked noinline in an easrlier PR.
Marking RTCP buffer write also as noinline to check if heap reporting
changes.
2025-12-08 22:30:30 +05:30
Raja Subramanian 64f3d1e972 switch participant callbacks to room to listener interface (#4136)
* switch participant callbacks to room to listener interface

* mage generate

* clean up

* clear listener

* clean up

* use interface in up data track manager

* tweaks

* Paul feedback - should reduce the diff as this keeps the room handlers as is except making methods for a couple of anonymous handlers

* clean up
2025-12-08 15:59:45 +05:30
Raja Subramanian a30c79fa6d Use isEnding to indicate if down track could be resumed. (#4132)
There is no need to cache down track if participant is going away.
2025-12-06 19:55:20 +05:30
Raja Subramanian 8c241ecf12 Fix RTCP reader leak in DownTrack. (#4131)
When a participant is closing, RTCP readers should be cleaned up from
factory even if the participant is expected to resume. The resumed
participant will be a new participant session and peer connection(s) and
everything will be set up again.
2025-12-06 17:49:23 +05:30
Raja Subramanian 3eef869a68 Do not pause rid in SDP (#4129) 2025-12-05 15:57:31 +05:30
cnderrauber fa0633aa3e move utils.WrapAround to mediatransportutil (#4124) 2025-12-04 17:45:11 +08:00
Raja Subramanian 7954748d7a Data tracks (#4089)
* WIP

* WIP

* Starting to add some signalling integration testing.

* Working tests.

* fix tests

* Forward data packets (#4096)

* WIP commit

* WIP

* WIP

* fix forwarding

* address PR comments

* move some methods from LocalParticipant to Participant interface

* handle subscription update

* add extensions and tests

* more packet tests

* add test for replace extension and fix a bug

* update protocol and add config
2025-12-04 10:44:34 +05:30
Raja Subramanian 7158d98366 log bucket growth (#4122) 2025-12-03 18:48:02 +05:30
Raja Subramanian 64c651431e Update mediatransportutil (#4115)
- New bucket API to pass in max packet size and sequence number offset
  and seequence number size generic type
- Move OWD estimator to mediatransportutil.
2025-11-28 21:51:53 +05:30
Raja Subramanian ffbabcc772 Switch forwarding latency log to Debugw (#4098) 2025-11-23 11:22:10 +05:30
cnderrauber 54cf7d46c8 Control latency of lossy data channel (#4088)
* Control latency of lossy data channel

* remove log

* test
2025-11-18 16:30:16 +08:00
Raja Subramanian c3964ba2eb Use sync.Pool for objects in packet path. (#4066)
* Use sync.Pool for objects in packet path.

Seeing cases of forwarding latency spikes that aling with GC.

This might be a bit overkill, but using sync.Pool for small +
short-lived objects in packet path.

Before this, all these were increasing in alloc_space heap profile
samples over time. With these, there is no increase (actually the lines
corresponding to geting from pool does not even show up in heap
accounting when doing `list` in `pprof`)

* merge

* Paul feedback
2025-11-14 16:13:23 +05:30
Raja Subramanian f8b994d491 Forwarding latency measurement tweaks. (#4080)
* Forwarding latency measurement tweaks.

- prom transmission type public
- do not measure short term values as it is not used and saves some lock
  contention time in packet path potentially. Adding a separate method
  for that.
- Change latency/jitter summary reporting to `ns` also to match the
  histogram.

* add GetShortStats
2025-11-13 18:39:49 +05:30
Raja Subramanian 4ce07bedeb Higher resolution forwarding latency histogram. (#4067)
* Higher resolution forwarding latency histogram.

Was using the average latency/jitter of last second to populate
forwarding latency/jitter histogram. But, it is too coarse, i. e. the
average value of latency/jitter is very low and those summarised samples
end up in the lowest bucket always.

A few things to address it
- record per packet forwarding latency in histogram
- adjust histogram bins to include smaller values
- Drop jitter histogram

This is a per packet call, but prometheus histogram is supposedly
fast/light weight. Would be good to get better resolution histograms.
Hence doing this. Please let me know if there are performance concerns.

* typo

* one more typo
2025-11-09 17:29:40 +05:30
Raja Subramanian 1dc9b8fc5c Use buffered indicator to exclude from forwarding latency. (#4062)
* Debug high forwarding latency missing.

* log highest

* log condition

* update log

* log

* log

* change log

* Track start up delay.

Digging into forwarding latency, there are a few things
1. Seems to be caused due to forwarding packets queued before bind. They
   would be in the queue till bind. There are two ways it is showing up
   a. Bind itself is delayed and releasing queued packets causes the
      high forwarding latency.
   b. There is a significant gap between bind and first packet being
      pulled off the queue to be forwarded, in one example 100ms.

(a) is understandable if the signalling delays things. Can drop these
packets without forwarding or indicate in the packet that it is a queued
packet and drop it from forwarding latency calculation. Dropping is
probably better as down stream components like egress will see a burst
in these situations.

(b) looks like go scheduling latency? Unsure.

Logging more to understand this better.

* log start

* Use buffered indicator to exclude from forwarding latency.

Buffered packets live the queue for a while before Bind releases them.
They have high(ish) queuing latency and not true representation of
forwarding latency.
2025-11-07 21:46:14 +05:30
Raja Subramanian f117ee511f Track start up delay. (#4061)
* Debug high forwarding latency missing.

* log highest

* log condition

* update log

* log

* log

* change log

* Track start up delay.

Digging into forwarding latency, there are a few things
1. Seems to be caused due to forwarding packets queued before bind. They
   would be in the queue till bind. There are two ways it is showing up
   a. Bind itself is delayed and releasing queued packets causes the
      high forwarding latency.
   b. There is a significant gap between bind and first packet being
      pulled off the queue to be forwarded, in one example 100ms.

(a) is understandable if the signalling delays things. Can drop these
packets without forwarding or indicate in the packet that it is a queued
packet and drop it from forwarding latency calculation. Dropping is
probably better as down stream components like egress will see a burst
in these situations.

(b) looks like go scheduling latency? Unsure.

Logging more to understand this better.

* log start
2025-11-07 16:55:18 +05:30
Raja Subramanian 4872f2051d Return write count from WriteRTP. (#4059)
* Log write count atomic.

* Return write count from WriteRTP.

Apologies for the frequent changes on this. With relays, the down track
could write to several targets. So, use count to have an accurate
indication of how may subscribers were written to.
2025-11-06 13:29:21 +05:30
Raja Subramanian d0ba46b460 Log write count atomic. (#4057) 2025-11-06 13:00:08 +05:30
Raja Subramanian ae5fb7e882 Add packet to forwarding stats only if packet is forwarded. (#4056)
Packets not being forwarded were getting included in forwarding stats
calculation and skewing the measurement towards a smaller number.

The latency measurement does not include the batch IO of packets on
send. With a 2ms batching, that will add an average latency of 1ms.
2025-11-06 12:31:49 +05:30
cnderrauber c264b504c4 Don't warn 0 payload type for PCMU (#4039) 2025-10-28 23:11:51 +08:00
Raja Subramanian 32fc35254e Broadcast cond var on RTX write. (#4038)
* Broadcast cond var on RTX write.

High forwarding latency logs all show high queuing delay so far. From
code inspection, RTX writes were not signaling the cond var. Not sure if
that is the reason, but adding a signal there for further tests.

* Remove return values from writeRTX as they are not used
2025-10-28 11:27:02 +05:30
Raja Subramanian 061eb8b4e8 AddDownTrack to regressed codec after restarting forwarder. (#4037)
Without that the new codec was skipping through with old selector and
not working correctly.
2025-10-27 20:14:33 +05:30
Raja Subramanian ab906d710c Prevent leakage of previous codec after codec regression. (#4035)
* Prevent leakage of previous codec after codec regression.

In the window between forwarder restart and determining codec, the old
codec packet could leak through. Prevent tha by doing the restart and
codec determination atomically on a codec regression.

* tidy

* use locked function
2025-10-27 17:40:39 +05:30
Raja Subramanian 79b03f97a2 Log queueing latency when encountering high forwarding latency (#4034) 2025-10-27 15:27:03 +05:30