836 Commits

Author SHA1 Message Date
cnderrauber f6f6cca133 don't push 0 ssrc probing packets to pending queue (#2888) 2024-07-23 17:58:04 +08:00
Raja Subramanian bd616d6074 Split ICE candidate queue. (#2885)
Shared ICE candidate queue meant only one of PUBLISHER/SUBSCRIBER pc got
final candidate notification. Split the queue.
2024-07-20 10:25:06 +05:30
cnderrauber 0c5b5537b2 Don't create DDParser for non-svc codec (#2883) 2024-07-19 10:52:27 +08:00
Raja Subramanian 95f4b304ef Prevent data race. (#2881)
* Prevent data race.

CI is reporting some data race warnings. Prevent that.

* prevent recursive lock

* prevent more recursive locks

* more lock dance
2024-07-18 19:53:41 +05:30
Raja Subramanian 91782b68be Recalc gap of sequence number after forcing rollover. (#2880) 2024-07-18 18:58:10 +05:30
Raja Subramanian 4733e864bc Rollover sequence number when time stamp is moving forward. (#2876)
* Rollover sequence number when time stamp is moving forward.

Seeing large gaps in sequence number due to potential network issues.
In that gap, the sequence number could roll over.

Using packet time jumps to figure out if a roll over could have happened
and force roll over the sequence number to ensure that it does not flow
backwards.

* fix test
2024-07-18 11:02:52 +05:30
Raja Subramanian 3621e9957e Prevent deadlock when adding down track. (#2869)
* Prevent deadlock when adding down track.

* revert down track lock scope change
2024-07-16 12:24:45 +05:30
Raja Subramanian f3d3ec1ce7 Record packet/octet count in sender report. (#2864)
Seeing cases of huge jumps in sender erport rtp time stamp
(of the order of minutes) a few hundred ms after start of track.
Only less than 20 packets have been published at that time as seen by
server. Adding these to sender report to check if client thinks it has
sent much more.
2024-07-16 07:59:27 +05:30
Raja Subramanian e815c99eac Log roll back reason. (#2865)
There are cases of subscriber of a FF publisher not able to lock onto
layer 0 and kept rolling back till it switched to a higher layer and
then it could switch to layer 0. Must have been due to not having a
sender report. Can't think of a reason why that would be missing.
Logging more to debug this further.

Also, using a wrapping logger because of this bug: https://github.com/uber-go/zap/issues/836
Tried using `json:"" yaml:""` tag for `refInfos` field and all fields
inside `refInfo` struct. But, they were still logging nils. So, using
the wrapper logger.
2024-07-15 23:54:45 +05:30
Raja Subramanian faa66d1138 Make sender report pass through an option. (#2861)
Enabled by default.

Also, tweak the long term propagation delay a bit. The first propagation
delay itself was too high and the long term initialized with a high
value. Prevent that and also ensure large negtaives do not have an
effect by using a lower bound of 0. Lower bound of 0 is okay as the main
purpose is to track sustained high positive values.
2024-07-15 11:27:31 +05:30
cnderrauber a995c71f84 Fallback to primary encoding if redundant block overflow (#2858)
* Fallback to primary encoding if redundant block overflow

* revert mtu change
2024-07-12 16:39:54 +08:00
cnderrauber deee816d0a Forward correct payload type for mixed up red/primary payload (#2847)
* Forward correct payload type for mixed up red/primary payload

* empty line

* log field & test case
2024-07-09 23:04:47 +08:00
Raja Subramanian 27f6794e77 Check sender report against media path. (#2843)
Seeing cases (mostly across relay) of large first packet time adjustment
getting ignored. From data, it looks like the first packet is extremely
delayed (some times of the order of minutes) which does not make sense.

Adding some checks against media path, i. e. compare RTP timestamp from
sender report against expected RTP timestamp based on media path
arrivals and log deviations more than 5 seconds.

Another puzzling case. Trying to understand more.

Also, refactoring SetRtcpSenderReportData() function as it was getting
unwieldy.
2024-07-09 09:20:27 +05:30
cnderrauber 4e9ab46091 use correct payload type for red primary encoding (#2845) 2024-07-09 10:30:05 +08:00
Raja Subramanian acbd4ea104 Handle cases of long mute/rollover of time stamp. (#2842)
* Handle cases of long mute/rollover of time stamp.

There are cases where the track is muted for long enough for timestamp
roll over to happen. There are no packets in that window (typically
there should be black frames (for video) or silence (for audio)). But,
maybe the pause based implementation of mute is causing this.

Anyhow, use time since last packet to gauge how much roll over should
have happened and use that to update time stamp. There will be really
edge cases where this could also fail (for e. g. packet time is affected
by propagation delay, so it could theoretically happen that mute/unmute
+ packet reception could happen exactly around that rollover point and
  miscalculate, but should be rare).

As this happen per packet on receive side, changing time to `UnixNano()`
to make it more efficient to check this.

* spelling

* tests

* test util

* tests
2024-07-08 11:07:20 +05:30
Raja Subramanian 39c59d913d Do not warn on padding (#2839) 2024-07-07 12:30:54 +05:30
Raja Subramanian 2bcfcbc354 Log mime and subscriberID for downtrack and descendents (#2838) 2024-07-06 23:15:23 +05:30
cnderrauber 70fbba2844 Add track subscribed notification to publisher (#2834)
* Add track subscribed notification to publisher

* update go mod
2024-07-05 21:57:03 +08:00
Raja Subramanian bfb7db2d91 RTP packet validity check. (#2833)
Adding some checks before packet is forwarded to check for anomalies.
Will remove after a round of debug.
2024-07-04 12:42:25 +05:30
Raja Subramanian b4134edf40 Log rtp stats state on large jumps. (#2829)
Forgot to include in receiver.
2024-07-01 11:51:05 +05:30
Raja Subramanian 57980fcc36 fix logging ignored key (#2826) 2024-06-28 10:34:41 +05:30
Raja Subramanian fa490dd510 Log rtp stats more consistently. (#2816)
* Log rtp stats more consistently.

Thank you Paul for the logging tip.
Also update deps.

* remove duplicate logging field

* nil check
2024-06-25 14:55:42 +05:30
Raja Subramanian 6bb48dd6f1 Do not log duplicate on large negative on send side (#2815) 2024-06-24 12:51:35 +05:30
Raja Subramanian cdb5f3ed68 Log more around unexpected cases (#2813)
- too many padding packets
- also fix cse of snapshot not getting any packets
2024-06-23 00:33:56 +05:30
Raja Subramanian 091eab556d Update mediatransportutil (#2812) 2024-06-22 11:32:28 +05:30
Raja Subramanian d4e50b633f Do not log warns on duplicate. (#2807)
With RTX, some clients use very old packets for probing. Check for
duplicate before logging warning about old packet/negative sequence
number jump.

Also, double the history so that duplicate tracking is better. Adds
about 1/2 KB per RTP stream.
2024-06-20 10:52:12 +05:30
Raja Subramanian ef838e4fa2 Indicate if track is expectd to be resumed in onClose callback. (#2800)
That is the main change. Changed variable name to `isExpectedToResume`
everywhere to be consistent.

Planning to use the callback value in relays to determine if the down
track should be closed or switched to a different up track.
2024-06-17 23:51:00 +05:30
Raja Subramanian 5d969ba35b remove some debug (#2797) 2024-06-17 12:57:04 +05:30
Raja Subramanian ea60368100 Do not error out on invalid packet. (#2789)
Remove the return when encountering invalid packet.
Also, log more sparesely.
Proper error returns from util so that we can selectively drop packets
based on error type, for example SSRC mismatches are okay type of thing.
2024-06-14 11:10:57 +05:30
Raja Subramanian 129ba62d61 Validate RTP packets. (#2778)
* Validate RTP packets.

Check version, payload type (if available) and SSRC (if available)
and drop bad packets. And let repair mechanisms take effect for those
packets.

* address data race reported by test

* fix an unlock and test packets
2024-06-10 15:43:59 +05:30
Raja Subramanian a31f59b689 Log first time adjustment total. (#2776)
* Log first time adjustment total.

Seeing cases where the first time is 400ms+ before start time.
Possible it is getting that much adjustment, but would be good to see
how much total adjustment happens.

* log propagation delay
2024-06-09 23:07:01 +05:30
Raja Subramanian 38d213ed10 Do not compare payload type before bind (#2775) 2024-06-09 01:03:38 +05:30
Raja Subramanian b58db82254 Log invalid RTP packet (#2774) 2024-06-08 10:36:05 +05:30
Raja Subramanian cee3fdb25e Better lock for sender report TS offset. (#2771)
* Better lock for sender report TS offset.

It is possible that a resume has happened and new time stamp offset
calculated. But, a sender report from publisher comes with a time stamp
prior to the time stamp which was used for offset calculation. Using
that sender report in the forwarding path causes jumps.

Example
- Track forwarding, let us tsOffset = `a`
- Unmute/layer switch - one of those events happens, a new tsOffset will
  be calculated, let us say that offset is `b` and it is based on
  incoming time stmap of `c`.
- A sender report from publisher could arrive with timestamp = `d`.
  o If `d` >= `c`,  the offset `b` is correct and can be applied.
  o But, it is possible that `d` < `c`, in that case, offset `a` should
    be used and not `b`.

To address this, keep track of incoming extended timestamp at switch
point and accept incoming sender reports which have a timestamp >=
switch point timestamp.

* clean up

* log more details on invalid layer
2024-06-07 23:56:10 +05:30
Raja Subramanian 73852d0a13 Reduce large sequence number jump threshold for logging. (#2770)
Seeing some unexplained large jumps on remotes across relay. Unclear if
there was a jump on origin side at some point. Reducing threshold for
large jump so that we can catch unexpected jumps more.
2024-06-07 12:36:02 +05:30
Raja Subramanian c265ab7104 Log invalid spatial layer (#2769) 2024-06-07 08:21:06 +05:30
Raja Subramanian 7d035deef8 Clean up logging fields a bit (#2767) 2024-06-06 23:03:21 +05:30
cnderrauber 908baeb942 initialize bucket size by publish bitrates (#2763) 2024-06-06 14:31:20 +08:00
Raja Subramanian d3f9e8b8e5 Do not add tracker for invalid layers. (#2759)
Previously, the bit rate interval config was checked first. That would
have returned `!ok` for invalid layers. A recent change to prevent
duplicate tracker addition re-arranged the code and the tracker array
was accessed out-of-bounds.

Unclear why an invalid layer is passed in. Need to investigate that.
2024-06-05 23:06:22 +05:30
Raja Subramanian fdde44d926 Attach time stamp offset to publisher sender report. (#2757)
There are cases where the RTP time stamp does not increment acros
mute/unmute. Seems to happen fairly consistently with React Native
clients.

Something like the following happens
- Track is progressing
- Mute at `t = x`, assume RTP time stamp at the point is `y` and RTP clock
  rate is `z`.
- Through mute, more RTCP sender reports come in from publisher and the
  RTP time stamp in those reports are progressing at expected rate of
  `z` RTP clock ticks per second.
- Forwarding path uses those sender reports from publisher to build the
  sender report for subscribers.
- Unmute happens at `t = x + a` seconds.
- Ideally packets coming in after that, should have a time stamp of `y +
  (a * z)`, but they tend to have something a little bit more than `y`.
- RTCP sender reports also have a time stamp that goes back. SFU ignores
  these.
- Mean while the forwarding path has adjusted to the new RTP time stamp
  base and it has calculated a TS offset (from publisher -> subscriber).
  Effectively, that offset comes out close to `(a * z)`, i. e. jump
  corresponding to the mute interval.
- When it is time to send a RTCP sender report to subscriber, the old
  sender report from publisher is used (as intervening ones from
  publisher were rejected because time stamp is moving backwards).
  The problem is that the old report is used with new offset.
  So, it looks like time stamp jumped ahead by `a` seconds.

Address it by storing time stamp offset at the time of receiving the
publisher side sender report. And use that while sending subscriber side
sender report. There are very edge cases where this can
get mismatched, but should be rare. Hopefully, this should prevent
unnecessary jumps in time stamp in RTCP sender report to subscribers.
2024-06-05 13:02:33 +05:30
Raja Subramanian 49e1848d1d Move resync to TrackSender interface so that it can be called directly from up track (#2755) 2024-06-04 22:38:39 +05:30
Raja Subramanian 03bb468472 Log range map for debugging. (#2754)
* Log range map for debugging.

* log details on errors

* log details
2024-06-04 08:00:26 +05:30
Raja Subramanian 1e52b09e24 Reset tracker on expected layer increase. (#2753)
Was hitting the edge case mentioned in the (now deleted in this PR)
comments. It is fine to reset and let it declare available again.
Available layer handler will ignore repeats.
2024-06-03 13:58:58 +05:30
cnderrauber b90ade81da add log for unexpected jitter (#2752) 2024-06-03 15:12:53 +08:00
Raja Subramanian b7ea733492 Fix DD tracker addition. (#2751)
DD uses a wrapped tracker. Check only stream tracker manager field for
existing tracker.
2024-06-03 12:14:42 +05:30
Raja Subramanian e483cee662 Prevent tracker override. (#2750)
When relaying buffers are stopped and restarted. On a restart,
the buffer adds a tracker. But, the tracker is not destroyed till the
end. So, the old tracker and new tracker for the same layer stomp on
each other and declare layer unavailable (the old tracker is not getting
any packets).

Fix by not creating a new tracker if one exists already.
2024-06-03 11:55:59 +05:30
cnderrauber 526d3a017e ignore abnormal forward delay (#2748) 2024-05-31 10:23:37 +08:00
Raja Subramanian 3ad0a69c86 Stop probe on probe controller reset (#2744) 2024-05-29 18:26:22 +05:30
Raja Subramanian f9f761b223 Demote some less useful/noisy logs. (#2743) 2024-05-29 12:05:18 +05:30
Raja Subramanian 447793d077 Move RTT errors to Debugw. (#2742)
With the move to forwarding NTP timestamp as is, we get a bunch more of
this error logged as the remote is basing it off of previous report and
local (i. e. server-side) bases it off of a more recent report.

Anyhow, this code has been around for a long time and there is nothing
new to learn from those errors. Just log it at Debugw in case we can
learn something from it for specific projects or environments where
Debugw is okay.
2024-05-29 11:26:30 +05:30