Commit Graph

43 Commits

Author SHA1 Message Date
Raja Subramanian e93611eafa Log sender reports. (#2625) 2024-04-05 18:21:38 +05:30
Raja Subramanian 63b1fba082 Add start/end time to AnalyticsStream. (#2618)
* Add start/end time to AnalyticsStream.

* fix test
2024-04-03 12:23:18 +05:30
Raja Subramanian 860702e9dc Prevent large spikes in propagation delay (#2615)
* Prevent large spikes in propagation delay

A few tweaks
- Large spike in propagation delay due to congested channel results in
  long term estimate getting high value. Ignore outliers in long term
  estimate.
- Introduce a new field for adjusted arrival time as adjusting the
  arrival time in place meant it got applied again across the relay and
  that caused different propagation delay on remote nodes.
- Reset path change counters as long as there is any sample that is not
  higher than the multiple of long term. There was a case of
  o Sample with high value that triggered path change start.
  o Then some samples with high enough delta, but did not meet the
    criteria for increasing counter further.
  o Some time later, another sample met the threshold and that triggered
    a path change re-init.

* do not adapt to large delta
2024-04-02 14:21:20 +05:30
Raja Subramanian 3e43f75143 Forward publisher sender report. (#2572)
* Forward publisher sender report.

Publisher side RTCP sernfer report is rebased to SFU time base
and used to send sender rerport to subscriber.

Will wait to merge till previous versions are out as this will require a
bunch of testing.

* - Add rebased report drift
- update protocol dep
- fix path change check, it has to check against delta of propagation
  delay and not propagation delay as the two side clocks could be way
  off.
2024-03-13 14:31:39 +05:30
Raja Subramanian 610d68a409 Clean up using publisher side clock rate. (#2568)
It is not used any more.
2024-03-11 12:25:07 +05:30
Raja Subramanian 50c48ff29d Ignore out-of-order receiver side sender reports. (#2567) 2024-03-11 11:30:01 +05:30
Raja Subramanian 93c7d1f4fb Adjust first packet time on down track resume. (#2566)
Allows subscriber sender report to line up better quicker.
2024-03-11 00:40:16 +05:30
Raja Subramanian bdbc9dcbc7 Use start time stamp to calculate down stream sender report. (#2564)
* Use start time stamp to calculate down stream sender report.

With first packet time adjustment, using the first time stamp is more
accurate.

This still suffers if the up stream clock rate changes (happens in cases
like noise suppression which is not well understood). Will be looking at
pass through of sender report from publisher to subscriber.

* similar log strings

* avoid early sender reports

* log messages

* Reduce first packet adjustment threshold to 15 seconds
2024-03-10 23:18:54 +05:30
Raja Subramanian a08b058abc Structured logging for sender report data. (#2563) 2024-03-10 01:29:37 +05:30
Raja Subramanian 46257c1d24 Skip large RR intervals. (#2544) 2024-03-04 19:12:53 +05:30
Raja Subramanian 49fd332e91 Store first SR also as it can get reset (#2472) 2024-02-12 12:14:25 +05:30
Raja Subramanian 7c16ca6a0c Log feed Sender Report to better understand forwarded sender report (#2443)
anomalies.
2024-02-04 11:12:22 +05:30
Raja Subramanian a1ca41a4e1 Skip reporting skew for out-of-order reports (#2369) 2024-01-08 13:11:04 +05:30
Raja Subramanian 1f335dd564 Convert to formatter string for lazy evaluation. (#2298) 2023-12-06 18:11:05 +05:30
Raja Subramanian 2ee5aa7c98 Add optional supervisor disable. (#2277)
* Add optional supervisor disable.

Used `DisableSupervisor` so that default can be enabled and
it can be disabled explicity. But, open to defaulting to disable
(i. e. change param to `EnableSupervisor`).

* Move nil check to call site
2023-11-30 13:04:31 +05:30
Raja Subramanian 45346d7c76 Clean up condition that is not happening (#2207) 2023-11-01 15:17:09 +05:30
Raja Subramanian c93a88bd9b Log starts on metadata cache overflow. (#2206) 2023-11-01 10:55:07 +05:30
Raja Subramanian ce8f64176a Log correct time difference (#2192) 2023-10-26 15:34:49 +05:30
Raja Subramanian 047a4ac870 Apply repair to the newest cached report (#2186) 2023-10-26 03:43:52 +05:30
Raja Subramanian fa01297d96 Slight sequencer tweaks. (#2184)
The buffer is not for padding packets. So, calculate
adjusted sequence numbers before comparing against size.

Also, it is possible that invalidated slot is accessed
due to not being able to exclude padding range. This was
causing time stamp reset to 0. Will remove the error log
after this goes out and the condition does not show up
for a few days.
2023-10-25 23:12:14 +05:30
Raja Subramanian f4a3618000 Log error on 0 time stamp. (#2174)
Need backtrace for source of it.
Also, do not reset start if 0, that is incorrect.
2023-10-23 23:00:03 +05:30
Raja Subramanian 3e9450c774 Log more details in warns. (#2166)
Logging more details in warns so that we do not have to enable Infow
for some logs later.
2023-10-21 11:02:34 +05:30
Raja Subramanian b591c56aa3 Logging reduction. (#2165)
Move some to Debugw and add sampling for a few.
2023-10-21 10:26:30 +05:30
Raja Subramanian 0407eb4833 Log audio packets in forwarding path. (#2162)
Seeing a time stamp jump that I am not able to explain.
Basically, it looks like the time stamp doubles at some
point. There is no code which doubles the timestamp.
Can understand an erroneous roll over/wrap around, but
doubling is very strange.

So, logging only audio packets. Will disable as soon
as I have some smaples from canary.
2023-10-21 01:37:30 +05:30
Raja Subramanian 5bf2e5fd4a Log clock deviations in sender report. (#2161)
Seeing some unexplained jumps in sender report time stamp
in canary. Wonder if the calculated clock rate is way off
during some interval. Logging clock deviations to understand
better.
2023-10-20 23:06:34 +05:30
Raja Subramanian 43a0ca57b5 Clear flags in packet metadata cache before setting them. (#2160)
Not sure if this could have resulted in bad FPS calculation,
but could have contributed to it.
2023-10-20 12:13:29 +05:30
Raja Subramanian f653efcf10 Do not update highest time on padding packet. (#2157)
* Error log of padding updating highest time to get backtrace.

* Do not update highest time on padding packet.

Padding packets use time stamp of last packet sent.
Padding packets could be sent when probing much after last packet
was sent. Updating highest time on that screws up sender report
calculations. We have ways of making sure sender reports do not
get too out-of-whack, but it logs during that repair.
That repair should be unnecessary unless the source is behaving weird
(things like publisher sending all packets at the same time, publisher
sample rate is incorrect, etc.)
2023-10-19 12:01:48 +05:30
Raja Subramanian 7c830ea5b9 Log highest time update on padding packet. (#2154)
* Log highest time update on padding packet.

Seeing a strange case of what looks like highest time getting
updated on a padding packet. Can't see how it happens in code.
So, logging to check. Will be removing log after checking.

* log sequence number also
2023-10-19 00:53:50 +05:30
Raja Subramanian f97242c8ba Use 32-bit time stamp to get reference time stamp on a switch. (#2153)
* Use 32-bit time stamp to get reference time stamp on a switch.

With relay and dyncast and migration, it is possible that different
layers of a simulcast get out of sync in terms of extended type,
i. e. layer 0 could keep running and its timestamp could have
wrapped around and bumped the extended timestamp. But, another layer
could start and stop.

One possible solution is sending the extended timestamp across relay.

But, that breaks down during migration if publisher has started afresh.
Subscriber could still be using extended range.

So, use 32-bit timestamp to infer reference timestamp and patch it with
expected extended time stamp to derive the extended reference.

* use calculated value

* make it test friendly
2023-10-18 21:48:41 +05:30
Raja Subramanian 3e4cd3a161 Accept more range for first packet time adjustment. (#2150) 2023-10-17 23:52:14 +05:30
Raja Subramanian 70b60101f4 do the proper large negative check (#2139) 2023-10-10 11:01:49 +05:30
Raja Subramanian 31b042ddce Log larga negative gap. (#2138)
Seeing a large positive gap which I am not able to explain.
Wondering if at some other time, a large negative is happening
and the large positive is just a correction.
2023-10-09 14:32:10 +05:30
Raja Subramanian 4ea284fae0 Log potential sequence number de-sync in receiver reports. (#2128)
* Log potential sequence number de-sync in receicer reports.

Seeing some cases of a roll over being missed. That ends up
as largish range to search in an interval and reports missing packets
in the packet metadata cache.

Logging some details.

* just log in one place
2023-10-05 13:10:13 +05:30
Raja Subramanian 5aa093f65d Log time when there are too many packets. (#2127)
Ideally, can remove the nil return when there are too many packets
as we have more information with extended sequence numbers, but
logging duration first to understand what is happening better.
2023-10-05 12:02:55 +05:30
Raja Subramanian 6c49d1a160 Logging a few bits at Infow (#2126)
Seeing sequencer errors with egress (related to dummy start).
So, logging a few bits at Infow to understand them better.
2023-10-05 11:16:31 +05:30
Raja Subramanian cb0a48c12d Handle RED extended sequence number. (#2123)
When converting from RED -> Opus, if there is a loss, SFU recovers
that loss if it can using a subsequent redundant packet. That path
was not setting the extended sequence number properly.

Also, ensuring use of monotonic clock for first packet time adjustment
also.
2023-10-03 19:38:55 +05:30
Raja Subramanian 989d621c97 A small change to move out of order check before RTT calc. (#2117) 2023-10-02 11:51:46 +05:30
Raja Subramanian ee3a7c01bc Log resync at Infow. (#2114)
* Log resync at Infow.

Seeing potentially large sequence number jumps on a resync.
And it seems to happen on a lot of subscribe/unsubscribe.
Logging at Infow to understand better.

Probably need to find a way to avoid resync. But, logging for now to
check if I can catch one.

* Remove resync and log large sequence number jumps
2023-09-30 08:42:46 +05:30
Raja Subramanian 97048a923c Reducing rtp stats memory consumption - part 2 (#2078)
* WIP commit

* move a struct to sender only

* Snapshot intervals

* make receiver history 4K too
2023-09-16 18:54:18 +05:30
Raja Subramanian 044f6cec40 Reduce packet meta data cache - part 1 (#2073)
* Reduce packet meta data cache - part 1

Packet meta data cache takes a good amount of space.
That cache is 8K entries deep and each entry is 8 bytes.
So, that takes 64KB per RTP stream.

It is mostly needed for down stream to line up with receiver reports.

So, removing cache from up stream (RTPStatsReceiver) as part 1.
Will look at optimising the down stream in part 2.

* Remove caching from RTPStatsReceiver

* clean up a bit more

* maintain history and fix test
2023-09-15 21:39:03 +05:30
Raja Subramanian 5f701ece34 Include top bits from start in highest sequence number from RR. (#2064)
Streaming could start after 16-bits has rolled over. So, have to add
that base back to what is received in receiver report.

Otherwise, it looks like there are not packets received in window
leading to poor quality.
2023-09-12 14:36:25 +05:30
Raja Subramanian f333ce05ed fix start (#2060) 2023-09-11 20:59:47 +05:30
Raja Subramanian c09d8d0878 Split RTPStats into receiver and sender. (#2055)
* Split RTPStats into receiver and sender.

For receiver, short types are input and need to calculate extended type.

For sender (subscriber), it can operate only in extended type.
This makes the subscriber side a little simpler and should make it more
efficient as it can do simple comparisons in extended type space.

There was also an issue with subscriber using shorter type and
calculating extended type. When subscriber starts after the publisher
has already rolled over in sequence number OR timestamp, when
subsequent publisher side sender reports are used to adjust subscriber
time stamps, they were out of whack. Using extended type on subscriber
does not face that.

* fix test

* extended types from sequencer

* log
2023-09-11 07:33:39 +05:30