Commit Graph

1801 Commits

Author SHA1 Message Date
Paul Wells a5abf61a56 update psrpc (#2188) 2023-10-25 20:20:49 -07:00
Raja Subramanian 047a4ac870 Apply repair to the newest cached report (#2186) 2023-10-26 03:43:52 +05:30
Raja Subramanian d8e4933dd1 Reference time stamp for SVC. (#2185)
SVC has only one stream and when calculating reference time stamp,
irrespective of reference layer, reference time stamp will be the
same as the given time stamp as there is only one stream and no offset.

TODO: Need better all around SVC handling.
2023-10-25 23:27:43 +05:30
Raja Subramanian fa01297d96 Slight sequencer tweaks. (#2184)
The buffer is not for padding packets. So, calculate
adjusted sequence numbers before comparing against size.

Also, it is possible that invalidated slot is accessed
due to not being able to exclude padding range. This was
causing time stamp reset to 0. Will remove the error log
after this goes out and the condition does not show up
for a few days.
2023-10-25 23:12:14 +05:30
cnderrauber 0296a5bd86 Remove un-preferred codecs for android firefox (#2183)
* Remove un-preferred codecs for android firefox

Android firefox don't comply with the codec order in answer sdp and
has problem to publish h.264, remove other codecs to fix this.

* false(false) is true
2023-10-25 16:59:37 +08:00
Paul Wells 48dba9d589 reduce closing signal stream log level (#2182) 2023-10-24 17:46:07 -07:00
David Colburn b8ac836b9b Only launch room egress once (#2175)
* only launch room egress once

* regenerate fakes
2023-10-24 13:05:23 -07:00
Paul Wells f80e87b216 skip psrpc service registration unless the config is enabled (#2181) 2023-10-24 11:41:52 -07:00
Raja Subramanian 66750e4ba8 Fix deadlock (#2180)
* Fix deadlock

My previous PR to wrap layer notifier post in bind lock was
problematic as `onBinding` callback happens within that lock
and that onBinding callback can call set max layer which will
post to channel. Use a separate mutex.

* RUnlock
2023-10-24 22:12:38 +05:30
Raja Subramanian d6ad857506 Do not post to closed channels. (#2179)
* Do not post to closed channels.

Perils of atomics. Hard to imagine, but I guess it could happen.
The postMaxLayerNotifier checked for closed and down track was not
closed. But, between that check and posting to channel (which is
a very small window), the down track could have been closed and
the channel (maxLayerNotiferCh) is closed.

Protect that channel post + close with the bind lock.

* reduce the change

* Check for closed inside lock
2023-10-24 18:21:59 +05:30
Raja Subramanian df9d6ee0f4 Update protocol. (#2177) 2023-10-24 12:59:25 +05:30
cnderrauber 1ee808ec7d Fix frame chain can't detect broken if currentLayer is not valid (#2176) 2023-10-24 14:09:40 +08:00
Raja Subramanian f4a3618000 Log error on 0 time stamp. (#2174)
Need backtrace for source of it.
Also, do not reset start if 0, that is incorrect.
2023-10-23 23:00:03 +05:30
David Colburn 0f27dda281 move CreateEgress call (#2168) 2023-10-23 09:15:18 -07:00
Raja Subramanian f622fc2490 Sample clock skew down by an order of magnitude (#2173) 2023-10-23 16:58:02 +05:30
cnderrauber eca32792b8 Add configuration to limit MaxBufferedAmount for data channel (#2170)
* Add configuration to limit MaxBufferedAmount for data channel

* comment

* Fix generate flags

* fix test

* Don't disconnect slow subscriber
2023-10-23 15:03:58 +08:00
Paul Wells 0bc932e57e fix config typo (#2172)
* fix config typo

* tidy

* add sample config

* cleanup
2023-10-22 23:43:03 -07:00
Paul Wells 325e5ca753 add psrpc room service (#2171)
* add psrpc room service

* update deps

* disable by default

* feedback

* config

* test
2023-10-22 22:49:38 -07:00
Raja Subramanian 08997c96b0 Drop not relevant packet only if contiguous. (#2167)
The probing + munging has not been set up to drop packets that follow
a gap. Dropping such a packet leads to padding packet sequence numbers
overlapping with regular packets.

This change does two things though.
- The not relevant packet will still not be sent over the wire. That could
create holes in the sequence number leading to NACKs
- Would the hole cause decode issues? Unclear as making this condition is hard.
Simulating it is not showing issues, but that may not be producing the bad
sequence if any.

Will look at the ability to drop a packet after a gap later.
2023-10-22 00:08:41 +05:30
Raja Subramanian 3e9450c774 Log more details in warns. (#2166)
Logging more details in warns so that we do not have to enable Infow
for some logs later.
2023-10-21 11:02:34 +05:30
Raja Subramanian b591c56aa3 Logging reduction. (#2165)
Move some to Debugw and add sampling for a few.
2023-10-21 10:26:30 +05:30
Raja Subramanian 39edfab2b5 Fix extended TS calculated during retransmit. (#2164)
May have caused the large time stamp jump in sender reports.
2023-10-21 02:25:03 +05:30
Raja Subramanian 4f8bbdbaab Keeping revert of debug logs ready (#2163) 2023-10-21 01:47:50 +05:30
Raja Subramanian 0407eb4833 Log audio packets in forwarding path. (#2162)
Seeing a time stamp jump that I am not able to explain.
Basically, it looks like the time stamp doubles at some
point. There is no code which doubles the timestamp.
Can understand an erroneous roll over/wrap around, but
doubling is very strange.

So, logging only audio packets. Will disable as soon
as I have some smaples from canary.
2023-10-21 01:37:30 +05:30
Raja Subramanian 5bf2e5fd4a Log clock deviations in sender report. (#2161)
Seeing some unexplained jumps in sender report time stamp
in canary. Wonder if the calculated clock rate is way off
during some interval. Logging clock deviations to understand
better.
2023-10-20 23:06:34 +05:30
Raja Subramanian 43a0ca57b5 Clear flags in packet metadata cache before setting them. (#2160)
Not sure if this could have resulted in bad FPS calculation,
but could have contributed to it.
2023-10-20 12:13:29 +05:30
Raja Subramanian 0d7477178e More fine grained filtering NACKs after a key frame. (#2159)
* More fine grained filtering NACKs after a key frame.

There are applications with periodic key frame.
So, a packet lost before a key frame will not be retransmitted.
But, decoder could wait (jitter buffer, play out time) and cause
a stutter.

Idea behind disabling NACKs after key frame was another knob to
throttle retransmission bit rate. But, with spaced out retransmissions
and max retransmissions per sequence number, there are throttles.
This would provide more throttling, but affects some applications.
So, disabling filtering NACKs after a key frame.

Introducing another flag to disallow layers. This would still be quite
useful, i. e. under congestion the stream allocator would move the
target lower. But, because of congestion, higher layer would have lost
a bunch of packets. Client would NACK those. Retransmitting those higher
layer packets would congest the channel more. The new flag (default
enabled) would disallow higher layers retransmission. This was happening
before this change also, just splitting out the flag for more control.

* split flag
2023-10-20 00:44:39 +05:30
Raja Subramanian e461e9cd79 Log skew in clock rate. (#2158)
* Log skew in clock rate.

Remember seeing sender report time stamp moving backward
across mute with replaceTrack(null). Not able to reproduce
it in JS sample app, but have seen it elsewhere.

Logging to understand it better. Wondering if the sender report
should be reset on time stamp moving backward or if we should drop
backwards moving reports.

* set threshold at 20%
2023-10-19 13:58:50 +05:30
Raja Subramanian f653efcf10 Do not update highest time on padding packet. (#2157)
* Error log of padding updating highest time to get backtrace.

* Do not update highest time on padding packet.

Padding packets use time stamp of last packet sent.
Padding packets could be sent when probing much after last packet
was sent. Updating highest time on that screws up sender report
calculations. We have ways of making sure sender reports do not
get too out-of-whack, but it logs during that repair.
That repair should be unnecessary unless the source is behaving weird
(things like publisher sending all packets at the same time, publisher
sample rate is incorrect, etc.)
2023-10-19 12:01:48 +05:30
David Colburn b290c233ea fix CreateEgress not completing (#2156) 2023-10-18 23:18:53 -07:00
David Colburn 62b057b4c1 Egress store/IO cleanup (#2152)
* egress store cleanup

* client wrapper, regenerate

* put WithClusterID back

* rename clent

* infinite loops

* client wrapper -> interface

* remove StopEgress update

* remove Update from IOClient

* avoid duplicate EgressStarted events

* update protocol
2023-10-18 14:48:51 -07:00
Raja Subramanian 7c830ea5b9 Log highest time update on padding packet. (#2154)
* Log highest time update on padding packet.

Seeing a strange case of what looks like highest time getting
updated on a padding packet. Can't see how it happens in code.
So, logging to check. Will be removing log after checking.

* log sequence number also
2023-10-19 00:53:50 +05:30
Raja Subramanian f97242c8ba Use 32-bit time stamp to get reference time stamp on a switch. (#2153)
* Use 32-bit time stamp to get reference time stamp on a switch.

With relay and dyncast and migration, it is possible that different
layers of a simulcast get out of sync in terms of extended type,
i. e. layer 0 could keep running and its timestamp could have
wrapped around and bumped the extended timestamp. But, another layer
could start and stop.

One possible solution is sending the extended timestamp across relay.

But, that breaks down during migration if publisher has started afresh.
Subscriber could still be using extended range.

So, use 32-bit timestamp to infer reference timestamp and patch it with
expected extended time stamp to derive the extended reference.

* use calculated value

* make it test friendly
2023-10-18 21:48:41 +05:30
Raja Subramanian 3e4cd3a161 Accept more range for first packet time adjustment. (#2150) 2023-10-17 23:52:14 +05:30
Raja Subramanian 11c9c56a4d Defer close of source and sink to prevent error logs. (#2149)
When a room is created via room service, when `StartSession`
runs, it sees a closed request source and returns an error
and that gets logged. It is not a real error.

Defer the sink and source close so that room creation can finish without
errors.
2023-10-17 11:37:34 +05:30
cnderrauber 53e757fd2c Fix panic on streamtracker_dd (#2147) 2023-10-17 10:37:11 +08:00
David Zhao 65934e6486 Fix ICE connection fallback (#2144)
* Fix ICE connection fallback

Short connection detection relied on iceFailedTimeout, which previously
had been misinterpreted. Since we've reduced iceFailedTimeout, it is
creating false negatives.

We'll instead use PingTimeout since clients are expected to keep the
signal connection active.

* reduce ping interval to align with total ice failure timeout
2023-10-15 14:36:12 -07:00
Raja Subramanian 70b60101f4 do the proper large negative check (#2139) 2023-10-10 11:01:49 +05:30
Raja Subramanian 31b042ddce Log larga negative gap. (#2138)
Seeing a large positive gap which I am not able to explain.
Wondering if at some other time, a large negative is happening
and the large positive is just a correction.
2023-10-09 14:32:10 +05:30
Raja Subramanian ebe1470e46 Simplify (#2137) 2023-10-07 13:15:51 +05:30
David Zhao 0931dc0300 Handle playoutDelay for Firefox (#2135)
We will need to disable playoutDelay for FF users if the developer is
assuming streams are synced.
2023-10-07 00:00:52 -07:00
Raja Subramanian 9fc276481a Increase accuracy of delay since last sender report. (#2136)
It is in 1 / 65536 seconds units. That is about 0.015 ms.
So, use microseconds to increase accuracy.
2023-10-07 12:25:47 +05:30
Raja Subramanian 2ff8fe3b78 Prevent old packets resolution. (#2134)
* Prevent old packets resolution.

With range map, we are just looking up ranges and not exactly
which packets were missing. This caused the case of old packets
being resolved after layer switch.

For example,
- Packet 10 is layer switch, range map gets reset
- Packet 11, 12, 13 are forwarded
- Packet 9 comes, it should ideally be dropped as pre-layer switch old
  packet. But, when looking up range map, it gets an offset and hence
  gets re-mapped to something before layer switch. This was probably
  okay as decoders would have had a key frame at the switch point and
  moved ahead, but incorrect technically.

Fix is to reset the start point in the range map to the switch point
and not 0. So, when packet 9 comes, range map will return "key too old"
error and that packet will be dropped as missing from cache.

* fix tests
2023-10-07 10:56:34 +05:30
David Zhao f40a97fd3e Allow playout delay even when sync stream isn't used. (#2133)
The two options should function independently. Users could still improve
video jitter by using up to 100ms of playout delay.
2023-10-06 14:51:11 -07:00
Benjamin Pracht 4253845505 Create stub for update outputs in egress service (#2132) 2023-10-05 12:00:47 -07:00
Raja Subramanian 4ea284fae0 Log potential sequence number de-sync in receiver reports. (#2128)
* Log potential sequence number de-sync in receicer reports.

Seeing some cases of a roll over being missed. That ends up
as largish range to search in an interval and reports missing packets
in the packet metadata cache.

Logging some details.

* just log in one place
2023-10-05 13:10:13 +05:30
Raja Subramanian 5aa093f65d Log time when there are too many packets. (#2127)
Ideally, can remove the nil return when there are too many packets
as we have more information with extended sequence numbers, but
logging duration first to understand what is happening better.
2023-10-05 12:02:55 +05:30
Raja Subramanian 6c49d1a160 Logging a few bits at Infow (#2126)
Seeing sequencer errors with egress (related to dummy start).
So, logging a few bits at Infow to understand them better.
2023-10-05 11:16:31 +05:30
Raja Subramanian c710ab901e Log stream start at Infow (#2125)
Seeing some sequence number adjustment where timestamp is 0.
Want to check the stream start.
2023-10-05 00:12:41 +05:30
Raja Subramanian cb0a48c12d Handle RED extended sequence number. (#2123)
When converting from RED -> Opus, if there is a loss, SFU recovers
that loss if it can using a subsequent redundant packet. That path
was not setting the extended sequence number properly.

Also, ensuring use of monotonic clock for first packet time adjustment
also.
2023-10-03 19:38:55 +05:30