Commit Graph

168 Commits

Author SHA1 Message Date
Raja Subramanian a64bd23b6d Do server PLI when sync is required. (#2197)
* Do server PLI when sync is required.

A few changes
- Run key frame requester goroutine always. Runs every 200 ms which is
  not bad.
- Post a key frame request when server knows it needs one, like after an
  allocation. This ensures that the initial request is not delayed.
- Periodic check will ensure PLI for cases like all frame chains of a
  dependency descriptor being broken.

* simplify
2023-10-27 15:16:39 +05:30
Raja Subramanian 8b16db2270 Log PLI requests. (#2194)
A few things
- Log PLI requests from client.
- Pass in marker to RTP munger as SVC can insert marker.
- Adjusting first packet time should be aware of SVC as there is single
  stream in SVC
2023-10-26 21:07:36 +05:30
Raja Subramanian 08997c96b0 Drop not relevant packet only if contiguous. (#2167)
The probing + munging has not been set up to drop packets that follow
a gap. Dropping such a packet leads to padding packet sequence numbers
overlapping with regular packets.

This change does two things though.
- The not relevant packet will still not be sent over the wire. That could
create holes in the sequence number leading to NACKs
- Would the hole cause decode issues? Unclear as making this condition is hard.
Simulating it is not showing issues, but that may not be producing the bad
sequence if any.

Will look at the ability to drop a packet after a gap later.
2023-10-22 00:08:41 +05:30
Raja Subramanian b591c56aa3 Logging reduction. (#2165)
Move some to Debugw and add sampling for a few.
2023-10-21 10:26:30 +05:30
Raja Subramanian 4f8bbdbaab Keeping revert of debug logs ready (#2163) 2023-10-21 01:47:50 +05:30
Raja Subramanian 0407eb4833 Log audio packets in forwarding path. (#2162)
Seeing a time stamp jump that I am not able to explain.
Basically, it looks like the time stamp doubles at some
point. There is no code which doubles the timestamp.
Can understand an erroneous roll over/wrap around, but
doubling is very strange.

So, logging only audio packets. Will disable as soon
as I have some smaples from canary.
2023-10-21 01:37:30 +05:30
Raja Subramanian 0d7477178e More fine grained filtering NACKs after a key frame. (#2159)
* More fine grained filtering NACKs after a key frame.

There are applications with periodic key frame.
So, a packet lost before a key frame will not be retransmitted.
But, decoder could wait (jitter buffer, play out time) and cause
a stutter.

Idea behind disabling NACKs after key frame was another knob to
throttle retransmission bit rate. But, with spaced out retransmissions
and max retransmissions per sequence number, there are throttles.
This would provide more throttling, but affects some applications.
So, disabling filtering NACKs after a key frame.

Introducing another flag to disallow layers. This would still be quite
useful, i. e. under congestion the stream allocator would move the
target lower. But, because of congestion, higher layer would have lost
a bunch of packets. Client would NACK those. Retransmitting those higher
layer packets would congest the channel more. The new flag (default
enabled) would disallow higher layers retransmission. This was happening
before this change also, just splitting out the flag for more control.

* split flag
2023-10-20 00:44:39 +05:30
Raja Subramanian f97242c8ba Use 32-bit time stamp to get reference time stamp on a switch. (#2153)
* Use 32-bit time stamp to get reference time stamp on a switch.

With relay and dyncast and migration, it is possible that different
layers of a simulcast get out of sync in terms of extended type,
i. e. layer 0 could keep running and its timestamp could have
wrapped around and bumped the extended timestamp. But, another layer
could start and stop.

One possible solution is sending the extended timestamp across relay.

But, that breaks down during migration if publisher has started afresh.
Subscriber could still be using extended range.

So, use 32-bit timestamp to infer reference timestamp and patch it with
expected extended time stamp to derive the extended reference.

* use calculated value

* make it test friendly
2023-10-18 21:48:41 +05:30
Raja Subramanian 6c49d1a160 Logging a few bits at Infow (#2126)
Seeing sequencer errors with egress (related to dummy start).
So, logging a few bits at Infow to understand them better.
2023-10-05 11:16:31 +05:30
Raja Subramanian 69177b8b6e Debug logs to check on sequencer missing offset. (#2071)
* Debug logs to check on sequencer missing offset.

* spelling

* close range on value decrement
2023-09-14 12:20:33 +05:30
Raja Subramanian 68aebb0106 Do not mute when stream is paused. (#2069) 2023-09-13 19:59:24 +05:30
Raja Subramanian 5f701ece34 Include top bits from start in highest sequence number from RR. (#2064)
Streaming could start after 16-bits has rolled over. So, have to add
that base back to what is received in receiver report.

Otherwise, it looks like there are not packets received in window
leading to poor quality.
2023-09-12 14:36:25 +05:30
Raja Subramanian 254a35543d Fix down stream stats. (#2063)
Need to pass in the correct time. Previously streaming start was
determined by another delta snap shot which as removed for efficiency.
Did not realise that we were passing in zero time for stats.

Also, revert of the change (the part which did not re-pause) from this
PR (https://github.com/livekit/livekit/pull/2037). That change affects
other paths. The edge it was trying to fix is more rare. Need to think
about a way which covers all cases.
2023-09-12 08:34:28 +05:30
Raja Subramanian b5f2f83278 Fix time stamp adjustment when starting with dummy packets. (#2053)
* Fix time stamp adjustment when starting with dmummy packets.

- Populated extended values in ExtPacket on dummy packet.
- Have to pass reference time stamp offset to first packet time
  adjustment.

* display participant version info
2023-09-09 17:33:26 +05:30
Raja Subramanian 0fffaf3282 Some small optimisations (#2042)
* WIP commit

* WIP commit

* WIP commit

* Revert unintended delete
2023-09-07 13:25:09 +05:30
Raja Subramanian c122c20f49 Do not re-pause a paused track. (#2037) 2023-09-06 08:27:16 +05:30
Raja Subramanian 1590b96686 Need to set reference layer when starting with dummy packets. (#2034)
Dummy packets are used at start to trigger Pion's OnTrack.
2023-09-05 12:00:00 +05:30
cnderrauber b85ff8f063 Support non-SVC AV1 track publishing (#2030) 2023-09-04 12:39:14 +08:00
Raja Subramanian f0ca262bcf Prevent erroneous stream pause. (#2008) 2023-08-29 13:21:57 +05:30
Raja Subramanian 3b30f49ad5 Extended type for RTP timestamp. (#2001) 2023-08-27 17:28:44 +05:30
Raja Subramanian 55d5edcf73 Use range map in RTPMunger. (#2000)
* WIP commit

* Make lastSN 32-bit

* Remove unused TSCycles
2023-08-27 10:49:17 +05:30
Raja Subramanian 10893b9b33 Store referenceLayerSpatial in Forwarder state. (#1986)
When restoring state, reference layer could change before this change.
That meant the time stamp base would change and cause jumps.

But, the solution in this change to store the reference layer state
and restoring it has a different issue. It is possible that the reference is
layer 2 (HIGH) for example. On a migration when the down track has
to re-attach and resume to a moved up stream track, it is possible that
layer 2 is not published due to bandwidth constraint after publisher
migrates to new node. In that case, the stream cannot be resumed as
time stamp adjustment cannot be calculated.

An option is to set referenceSpatialLayer always at layer 0 (LOW).
But, that also has a couple of issues
- Browsers like FF have shown issues with layer mapping.
- Layer 0 is lowest bit rate. So, it will have RTCP at lower frequency.
  That could introduce a slight latency in stream start as we need
  RTCP sender report to calculate the time stamp.

Open to ideas on how to handle this better.
2023-08-22 00:03:37 +05:30
Raja Subramanian ee88115097 Demote noisy logs (#1976) 2023-08-18 09:52:02 +05:30
Raja Subramanian 129b1df8e6 Use VP9 Key frame detection from Galene. (#1973)
* Use VP9 Key frame detection from Galene.

With ffmpeg generated VP9 file with single layer
and publishing via Go SDK, the key picture determination
outlined at https://datatracker.ietf.org/doc/html/draft-ietf-payload-vp9-16#page-13
under the F bit explanation does not work. It declares kay frame for
pretty much all frames. Unclear if ffmpeg generated bitstream has issues
or if that procedure in the above document does not work for single
layer case.

Using the bit stream explained here
https://storage.googleapis.com/downloads.webmproject.org/docs/vp9/vp9-bitstream-specification-v0.6-20160331-draft.pdf
(pages 28, 62, 63) implemented in Galene.
That is more expensive as it has to parse more, but works in all cases.

* Add AV1-TODo

* add some TODOs
2023-08-17 22:33:11 +05:30
Raja Subramanian ce1fde451c Get next higher using bit rate. (#1960) 2023-08-11 17:22:56 +05:30
Raja Subramanian 114888e7c7 log next sequence number also, easier to check layer switches (#1957) 2023-08-11 10:50:11 +05:30
Raja Subramanian 51650ea301 Use refTS if ahead. (#1956)
As expectedTS is tied to first packet and first packet adjustment
may not have happened, refTS being ahead is not a bad thing.

In one example,
- first packet was late
- a layer switch happened around 110ms later
- in that time, 190ms worth of media was forwarded
- but first packet adjustment did not happen yet
- so at that layer switch, expected was behind
- choosing ref at that switch is the right thing
2023-08-11 10:17:13 +05:30
Raja Subramanian 7802310830 Restrict resume behind check to publisher mute only. (#1951)
A publisher mute is when the reference can fall behind because
of replaceTrack(nil). On a subscriber mute, should not jump ahead
to expected because publisher could still be lagging and behind.

Also consolidate logging.
2023-08-10 13:48:39 +05:30
Raja Subramanian 0e9ec9a21e Ignore lagging layer switches. (#1948) 2023-08-09 17:42:33 +05:30
Raja Subramanian c14c58b4ae Layer switches at log info to better understand A/V sync issues. (#1947) 2023-08-09 11:28:48 +05:30
Raja Subramanian 9a96abc11f Intermediate signed type casting (#1944) 2023-08-08 23:44:03 +05:30
Raja Subramanian 0dc92ef273 Remove parked layer feature. (#1927)
* Remove parked layer feature.

Not worth the added complexity.

Several reasons
- Not seeing black frames on pub mute always.
- If they are there, it can consume more than 30kbps if the parked layer
  is high res. That is wasted bandwidth downstream when pub is muted.
- On resume, client some time sends PLI and that triggers a key frame
  request.

But, leaving the separate `PubMuted` flag in forwarder in case we can
use it for better handling.

* need the request spatial
2023-08-02 14:02:29 +05:30
cnderrauber f7a1776f4c Add control of playout delay (#1838)
* Add control of playout delay

Add config to enable playout delay. The delay will be limited by
[min,max] in the config option and calculated by upstream & downstream
RTT.

* check protocol version to enable playout delay

* Move config to room, limit playout-delay update interval, solve comments

* Remove adaptive playout-delay

* Remove unused config
2023-08-02 16:12:23 +08:00
Raja Subramanian 0c34f12fa1 Demote some high frequency logs to Debugw (#1925) 2023-08-02 00:03:38 +05:30
David Zhao 981fb7cac7 Adding license notices (#1913)
* Adding license notices

* remove from config
2023-07-27 16:43:19 -07:00
Raja Subramanian 7a10f60be7 Remove packet debug. (#1909)
Not showing anything too useful.
2023-07-27 10:04:04 +05:30
Raja Subramanian 9702d3b541 A couple of more opportunities in stream allocator. (#1906)
1. When re-allocating for a track in DEFICIENT state, try to use
   available headroom to accommodate change before trying to steal
   bits from other tracks.
2. If the changing track gives back bits (because of muting or
   moving to a lower layer subscription), use the returned bits
   to try and boost deficient track(s).
2023-07-26 15:35:07 +05:30
Raja Subramanian 0484a68342 Plug a couple of holes in stream transitions. (#1905)
* Plug a couple of holes in stream transitions.

1. Missed negative sign meant stealing bits from other tracks was not
   working.
2. When a track change (mute, unmute, subscription change) cannot be
   allocated, explicitly pause so that stream state update happens.

Refactor stream state update a bit to make it a bit cleaner.

* correct comment
2023-07-26 13:36:58 +05:30
Raja Subramanian cf8cf1a87f Forgot to log important bits :-( (#1891) 2023-07-19 10:22:51 +05:30
Raja Subramanian 66de9ff4a0 Add debug log for RTCP sender report. (#1890)
* Add debug log for RTCP sender report.

Temporary to collect more data. Hitting scenarios under congestion
where the sender report gets off sync. Need some data to pore through
and understand and implement changes.

* Debugw
2023-07-18 23:21:06 +05:30
Raja Subramanian 11e1eb00fa Attempt to avoid out-of-order max subscribed layer notifications. (#1882)
* Check for request layer lock only in the goroutine

* check before sending PLI

* max layer notifier worker

* test cleanup

* clean up

* do notification in the callback
2023-07-16 23:28:20 +05:30
Raja Subramanian 4c02a6d717 Time stamp adjustments v2 (I think) (#1875)
* WIP commit

* WIP commit

* WIP commit

* Some clean up
- Removed a chatty debug log
- some spelling, punctuation correction in comments
- missed an `Abs` in check, add it.
2023-07-14 11:47:07 +05:30
Raja Subramanian 8dc2c005c3 Add ability to roll back video layer selection. (#1871)
* Add ability to roll back video layer selection.

Not currently useful, but it is possible to do things like not
applying a layer switch if the switch point time stamp is too far back.

Add ability to roll back a layer switch and invoke rollback if
a packet was selected for forwarding, but a subsequent error or decision
to drop the packet can rollback layer switch if that was the switching
packet.

In current code, the paths where a packet can be dropped after selection
does not happen at switch points. So, it was okay to apply the selection
unconditionally. But, adding the call to rollback in the current code
also in all paths where packet is dropped after selection for consistent
code flow.

* separate switch for temporal layer
2023-07-12 14:12:00 +05:30
Raja Subramanian 5459bd2931 Push track quality to poor on a bandwidth constrained pause. (#1867)
* Push track quality to poor on a bandwidth constrained pause.

* add tests

* scale distance by divisor

* fix test distance to desired

* wait longer for subscription manager to reconcile
2023-07-11 15:29:35 +05:30
cnderrauber 873c87f24b Fix nack issue for svc codecs (#1856)
* Fix nack issue for svc codecs

* Fix test
2023-07-07 15:46:18 +08:00
cnderrauber 5b975af55f Refine dependency descriptor based selection forwarder (#1808)
* Don't update dependency info if unordered packet received

* Trace all active svc chains for downtrack

* Try to keep lower decode target decodable

* remove comments

* Test case

* clean code

* solve comments
2023-06-27 15:11:06 +08:00
Raja Subramanian 6946d0a3a1 Do not mute forwarder when paused to bandwidth congestion. (#1796)
* Do not mute forwarder when paused to bandwidth congestion.

Detailed notes in code.

* remove word
2023-06-16 12:08:01 +05:30
Raja Subramanian afa7733748 Promote switch logs to Infow. (#1790) 2023-06-12 17:30:56 +05:30
Raja Subramanian 3d696ac39f Keep next timestamp on switch closer to ref. (#1784)
If ref is coming in slow (due to pacing), it is possible that
expected is ahead. Pulling next too far towards expected causes
warps in a subsequent report. Keep switches closer to ref.
2023-06-10 11:38:46 +05:30
Raja Subramanian 7ed3af193a No proof that this helps (#1772) 2023-06-06 11:28:13 +05:30