When restoring state, reference layer could change before this change.
That meant the time stamp base would change and cause jumps.
But, the solution in this change to store the reference layer state
and restoring it has a different issue. It is possible that the reference is
layer 2 (HIGH) for example. On a migration when the down track has
to re-attach and resume to a moved up stream track, it is possible that
layer 2 is not published due to bandwidth constraint after publisher
migrates to new node. In that case, the stream cannot be resumed as
time stamp adjustment cannot be calculated.
An option is to set referenceSpatialLayer always at layer 0 (LOW).
But, that also has a couple of issues
- Browsers like FF have shown issues with layer mapping.
- Layer 0 is lowest bit rate. So, it will have RTCP at lower frequency.
That could introduce a slight latency in stream start as we need
RTCP sender report to calculate the time stamp.
Open to ideas on how to handle this better.
* Use VP9 Key frame detection from Galene.
With ffmpeg generated VP9 file with single layer
and publishing via Go SDK, the key picture determination
outlined at https://datatracker.ietf.org/doc/html/draft-ietf-payload-vp9-16#page-13
under the F bit explanation does not work. It declares kay frame for
pretty much all frames. Unclear if ffmpeg generated bitstream has issues
or if that procedure in the above document does not work for single
layer case.
Using the bit stream explained here
https://storage.googleapis.com/downloads.webmproject.org/docs/vp9/vp9-bitstream-specification-v0.6-20160331-draft.pdf
(pages 28, 62, 63) implemented in Galene.
That is more expensive as it has to parse more, but works in all cases.
* Add AV1-TODo
* add some TODOs
* Adjust TS and cycles when adjusting start.
Chasing some AddPacket errors across relay.
Noticed that in one case the start/end sequence was flipped.
There is a known issue of it happening with resync.
Unclear if this instance was due to resync or not.
The start was close to the edge (64513). So, thought maybe
adjust at start and noticed that it needs to maybe increase
cycle count if start is wrapping back. In this case, the
start is 1000 before wrap point. So, may not be a wrap back
issue, but addressing what I found anyway.
* fix test
Not super useful. It does happen a bunch of times especially at lower
end of estimate where the next layer up is high. We have to probe
anyway. Effects of large jumps have been mitigated by doing it for short
time.
If room metadata is changed in between when a participant is joining and
when they've became active, that participant will not have the latest
room metadata.
Seeing a good chunk of logs using default offset.
And it is concentrated heavily on few tracks.
Logging more to understand this better before
potentially demoting this log.
As expectedTS is tied to first packet and first packet adjustment
may not have happened, refTS being ahead is not a bad thing.
In one example,
- first packet was late
- a layer switch happened around 110ms later
- in that time, 190ms worth of media was forwarded
- but first packet adjustment did not happen yet
- so at that layer switch, expected was behind
- choosing ref at that switch is the right thing
* Use 32-bit ts for first packet adjustment.
Otherwise, a new subscriber on a long running sees a huge difference
if the publisher side has rolled over.
As this happens only in the first two minutes of a track's lifecycle
it is fine to not consider rollover.
* log RTP in anachronous report
A publisher mute is when the reference can fall behind because
of replaceTrack(nil). On a subscriber mute, should not jump ahead
to expected because publisher could still be lagging and behind.
Also consolidate logging.
* Integrate logger components
Dividing into the following components
* pub - publisher
* pub.sfu
* sub - subscriber
* transport
* transport.pion
* transport.cc
* api
* webhook
* update go modules
Previous change to check for non-zero width caused test failures
as subscribed track settings can use the quality field and not
necessarily width/height.
* Remove parked layer feature.
Not worth the added complexity.
Several reasons
- Not seeing black frames on pub mute always.
- If they are there, it can consume more than 30kbps if the parked layer
is high res. That is wasted bandwidth downstream when pub is muted.
- On resume, client some time sends PLI and that triggers a key frame
request.
But, leaving the separate `PubMuted` flag in forwarder in case we can
use it for better handling.
* need the request spatial
* Add control of playout delay
Add config to enable playout delay. The delay will be limited by
[min,max] in the config option and calculated by upstream & downstream
RTT.
* check protocol version to enable playout delay
* Move config to room, limit playout-delay update interval, solve comments
* Remove adaptive playout-delay
* Remove unused config
Server could have closed subscriber PC to aid migration.
But, if a resumes lands back on that node, a resume of
the participant session is not possible as subscriber PC is already
closed. While theoretically possible to form a new subscriber
peer conenction, reducing complexity and issuing a full reconnect
as this should be a rare case.