* WIP commit
* WIP commit
* Remove debug
* Revert to reduce diff
* Fix tests
* Determine spatial layer from track info quality if non-simulcast
* Adjust for invalid layer on no rid, previously that function was returning 0 for no rid case
* Fall back to top level width/height if there are no layers
* Use duration from RTPDeltaInfo
- SenderSSRC was not set for NACK, RTCP_RR - so SRTP context
was using SSRC = 0 which is not bad, but let us set the SSRC properly.
- PLI was using a random SSRC on every PLI. So, that would have
created a new SRTP stream (not bad as that stream context is small)
on every PLI. It is wasteful. So, set the SenderSSRC to the mediaSSRC.
- Reduce re-start of higher layers to 10 seconds. That is long enough
to declare that a stream layer has restarted.
* Use grants clone
* Fix a couple of more races
Use a shadow copy of down tracks in DownTrackSpreader.
Read always uses the shadow.
On Add/Delete of down track, make a new copy.
Copying is done only on add/delete.
If somebody is holding reference to a shadow, it will be in tact as Add/Delete create a new slice.
With this, not seeing any more races in test. So, enabling CI tests with `-race`.
Also fixing another race reported in #603
There are a couple of more races in that bug report that needs to be
chased down.
* Use env suggested in https://lifesaver.codes/answer/runtime-race-detector-sigabrt-or-sigsegv-on-macos-monterey-49138
* staticcheck, did not fail locally, but reported by CI
* use API to get down tracks
* Stats of NACKs acked and number of repeated NACKs.
Also making a change in delta stat to drop negative packet loss
counts to 0. Because of windowing it is a legitimate case.
The receiver could have seen a loss in window we are measuring
and in the subsequent window, the receiver could have gotten
a retransmission and reduced the packet loss count resulting
in a negative delta. When we report negative delta, it could
get dropped by analytics validator. That will be lost data.
Avoid that.
* Remove unused code
* Pick up latest protocol
* Remove `Head` field from `ExtPacket` structure.
Although we do not intend to, but if packets get out-of-order
in the forwarding path (maybe reading in multiple goroutines
or using some worker pool to distribute packets), the `Head`
indicator could lead to wrong behaviour. It is possible that
at the receiver, the order is
- Seq Num N, Head = true
- N + 1, Head = true
If the forwarding path sees `N + 1` first, the Head flag
when it sees `N` packet is incorrect and will lead to incorrect
behaviour.
The alternative check is very simple. So, remove `Head` flag.
* Remove unused field
* Use delta stats throughout and avoid calculating deltas in telemetry
* Fix a few things after testing
* Remove debug
* Fix tests
* delete instead of setting to nil
* Point to the latest protocol
* Handle an edge in layer lock.
A very edge case
- Available layer: [0, 1, 2], but bitrate is not yet available.
We set it to layer 2 awaiting measurement.
- Measurement for layers 0 and 1 come through.
- Still no key frame for layer 2.
- Finalize layers runs and sees that bitrate is available for 0 and 1.
It finalizes layer 1.
- Layer 1 key frame comes (because we asked key frame of layer 2,
publisher sends key frame for all layers). Locks to layer 1.
- No more events happen to switch to layer 2.
Changes
-------
- Move bit rate measurement to StreamTrackerManager. Allows re-use
in relay.
- Make bit rate availability (from zero -> non-zero) an event
and let it flow through the stream allocation flow so that we
always have an event when bit rate measurement becomes available.
This gets rid of finalization which I was unhappy with it anyway as
it was polling every second.
- Removing REMB stuff from buffer. We do not use it.
It is incorrect anyway. REMB should be ay peer connection level.
* Fix test
* fix test
* Simplify allocate
* Simplify/clean up
This still does not address root cause of large loss, but at least
does not display crazy thing like packets = 0, but packet rate is 45/s.
Also, RLock in ToString() as there are bits of structure used in
stringification.
* Key frames
- Keep track of key frame stats
- Split out PLI from down track used for purpose of layer locking.
This will give us a good picture of down stream issues forcing a PLI.
- Use key frame requester whenever there is a layer lock required.
Not just the first key frame. With the synchronous thing, the counter
was just ridiculously high like 150 or something because of all
the initial padding packets. Also, use RTT in key frame requester.
* send first PLI before waiting
* Turn off key frame requester when disabled
* simplify
* Introducing OpsQueue
Creating a PR to get feedback on standardizing on this concept.
Can be used for callbacks.
Already a couple of places use this construct. Wondering if we
should standardize on this across the board.
Just changing one place to use the new struct. Another place
that I know of which uses this pattern is the telemetry package.
* atomic flag -> bool