Seeing an error in an e2e test, after migration, no packets are
forwarded. The only reason seems to be payload type mismatch (assuming
there are no errors in the forwarding loop pulling packets from buffer).
So, logging some packet stats in forwarding loop.
* Use atomic to store codec.
It can change on up stream codec change, but not seeing any racy
behaviour with atomic access.
Reverting the previous change to mute with this change.
* no mime arg
Need to re-visit the bind lock scope and maybe make the codec/mime
atomic and access them without bind lock. But, doing a whack-a-mole a
bit first to move things forward. Will look at making them atomics.
With publish RED and subscribe Opus, the RTCP sender reports were not
sent to down track as publisher sender reports were not forwarded to the
down track.
* Dependent participants should not trigger count towards FirstJoinedAt
According to the API, empty timeout should be honored as long as no
independent participant joins the room. If we counted Agents and Egress
as part of FirstJoinedAt, it would have the side effect of using
departureTimeout instead of emptyTimeout for idle calculations.
* use Room logger
- With probing the packet rate can get high suddenly and remote may not
have sent receiver report as it might be sending for the non-spikey
rate. That causes metadata cache overflows. So, give RTX more cahe.
- Don't need a large cache for primary as either reports come in
regularly (or they are missing for a long time and having a biger
cache is not the solution for that, so reduce primary cache size)
- Check for receiver report falling exactly back by (1 << 16). Had done
that change in the inside for loop, but missed the top level check :-(
This is mostly to clean up forwarder state cache for already started
tracks.
A scenario like the following could apply the seed twice and end up with
an incorrect state resulting in a large jump
- Participant A let's say is the one showing the problem
- Participant A migrates first. So, it tries to restore its down track states by querying state from the previous node.
- But, its down tracks start before the response can be received. However, it remains in the cache.
- Participant B migrates from a different node to where Participant A. So, the down track of Participant A gets switched from relay up track publisher -> local up track publisher.
- I am guessing the seeding gets applied twice in this case and the cached value from step 3 above causes the huge jump.
In those cases, the cache needs to be cleaned up.
(NOTE: I think this seeding of down track on migration is not necessary
as the SSRC of down track changes and the remote side seems to be
treating it like a fresh start because of that. But, doing this step
first and will remove the related parts after observing for a bit more)
Also, moving fetching forwarder state to a goroutine as it involves a
network call to the previous node via Director.