* Handle multiple codecs in renegotiation
update pion to v3.1.9 for answer same order of codec as publisher.
register enable codecs in subscriber peerconnectin created.
add codec parameter to buffer.bind
buffer should use the codec of TrackRemote as it's codec mime.
sent h264blankframe when DownTrack closing
When we RLock during write cycles, the mutex spends the majority of its time
staying locked. As new participants join, they have to acquire the WLock
before downtracks could be add it.
In load test scenarios (25 participants joining together), it's common to see
goroutine dump showing MediaTrack.AddSubscriber -> DownTrack.storeDownTrack trying to acquire mutex, and never able to acquire it.
* Fixing edge cases in picture id munging.
Changes
1. Check the RTP sequence number order before VP8 temporal layer
filtering and use that ordering result while doing temporal
layer filtering.
In a sequence like below
o Packet 10 -> Picture ID 10
o Packet 11 -> missing
o Packet 12 -> Picture ID 11
it is not known if packet 11 will belong to Picture ID 10 or
Picture ID 11. The problem becomes a lot more tricky if there
is a burst loss and there is a larger hole in the picture id
space also as a result.
So, in the event of a packet loss, forward even if the current
packet belongs to a layer that can be dropped. More comments
in code.
2. Use result of sequence number ordering check while doing VP8 picture id munging.
3. When adding to missing picture id cache, have to include picture ids including
both ends. As a picture can span multiple packets and it is not known which
picture the packet belongs to, have to include both ends also in missing
picture id cache in the event of a gap.
4. As a picture can span multiple packets, it is not possible to have a simple
map of missing picture ids as an entry cannot be deleted if an out-of-order
picture id is received. There may be more missing packets belonging to that
picture id that is yet to be received.
So, have to use an ordered map and truncate the map if it grows too large.
Picked this for ordered map - https://github.com/elliotchance/orderedmap.
Has a simple API, had the highest number of stars of all the ones I checked.
And there are benchmarks.
The author also wrote a medium post at https://medium.com/swlh/an-ordered-map-in-go-436634692381
Another one which I looked at is - https://github.com/wk8/go-ordered-map.
The author of that wrote at https://morioh.com/p/990229f32171 and has a
bunch of other options at the end of that post (but does not include the
one I picked above). None of those have that many stars.
Testing:
--------
- Set max temporal layers to 0 so that temporal filtering happens and run for
an hour on sample app.
* do not let padding packets through VP8
* Correct comment
* fix comment
* Review comments from Jie
* golang naming convention
* Try sending small key frames to clear decoded buffer
Problem:
--------
With transceiver re-use, client disables/enables tracks.
With video, this retains the last picture and when a new track
starts, there is a brief moment when the old stream is displayed
till there is data from new stream to decode and display.
Fix:
---
Send small key frames before closing DownTrack to try and clear
the decoder display buffer.
Testing:
--------
Tried with Chrome/Safari/Firefox and they worked. But, very rarely,
the last frames do not seem to show up. In fact, 6 frames are sent
and webrtc internals (in Firefox) reports anywhere from 1 - 6 frames
at the small resolution. Unclear as to why it does not get all the
frames or why it reports less than 6. A not so small percentage of
times (maybe 1 in 15 - 20), have seen no small frame reported at all.
TODO:
----
- Have to support more video codecs
- Would this be an issue for audio also? Should we send something to handle that?
Probably not necessary as video is more jarring.
* Make VP8 Key Frame a const
* Need a packet factory buffer for simple tracks too
as we are using the VP8 munger for simple tracks too because
of the need to send blank frames at the end.
Also, making the writeBlankFrameRTP a private function.
And adding a check to not send blank frames if nothing has been sent
on that DownTrack.
* Separate from ion-sfu
changes:
1. extract pkg/buffer, twcc, sfu, relay, stats, logger
2. to solve cycle import, move ion-sfu/pkg/logger to pkg/sfu/logger
3. replace pion/ion-sfu => ./
reason: will change import pion/ion-sfu/pkg/* to livekit-server/pkg/*
after this pr merged. Just not change any code in this pr, because it
will confused with the separate code from ion-sfu in review.
* Move code from ion-sfu to pkg/sfu
* fix build error for resovle conflict
Co-authored-by: cnderrauber <zengjie9004@gmail.com>
* Fix faulty participant update buffering.
* Fix bug with broadcasting out of order
* dedicated participant update worker, without locks
* use tracker to drop duplicate/out of date messages
* additional lock around filter logic
Got the following error on a fresh install
```
wire: /root/ws/livekit-server/pkg/service/interfaces.go:35:2: DeleteRoom redeclared
wire: /root/ws/livekit-server/pkg/service/interfaces.go:38:2: other declaration of DeleteRoom
wire: generate failed
Error: exit status 1
```
Probably something from the latest `wire` version.
After consulting David, removing the duplicate.
Testing:
--------
- Server builds and runs. Client is able to connect.
* Prevent missing entry in pending tracks
Problem:
--------
A track received via signalling request `AddTrack` is stored
in `pendingTracks` of participant. A MediaTrack is created
when `onTrack` fires after `SetRemoteDescription`. At that
time, pending tracks are searched to find a matching track
and look up an already published MediaTrack.
This is because `onTrack` fires once for every layer of
Simulcast and MediaTrack abstraction is for a media track and
not one for every layer of Simulcast track.
To accomplish that, pending tracks are cleaned up 5 seconds
after the MediaTrack is created. The theory there is that
`onTrack` will fire on all layers within 5 seconds. But, have
observed several instances on my slow machine of that firing
after 5 seconds which results in the search failing and we end
up creating a new MediaTrack.
The above is probably the reason (I am guessing though) for
subscriber PC having an extra m-line some times.
Considered fix:
---------------
One possible option is to increase that 5 seconds timeout to a
very large value. But, it has another issue.
`getPendingTrack` is given the track id which comes in the SDP.
Entries are added to the pending tracks using track id received
via the `AddTrack` signalling message.
And those two need not be the same. Especially Firefox has different ids
every time. Not sure if that is something we do on client side which
causes that, but it does look like a real possibility.
To handle that case, `getPendingTrack` looks up tracks by media kind
(audio/video) if the look up by SDP client id fails.
Here, it is possible that there are two pending tracks of type video
(think camera and screen sharing as an example) and looking up by kind
might end up picking the wrong one.
Fix:
----
Store the signalled client id and SDP client id in the MediaTrack and
look up the published tracks by SDP client id for a track match.
If there is no match, create a new MediaTrack and add it to publishedTracks
and delete the corresponding pending track all within the lock (yeah not
great to have a lot of code within the lock, but this is probably worth
it to have the correctness).
This does solve the issue of deferred pending track removal causing issues.
However, note that kind based look up may do some switching. In a scenario
where there are two pending tracks of kind video and the look up has to
rely on kind, it is possible that signalCid and sdpCid get cross matched
(i. e. client might have sent a signalCid for a Simulcast track, but during
kind based look up it gets assigned to a non-simulcast track). I think
that is okay as there is no strong correlation between the two.
Testing:
--------
- Connect from Chrome, Firefox (both orders, Chrome joining first, Firefox joining first) and ensure that media subscriptions and publishing are correct
- Ensure that DTX munging works properly too.
* Fix tests
Add back adding track to publishedTracks for testing purposes.
* Add a test to check case of `AddTrack` rejecting already published track
* Remove debug.
* Address PR comments - do not need to return SDP cid from `getPendingTracks`.
- Update ion-sfu to v1.20.14
- Enable `abs-send-time` for video tracks
Reference: ion-sfu PR - https://github.com/livekit/ion-sfu/pull/12
Testing:
--------
- Look at SDP offer in subscriber PC and ensure that abs-send-time is negotiated.
- Ensure that downstream packets have `abs-send-time` extension for video packets.
TODO:
-----
- Not yet setting this for audio tracks. Eventually we want to move
to TWCC. This is just a step along the way.
* small refactor
* extra line
* fix room allocator test
* selector fakes not used
* keep decisions out of router
* put nodeId logic back
* fix room allocator test
* more generic turn server
* public turn realm name
* support turn cert itself in config
* remove cert/key from config
* double auth handler
* generate
Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>
* LK-105 (Opus DTX)
https://linear.app/livekit/issue/LK-105/allow-enabling-of-opus-dtx
Enable/Disable Opus DTX using SDP answer based on setting in
`AddTrack` request.
Testing:
--------
Chrome and Firefox work. Having audio problems with Safari
(maybe the Safari 15 issue as I am not getting media)
* Check that receiver has no tracks
* Skip non-audio transceivers
* A small clean up to not use pendin track outside lock and also append with spread
* Address comments from review by David
* Update pkg/rtc/participant.go
Co-authored-by: David Zhao <david@davidzhao.com>
* Pull in tagged version of webrtc and lk protocol
Co-authored-by: David Zhao <david@davidzhao.com>
* cli: Allow setting the current node region with flag or env variable
Also add region to "starting LiveKit server" log.
* routing: Add region to node registration
Register the node's region on the selected router so it can be used for
region aware node selection.
Also add the region to the list-nodes output.
* regionaware: Set minDist to zero for the current node
If you don't set the minDist when leaving the loop early for a node that
matches the current region, the minDist value with still be at max. This
causes the the wrong node to be selected if the current node is the
first one the loop passes through.
Add a test that validates this change. The new test fails if this new
change is not in place.