379 Commits

Author SHA1 Message Date
cnderrauber b7f32dfffd Handle multiple codecs in renegotiation (#191)
* Handle multiple codecs in renegotiation

update pion to v3.1.9 for answer same order of codec as publisher.
register enable codecs in subscriber peerconnectin created.

add codec parameter to buffer.bind
buffer should use the codec of TrackRemote as it's codec mime.

sent h264blankframe when DownTrack closing
2021-11-17 21:18:43 +08:00
cnderrauber c4c93eaad6 Fix issue #159 (#195)
* Fix issue #159

use timestamp for AudeioLevel observer
change smoothinterval default to 2 for more sensitive
2021-11-16 21:59:15 +08:00
David Colburn 95e29d3766 Interface updates (#194)
* update interfaces, a bit of cleaning

* regenerate

* return interface for RoomService

* export packetBufferSize

* update router interface

* move participant key into router

* change locks back

* read only room store

* fix server rm locks

* update SendJoinResponse

* clean up imports

* update room messaging

* regenerate
2021-11-15 15:25:50 -06:00
David Zhao ffb2c50a70 Fixed room API breakage (#190) 2021-11-14 11:18:01 -08:00
David Zhao ceae58ac20 Fixed deadlocks occurring in Receiver writeRTP (#189)
When we RLock during write cycles, the mutex spends the majority of its time
staying locked. As new participants join, they have to acquire the WLock
before downtracks could be add it.

In load test scenarios (25 participants joining together), it's common to see
goroutine dump showing MediaTrack.AddSubscriber -> DownTrack.storeDownTrack trying to acquire mutex, and never able to acquire it.
2021-11-13 22:59:53 -08:00
David Colburn 4e16e4275c move NewStatsInterceptorFactory 2021-11-12 20:21:54 -08:00
David Colburn 92838d75a8 Analytics events + stats (#187)
* events

* bump

* update incoming stats

* publisher stats

* outgoing rtcp

* stats

* remove unnecessary struct

* merge mediaTrack

* put comment back
2021-11-12 16:36:10 -06:00
David Zhao 6500ce262d Fix publisher loss reporting (#186) 2021-11-11 23:29:54 -08:00
David Zhao 5a057f3d19 Handle nil when buffer pair doesn't exist 2021-11-11 23:16:21 -08:00
Raja Subramanian 0a20acdf68 Do not parse padding only retransmit packets as VP8 payload. (#185) 2021-11-12 10:09:25 +05:30
David Zhao 90f3c43dc5 Fixed deadlock in pion with SRTP and DataChannel 2021-11-11 13:25:08 -08:00
Raja Subramanian 2ec5f2bd3d Fixing edge cases in picture id munging. (#180)
* Fixing edge cases in picture id munging.

Changes

1. Check the RTP sequence number order before VP8 temporal layer
   filtering and use that ordering result while doing temporal
   layer filtering.

    In a sequence like below
       o Packet 10 -> Picture ID 10
       o Packet 11 -> missing
       o Packet 12 -> Picture ID 11
     it is not known if packet 11 will belong to Picture ID 10 or
     Picture ID 11. The problem becomes a lot more tricky if there
     is a burst loss and there is a larger hole in the picture id
     space also as a result.

     So, in the event of a packet loss, forward even if the current
     packet belongs to a layer that can be dropped. More comments
     in code.

2. Use result of sequence number ordering check while doing VP8 picture id munging.

3. When adding to missing picture id cache, have to include picture ids including
   both ends. As a picture can span multiple packets and it is not known which
   picture the packet belongs to, have to include both ends also in missing
   picture id cache in the event of a gap.

4. As a picture can span multiple packets, it is not possible to have a simple
   map of missing picture ids as an entry cannot be deleted if an out-of-order
   picture id is received. There may be more missing packets belonging to that
   picture id that is yet to be received.

   So, have to use an ordered map and truncate the map if it grows too large.

   Picked this for ordered map - https://github.com/elliotchance/orderedmap.
   Has a simple API, had the highest number of stars of all the ones I checked.
   And there are benchmarks.
   The author also wrote a medium post at https://medium.com/swlh/an-ordered-map-in-go-436634692381

   Another one which I looked at is - https://github.com/wk8/go-ordered-map.
   The author of that wrote at https://morioh.com/p/990229f32171 and has a
   bunch of other options at the end of that post (but does not include the
   one I picked above). None of those have that many stars.

Testing:
--------
- Set max temporal layers to 0 so that temporal filtering happens and run for
an hour on sample app.

* do not let padding packets through VP8

* Correct comment

* fix comment

* Review comments from Jie

* golang naming convention
2021-11-11 19:03:33 +05:30
Raja Subramanian fc52b18776 Try sending small key frames to clear decoded buffer (#179)
* Try sending small key frames to clear decoded buffer

Problem:
--------
With transceiver re-use, client disables/enables tracks.
With video, this retains the last picture and when a new track
starts, there is a brief moment when the old stream is displayed
till there is data from new stream to decode and display.

Fix:
---
Send small key frames before closing DownTrack to try and clear
the decoder display buffer.

Testing:
--------
Tried with Chrome/Safari/Firefox and they worked. But, very rarely,
the last frames do not seem to show up. In fact, 6 frames are sent
and webrtc internals (in Firefox) reports anywhere from 1 - 6 frames
at the small resolution. Unclear as to why it does not get all the
frames or why it reports less than 6. A not so small percentage of
times (maybe 1 in 15 - 20), have seen no small frame reported at all.

TODO:
----
- Have to support more video codecs
- Would this be an issue for audio also? Should we send something to handle that?
  Probably not necessary as video is more jarring.

* Make VP8 Key Frame a const

* Need a packet factory buffer for simple tracks too
as we are using the VP8 munger for simple tracks too because
of the need to send blank frames at the end.

Also, making the writeBlankFrameRTP a private function.
And adding a check to not send blank frames if nothing has been sent
on that DownTrack.
2021-11-11 14:38:38 +05:30
Raja Subramanian 3ff3e91165 Update pion/rtp and pion/webrtc to the latest (#182)
* Update pion/rtp and pion/webrtc to the latest

* introduce additional delay to fix test reliability

Co-authored-by: David Zhao <david@davidzhao.com>
2021-11-10 14:12:14 -08:00
Mathew Kamkar 9336a0dab5 health check depends on updated stats (#183) 2021-11-10 14:11:44 -08:00
Mathew Kamkar 94aec3b98d Node updates stats with KeepAlive message to self (#177)
* node sends KeepAlive message to self

* use WriteRTCNodeMessage instead of participants[0]
2021-11-09 17:19:46 -08:00
David Colburn 01cf22f2c4 remove error message 2021-11-09 09:33:21 -08:00
cnderrauber 8b7a776af6 Remove unused files (#174)
* remove unsed files from ion-sfu

* remove comment code

* go mod tidy
2021-11-09 17:15:14 +08:00
David Colburn bf46e998b2 Sfu/buffer stats for telemetry (#173)
* more buffer stats for analytics

* update names

* fix jitter and lost rate

* don't return on participantLeft if they never published
2021-11-09 02:06:07 -06:00
David Zhao c5830f9060 add ion-sfu NOTICE 2021-11-08 20:56:53 -08:00
David Zhao 749446274f Use time.Unix instead of UnixMilli (for Go 1.15 compat) 2021-11-08 20:47:41 -08:00
cnderrauber 1e1aaeb86b Separate from ion-sfu (#171)
* Separate from ion-sfu

changes:
1. extract pkg/buffer, twcc, sfu, relay, stats, logger

2. to solve cycle import, move ion-sfu/pkg/logger to pkg/sfu/logger

3. replace pion/ion-sfu => ./
reason: will change import pion/ion-sfu/pkg/* to livekit-server/pkg/*
after this pr merged. Just not change any code in this pr, because it
will confused with the separate code from ion-sfu in review.

* Move code from ion-sfu to pkg/sfu

* fix build error for resovle conflict

Co-authored-by: cnderrauber <zengjie9004@gmail.com>
2021-11-09 12:03:16 +08:00
David Colburn 289ebd32ff Telemetry refactor (#172)
* telemetry refactor

* fix imports

* update protocol
2021-11-08 20:00:34 -06:00
David Zhao 8344466629 Delete unused code 2021-11-07 21:10:40 -08:00
David Zhao 6a80beedfc Only send connection quality updates for protocol 5+ 2021-11-03 23:09:25 -07:00
David Zhao aa9534b7fb Server-driven connection quality detection (#167) 2021-11-03 21:05:20 -07:00
David Colburn 862a212b93 idOrName -> name (#169) 2021-11-03 15:11:44 -05:00
Mathew Kamkar 45aeafd3af more room logging 2021-11-02 14:05:52 -07:00
Mathew Kamkar 05c4df4e23 Room logger with room name (#165)
* room with logger

* participant with room logger

* transport with room logger

* simplify room logger usage

* simplify logger

* update protocol

* more room logging, test fix
2021-11-02 14:02:45 -07:00
David Zhao 7442551ae5 Fix missing participant updates race (#163)
* Fix faulty participant update buffering.

* Fix bug with broadcasting out of order

* dedicated participant update worker, without locks

* use tracker to drop duplicate/out of date messages

* additional lock around filter logic
2021-10-31 15:20:46 -07:00
Mathew Kamkar f3e916e2fe Room Allocator Interface (#161)
* room allocator interface

* remove wire bind

* fix test
2021-10-28 21:02:17 -07:00
David Zhao 0898c17e8a Select video quality using provided dimensions (#158) 2021-10-28 21:01:05 -07:00
Raja Subramanian 4789ae4c7d Fix interface duplicate definition. (#157)
Got the following error on a fresh install
```
wire: /root/ws/livekit-server/pkg/service/interfaces.go:35:2: DeleteRoom redeclared
wire: /root/ws/livekit-server/pkg/service/interfaces.go:38:2: 	other declaration of DeleteRoom
wire: generate failed
Error: exit status 1
```
Probably something from the latest `wire` version.

After consulting David, removing the duplicate.

Testing:
--------
- Server builds and runs. Client is able to connect.
2021-10-25 21:25:46 +05:30
David Colburn 1f643dc96b remove SignalRequest_Simulcast (#154) 2021-10-21 17:11:43 -05:00
Raja Subramanian e0e46e079d Prevent missing entry in pending tracks (#152)
* Prevent missing entry in pending tracks

Problem:
--------
A track received via signalling request `AddTrack` is stored
in `pendingTracks` of participant. A MediaTrack is created
when `onTrack` fires after `SetRemoteDescription`. At that
time, pending tracks are searched to find a matching track
and look up an already published MediaTrack.

This is because `onTrack` fires once for every layer of
Simulcast and MediaTrack abstraction is for a media track and
not one for every layer of Simulcast track.

To accomplish that, pending tracks are cleaned up 5 seconds
after the MediaTrack is created. The theory there is that
`onTrack` will fire on all layers within 5 seconds. But, have
observed several instances on my slow machine of that firing
after 5 seconds which results in the search failing and we end
up creating a new MediaTrack.

The above is probably the reason (I am guessing though) for
subscriber PC having an extra m-line some times.

Considered fix:
---------------
One possible option is to increase that 5 seconds timeout to a
very large value. But, it has another issue.

`getPendingTrack` is given the track id which comes in the SDP.

Entries are added to the pending tracks using track id received
via the `AddTrack` signalling message.

And those two need not be the same. Especially Firefox has different ids
every time. Not sure if that is something we do on client side which
causes that, but it does look like a real possibility.

To handle that case, `getPendingTrack` looks up tracks by media kind
(audio/video) if the look up by SDP client id fails.

Here, it is possible that there are two pending tracks of type video
(think camera and screen sharing as an example) and looking up by kind
might end up picking the wrong one.

Fix:
----
Store the signalled client id and SDP client id in the MediaTrack and
look up the published tracks by SDP client id for a track match.

If there is no match, create a new MediaTrack and add it to publishedTracks
and delete the corresponding pending track all within the lock (yeah not
great to have a lot of code within the lock, but this is probably worth
it to have the correctness).

This does solve the issue of deferred pending track removal causing issues.

However, note that kind based look up may do some switching. In a scenario
where there are two pending tracks of kind video and the look up has to
rely on kind, it is possible that signalCid and sdpCid get cross matched
(i. e. client might have sent a signalCid for a Simulcast track, but during
kind based look up it gets assigned to a non-simulcast track). I think
that is okay as there is no strong correlation between the two.

Testing:
--------
- Connect from Chrome, Firefox (both orders, Chrome joining first, Firefox joining first) and ensure that media subscriptions and publishing are correct
- Ensure that DTX munging works properly too.

* Fix tests

Add back adding track to publishedTracks for testing purposes.

* Add a test to check case of `AddTrack` rejecting already published track

* Remove debug.

* Address PR comments - do not need to return SDP cid from `getPendingTracks`.
2021-10-21 12:50:07 +05:30
David Colburn 86d7fe8241 take iceServers out of room (#151) 2021-10-19 19:56:34 -05:00
David Zhao 833e497f37 Pass TrackInfo entirely to ensure fields aren't missing (#150) 2021-10-19 17:27:32 -07:00
Raja Subramanian 2d76c672e3 Use abs-send-time RTP header extension for video downstream (#149)
- Update ion-sfu to v1.20.14
- Enable `abs-send-time` for video tracks

Reference: ion-sfu PR - https://github.com/livekit/ion-sfu/pull/12

Testing:
--------
- Look at SDP offer in subscriber PC and ensure that abs-send-time is negotiated.
- Ensure that downstream packets have `abs-send-time` extension for video packets.

TODO:
-----
- Not yet setting this for audio tracks. Eventually we want to move
to TWCC. This is just a step along the way.
2021-10-19 23:46:04 +05:30
David Colburn 0c8fe361b2 Small refactor (#148)
* small refactor

* extra line

* fix room allocator test

* selector fakes not used

* keep decisions out of router

* put nodeId logic back

* fix room allocator test
2021-10-18 21:49:16 -05:00
David Colburn 1d626ba053 Update turn (#147)
* more generic turn server

* public turn realm name

* support turn cert itself in config

* remove cert/key from config

* double auth handler

* generate

Co-authored-by: Mathew Kamkar <578302+matkam@users.noreply.github.com>
2021-10-18 16:14:27 -05:00
David Zhao 43079866a2 Update to Pion v3.1.5, fixed simulcast / non-simulcast mixing 2021-10-15 09:35:01 -07:00
David Zhao 81712f9502 revert to pion v3.1.0-beta.3
mixing of simulcast and non-simulcast tracks is broken
2021-10-15 00:31:57 -07:00
David Zhao eba0c23375 Handle TrackInfo.Source attribute (#146)
* Support passing along Source attribute
2021-10-14 13:10:57 -07:00
Raja Subramanian d08646f17e Reuse transceivers (#145)
* WIP - Re-use transceiver

* Support both old and new methods for adding transceiver
2021-10-14 11:57:59 +05:30
Raja Subramanian ac4db4575f LK-105 (Opus DTX) (#140)
* LK-105 (Opus DTX)

https://linear.app/livekit/issue/LK-105/allow-enabling-of-opus-dtx

Enable/Disable Opus DTX using SDP answer based on setting in
`AddTrack` request.

Testing:
--------
Chrome and Firefox work. Having audio problems with Safari
(maybe the Safari 15 issue as I am not getting media)

* Check that receiver has no tracks

* Skip non-audio transceivers

* A small clean up to not use pendin track outside lock and also append with spread

* Address comments from review by David

* Update pkg/rtc/participant.go

Co-authored-by: David Zhao <david@davidzhao.com>

* Pull in tagged version of webrtc and lk protocol

Co-authored-by: David Zhao <david@davidzhao.com>
2021-10-13 11:22:33 +05:30
Mathew Kamkar 84ab0f82af Prometheus counters for RTC connection steps (#143)
* signal ws connection, participant join, ice connection

* must register

* offer negotiation

* dz review: offer and offer_response

* dz review: answer
2021-10-12 15:22:17 -07:00
David Zhao 4149c4a314 removed duplicate region log 2021-10-10 22:52:51 -07:00
David Zhao 575b99840a Fixed handling of multiple nodes in region-aware routing 2021-10-10 22:25:29 -07:00
Brint E. Kriebel 822f8c3944 Region Aware node selection fixes and enhancements (#141)
* cli: Allow setting the current node region with flag or env variable

Also add region to "starting LiveKit server" log.

* routing: Add region to node registration

Register the node's region on the selected router so it can be used for
region aware node selection.

Also add the region to the list-nodes output.

* regionaware: Set minDist to zero for the current node

If you don't set the minDist when leaving the loop early for a node that
matches the current region, the minDist value with still be at max. This
causes the the wrong node to be selected if the current node is the
first one the loop passes through.

Add a test that validates this change. The new test fails if this new
change is not in place.
2021-10-10 22:21:37 -07:00
cnderrauber 8ff18e0326 forward fraction lost from subscriber to publisher (#142)
Co-authored-by: cnderrauber <zengjie9004@gmail.com>
2021-10-10 21:56:13 -07:00