Release to Docker / docker (push) Failing after 3m42s
* Key telemetry stats work using combination of roomID, participantID
With forwarded participant, the same participantID can existing in two
rooms.
NOTE: This does not yet allow a participant session to report its
events/track stats into multiple rooms. That would require regitering
multiple listeners (from rooms a participant is forwarded to).
* missed file
* data channel stats
* PR comments + pass in room name so that telemetry events have proper room name also
When a participant is closing, RTCP readers should be cleaned up from
factory even if the participant is expected to resume. The resumed
participant will be a new participant session and peer connection(s) and
everything will be set up again.
* Add checks for participant and sub-components close.
Looks like there might be some memory leak with participant sessions not
getting closed properly. Adding checks (to be cleaned up later) to see
if there is a consistent place where things might hang.
* init with right type
* Remove unnecessary goroutine, thank you @milos-lk
* clean up
Currently, it is a bit of a mish-mash
- some compose the message fully and just call send()
- some give parameters and the message is composed in
participant_signal.go
Was thinking about making an interface for signalling and have v1/v2
impls, but did not want to repeat composing messages if there are common
messages. And some of those function reach into `ParicipantImpl` object
and use information (simple example of p.IsReady()) which would become
more elaborate if the signaller is split out into its own struct.
Maybe, just need to make an interface for the sink and send to the
correct sink based on v1 /v2 signal transport.
But, for now, just grouping all signal messaages in one file
so that it is easier to manage later.
* Make sure moving out track has been unsubscribed
Remove start time checking in subscription manager
as We always use new track ID for republished track at #3020
so there is no race condition now.
Also RemoveSubscriber for moving out tracks for safety,
the subscription manager will handle the removed event but
RemoveSubscriber again will not be bad.
* Clear subscriber node max quality for moving out tracks
* Add Moving participant to another room
it is implemented in cloud only since the destination
room can exist in different node with the source room
* Update pkg/service/errors.go
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* rename
* test panic
* fake LocalParticipantHelper
* revert delete line
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Normalize mime type and add utilities.
An attempt to normalize mime type and avoid string compares remembering
to do case insensitive search.
Not the best solution. Open to ideas. But, define our own mime types
(just in case Pion changes things and Pion also does not have red mime
type defined which should be easy to add though) and tried to use it everywhere.
But, as we get a bunch of callbacks and info from Pion, needed conversion in
more places than I anticipated. And also makes it necessary to carry
that cognitive load of what comes from Pion and needing to process it
properly.
* more locations
* test
* Paul feedback
* MimeType type
* more consolidation
* Remove unused
* test
* test
* mime type as int
* use string method
* Pass error details and timeouts. (#3402)
* go mod tidy (#3408)
* Rename CHANGELOG to CHANGELOG.md (#3391)
Enables markdown features in this otherwise already markdown'ish formatted document
* Update config.go to properly process bool env vars (#3382)
Fixes issue https://github.com/livekit/livekit/issues/3381
* fix(deps): update go deps (#3341)
Generated by renovateBot
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
* Use a Twirp server hook to send API call details to telemetry. (#3401)
* Use a Twirp server hook to send API call details to telemetry.
* mage generate and clean up
* Add project_id
* deps
* - Redact requests
- Do not store responses
- Extract top level fields room_name, room_id, participant_identity,
participant_id, track_id as appropriate
- Store status as int
* deps
* Update pkg/sfu/mime/mimetype.go
* Fix prefer codec test
* handle down track mime changes
---------
Co-authored-by: Denys Smirnov <dennwc@pm.me>
Co-authored-by: Philzen <Philzen@users.noreply.github.com>
Co-authored-by: Pablo Fuente Pérez <pablofuenteperez@gmail.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Paul Wells <paulwe@gmail.com>
Co-authored-by: cnderrauber <zengjie9004@gmail.com>
* Set down track connected flag in one-shot-signalling mode.
Also, added maintaing ICE candidates for info purposes.
And doing analytics events (have to maintain the subscription inside
subscriptionmanager to get list of subscribed tracks, so added enough
bits from the async path into sync path to get the analytics bits also)
* comment typo
* method to check if a track name is subscribed
* Set down track connected flag in one-shot-signalling mode.
Also, added maintaing ICE candidates for info purposes.
And doing analytics events (have to maintain the subscription inside
subscriptionmanager to get list of subscribed tracks, so added enough
bits from the async path into sync path to get the analytics bits also)
* comment typo
* WIP
* comment
* Verify method on LocalParticipant
* cleanup
* clean up
* pass in one-shot-mode to StartSession
* null message source and sink
* feedback and also remove check in ParticipantImpl for one-shot-mode-filtering as a null sink can be used for that
* Add counter for pub&sub time metrics
The pub&sub shows large value in migration related case like
muted/disabled migration, the subscription time depends on
the time when publisher unmute the track(sending rtp packet
after migration), add a counter to distinguish since we
can't control the time in such cases and the first subscription
attemps also is more meaningful than those cases.
* Add info log for high publish delay
* Initial plumbing for metrics.
This implements
- metrics received from participant.
- callback to room.
- room distributes it to all other participants (excluding the sending
participant).
- other participants forward to client.
- counting metrics bytes in data channel stats
TODO:
- recording/processing/batching
- should recording/processing/batching happen on publisher side or
subscriber side?
- should metrics be echoed back to publisher?
- grants to publish/subscribe metrics.
* mage generate
* clear OnMetrics on close
* - CanSubscribeMetrics permission.
- Echo back to sender.
* update deps
* No destination identities for metrics
* WIP
* use normalized timestamp for server injected timestamps
* compile
* debug log metrics batch
* correct comment
* add baseTime to wire
* protocol dep
* Scope metrics forwarding to only participants that a participant is
subscribed to.
Also remove the participant_metrics.go file as it was not doing anything
useful.
* update comment
* utils.ErrorIsOneOf
* couple of more utils.CloneProto
* Do not remove from subscription map on unsubscribe.
Notes in line as to why.
* Avoiding the extra check. It is fine to check under lock for clean up.
It is done in other places.
* comments
* speed up track publication
Add metrics for track publication and subscription
Return EnabledCodecs in JoinResponse so client can
choose codec without server side codec fallback
Cache remote webrtc track without AddTrackRequest to
let client send publisher offer before AddTrackRequest response
* go mod
* clean code
That is the main change. Changed variable name to `isExpectedToResume`
everywhere to be consistent.
Planning to use the callback value in relays to determine if the down
track should be closed or switched to a different up track.
* Notify initial permissions
NOTE: This does add an initial subscription permission notification
which should be fine, but something to watch for.
A stress test combining
- mute/unmute on publisher side.
- allowing/revoking permission for subscriber from publisher side.
- subscribing/unsubscribing from subscriber side.
results in a scenario where a subscription permission update of
`not_allowed` being sent and on a re-subscribe, an `allowed` update does
not happen.
It happens like so
- Subscription revoke cloes the down track of subscriber.
- The subscription is still desired.
- So, a subscription reconcile runs and sees `permission: false`. This
sends subscription permission of `not_allowed`.
- Unsubscribe request comes in and sets `desired: false`.
- Reconsiler runs again and sees `desired: false` and `subscribedTrack:
nil`. This cleans up the subscription.
- Publisher grants permission for the subscriber.
- Subscriber subscribes to the track again. A new subscription is
created.
- Reconciler runs and sees `permission: true`, but there is no
permission change as it is a new subscription object. So, `allowed`
subscription permission update is not sent and the client is stuck at
`not_allowed`.
Fix, maintain if permission has been initialized. Has the effect of
sending an initial update which should be fine.
* clean up comment
* no default
* Maintain subscription count.
Does not affect function as it is not decremented only if limits are
configured. But, good to maintain proper count anyway.
* wire
* Remove some logs.
Also, changing Errorw -> Warnw in a bunch of places.
Going to move towards using `Errorw` for cases where a functionally
unexpected condition happens, i.e by design a condition should not
happen yet it triggered kind of scenarios.
* log error
* Use a participant worker queue in room.
Removes selectively needing to call things in goroutine from
participant.
Also, a bit of drive-by clean up.
* spelling
* prevent race
* don't need to remove in goroutine as it is already running in the worker
* worker will get cleaned up in state change callback
* create participant worker only if not created already
* ref count participant worker
* maintain participant list
* clean up oldState
* Log cleanup pass
Demoted a bunch of logs to DEBUG, consolidated logs.
* use context logger and fix context var usage
* moved common error types, fixed tests
* Fix ICE connection fallback
Short connection detection relied on iceFailedTimeout, which previously
had been misinterpreted. Since we've reduced iceFailedTimeout, it is
creating false negatives.
We'll instead use PingTimeout since clients are expected to keep the
signal connection active.
* reduce ping interval to align with total ice failure timeout
* Close subscriptions promptly
Two things:
-----------
1. Because the desired is not changed, the notifiers are not notified
that the subscription is not observing any more. So, that holds
a refernce to the subscription manager.
Address the above by setting `setDesired` to false on all subscriptions
when subscription manager closes. That will remove observer from the
notifiers.
2. When subscription manager is closed, the down track close
is invoked which flows back (with onClose callback of downtrack) to
subscription manager "handleSubscribedTrackClose". That callback
handler sets the subscribed track to nil for that subscription.
A couple of scenarios here
a. Without the above change, desired could have been true and it would
have looked that the track needs to try subscription again because
`needsSubscribe == true` (desired == true && subscribedTrack == nil)
b. Even with the change above, there is a new condition of
`desired == false && subscribedTrack == nil` and there was no handler
for that condition in the reconciler.
Address this by adding a `needsCleanup` function and delete subscription
from the map. Note that the reconciler may not be running to execute
this action as subscription manager would have closed the `closeCh`, but
doing the code in the interest of proper clean up.
* clean up
A subscription in subscription manager could live till the source
track goes away even though the participant with that subscription
is long gone due to closure on source track removal. Handle it by using
trackID to look up on source track removal.
Also, logging SDPs when a negotiation failure happens to check
if there are any mismatches.
* Perform unsubscribe in parallel to avoid blocking
When unsubscribing from tracks, we flush a blank frame in order to prepare
the transceivers for re-use. This process is blocking for ~200ms. If
the unsubscribes are performed serially, it would prevent other subscribe
operation from continuing.
This PR parallelizes that operation, and ensures subsequent subscribe
operations could reuse the existing transceivers.
* also perform in parallel when uptrack close
* fix a few log fields
* Avoid reconnect loop for unsupported downtrack
If the client subscribes to a track which codec is unsupported by the
client, sfu will trigger negotiation failed and issue a full reconnect
after received client answer. If the client try to subscribe that track
then it will got full reconnect again. That will cause a infinite
reconnect loop until the client don't subscribe that track. This PR
will unsubscribe the error track for the client and send a
SubscriptionResponse that contain the reason to indicates the track's
codec is not supported to avoid the reconnect loop.