livekit

mirror of https://github.com/livekit/livekit.git synced 2026-05-24 10:15:28 +00:00

Author	SHA1	Message	Date
Raja Subramanian	c2335968de	Prevent evaluation over small wkndow. (#1516 ) With push model (i. e. connection quality evaluation triggered by reception of RTCP receiver report), it is possible that a report is received quickly after a track is started (especially with video). Those should not trigger a quality evaluation. Set `lastStatsAt` in `Start` routine and ensure that start has been called and enough time has passed since last stats time to avoid small windows.	2023-03-14 16:27:39 +05:30
Raja Subramanian	c70aa616a9	Expected vs actual Layer based connection quality. (#1509 ) * Expected vs actual Layer based connection quality. With VBR streams (like screen share), bit rate is not a good indicator of whether desired layer (spatial/temporal) is achieved due to high variance. Using expected vs actual layer (i. e. distance to desired) can capture any short fall and include it in quality scoring. This PR uses distance to desired, i. e. how many steps it would take to go from actual spatial/temporal -> desired spatial/temporal and that distance is propotionally used (currently it is just linear) to decrease score. * wire up layer transitions for screen share tracks	2023-03-10 13:08:36 +05:30
Raja Subramanian	e893d30fd0	Use EWMA (Exponentially Weighted Moving Average) for score updates. (#1507 ) * Use EWMA (Exponentially Weighted Moving Average) for score updates. Makes code simpler, but makes it harder to test as the inflection points are not exact. Score falls a bit slower to be conservative on dropping quality too quickly. Still fall factor is higher (i. e. newer scores get more weight) than rise factor (i. e. newer scores get lower weight). Slower rise factor to introduce hysteresis on things climibing back too quickly. In the extreme case, asympttotic conditions could cause unexpected results. For example, having 4% loss of video continously will never drop quality to `POOR`. It will get close to 60, but it will always stay above 60 forever and hence quality will never drop to POOR. Maybe, need some sort of variable thresholding to deal with that. But, that is an extreme case and may not happen in real life. * remove unused stuff	2023-03-09 13:52:01 +05:30
Raja Subramanian	14b0b48b15	Push/pull for connection stats/quality scoring. (#1505 ) * Push/pull for connection stats/quality scoring. Was not happy with pure pull method missing a window because of RTCP RR timing is slightly off for audio and using a much larger window of data in the next update. That also resulted in RTP stats getting some bits of code. As that is per-packet processing, was not a good idea. Switching to push-pull method. For up track, it is pull, i. e. connection stats worker will pull stats. For down track, there is a new notification about receiver report reception. Using this to check for time to run stats. And adding a bit of tolerance for processing window (currently set so that as long as it is > 95% of usual processing interval). This allows two things - for video, RTCP RR are more frequent, but we will still not process till enough time has passed - for audio, RTCP RR could be once in 5 seconds or so. Can process when it is available rather than miss a window and use a much larger window later. * uber atomic	2023-03-09 11:51:20 +05:30
Raja Subramanian	99601e6d41	Handle the case of no packets in down stream tracks better. (#1500 )	2023-03-07 14:32:43 +05:30
Raja Subramanian	04269c100c	Connection quality misc changes (#1496 ) * Connectino quality misc changes 1. Call scorer.Update() with nil stat when no data available so that scorer can synthesise window with proper window time. 2. Substract out loss in interval to account for packets not sent at all. 3. Fix `packetsNotFound` variable in `getIntervalStats`. I remember this working at some point. Not sure if I fat fingered in another PR and deleted the increment line. 4. Logging a bit more when no packets expected. Those can get noisy especially when track is muted. But, seeing some unexplained instances of no packets leading to quality drop. So, temporary logging to get a bit more information. * correct spelling * Limit packet score minimum to 0.0	2023-03-07 09:08:19 +05:30
Raja Subramanian	9e327b1f3c	Connection quality (#1490 ) * Make connection quality not too optimistic. With score normalization, the quality indicator showed good under conditions which should have normally showed some badness. So, a few things in this PR - Do not normalize scores - Pick the weakest link as the representative score (moving away from averaging) - For down track direction, when reporting delta stats, take the number of packets sent actually. If there are holes in the feed (upstream packet loss), down tracks should not be penalised for that loss. State of things in connection quality feature - Audio uses rtcscore-go (with a change to accommodate RED codec). This follows the E-model. - Camera uses rtcscore-go. No change here. NOTE: THe rtscore here is purely based on bits per pixel per frame (bpf). This has the following existing issues (no change, these were already there) o Does not take packet loss, jitter, rtt into account o Expected frame rate is not available. So, measured frame rate is used as expected frame rate also. If expected frame rate were available, the score could be reduced for lower frame rates. - Screen share tracks: No change. This uses the very old simple loss based thresholding for scoring. As the bit rate varies a lot based on content and rtcscore video algorithm used for camera relies on bits per pixel per frame, this could produce a very low value (large width/height encoded in a small number of bits because of static content) and hence a low score. So, the old loss based thresholding is used. * clean up * update rtcscore pointer * fix tests * log lines reformat * WIP commit * WIP commit * update mute of receiver * WIP commit * WIP commit * start adding tests * take min score if quality matches * start adding bytes based scoring * clean up * more clean up * Use Fuse * log quality drop * clean up debug log * - Use number of windows for wait to make things simpler - track no layer expected case - always update transition - always call updateScore	2023-03-05 12:55:04 +05:30
Raja Subramanian	fe0502c886	Demote some stable logs to Debugw (#1158 ) * Demote some stable logs to Debugw * Add 'discard message from' to ignore list	2022-11-11 10:17:47 +05:30
David Zhao	02537a121d	Store initial track MimeType in TrackInfo (#1065 )	2022-09-30 23:33:22 -07:00
Raja Subramanian	c03003becf	Logging some connection quality stuff to get some data. (#1008 ) * Logging some connection quality stuff to get some data. Setting it at 4.5 as normalised scores are higher. * log average score	2022-09-15 17:16:59 +05:30
Raja Subramanian	d76f7811e9	An attempt to use consistent layer mapping (#986 ) * WIP commit * Consistent layers. * slight re-arrangement of code * log mime * fix tests * map -> array	2022-09-07 09:57:31 +05:30
Raja Subramanian	c75f38bce6	Protect against looking up dimensions for invalid spatial layer (#977 ) Also use loss based scoring when track dimensions are not available.	2022-09-03 00:59:47 +05:30
Raja Subramanian	b5c023f986	Connection quality changes (#913 ) * WIP commit * Connection quality changes - Fix Firefox showing poor quality o The issue was that we were using max available layer and calculating quality. The rationale being that even if server sends dynacast messages, client may not implement dynacast and still stream all layers. But, with Firefox (maybe a Firefox bug), it sends some small amount of data on layer 2 even when that layer is disabled. Guessing it is probing (or actually we might be using some small value for high layers as Firefox cannot turn off layers). That higher layer gets used in quality calculation. As the bit rate on that layer is extremely low, it yields low score. Fixed by considering the max expected layer. That is of most interest. Yes, clients may ignore dynacast and stream all layers, but, max expected is the one of interest. So, look for quality in the max expected layer and not max available layer. - Lots of clean up around connection quality stuff o Use a dynamic scaling thing to ensure that we do not get bitten by absolute values. Calculate best possible scenario score and map that to maximum MOS score. This will ensure that different codecs, different settings do not mess up the scoring. For example, a client might use 1 Mbps for 720p, but a different client could use 2 Mbps for 720p. As an SFU/infrastructure middlebox, we do not have control over quality at those rates. We can only ensure that streaming happens smoothly at those rates. So, in that example, for client 1, 1 Mbps will map to MOS 5.0 and for client 2, 2 Mbps will map to MOS 5.0. Any impairments after that will reflect in the score. o Penalise for missing target layer by one level for one layer missed. o Move tests to connection quality directory. The participant test was not super useful. * Add missed file * Remove debug code * use more constants and initialise normalisation factor * rtcscore pointer	2022-08-15 13:21:07 +05:30
Raja Subramanian	dbcc53f04e	Use media payload size in scoring. (#912 ) * Use media payload size in scoring. Subtract out header bytes when calculating score. This does not seem to affect the score (under perfect conditions), but, using header bytes will inflate the bit rate and will affect scoring. * Add header bytes to ToProto * protocol pointer * fix test	2022-08-14 13:22:58 +05:30
David Zhao	f09885825e	Return ServerInfo to clients on join (#904 ) * checkpoint * Return ServerInfo in join response * also include node information * less verbose quality score * update go modules	2022-08-10 17:04:17 -07:00
David Zhao	53f51c8cb0	Logging cleanup (#843 ) * Logging cleanup Changes log levels to better match significance * fix lock	2022-07-21 00:39:49 -07:00
Raja Subramanian	c15eeeff2b	Run connection quality worker every 5 seconds. (#795 ) With a small window, the quality is volatile even on small disturbances. For example losing 2 audio packets in a 2 second window could drop the quality metric.	2022-06-30 09:10:18 +05:30
Raja Subramanian	2c48eafd6e	Retain previous audio score if number of packets is low (#793 ) * Retain previous audio score if number of packets is low * better comment and correct spelling	2022-06-29 14:48:06 +05:30
Raja Subramanian	45ed8ce85a	Look for stable mex expected layer before calculating score. (#774 )	2022-06-21 17:24:34 +05:30
Raja Subramanian	ac1e55fa27	Use current layer for actual dimension when calculating quality of muxed (#773 ) tracks.	2022-06-21 11:31:51 +05:30
Raja Subramanian	1e6a12167b	Use loss based scoring for screen share tracks. (#771 ) * Use loss based scoring for screen share tracks. * Remove named TODO markers and file issues	2022-06-20 12:08:30 +05:30
Raja Subramanian	62943f2096	Set DtxDisabled from TrackInfo in score calculation. (#770 ) * Set DtxDisabled from TrackInfo in score calculation. Also, fix sending connection quality upate on a new subscription. * comments tweaks * Move TrackInfo into StreamTrackerManager as this is used by cloud as well	2022-06-19 21:12:09 +05:30
Raja Subramanian	9032db857c	Connection quality clean up (#766 ) * WIP commit * WIP commit * Remove debug * Revert to reduce diff * Fix tests * Determine spatial layer from track info quality if non-simulcast * Adjust for invalid layer on no rid, previously that function was returning 0 for no rid case * Fall back to top level width/height if there are no layers * Use duration from RTPDeltaInfo	2022-06-18 21:58:47 +05:30
Raja Subramanian	4701119885	Proto clone VideoLayer (#756 ) Otherwise, there are warnings about copying locks.	2022-06-09 09:32:57 +05:30
shishirng	cb9f0d37c2	Use rtcscore-go to calculate audio/video score (#689 ) * Use rtcscore-go to calculate audio/video score Signed-off-by: shishir gowda <shishir@livekit.io> * Get max expected layer and find max actual layer from stream Signed-off-by: shishir gowda <shishir@livekit.io> * Cleanup unused methods Signed-off-by: shishir gowda <shishir@livekit.io> * Cleanup code - address review comments Signed-off-by: shishir gowda <shishir@livekit.io> * get expected layer info instead of just quality Signed-off-by: shishir gowda <shishir@livekit.io> * Move SpatialLayerForQuality to utils/helpers method is required in rtc,sfu and connectionstats pkg Moved to utils/helpers.go to remove cyclic deps Signed-off-by: shishir gowda <shishir@livekit.io> * update tests Signed-off-by: shishir gowda <shishir@livekit.io> * Pick stream stats with max layer Signed-off-by: shishir gowda <shishir@livekit.io> * Update rtcscore-go pkg to make rtt/jitter optional when passing 0, rtcscore-go was setting default values Signed-off-by: shishir gowda <shishir@livekit.io> * update score to rating Signed-off-by: shishir gowda <shishir@livekit.io> * Update rtcscore-go pkg to use simulcast layer info for score Signed-off-by: shishir gowda <shishir@livekit.io> * Update score ratings to reflect rtcscore range Signed-off-by: shishir gowda <shishir@livekit.io> * update test params for new rtcscore Signed-off-by: shishir gowda <shishir@livekit.io> * Delay sending scores to connections only till full data is available first interval can have partial data leading to lower scores Signed-off-by: shishir gowda <shishir@livekit.io> * Check for inf values in quality params Signed-off-by: shishir gowda <shishir@livekit.io> * Clean up initial score calculation. Default to 5 Signed-off-by: shishir gowda <shishir@livekit.io> Co-authored-by: David Zhao <dz@livekit.io>	2022-05-27 14:58:26 -04:00
cnderrauber	f958fbcc1c	simulcast codecs support (#720 ) simulcast codecs support Co-authored-by: David Zhao <dz@livekit.io>	2022-05-27 19:55:50 +08:00
David Zhao	bd7e3beda4	Improve frequency of stats update (#673 ) * Improve frequency of stats update Prometheus stats are updated as the data becomes available, instead of aggregated along with telemetry batches. Node availability decisions can now react much faster to these stats. * use the same intervals for connection quality updates	2022-05-09 08:55:06 -07:00
Raja Subramanian	a98d955284	Delta stats throughout (#615 ) * Use delta stats throughout and avoid calculating deltas in telemetry * Fix a few things after testing * Remove debug * Fix tests * delete instead of setting to nil * Point to the latest protocol	2022-04-16 21:11:32 +05:30
Raja Subramanian	92009b6428	Consistently stop tickers (#593 )	2022-04-05 20:42:06 +05:30
David Colburn	0b8a180554	Code inspection (#581 ) * Code inspection * fix [4]int64 conversiong	2022-03-30 13:49:53 -07:00
Raja Subramanian	ae85e55fd4	Using RTPStats across the board (#515 ) * WIP commit * Clean up	2022-03-15 17:47:19 +05:30
Raja Subramanian	778d1aa141	`utils.AtomicFlag` -> `atomic.Bool` (#466 ) * Replacing hand rolled ion-sfu atomic with uber/atomic * Remove another hand rolled atomic * utils.AtomicFlag -> atomic.Bool	2022-02-25 12:19:49 +05:30
Raja Subramanian	0170cc1cb6	Staticcheck (#464 ) Using `go get -u honnef.co/go/tools/cmd/staticcheck` Uneaarthed a couple of real bugs	2022-02-25 12:04:08 +05:30
Raja Subramanian	bce3a9b10a	Tigether scoping on connection stats lock. (#415 )	2022-02-08 13:51:44 +05:30
Raja Subramanian	36289bbca7	FPS (#410 ) * WIP commit * WIP commit * WIP commit * WIP commit * WIP commit * WIP commit * Clean up * Clean up * Store RTT in stats * spelling mistake * Make tests compile * Fix test compilation error * fix tests * clone * latest protocol	2022-02-08 12:53:14 +05:30
Raja Subramanian	a1f88faed1	Add a resync API to sfu.DownTrack (#389 ) * Add a resync API to sfu.DownTrack Also passing in logger with context into sfu package. More to do here with proper logging context in all modules, but this is a start * Remove debug code * fix tests	2022-01-30 10:59:47 +05:30
Raja Subramanian	5b57522c05	Refactoring connection stats (#384 )	2022-01-29 00:55:00 +05:30
shishirng	26eea78b54	Telemetry connection scores (#377 ) * octets - total bytes needs to be uint64 uint32 wraps at 4GB Signed-off-by: shishir gowda <shishir@livekit.io> * Cleanup stats handler to use connectionQuality stats remove per packet rtcp handlers, buffer stats * cleanup connection stats * Update mediatrack to store rtcp stats in connection stats * Update downstream handling of connection stats and telemetry * Update telemetry tests Signed-off-by: shishir gowda <shishir@livekit.io> * Misc fixes Signed-off-by: shishir gowda <shishir@livekit.io> * Minor fix to avoid accessing buffer before its allocated Signed-off-by: shishir gowda <shishir@livekit.io> * start updateStats worker in AddReciever() Signed-off-by: shishir gowda <shishir@livekit.io> * Use previous score to calculate avg scores * Restructure connectionStats Signed-off-by: shishir gowda <shishir@livekit.io>	2022-01-27 11:24:54 -05:00
David Colburn	5bea9debb7	Code cleanup (#353 )	2022-01-19 02:13:06 -08:00
shishirng	e6543f3b9e	Convert jitter from MicroSecs to MilliSecs (#282 ) Signed-off-by: shishir gowda <shishir@livekit.io>	2021-12-22 12:37:45 -05:00
shishirng	0f728b0b72	Connection quality v1 (#260 ) * audio connection quality mos for publisher stats Signed-off-by: shishir gowda <shishir@livekit.io> * Update tests Signed-off-by: shishir gowda <shishir@livekit.io> * Change ratings range, increase default rtt to 80 Signed-off-by: shishir gowda <shishir@livekit.io> * Use stats worker to get total packets to find %lost in window Signed-off-by: shishir gowda <shishir@livekit.io> * Update go dep Signed-off-by: shishir gowda <shishir@livekit.io> * Increase interval of score cal to 5 seconds Signed-off-by: shishir gowda <shishir@livekit.io> * use lastSequenceNumber in reports to find total packets Signed-off-by: shishir gowda <shishir@livekit.io> * Account for delay while calculating scores Signed-off-by: shishir gowda <shishir@livekit.io> * Fix minor typo Signed-off-by: shishir gowda <shishir@livekit.io> * Add connection stats/score to subscribed audio tracks Signed-off-by: shishir gowda <shishir@livekit.io> * Cleanup Signed-off-by: shishir gowda <shishir@livekit.io> * Ignore duplicate LastSequenceNumbers in rtcp reports Ignore if sequence number is less than what was recieved Signed-off-by: shishir gowda <shishir@livekit.io> * Move video track score calc to media/downtracks Signed-off-by: shishir gowda <shishir@livekit.io> * Deprecate SubscribeLossPercentage() as score calc is now handled downstream Signed-off-by: shishir gowda <shishir@livekit.io> * Initialize connection score to excellent score is calc at 5sec interval. Client fetches score before first score is computed * Update test cases for connection quality Signed-off-by: shishir gowda <shishir@livekit.io>	2021-12-20 07:54:14 -05:00

41 Commits