mirror of
https://github.com/agessaman/meshcore-bot.git
synced 2026-05-11 18:14:50 +00:00
132 lines
4.6 KiB
Markdown
132 lines
4.6 KiB
Markdown
# Race Condition Fix for RF Data Correlation
|
|
|
|
## Problem Description
|
|
|
|
The MeshCore Bot was experiencing a race condition where channel messages would sometimes show "unknown" SNR/RSSI values instead of actual measurements. This occurred because:
|
|
|
|
1. **Event Timing Dependency**: The bot expected `RX_LOG_DATA` events (containing SNR/RSSI) to arrive before or very close to `CHANNEL_MSG_RECV` events
|
|
2. **Short Time Window**: The original 5-second correlation window was too narrow for network timing variations
|
|
3. **No Fallback Strategy**: When immediate correlation failed, there was no robust fallback mechanism
|
|
|
|
## Root Cause Analysis
|
|
|
|
From the logs, we can see the issue:
|
|
```
|
|
2025-09-04 17:24:42 - MeshCoreBot - WARNING - ❌ NO RF DATA found for channel message
|
|
```
|
|
|
|
This happened because:
|
|
- The `RX_LOG_DATA` event arrived outside the 5-second correlation window
|
|
- The events were processed in an unexpected order
|
|
- The timing gap between RF data and message events was too large
|
|
|
|
## Solutions Implemented
|
|
|
|
### 1. Extended Time Window
|
|
- **Before**: 5-second correlation window
|
|
- **After**: 15-second correlation window (configurable)
|
|
- **Benefit**: Handles network timing variations and device processing delays
|
|
|
|
### 2. Multi-Strategy Correlation System
|
|
The bot now uses 4 correlation strategies in sequence:
|
|
|
|
#### Strategy 1: Immediate Correlation
|
|
- Try to find RF data immediately using exact pubkey match
|
|
- Fastest and most accurate when events arrive in order
|
|
|
|
#### Strategy 2: Message Queuing (Enhanced Mode)
|
|
- Store messages temporarily and wait 100ms for RF data
|
|
- Only enabled when `enable_enhanced_correlation = true`
|
|
- Handles cases where RF data arrives slightly after message
|
|
|
|
#### Strategy 3: Extended Timeout
|
|
- Search with 2x the normal timeout (30 seconds)
|
|
- Catches RF data that arrived much earlier than expected
|
|
|
|
#### Strategy 4: Most Recent Fallback
|
|
- Use the most recent RF data available
|
|
- Ensures we always have some signal strength information
|
|
|
|
### 3. Improved Pubkey Matching
|
|
- **Exact Match**: Full pubkey comparison (most reliable)
|
|
- **Partial Match**: First 16 characters (handles truncated pubkeys)
|
|
- **Fallback**: Most recent data (handles timing issues)
|
|
|
|
### 4. Enhanced Data Storage
|
|
- **Timestamp Index**: Fast lookup by time
|
|
- **Pubkey Index**: Fast lookup by sender
|
|
- **Automatic Cleanup**: Removes old data to prevent memory leaks
|
|
|
|
### 5. Configuration Options
|
|
Added to `config.ini`:
|
|
```ini
|
|
[Bot]
|
|
# RF Data Correlation Settings
|
|
rf_data_timeout = 15.0 # Time window for correlation
|
|
message_correlation_timeout = 10.0 # Time to wait for correlation
|
|
enable_enhanced_correlation = true # Enable advanced strategies
|
|
```
|
|
|
|
## Performance Impact
|
|
|
|
### Positive Impacts
|
|
- **Higher Success Rate**: More messages will have accurate SNR/RSSI values
|
|
- **Better User Experience**: Users see actual signal strength instead of "unknown"
|
|
- **Robust Operation**: Handles network timing variations gracefully
|
|
|
|
### Minimal Overhead
|
|
- **Memory**: Slightly more memory for correlation indexes (cleaned up automatically)
|
|
- **CPU**: Negligible impact from additional correlation attempts
|
|
- **Latency**: 100ms additional wait only when needed (Strategy 2)
|
|
|
|
## Testing Results
|
|
|
|
All correlation strategies tested successfully:
|
|
- ✅ Immediate correlation
|
|
- ✅ Message correlation system
|
|
- ✅ Extended timeout correlation
|
|
- ✅ Partial pubkey matching
|
|
- ✅ Cleanup functionality
|
|
|
|
## Configuration Recommendations
|
|
|
|
### For Stable Networks
|
|
```ini
|
|
rf_data_timeout = 10.0
|
|
enable_enhanced_correlation = false
|
|
```
|
|
|
|
### For Unstable Networks (Default)
|
|
```ini
|
|
rf_data_timeout = 15.0
|
|
enable_enhanced_correlation = true
|
|
```
|
|
|
|
### For Very Unstable Networks
|
|
```ini
|
|
rf_data_timeout = 30.0
|
|
enable_enhanced_correlation = true
|
|
```
|
|
|
|
## Monitoring
|
|
|
|
The bot now logs correlation success/failure:
|
|
- `🔍 FOUND RF DATA`: Successful correlation
|
|
- `❌ NO RF DATA found for channel message after all correlation attempts`: All strategies failed
|
|
|
|
Monitor these logs to tune the configuration for your network conditions.
|
|
|
|
## Backward Compatibility
|
|
|
|
- All changes are backward compatible
|
|
- Default configuration provides improved behavior
|
|
- Can be disabled by setting `enable_enhanced_correlation = false`
|
|
- Original 5-second behavior available by setting `rf_data_timeout = 5.0`
|
|
|
|
## Future Improvements
|
|
|
|
1. **Adaptive Timeouts**: Automatically adjust timeouts based on network conditions
|
|
2. **Machine Learning**: Learn optimal correlation strategies from historical data
|
|
3. **Network Quality Metrics**: Track correlation success rates and adjust accordingly
|
|
4. **Event Ordering**: Implement event sequence numbers for more reliable correlation
|