One-Stage Object Detection Advantages in Semi-Supervised Learning

Overview of One-Stage Object Detection

One-stage object detectors like YOLO and RetinaNet directly predict object classes and bounding box coordinates in a single pass
Advantages over two-stage detectors like Faster R-CNN:
- Faster inference speed
- Simpler architecture
- End-to-end training

Low-Quality Pseudo-Labels:
- One-stage detectors directly output class and bounding box predictions, making it harder to generate high-quality pseudo-labels compared to two-stage detectors
- Pseudo-labels may have inaccurate bounding boxes and classification scores
- Can lead to optimization conflicts and performance degradation
Optimization Conflict:
- One-stage detectors optimize classification and regression tasks jointly, which can lead to conflicts during semi-supervised training
- Classification and regression losses may interfere with each other when using both labeled and pseudo-labeled data
- Can be more severe in one-stage detectors compared to two-stage detectors

Simpler Architecture:
- One-stage detectors have a more straightforward architecture compared to two-stage detectors
- This makes them easier to adapt and optimize for semi-supervised learning
- Fewer hyperparameters and training components to tune
End-to-End Training:
- One-stage detectors can be trained end-to-end, which is beneficial for semi-supervised learning
- The entire network can be optimized jointly on both labeled and unlabeled data
- Avoids the need for separate training stages like in two-stage detectors
Potential for Improved Pseudo-Label Quality:
- Recent advancements in one-stage detectors, such as YOLOv5, have improved the quality of their predictions
- This can lead to better pseudo-labels for the unlabeled data in semi-supervised learning
- Techniques like Multi-View Pseudo-Label Refinement can further enhance the pseudo-label quality
Decoupled Semi-Supervised Optimization:
- Separating the classification and regression losses during semi-supervised training can help address the optimization conflict
- Allows the network to focus on learning robust features for both tasks independently
- Can lead to better performance compared to jointly optimizing the losses

Multi-View Pseudo-Label Refinement:
- Generate pseudo-labels from multiple augmented views of the same image
- Combine the predictions from different views to obtain more accurate and robust pseudo-labels
- Helps address the issue of low-quality pseudo-labels in one-stage detectors
Decoupled Semi-Supervised Optimization:
- Separate the classification and regression losses during semi-supervised training
- Allows the network to focus on learning robust features for both tasks independently
- Can help address the optimization conflict in one-stage detectors
Dynamic Self-Adaptive Threshold (DSAT):
- Automatically adjusts the threshold for selecting high-quality pseudo-labels in the classification branch
- Balances the trade-off between the quantity and quality of pseudo-labels
- Helps address the class imbalance issue in one-stage detectors
Regression Uncertainty Estimation:
- Evaluates the regression quality of pseudo-labels based on the uncertainty of bounding box predictions
- Avoids the impact of low-quality pseudo-labels on the regression task
- Applicable to one-stage detectors without the need for a separate region proposal network

DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection