
Computer Vision Annotation Tool (CVAT)
The industry-standard open-source platform for professional data labeling and computer vision management.

Real-time semantic segmentation for embedded autonomous systems using factorized residual layers.

ERFNet (Efficient Residual Factorized ConvNet) is a pioneering deep learning architecture designed to provide an optimal trade-off between computational efficiency and accuracy for semantic segmentation tasks. Developed originally for autonomous driving and intelligent transportation systems, ERFNet utilizes a unique structure of 'non-bottleneck-1D' blocks. These blocks leverage factorized convolutions—decomposing a standard 3x3 kernel into sequential 3x1 and 1x3 operations—which dramatically reduces parameters and FLOPs while maintaining a wide receptive field. In the 2026 market, ERFNet continues to serve as a critical benchmark and production-grade backbone for edge deployment on low-power hardware like NVIDIA Jetson and specialized NPUs. Its architecture consists of a powerful encoder to extract features and a lightweight decoder to recover spatial resolution, achieving over 70% mIoU on the Cityscapes dataset at real-time speeds (>50 FPS on modern hardware). Its robustness and minimal memory footprint make it the preferred choice for industrial robotics, drone navigation, and real-time ADAS (Advanced Driver Assistance Systems) where latency is the primary constraint.
ERFNet (Efficient Residual Factorized ConvNet) is a pioneering deep learning architecture designed to provide an optimal trade-off between computational efficiency and accuracy for semantic segmentation tasks.
Explore all tools that specialize in perform semantic segmentation. This domain focus ensures ERFNet delivers optimized results for this specific requirement.
Explore all tools that specialize in object boundary detection. This domain focus ensures ERFNet delivers optimized results for this specific requirement.
Splits 3x3 kernels into 3x1 and 1x3 components to reduce the number of operations without sacrificing receptive field.
Residual layers that avoid the bottleneck design to prevent information loss in shallow networks.
Incorporates dilated (atrous) kernels in the late stages of the encoder to capture global context.
A structured downsampling and upsampling strategy that balances feature extraction and spatial recovery.
Designed for limited VRAM environments with a highly efficient parameter count (~2M parameters).
Compatible with training regimes that utilize various input resolutions for improved robustness.
Fully compatible with ONNX export for acceleration via TensorRT or OpenVINO.
Clone the official repository from GitHub: git clone https://github.com/Eromera/erfnet
Install Python 3.10+ and PyTorch 2.0+ environment.
Install dependencies: torchvision, numpy, pillow, and opencv-python.
Download pre-trained weights for Cityscapes or Pascal VOC from the repository's model zoo.
Prepare input data by resizing images to 1024x512 (default resolution for optimal Cityscapes performance).
Execute 'eval.py' to run inference on a validation set and verify the mIoU accuracy.
Modify 'main.py' to point to your custom dataset if training from scratch.
Configure hyper-parameters including Adam optimizer, 5e-4 learning rate, and L2 weight decay.
Export the trained model to TorchScript or ONNX for optimized edge deployment.
Integrate the ONNX model into a C++ TensorRT pipeline for maximum real-time performance.
All Set
Ready to go
Verified feedback from other users.
"Highly regarded in the academic and engineering community for its exceptional speed-to-accuracy ratio on edge devices."
Post questions, share tips, and help other users.

The industry-standard open-source platform for professional data labeling and computer vision management.
Professional-grade edge matting and semantic segmentation for high-volume digital workflows.

The AI-native data platform for data-centric computer vision development.

The performance-first computer vision augmentation library for high-accuracy deep learning pipelines.

Enterprise-grade automated data labeling and dataset curation for production-ready AI models.

Criss-Cross Network for Semantic Segmentation using attention mechanisms.