STDC-Seg
A real-time semantic segmentation approach for efficient scene understanding.
A collaborative release of open source dataset by Google for computer vision research, offering annotated images for object detection, segmentation, and visual relationship detection.

Open Images Dataset is a collaborative open-source dataset released by Google, designed to advance computer vision research. It features millions of images with annotations for various tasks, including object detection, instance segmentation, visual relationship detection, localized narratives and point-level annotations. The dataset includes bounding boxes for 600 object classes, instance segmentations for 350 classes, and relationship annotations on 1,466 relationships. Additionally, it provides localized narratives, point-level annotations, and image-level labels spanning thousands of classes. The dataset is intended for researchers and developers working on machine learning models for image analysis, object recognition, and other computer vision applications. The large scale and comprehensive annotations facilitate the training and evaluation of robust and accurate models.
Open Images Dataset is a collaborative open-source dataset released by Google, designed to advance computer vision research.
Explore all tools that specialize in bounding box annotation. This domain focus ensures Open Images Dataset delivers optimized results for this specific requirement.
Explore all tools that specialize in pixel-level classification. This domain focus ensures Open Images Dataset delivers optimized results for this specific requirement.
Explore all tools that specialize in relationship annotation. This domain focus ensures Open Images Dataset delivers optimized results for this specific requirement.
Provides bounding box annotations for 600 different object classes, enabling the training of object detection models.
Offers instance segmentation annotations for 350 object classes, allowing for pixel-level segmentation of individual objects.
Includes annotations for relationships between objects, such as 'person riding bike', enabling the training of models that understand object interactions.
Provides textual descriptions of specific regions within images, linking visual content with natural language.
Offers fine-grained point-level annotations for various object classes
Visit the Open Images Dataset website: https://storage.googleapis.com/openimages/web/index.html
Review the dataset description and available annotations.
Download the desired image subsets and annotation files.
Familiarize yourself with the dataset format and annotation structure.
Choose a machine learning framework (e.g., TensorFlow, PyTorch) for model training.
Load the dataset and annotations into your chosen framework.
Start training your computer vision model.
All Set
Ready to go
Verified feedback from other users.
"Open Images Dataset provides a large and diverse dataset which enables researchers and developers to train high-performing computer vision models. The availability of annotations such as object detection, instance segmentation, and visual relationships are widely appreciated by the community."
0Post questions, share tips, and help other users.
A real-time semantic segmentation approach for efficient scene understanding.
ICNet for Real-Time Semantic Segmentation on High-Resolution Images.
ShapeNet is a richly-annotated, large-scale dataset of 3D shapes designed to enable research in computer graphics, computer vision, robotics, and related disciplines.
The VCTK Corpus provides diverse English speech data from 110 speakers, ideal for voice cloning and speech synthesis research.
SNLI is a large, annotated corpus for learning natural language inference, providing a benchmark for evaluating text representation systems.
nuScenes is a public large-scale dataset for autonomous driving, providing a comprehensive suite of sensor data and annotations.
Cityscapes is a large-scale dataset for semantic urban scene understanding, providing high-quality pixel-level annotations of street scenes from 50 different cities.
KITTI Dataset provides a suite of real-world computer vision benchmarks for autonomous driving research and development.