
Computer Vision Annotation Tool (CVAT)
The industry-standard open-source platform for professional data labeling and computer vision management.

Open-source, browser-based image labeling for high-velocity computer vision pipelines.

MakeSense.ai is a specialized, open-source image annotation tool designed for the rapid generation of datasets for computer vision. Architected using React and leveraging TensorFlow.js for client-side execution, it operates entirely within the user's browser. This technical choice ensures that sensitive image data never leaves the local environment, providing a privacy-first workflow that is increasingly critical in 2026 for regulated industries. The platform supports a wide array of annotation primitives, including bounding boxes, polygons, lines, and point keypoints. Its competitive advantage lies in its 'AI-assisted labeling' feature, which allows users to load pre-trained models (such as YOLOv5 or SSD) to perform initial automated labeling, significantly reducing manual effort. As the market shifts towards edge-AI and proprietary data sovereignty, MakeSense.ai serves as a zero-infrastructure entry point for ML engineers and researchers who require a lightweight, high-performance alternative to enterprise-grade suites like Labelbox or CVAT. Its ability to export directly into major formats like YOLO, VOC XML, and VGG JSON makes it a versatile component in any modern MLOps stack.
MakeSense.
Explore all tools that specialize in perform semantic segmentation. This domain focus ensures MakeSense.ai delivers optimized results for this specific requirement.
Explore all tools that specialize in classify images. This domain focus ensures MakeSense.ai delivers optimized results for this specific requirement.
Explore all tools that specialize in object detection labeling. This domain focus ensures MakeSense.ai delivers optimized results for this specific requirement.
Uses TensorFlow.js to run model inference locally on the GPU via WebGL, enabling real-time object detection suggestions.
Support for complex n-sided polygons with vertex-snapping and interactive editing.
On-the-fly conversion of internal state to COCO, YOLO, and VOC formats.
Project state can be exported as a .json file to be reloaded later, bypassing the lack of a database.
Comprehensive keyboard mapping for all annotation actions.
Allows for the placement of specific point coordinates for pose estimation tasks.
Ability to upload custom TF.js converted models for specialized object detection.
Navigate to the makesense.ai official web application.
Drag and drop a local dataset of images into the browser window.
Select the 'Object Detection' or 'Image Recognition' workflow based on project goals.
Define a list of labels/classes (e.g., 'Car', 'Pedestrian', 'Traffic Light').
(Optional) Load a pre-trained TF.js model for automated bounding box suggestion.
Utilize hotkeys (W for box, P for polygon) to annotate objects in the viewport.
Adjust coordinates and refine edge cases manually using the zoom and pan tools.
Save the project session locally to preserve state without cloud sync.
Select 'Export Labels' from the Actions menu.
Choose the desired ML framework format (e.g., YOLOv8) and download the ZIP.
All Set
Ready to go
Verified feedback from other users.
"Users praise the tool for its 'no-nonsense' approach and speed. The lack of account requirements and data privacy are cited as the top reasons for use in professional settings."
Post questions, share tips, and help other users.

The industry-standard open-source platform for professional data labeling and computer vision management.

The high-performance deep learning framework for flexible and efficient distributed training.

The AI-native data platform for data-centric computer vision development.

The performance-first computer vision augmentation library for high-accuracy deep learning pipelines.

Vision Transformer and MLP-Mixer architectures for image recognition and processing.

Real-time semantic segmentation for embedded autonomous systems using factorized residual layers.