Ultralytics YOLOv10, developed by Ultralytics, is a cutting-edge, state-of-the-art (SOTA) model that builds upon the success of previous YOLO versions and introduces new features and improvements to further boost performance and flexibility. YOLOv10 is designed to be fast, accurate, and easy to use, making it an excellent choice for a wide range of object detection, image segmentation and image classification tasks.
Backbone | Arch | size | Mask Refine | SyncBN | AMP | Mem (GB) | box AP | TTA box AP | Config | Download |
---|---|---|---|---|---|---|---|---|---|---|
YOLOv8-n | P5 | 640 | No | Yes | Yes | 2.8 | 37.2 | config | model | log | |
YOLOv8-n | P5 | 640 | Yes | Yes | Yes | 2.5 | 37.4 (+0.2) | 39.9 | config | model | log |
YOLOv8-s | P5 | 640 | No | Yes | Yes | 4.0 | 44.2 | config | model | log | |
YOLOv8-s | P5 | 640 | Yes | Yes | Yes | 4.0 | 45.1 (+0.9) | 46.8 | config | model | log |
YOLOv8-m | P5 | 640 | No | Yes | Yes | 7.2 | 49.8 | config | model | log | |
YOLOv8-m | P5 | 640 | Yes | Yes | Yes | 7.0 | 50.6 (+0.8) | 52.3 | config | model | log |
YOLOv8-l | P5 | 640 | No | Yes | Yes | 9.8 | 52.1 | config | model | log | |
YOLOv8-l | P5 | 640 | Yes | Yes | Yes | 9.1 | 53.0 (+0.9) | 54.4 | config | model | log |
YOLOv8-x | P5 | 640 | No | Yes | Yes | 12.2 | 52.7 | config | model | log | |
YOLOv8-x | P5 | 640 | Yes | Yes | Yes | 12.4 | 54.0 (+1.3) | 55.0 | config | model | log |
Note
- We use 8x V100 for training, and the single-GPU batch size is 16. This is different from the official code, but has no effect on performance.
- The performance is unstable and may fluctuate by about 0.3 mAP and the highest performance weight in
COCO
training inYOLOv8
may not be the last epoch. The performance shown above is the best model. SyncBN
means using SyncBN,AMP
indicates training with mixed precision.- The performance of
Mask Refine
training is for the weight performance officially released by YOLOv8.Mask Refine
means refining bbox by mask while loading annotations and transforming afterYOLOv5RandomAffine
, and the L and X models useCopy Paste
. TTA
means that Test Time Augmentation. It's perform 3 multi-scaling transformations on the image, followed by 2 flipping transformations (flipping and not flipping). You only need to specify--tta
when testing to enable. see TTA for details.