Posted on 2021-08-18
Ryax Technologies Ryax Technologies

General object detection (overhead imagery)

+ More details


This model is trained to do so for 60 different types of objects (including buses, vehicle lots, buildings, oil tankers, etc.) using one of the largest and highest quality public datasets of annotated high-resolution satellite imagery. Covering a total area of over 45,000 square kilometers, this dataset contains many variations of each of the 60 object classes, resulting in a robust model.

Analytics status

  • Beta

Business benefit

The ability to quickly and accurately identify objects in aerial imagery can be useful in a variety of scenarios, such as urban planning, crop surveillance, and traffic surveillance.

Data inputs (mandatory)

▪ Image (100Mo max, .png .jpg .jpeg .tiff)
The input file should be a 3-channel electro-optical satellite image.

Data Output

▪ The output file will contain detected vehicle bounding boxes. Each bounding box will contain the corresponding class name, confidence score, and top left/bottom right x,y coordinates defining the box. This model can detect the following object classes: (1Mo max, text file)

Containing the following possible object classes:
Fixed-wing Aircraft Small Aircraft Cargo Plane Helicopter
Small Car Bus Pickup Truck Utility Truck
Truck Cargo Truck Truck with Box Truck Tractor
Trailer Truck with Flatbed Truck with Liquid Crane Truck
Railway Vehicle Passenger Car Cargo Car Flat Car
Tank Car Locomotive Maritime Vessel Motorboat
Sailboat Tugboat Barge Fishing Vessel
Ferry Yacht Container Ship Oil Tanker
Engineering Vehicle Tower Crane Container Crane Reach Stacker
Straddle Carrier Mobile Crane Dump Truck Haul Truck
Scraper or Tractor Front Loader or Bulldozer Excavator Cement Mixer
Ground Grader Hut or Tent Shed Building
Aircraft Hangar Damaged Building Facility Construction Site
Vehicle Lot Helipad Storage Tank Shipping Container Lot
Shipping Container Pylon Tower Passenger Vehicle

Technical description

23.2% 60-class mAP

The mean of Average Precision scores across all classes.

This model was trained and validated using the public xView2dataset, which consists of high-resolution satellite imagery annotated with building locations and damage scores before and after natural disasters. This model achieves a 60-class mean average precision (mAP) score of 0.2319, calculated using the methodology outlined by SIMRDWN. An Intersection over Union (IoU) threshold of 0.5 was used for most classes and a threshold of 0.25 was used for objects that are comparably smaller, such as vehicles. The model performs best on electro-optical satellite imagery with a ground sample distance of 0.3 meters. For fast inference time, this model should have access to at least one GPU.

This model detects 60 classes of objects within overhead electro-optical (EO) satellite imagery. It utilizes an adapted version of YOLO called YOLT2, provided by the open source SIMRDWN framework. YOLT2 is a pipeline that is tailored towards satellite imagery, where larger convolutional filters are replaced by smaller 3×3 filters, refining the model’s ability of detecting small objects from a distance. This model was trained and validated using the public xView2 dataset, and accepts a TIFF, PNG, or JPEG image as its input. It returns a JSON file containing detected bounding boxes and corresponding object class names and their corresponding confidence scores.

The training set consists of 80% of the original images and annotations, randomly sampled without replacement, from the public xView2 dataset. The images were then chipped to fit the YOLT network’s window size of 416 x 416 pixels and processed at their native resolution. Image chips that contained no objects were discarded.