SCD: A Stacked Carton Dataset for Detection and Segmentation
Even the most advanced Warehouse Management Systems (WMS) depend on humans, and humans make errors. One of such erros which leads to operational costs is misplacements of pallets. Misplaced pallet is a lost pallet. WMS shows its location, but it's not there. To find lost pallets, warehouses have to perform periodic inventory counting, which - if done manually, is costly and slow.
Manual counting of pallets can be replaced with AI/ML. The following steps are needed to implement automated AI-assisted pallet counting/identification:
The system described above can be connected to a drone, which performs autonomous/semi-autonomous flightrs inside a warehouse and sends video stream or images to a server, which does object detection/recognition/matching.
This repository contains notebooks and code to fine-tune and deploy several models. We are using Stacked Carton Dataset to fine-tune the model to recognize segments of carton boxes stacked on pallets in warehouses.
The two images below show an example of an image from SCD dataset:
our goal is to fine-tune Florence-2 model so that it annotates carton segments on an image with maximum accuracy.


notebooks will be added as the project progresses
0_SCD_download_explore.ipynb
In this notebook we download Stacked Carton Dataset and explore its contents
1_SCD_create_dataset_ft_detr.ipynb
In this notebook we make the necessary tarnsformations and create the following 3 datasets:
3_florence_2_zero_shot.ipynb
In this notebook we test zero-shot Florence-2 performance on a selection of images from val_df. We can see that the model shows quite a reasonable accuracy and thus is a good candidate for fine-tuning.
4_fine_tune_rt_detr.ipynb
In this notebook we fine-tune RT-DETR model. Unfortunately, we have failed to achieve good enough results with this model.