Automatically scrape images with your query from the popular search engines
using an easy-to-use Frontend or using scripts.

This code is part of a paper (citation), also check the project page if you are interested in creation a dataset for instance segmentation.
Start the front end with a single command (adjust the /PATH/TO/OUTPUT to your desired output path)
docker run -it --rm --name easy_image_scraping --mount type=bind,source=/PATH/TO/OUTPUT,target=/usr/src/app/output -p 5000:5000 ghcr.io/a-nau/easy-image-scraping:latestEnter your query and wait for the results to show in the output folder. The web applications also shows a preview of
downloaded images.
Start using the command line with
docker run -it --rm --name easy_image_scraping --mount type=bind,source=/PATH/TO/OUTPUT,target=/usr/src/app/output -p 5000:5000 ghcr.io/a-nau/easy-image-scraping:latest bashIf you just want to search for a single keywords adjust and run search_by_keyword.py
search_terms_eng.txt.config.py to define search engines for each languagesearch_by_keywords_from_filesThis is optional - you can also directly use our provided container.
You can also build the image yourself using
docker build -t easy_image_scraping .The run it by using
docker run -it --rm --name easy_image_scraping -p 5000:5000 --mount type=bind,source=/PATH/TO/OUTPUT,target=/usr/src/app/output easy_image_scrapingconda env create -f environment.ymlpip install -r requirements.txtwith webdriver.Chrome(
executable_path="path/to/chrome_diver.exe", # add this line
options=set_chrome_options()
) as wd:
Unless stated otherwise, this project is licensed under the MIT license.
If you use this code for scientific research, please consider citing
@inproceedings{naumannScrapeCutPasteLearn2022,
title = {Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics},
author = {Naumann, Alexander and Hertlein, Felix and Zhou, Benchun and Dörr, Laura and Furmans, Kai},
booktitle = {{{IEEE Conference}} on {{Machine Learning}} and Applications ({{ICMLA}})},
date = 2022
}Please be aware of copyright restrictions that might apply to images you download.