image_search Download - image_search Source code download

image_search

Other source code

1.0.0

Download

IMAGE SEARCH APP

Project Intro

The advent of the internet revolutionized the way we access information through potent search engines such as Google, Bing, and Yandex. With just a few keywords, we can swiftly locate web pages pertinent to our queries. As technology, particularly AI, advances, many search engines now facilitate online image searches.

Various techniques for image searching have emerged, including:

image search by metadata: Here, the search is not based on the image itself but rather on the metadata following the image like (keywords, text, filename,date etc.)
image search based on image content: This approach uses, state of the art computer vision techniques to extract shape, colour, any relevant features from an image. This is the technique we are going to use.

In this project, we will use a pre-trained Convolutional Neural Network (CNN) to extract valuable features from the images. This methodology, a key component of content-based image search, provides the following benefits:

CNN are robust: CNN have proven to be very powerful to extract key features from an image.
CNN can reduce dimension: The CNN output typically represents a condensed, relevant representation of the image often called feature map or embedding or vectors, as not every pixel holds significant information. This condensed representation often has smaller dimensions.

In summary, in this study we will like to answer the following question: Are two similar images associated embedding are still similar?

Technologies / Frameworks used

Project Description

For this project, we've used the Cifar-10. It's is a freely available dataset comprising 60,000 color images, each measuring 32x32 pixels. These images belong to 10 distinct categories: Airplane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, and Truck. To obtain their corresponding embeddings, we applied a pre-trained CNN model, specifically VGG-16, to extract essential features. The resulting vector is 512-dimensional. Within Pinecone, we created an index named "images" with a dimension of 512, where all these vectors will be stored.

THe idea behind this project is to find if similar images of Birds for example have similar embeddings. To do so,we've uploaded 50,000 out of the total 60,000 images associated embedding to a pinecone index . This partition was made to ensure that we have entirely new and distinct images compared to those already stored as vectors in Pinecone. Also note that this paritioning is already done by the cifar-10 dataset into train and test batches representing the serialized versions of the original images arrays.

Working principle

The picutre below describe the whole process of storing the embeddings to a pinecone index. From the first step of reading the images, applying a pre-trained VGG16 neural network to generate 512 dimensional embbedings which are then upserted (ie stored) in a pinecone index. principle

Running time

In this project, we are handling 50 thousand images, which pose some challenges in terms of compution especially when reading images, unpickle (we've downlaoded the CIFAR-10 dataset serialized version) them and extracting features via a CNN. We tried to leverage the power of parrallel computing when running our code so that everything runs as fast as possible on multiple CPU cores via multithreading.
Note: If possible, run this project on a GPU powered environment for faster computations.

Getting Started

Create a pinecone account for free here.
Get the api key and environement associated to your pinecone account
Clone this repo (for help see this tutorial).
Create a virtual environment in the project folder (for help see this tutorial).
Run the following command to install the necessary packages.

For linux users:

pip3 install -r requirements.txt

For windows users:

pip install -r requirements.txt

Launch the image insertion script using the following.

python insert_data.py -key <API_KEY>  -env <ENV>  -metric <METRIC>

Replace <ENV> and <API_KEY> with the values you get from your pinecone account. Wait for the script to be done. 7. Launch the app using the following.

streamlit run app.py -- -key <API_KEY> -env <ENV>

Once everything is done, you should see something like this:

home Page

Expand

Additional Information

Version 1.0.0
Type Other source code
Update Time 2025-05-31
size 162.7MB
From Github

Related Applications

Bulk Image Downloader

2024-11-10
Word Search 800

2024-11-08
pytorch image models

2024-11-03
moa image gallery

2011-09-14
CF image host

2011-04-26
ajax-image-uploader

2010-09-29

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All