Protex is a semantic search tool enabling researchers to search for both known & novel proteins using natural-language functional description (e.g. "Find proteins that can help degrade tyrosine phosphatase 1")
Made with: @rdilip, @alexub, @shawndimantha. 2nd place at Scale AI's Generative AI Hackathon
Demonstration Video
OpenAI's CLIP model is used to embed both descriptions of proteins (scraped from InterPro) and the embedded protein sequence (obtained using Meta's ESM embedding model). Embeddings are stored in ChromaDB and a nearest-neighbor vector search is performed to find the most relevant proteins for user queries.
Our site is built with React, NextJS, Tailwind CSS, and Flask.

The frontend is available at this link.
To get the backend running, simply:
protex.pyindex.py. (you might be prompted to install some Python packages)The website should now be fully functional.