words2contact 다운로드 words2contact 소스 코드 다운로드

words2contact

AI 소스 코드

1.0.0

다운로드

Words2Contact : 기초 모델을 사용한 구두 지침에서 지원 연락처 식별

github

논문의 공식 구현 "Words2Contact : IEEE-RAS Humanoids 2024에 제시된 기초 모델을 사용하여 구두 지침에서 지원 접촉을 식별합니다 .

이 저장소에는 프로젝트의 LLMS/VLMS 부분의 구현이 포함되어 있습니다. 다중 접촉 전신 컨트롤러는이 리포지어를 방문하십시오.

자세한 내용은 논문 웹 사이트를 방문하십시오.

Words2Contact : 기초 모델을 사용한 구두 지침에서 지원 연락처 식별
- 목차
- 저장소 구조
- 전제 조건
- 설치
- 용법
  - 설정
  - Docker 컨테이너를 시작합니다
  - 빠른 시작
  - 명령 줄 옵션
  - 로컬 LLM 사용
- 연락하다
- Words2Contact 인용
- 감사의 말

저장소 구조

 .
├── .ci/                       # Docker configurations
│   └── Dockerfile             # Dockerfile to build the project's container
├── config/                    # Configuration files for models
│   └── GroundingDINO_SwinT_OGC.py # GroundingDINO configuration
├── data/                      # Test data and outputs
│   ├── test.png               # Example input image
│   └── test_output.png        # Example output image
├── media/                     # Media assets
│   ├── ack.png                # Acknowledgment image
│   └── concept_figure_wide.png # Conceptual figure for the project
├── submodules/                # External submodules
│   └── CLIP_Surgery/          # CLIP Surgery code and resources
├── words2contact/             # Core project source code
│   ├── grammar/               # Grammars for constraining language models
│   │   ├── classifier.gbnf    # Grammar for classifying outputs
│   │   └── README.md          # Grammar module documentation
│   ├── prompts/               # Prompts for LLMs
│   │   └── prompts.json       # JSON file with pre-defined prompts
│   ├── geom_utils.py          # Utilities for geometric calculations
│   ├── math_pars.py           # Parsing mathematical expressions
│   ├── saygment.py            # Language-grounded segmentation
│   ├── words2contacts.py      # Core script for Words2Contact
│   └── yello.py               # Language-grounded object detection
├── main.py                    # Entry point for the project
├── launch.sh                  # Docker launch script
├── object_detection.py        # Object detection testing
├── object_segmentation.py     # Object segmentation testing
└── README.md                  # Documentation (this file)

전제 조건

시작하기 전에 다음과 같은 사항을 확인하십시오.

도커
NVIDIA 컨테이너 툴킷 (GPU를 사용하는 경우 (권장))
OpenAI API 키 (GPT 기반 LLM을 사용하는 경우). OpenAi에서 얻을 수 있습니다.

설치

현재 Docker 만 지원되며 Conda 및 PIP 설치가 곧 추가 될 예정입니다.

저장소 복제 :

git clone https://github.com/hucebot/words2contact.git
cd words2contact

Docker 이미지 구축 :

docker build -t words2contact -f .ci/Dockerfile .

용법

설정

OpenAI의 GPT 기반 LLM을 사용하려는 경우 Docker 컨테이너를 시작하기 전에 API 키를 환경 변수로 설정하십시오.

 export OPENAI_KEY= < your_openai_api_key >

Docker 컨테이너를 시작합니다

컨테이너를 시작하려면 다음 명령을 실행하십시오.

bash launch.sh

이렇게하면 모델이 다운로드 및 저장 될 프로젝트의 루트에 models/ 폴더가 생성됩니다.

빠른 시작

제공된 예제 이미지와 관련하여 Words2Contact를 테스트하려면 :

python main.py --image_path data/test.png --prompt " Place your hand above the red bowl. "

출력은 data/test_output.png 로 저장됩니다.

더 많은 예가 곧 올 것입니다!

명령 줄 옵션

 usage: main.py [-h] [--image_path IMAGE_PATH] [--prompt PROMPT] [--use_gpt] [--yello_vlm YELLO_VLM] [--output_path OUTPUT_PATH] [--llm_path LLM_PATH] [--chat_template CHAT_TEMPLATE]

Run Words2Contact with an image and a text prompt.

options:
  -h, --help                    show this help message and exit
  --image_path IMAGE_PATH       Path to the input image file. Default: 'data/test.png'.
  --prompt PROMPT               Text prompt for Words2Contact. Default: 'Place your hand above the red bowl.'.
  --use_gpt                     Use OpenAI API for the LLM (requires `OPENAI_KEY`).
  --yello_vlm YELLO_VLM         Model to use for YELLO VLM. Default: 'GroundingDINO'.
  --output_path OUTPUT_PATH     Path to save the output image. Default: 'data/test_output.png'.
  --llm_path LLM_PATH           Path to the `.gguf` LLM model weights.
  --chat_template CHAT_TEMPLATE Chat template to use for local LLMs. Default: 'ChatML'.

로컬 LLM 사용

신뢰할 수있는 소스 (예 : TheBloke 's Hugging Face 모델)에서 로컬 LLM에 대한 .gguf 가중치를 다운로드하십시오.
가중치를 models/ 폴더에 배치하십시오.

스크립트를 실행할 때 --llm_path 인수를 지정하십시오.

python main.py --image_path data/test.png --llm_path models/local_model.gguf

연락하다

질문이나 지원은 다음과 같이 문의하십시오.

dionis totsila : [email protected]

Words2Contact 인용

연구에서 Words2Contact, 당사 데이터 세트 또는이 코드의 일부를 사용하는 경우, 논문을 인용하십시오.

 @INPROCEEDINGS { 10769902 ,
  author = { Totsila, Dionis and Rouxel, Quentin and Mouret, Jean-Baptiste and Ivaldi, Serena } ,
  booktitle = { 2024 IEEE-RAS 23rd International Conference on Humanoid Robots (Humanoids) } ,
  title = { Words2Contact: Identifying Support Contacts from Verbal Instructions Using Foundation Models } ,
  year = { 2024 } ,
  volume = { } ,
  number = { } ,
  pages = { 9-16 } ,
  keywords = { Accuracy;Large language models;Pipelines;Natural languages;Humanoid robots;Transforms;Benchmark testing;Iterative methods;Surface treatment } ,
  doi = { 10.1109/Humanoids58906.2024.10769902 } }