Descarga RepBelief - Descargar el código fuente RepBelief

RepBelief

Código Fuente de IA

1.0.0

Descargar

Los modelos de idiomas representan creencias de sí mismo y de los demás

Este repositorio proporciona el código para el documento "Los modelos de lenguaje representan creencias de sí mismo y de los demás". Muestra que los LLM representan internamente creencias de sí mismas y de otros agentes, y manipular estas representaciones puede afectar significativamente sus capacidades de razonamiento de la teoría de la mente.

Instalación

 conda create -n lm python=3.8 anaconda
conda activate lm
# Please install PyTorch (<2.4) according to your CUDA version.
conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt

Luego descargue los modelos de idiomas (por ejemplo, Mistral-7B-InStruct-V0.2, Deepseek-LLM-7B-CHAT) a models/ . También puede especificar las rutas de archivo en lm_paths.json .

Representaciones de extracción

sh scripts/save_reps.sh 0_forward belief
sh scripts/save_reps.sh 0_forward action
sh scripts/save_reps.sh 0_backward belief

Sondeo

Binario:

python probe.py --belief=protagonist --dynamic=0_forward --variable belief 
python probe.py --belief=oracle --dynamic=0_forward --variable belief

python probe.py --belief=protagonist --dynamic=0_forward --variable action 
python probe.py --belief=oracle --dynamic=0_forward --variable action

python probe.py --belief=protagonist --dynamic=0_backward --variable belief 
python probe.py --belief=oracle --dynamic=0_backward --variable belief

Multinomial:

python probe_multinomial.py --dynamic=0_forward --variable belief
python probe_multinomial.py --dynamic=0_forward --variable action
python probe_multinomial.py --dynamic=0_backward --variable belief

Evaluación de Bigtom

sh scripts/0_forward_belief.sh
sh scripts/0_forward_action.sh
sh scripts/0_backward_belief.sh

Intervención

Intervención para la tarea de creencia hacia adelante :

sh scripts/0_forward_belief_interv_oracle.sh
sh scripts/0_forward_belief_interv_protagonist.sh
sh scripts/0_forward_belief_interv_o0p1.sh

Intervención de tarea cruzada:

sh scripts/cross_0_forward_belief_to_forward_action_interv_o0p1.sh
sh scripts/cross_0_forward_belief_to_backward_belief_interv_o0p1.sh

Citación

 @inproceedings { zhu2024language ,
    title = { Language Models Represent Beliefs of Self and Others } ,
    author = { Zhu, Wentao and Zhang, Zhining and Wang, Yizhou } ,
    booktitle = { Forty-first International Conference on Machine Learning } ,
    year = { 2024 }
}

Expandir

Información adicional

Versión 1.0.0
Tipo Código Fuente de IA
Fecha de actualización 2025-09-10
tamaño 831.66KB
Proviene de Github

Aplicaciones relacionadas

ML stack

2025-07-01
awesome free chatgpt

2025-01-04
pywin_contextmenu

2025-08-31
promptl

2025-02-17
tick.chat

2025-09-16
FastLoRAChat

2025-09-03

Recomendado para ti

chat.petals.dev

Otro código fuente

1.0.0
GPT Prompt Templates

Otro código fuente

1.0.0
GPTyped

Otro código fuente

GPTyped 1.0.5
ML stack

Código Fuente de IA

1.0.0
awesome free chatgpt

Código Fuente de IA

1.0.0
pywin_contextmenu

Código Fuente de IA

Version update
Google Dorks

Otro código fuente

1.0
shepherd

Otro código fuente

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Otro código fuente

v1.1.0-rc-3

Información relacionada Todo