ec nl Download - ec nl Source code download

ec nl

AI Source Code

1.0.0

Download

EC-NL

Code and data for paper Linking Emergent and Natural Languages via Cospus Transfer at ICLR 2022 (spotlight).

@inproceedings{yao2022linking,
  title = {Linking Emergent and Natural Languages via Corpus Transfer},
  author = {Yao, Shunyu and Yu, Mo and Zhang, Yang and Narasimhan, Karthik and Tenenbaum, Joshua and Gan, Chuang},
  booktitle = {International Conference on Learning Representations (ICLR)},
  year = {2022},
  html = {https://openreview.net/pdf?id=49A1Y6tRhaq},
}

Dependencies

PyTorch 1.8
SciPy 1.4
Transformers 4.4.2
(Optional) Wandb

Data

Google Drive includes

image_features: Image features of coco-2014 (coco.pt) and Conceptual Captions (cc.pt) datasets from a pre-trained ResNet, to be used in EC pre-training.
lm_corpora: Corpora used for language modeling transfer experiments.

Name	Usage	Comment
cc.pt	pre-train	Emergent language
paren-zipf.pt	pre-train	Regular language of nesting parentheses
wiki-es.pt	pre-train	Spanish (IE-Romance) Wikipedia
wiki-da.pt	fine-tune	Danish (IE-Germanic) Wikipedia
wiki-eu.pt	fine-tune	Basque (Basque) Wikipedia
wiki-ja.pt	fine-tune	Japanese (Japanese) Wikipedia
wiki-ro.pt	fine-tune	Romanian (IE-Romance) Wikipedia
wiki-fi.pt	fine-tune	Finnish (Uralic) Wikipedia
wiki-id.pt	fine-tune	Indonesian (Austronesian) Wikipedia
wiki-kk.pt	fine-tune	Kazakh (Turkic) Wikipedia
wiki-he.pt	fine-tune	Hebrew (Afro-Asiatic) Wikipedia
wiki-ur.pt	fine-tune	Urdu (IE-Indic) Wikipedia
wiki-fa.pt	fine-tune	Persian (IE-Iranian) Wikipedia

Experiments

Emergent Communication (EC) Game

This part aims to generate emergent langauge corpus for downstream tasks. Download image_features from Google Drive to ./ec-pretrain/data. To run the emergent communication training,

cd ec-game
python train.py

Some major options:

--dataset: use Conceptual Captions (cc) or MS-COCO (coco_2014) dataset.
--vocab_size: Vocabulary size (default 4035).
--seq_len: Sequence length limit (default 15).

Such a game training automatically stores EC agents (e.g. ./ckpt/cc_vocab_4035_seq_15_reset_-1_nlayers_1/run77926/model_90.6_1000_4035.pt) and emergent language corpora (e.g. ./ckpt/cc_vocab_4035_seq_15_reset_-1_nlayers_1/run77926/model_90.6_1000_4035.pt-cc.pt, which can be used in place of lm_corpora/cc.pt from Google Drive) from different training steps. In the example, 90.6_1000_4035 represents game accuracy, game training steps, and game vocabulary size respectively.

Language Modeling Transfer

This part aims to reproduce Figure 2 of the paper. Download lm_corpora from Google Drive to ./ec-pretrain/data.

To run the pre-training,

export size=2 # 2,5,10,15,30
export pt_name="wiki-es" # "paren-zipf", "cc"
. pretrain.sh

To run the fine-tuning,

export size=2 # 2,5,10,15,30
export pt_name="wiki-es" # "paren-zipf", "cc"
export ft_name="wiki-ro"
export ckpt=3000
. finetune.sh

Meaning of variables above:

size: Token size (million) of pre-training corpus ([2, 5, 10, 15, 30]).
pt_name: Name of pre-training corpus (["wiki-es", "paren-zipf", "cc"]).
ft_name: Name of fine-tuning corpus (["wiki-ro", "wiki-da.pt]).
ckpt: Which pre-training checkpoint to use for fine-tuning (default 3000).

Acknowledgements

The EC part of the code is based on ECNMT, which was partly based on Translagent.

The LM part of the code is based on Huggingface run_clm.py.

The datasets for our EC experiments include MS COCO and Conceptual Captions.

The datasets for our LM experiments derive from tilt-transfer.

Please cite these resources accordingly. For any question, contact Shunyu.

Expand

Additional Information

Version 1.0.0
Type AI Source Code
Update Time 2025-09-09
size 23.16KB
From Github

Related Applications

GitHub sgrebnov/cordova plugin background download

2024-11-05
Wa ch ull navra maza navsacha 2 2024 ull ovie Fr e Online On Strea ings

2024-11-03
Wa ch navra maza navsacha 2 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-03
Wa ch the greatest of all time 2024 ull ovie Online For Fr e Strea ings At Home

2024-11-02
wolfs 2024 f llmo ie f lmyz lla dow load ree 7 0p 4 0p a d 10 0p

2024-11-01
Cloud EC e-commerce system v1.2.4

2022-06-04

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
pywin_contextmenu

AI Source Code

Version update
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All