word language model下載 - word language model源代碼下載

word language model

Ai源碼

1.0.0

下載

單詞級語言建模RNN

此示例在語言建模任務上訓練多層RNN（Elman，GRU或LSTM）。默認情況下，培訓腳本使用提供的PTB數據集。然後，生成腳本可以使用訓練有素的模型來生成新文本。這是pytorch/示例/word_language_model的端口。

用法

main.py腳本接受以下參數：

optional arguments:
  -h, --help         show this help message and exit
  --data DATA        location of the data corpus
  --model MODEL      type of recurrent net (RNN_TANH, RNN_RELU, LSTM, GRU)
  --emsize EMSIZE    size of word embeddings
  --nhid NHID        number of hidden units per layer
  --nlayers NLAYERS  number of layers
  --lr LR            initial learning rate
  --clip CLIP        gradient clipping
  --epochs EPOCHS    upper epoch limit
  --batch-size N     batch size
  --bptt BPTT        sequence length
  --dropout DROPOUT  dropout applied to layers (0 = no dropout)
  --decay DECAY      learning rate decay per epoch
  --tied             tie the word embedding and softmax weights
  --seed SEED        random seed
  --cuda             use CUDA
  --log-interval N   report interval
  --save SAVE        path to save the final model

有了這些論點，可以測試各種模型。例如，以下參數產生的模型較慢，但模型更好：

python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40           # Test perplexity of 80.97
python main.py --cuda --emsize 650 --nhid 650 --dropout 0.5 --epochs 40 --tied    # Test perplexity of 75.96
python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40        # Test perplexity of 77.42
python main.py --cuda --emsize 1500 --nhid 1500 --dropout 0.65 --epochs 40 --tied # Test perplexity of 72.30

These perplexities are equal or better than Recurrent Neural Network Regularization (Zaremba et al. 2014) and are similar to Using the Output Embedding to Improve Language Models (Press & Wolf 2016 and Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling (Inan et al. 2016), though both of these papers have improved perplexities by using a form of recurrent dropout (variational dropout).

建築學

很快。

在Floydhub上運行

這是在FloyDhub上培訓，評估和服務您的語言建模任務的命令。

項目設置

啟動之前，請使用Floyd登錄命令在Floydhub上登錄，然後分叉並初始化項目：

$ git clone https://github.com/floydhub/word-language-model.git
$ cd word-language-model
$ floyd init word-language-model

訓練

在開始之前，您需要將Penn Treebank-3數據集上傳為FloyDhub數據集，遵循本指南：創建和上傳數據集。然後，您將準備使用不同的語言模型。

 # Train a LSTM on PTB with CUDA, reaching perplexity of 114.22
floyd run --gpu --env pytorch-0.2 --data < USERNAME > /dataset/ < PENN-TB 3> / < VERSION > :input " python main.py --cuda --epochs 7 "

# Train a tied LSTM on PTB with CUDA, reaching perplexity of 110.44
floyd run --gpu --env pytorch-0.2 --data < USERNAME > /dataset/ < PENN-TB 3> / < VERSION > :input " python main.py --cuda --epochs 7 --tied "

# Train a tied LSTM on PTB with CUDA for 40 epochs, reaching perplexity of 87.17
floyd run --gpu --env pytorch-0.2 --data < USERNAME > /dataset/ < PENN-TB 3> / < VERSION > :input " python main.py --cuda --tied "

筆記：

--gpu在FloyDhub GPU實例上運行您的工作。
--env pytorch-0.2為Python 3準備了Pytorch環境。
--data <USERNAME>/dataset/<PENN-TB3>/<VERSION>:input將previus上傳的Penn Treebank-3數據集安裝在/input文件夾中，以供我們作業。

該模型使用nn.RNN模塊（及其姐妹模塊nn.GRU和nn.LSTM ），如果安裝了cudnn，它將自動使用cudnn後端。

在訓練過程中，如果收到了鍵盤中斷（CTRL-C），則停止訓練，並針對測試數據集評估當前模型。

您可以使用日誌命令沿著進度遵循進度。前兩個培訓示例應在大約5分鐘內在GPU實例中完成，在CPU中進行40'。最後一個示例應在GPU實例上大約需要30'，並且在CPU方向上應花費3個小時以上。

評估

是時候評估我們的模型生成一些文本了：

 # Generate samples from the trained LSTM model.
floyd run --gpu --env pytorch-0.2 --data < USERNAME > /dataset/ < PENN-TB 3> / < VERSION > :input --data < REPLACE_WITH_JOB_OUTPUT_NAME > :model " python generate.py --cuda "

嘗試我們的預訓練模型

我們為您提供了一個預先訓練的模型，該模型訓練了40個時期，達到87.17的困惑：

 # Generate samples from the trained LSTM model.
floyd run --gpu --env pytorch-0.2 --data < USERNAME > /dataset/ < PENN-TB 3> / < VERSION > :input --data < REPLACE_WITH_JOB_OUTPUT_NAME > :model " python generate.py --cuda "

通過REST API服務模型

FloyDhub支持示意模式進行演示和測試目的。在通過REST API服務模型之前，您需要創建一個floyd_requirements.txt並聲明其中的燒瓶要求。如果您使用--mode serve標誌運行工作，FloyDhub將在項目中運行app.py文件，並將其附加到動態服務端點：

floyd run --gpu --mode serve --env pytorch-0.2  --data < USERNAME > /dataset/ < PENN-TB 3> / < VERSION > :input --data < REPLACE_WITH_JOB_OUTPUT_NAME > :model

上面的命令將在您的終端控制台中打印出此作業的服務端點。

服務端點將需要幾分鐘才能準備好。一旦啟動，您就可以通過發送帖子請求的單詞數量和模型來生成文本的溫度來與模型進行交互：

 # Template
# curl -X POST -o <NAME_&_PATH_DOWNLOADED_GENERATED_TEXT> -F "words=<NUMBER_OF_WORDS_TO_GENERATE>" -F "temperature=<TEMPERATURE>" <SERVICE_ENDPOINT>

curl -X POST -o generated.txt -F " words=100 " -F " temperature=3 " https://www.floydlabs.com/expose/vk47ixT8NeYBTFeMavbWta

在服務模式下運行的任何工作都將延續到達到最大運行時。因此，一旦完成測試，請記住關閉工作！

請注意，此功能處於預覽模式，尚未準備就緒