PBLM Domain Adaptation
1.0.0
Autoren: Yftah Ziser, Roi Reichart (Technion - Israel Institute of Technology).
Dies ist ein Code -Repository, mit dem die in Pivot -basierten Sprachmodellierung erscheinen Ergebnisse für eine verbesserte Anpassung der neuronalen Domänen erzeugt werden.
Wenn Sie diese Implementierung in Ihrem Artikel verwenden, zitieren Sie bitte :)
@inproceedings { ziser2018pivot ,
title = { Pivot Based Language Modeling for Improved Neural Domain Adaptation } ,
author = { Ziser, Yftah and Reichart, Roi } ,
booktitle = { Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) } ,
volume = { 1 } ,
pages = { 1241--1251 } ,
year = { 2018 }
}PBLM benötigt die folgenden Pakete:
Python> = 2,7.
Numpy
Scipy
Theano/TensorFlow
Keras
Scikit-Learn
Sie finden ein erklärtes Beispiel in Run.py:
import tr
import sentiment
import pre
import os
import itertools
if __name__ == '__main__' :
domain = []
domain . append ( "books" )
domain . append ( "kitchen" )
domain . append ( "dvd" )
domain . append ( "electronics" )
# training the PBLM model in order to create structure aware representation for domain adaptation
#input:
# shared representation for both source domain and target domain
# first param: the source domain
# second param: the target domain
# third param: number of pivots
# fourth param: appearance threshold for pivots in source and target domain
# fifth param: the embedding dimension
# sixth param: maximum number of words to work with
# seventh param: maximum review length to work with
# eighth param: hidden units number for the PBLM model
#output: the software will create corresponding directory with the model
tr . train_PBLM ( domain [ 0 ], domain [ 1 ], 500 , 10 , 256 , 10000 , 500 , 256 )
# training the sentiment cnn using PBLM's representation
# shared representation for both source domain and target domain
# this phase needs a corresponding trained PBLM model in order to work
# first param: the source domain
# second param: the target domain
# third param: number of pivots
# fourth param: maximum review length to work with
# fifth param: the embedding dimension
# sixth param: maximum number of words to work with
# seventh param: hidden units number for the PBLM model
# eighth param: the number of filters for the CNN
# ninth param: the kernel size for the CNN
# output: the results file will be created in the same directory
# of the model under the results directory in the "cnn" dir
sentiment . PBLM_CNN ( domain [ 0 ], domain [ 1 ], 500 , 500 , 256 , 10000 , 256 , 250 , 3 )
# training the sentiment LSTM using PBLM's representation
# shared representation for both source domain and target domain
# this phase needs a corresponding trained PBLM model in order to work
# first param: the source domain
# second param: the target domain
# third param: number of pivots
# fourth param: maximum review length to work with
# fifth param: the embedding dimension
# sixth param: maximum number of words to work with
# seventh param: hidden units number for the PBLM model
# eighth param: hidden units number for the lstm model
# output: the results file will be created in the same directory
# of the model under the results directory in the "lstm" dir
sentiment . PBLM_LSTM ( domain [ 0 ], domain [ 1 ], 500 , 500 , 256 , 10000 , 256 , 256 )