Rasa NLU (Natural Language Understanding) is a tool for understanding natural semantics. For example, the official website is as follows:
"I'm looking for a Mexican restaurant in the center of town"
And returning structured data like:
intent: search_restaurant
entities:
- cuisine : Mexican
- location : center
The original project is on branch 0.2.7 and can be switched freely. The modification of this version is based on the latest version of rasa. The original component in rasa_nlu_gao has been modified, and no new additions have been made. Moreover, the previous practices are a bit cumbersome and do not need to be modified in the rasa source code. You can directly load the original component as addon, inherit the latest version of rasa, and update it in real time.
The new features currently added are as follows (please download the latest rasa-nlu-gao version) (edit at 2019.06.24):
language: "zh"
pipeline:
- name: "JiebaTokenizer"
- name: "CountVectorsFeaturizer"
token_pattern: "(?u)bw+b"
- name: "EmbeddingIntentClassifier"
- name: "rasa_nlu_gao.extractors.bilstm_crf_entity_extractor.BilstmCRFEntityExtractor"
lr: 0.001
char_dim: 100
lstm_dim: 100
batches_per_epoch: 10
seg_dim: 20
num_segs: 4
batch_size: 200
tag_schema: "iobes"
model_type: "bilstm" # 模型支持两种idcnn膨胀卷积模型或bilstm双向lstm模型
clip: 5
optimizer: "adam"
dropout_keep: 0.5
steps_check: 100
language: "zh"
pipeline:
- name: "JiebaTokenizer"
- name: "CRFEntityExtractor"
- name: "rasa_nlu_gao.extractors.jieba_pseg_extractor.JiebaPsegExtractor"
part_of_speech: ["nr", "ns", "nt"]
- name: "CountVectorsFeaturizer"
OOV_token: oov
token_pattern: "(?u)bw+b"
- name: "EmbeddingIntentClassifier"
language: "zh"
pipeline:
- name: "JiebaTokenizer"
- name: "CRFEntityExtractor"
- name: "JiebaPsegExtractor"
- name: "CountVectorsFeaturizer"
OOV_token: oov
token_pattern: '(?u)bw+b'
- name: "EmbeddingIntentClassifier"
- name: "rasa_nlu_gao.classifiers.entity_edit_intent.EntityEditIntent"
entity: ["nr"]
intent: ["enter_data"]
min_confidence: 0
language: "zh"
pipeline:
- name: "JiebaTokenizer"
- name: "rasa_nlu_gao.featurizers.bert_vectors_featurizer.BertVectorsFeaturizer"
ip: '127.0.0.1'
port: 5555
port_out: 5556
show_server_config: True
timeout: 10000
- name: "EmbeddingIntentClassifier"
- name: "CRFEntityExtractor"
EmbeddingIntentClassifier and ner_bilstm_crf , two components that use tensorflow, are configured as follows (of course, config_proto can not be configured, and the default value will utilize all resources): language: "zh"
pipeline:
- name: "JiebaTokenizer"
- name: "CountVectorsFeaturizer"
token_pattern: '(?u)bw+b'
- name: "EmbeddingIntentClassifier"
config_proto: {
"device_count": 4,
"inter_op_parallelism_threads": 0,
"intra_op_parallelism_threads": 0,
"allow_growth": True
}
- name: "rasa_nlu_gao.extractors.bilstm_crf_entity_extractor.BilstmCRFEntityExtractor"
config_proto: {
"device_count": 4,
"inter_op_parallelism_threads": 0,
"intra_op_parallelism_threads": 0,
"allow_growth": True
}
embedding_bert_intent_classifier classifier has been added, and the corresponding configuration files are as follows: language: "zh"
pipeline:
- name: "JiebaTokenizer"
- name: "rasa_nlu_gao.featurizers.bert_vectors_featurizer.BertVectorsFeaturizer"
ip: '127.0.0.1'
port: 5555
port_out: 5556
show_server_config: True
timeout: 10000
- name: "rasa_nlu_gao.classifiers.embedding_bert_intent_classifier.EmbeddingBertIntentClassifier"
- name: "CRFEntityExtractor"
intent_estimator_classifier_tensorflow_embedding_bert classifier, and the corresponding configuration file is as follows: language: "zh"
pipeline:
- name: "JiebaTokenizer"
- name: "rasa_nlu_gao.featurizers.bert_vectors_featurizer.BertVectorsFeaturizer"
ip: '127.0.0.1'
port: 5555
port_out: 5556
show_server_config: True
timeout: 10000
- name: "rasa_nlu_gao.classifiers.embedding_bert_intent_estimator_classifier.EmbeddingBertIntentEstimatorClassifier"
- name: "SpacyNLP"
- name: "CRFEntityExtractor"
pip install rasa-nlu-gao
For specific examples, please see rasa_chatbot_cn
liveportraitweb novelling whatnovel omniparser sexting Comprimirmp4