sentiment_analysis_fine_grain 다운로드 sentiment_analysis_fine

sentiment_analysis_fine_grain

AI 소스 코드

1.0.0

다운로드

소개

이 저장소를 사용하면 Bert를 사용하여 멀티 레이블 분류를 훈련시킬 수 있습니다.

온라인 예측을 위해 Bert를 배포하십시오.

중국어와 함께 Bert를 사용하는 방법에 대한 짧은 튜토리얼을 찾을 수 있습니다 : Bert Short Chinese Tutorial

AI Challenger의 훌륭한 곡물 감정 소개를 찾을 수 있습니다.

기본 아이디어

여기에 뭔가를 추가하십시오.

새로운 모델에 대한 실험

자세한 내용은 Model/Bert_CNN_Fine_Grain_Model.py를 확인하십시오

성능

모델	TextCnn (프레세 인)	TextCnn (프리 트레인-피니 튜닝)	버트 (base_model_zh)	BERT (Base_Model_ZH, 코퍼스의 사전 트레인)
F1 점수	0.678	0.685	여기에 숫자를 추가하십시오	여기에 숫자를 추가하십시오

통지 : F1 점수는 검증 세트에보고됩니다

용법

멀티 라벨 Classicaiton 용 Bert [미세 조정 및 사전 트레인을위한 데이터]

 export BERT_BASE_DIR=BERT_BASE_DIR/chinese_L-12_H-768_A-12
export TEXT_DIR=TEXT_DIR
nohup python run_classifier_multi_labels_bert.py   
  --task_name=sentiment_analysis   
  --do_train=true   
  --do_eval=true  
  --data_dir=$TEXT_DIR   
  --vocab_file=$BERT_BASE_DIR/vocab.txt   
  --bert_config_file=$BERT_BASE_DIR/bert_config.json  
  --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt   
  --max_seq_length=512   
  --train_batch_size=4   
  --learning_rate=2e-5   
  --num_train_epochs=3   
  --output_dir=./checkpoint_bert &

1. 겨우 Google에서 미리 훈련 된 모델을 다운로드하고 폴더에 넣어야합니다 (egbert_base_dir)

 chinese_L-12_H-768_A-12 from <a href='https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip'>bert</a>

2. 초로, 당신은 훈련 데이터 (예 : train.tsv)와 검증 데이터 (예 : dev.tsv)가 필요하고이를

 folder(e.g.TEXT_DIR ). you can also download data from here <a href='https://pan.baidu.com/s/1ZS4dAdOIAe3DaHiwCDrLKw'>data to train bert for AI challenger-Sentiment Analysis</a>.
  
 it contains processed data you can run for both fine-tuning on sentiment analysis and pre-train with Bert. 
  
 it is generated by following this notebook step by step:
  
 preprocess_char.ipynb 
  
 you can generate data by yourself as long as data format is compatible with 
  
 processor SentimentAnalysisFineGrainProcessor(alias as sentiment_analysis); 


 data format:  label1,label2,label3t here is sentence or sentencest
 
 it only contains two columns, the first one is target(one or multi-labels), the second one is input strings.
  
 no need to tokenized.
 
 sample:"0_1,1_-2,2_-2,3_-2,4_1,5_-2,6_-2,7_-2,8_1,9_1,10_-2,11_-2,12_-2,13_-2,14_-2,15_1,16_-2,17_-2,18_0,19_-2 浦东五莲路站，老饭店福瑞轩属于上海的本帮菜，交通方便，最近又重新装修，来拨草了，饭店活动满188元送50元钱，环境干净，简单。朋友提前一天来预订包房也没有订到，只有大堂，五点半到店基本上每个台子都客满了，都是附近居民，每道冷菜量都比以前小，味道还可以，热菜烤茄子，炒河虾仁，脆皮鸭，照牌鸡，小牛排，手撕腊味花菜等每道菜都很入味好吃，会员价划算，服务员人手太少，服务态度好，要能团购更好。可以用支付宝方便"
 
 check sample data in ./BERT_BASE_DIR folder 

 for more detail, check create_model and SentimentAnalysisFineGrainProcessor from run_classifier.py

개방형 자재수 모델을 기반으로 한 번 트레인 버트 모델을 한 다음 분류 작업을 수행하십시오.

원시 데이터 생성 : [여기에 뭔가 추가]
각 줄이 문장인지 확인하십시오. 각 문서 사이에 빈 줄이 있습니다.
zip 파일에서 생성 된 데이터를 찾을 수 있습니다.
```
 use write_pre_train_doc() from preprocess_char.ipynb 
```

다음을 사용하여 사전 훈련 단계에 대한 데이터를 생성합니다.

 export BERT_BASE_DIR=./BERT_BASE_DIR/chinese_L-12_H-768_A-12
nohup python create_pretraining_data.py 
--input_file=./PRE_TRAIN_DIR/bert_*_pretrain.txt 
--output_file=./PRE_TRAIN_DIR/tf_examples.tfrecord 
--vocab_file=$BERT_BASE_DIR/vocab.txt 
--do_lower_case=True 
--max_seq_length=512 
--max_predictions_per_seq=60 
--masked_lm_prob=0.15 
--random_seed=12345 
--dupe_factor=5 nohup_pre.out &

생성 된 데이터가있는 사전 훈련 모델 :
Python run_pretraining.py
미세 조정
Python run_classifier.py

TextCnn

감정 분석의 캐시 파일 다운로드 (토큰은 단어 수준에 있습니다)
모델 훈련 :
Python train_cnn_fine_grain.py

 cache file of TextCNN model was generate by following steps from preprocess_word.ipynb. 
 
 it contains everything you need to run TextCNN.
 
 it include: processed train/validation/test set; vocabulary of word; a dict map label to index. 
 
 take train_valid_test_vocab_cache.pik and put it under folder of preprocess_word/
 
 raw data are also included in this zip file.

사전 트레인 TextCnn

마스크 언어 모델을 사용하여 TextCnn 사전 트레인
Python train_cnn_lm.py
TextCnn의 미세 조정
Python train_cnn_fine_grain.py

온라인 예측을 위해 Bert를 배포하십시오

 with session and feed style you can easily deploy BERT.

Bert와의 온라인 예측, 여기에서 자세한 내용을 확인하십시오

참조

언어 이해를 위해 변압기의 양방향 인코더 표현
Google-Research/Bert
Pengshuang/AI-Comp
AI Challenger 2018
문장 분류를위한 컨볼 루션 신경 네트워크

확장하다

추가 정보

버전 1.0.0
유형 AI 소스 코드
업데이트 시간 2025-09-06
크기 3.31MB
출처 Github

sentiment_analysis_fine_grain

소개

기본 아이디어

새로운 모델에 대한 실험

성능

용법

멀티 라벨 Classicaiton 용 Bert [미세 조정 및 사전 트레인을위한 데이터]

개방형 자재수 모델을 기반으로 한 번 트레인 버트 모델을 한 다음 분류 작업을 수행하십시오.

TextCnn

사전 트레인 TextCnn

온라인 예측을 위해 Bert를 배포하십시오

참조

OpenCore_NO_ACPI_Build

nspanel_pro_tools_apk

YuQue_Book_Download

zkwork_aleo_gpu_worker

nextcloud_share_url_downloader

리화 데이터 분석 엔진 무료 버전 3.0_search_navigation_collection_여론_순위_api

chat.petals.dev

GPT Prompt Templates

GPTyped

ML stack

awesome free chatgpt

pywin_contextmenu

Google Dorks

shepherd

mongo express