When building a medical knowledge graph and automatic question and answer, refer to item 1 . The following optimizations have been made in building a medical knowledge graph and a question-and-answer system:
After building the knowledge graph and question-and-answer system, the front-end interaction and KG display are added, using the force-guided map of echarts. Implementation Reference Project 4 .
neo4j-community-4.1.4 % bin/neo4j start
medical_knowledge_graph_app-master % python med_kg/manage.py runserver
img function interface diagram
kg/prepare_data crawler file
kg/data/medical_rebuild.json The final processed data
kg/build_medicalgraph.py creates a neo4j graph database
med_kg/el_model entity link model
med_kg/el_model/embedding word embedding representation of disease/drug/symptom dictionary
med_kg/el_model/entity_linking.py entity link script
med_kg/med_kg django framework views and configuration files
med_kg/templates templates of django framework
med_kg/static front-end bootstrap file
med_kg/ner_model named entity recognition model
med_kg/ner_model/models named entity recognition model model code
med_kg/ner_model/data data for training model
med_kg/ner_model/losses loss function for training model
med_kg/ner_model/outpus/1101medselfner-finetune model finely tuned (trained)
med_kg/ner_model/prev_trained_model Pretrained model for pytorch
med_kg/util tool, cold start
med_kg/Model script to interact with neo4j graph database
med_kg/MedModel automatic question and answer
med_kg/MedModel/question_classifier.py Intent recognition script
med_kg/MedModel/question_parser.py script that converts the identified mention words and intents into query statements
med_kg/MedModel/answer_search.py query graph database to return answer
med_kg/MedModel/dict Domain Dictionary
The '*' number indicates the item that changes based on the original project
| Entity Type | Chinese meaning | Number of entities | Give an example |
|---|---|---|---|
| Check | Diagnostic examination items | 3,353 | Bronchography; arthroscopy |
| Department | Medical subjects | 54 | Plastic surgery department; Burn department |
| Disease | disease | 8,807 | Thromboocytic vasculitis; descending aortic aneurysm of the chest |
| Drug | drug | 3,828 | Jingwanhong Hemorrhoid Cream; Brintzoamine Eye Drops |
| Food | food | 4,870 | Tomato and vegetables beef ball soup; bamboo shoots stewed with lamb |
| Producer | Major categories of medicines | 17,201 | Tongyao Pharmaceutical Penicillin V potassium tablets; Qingyang dexamethasone acetate tablets |
| Symptom* | Symptoms of the disease | 4,377 | Hypertrophy of breast tissue; deep bleeding in the brain parenchyma |
| Total | total | 44,111 | About 44,000 entities |
The '*' number indicates the item that changes based on the original project
| Entity Relationship Type | Chinese meaning | Number of relationships | Give an example |
|---|---|---|---|
| belongs_to | belong | 8,844 | <Gynecology, belonging to, obstetrics and gynecology> |
| common_drug | Common medicines for diseases | 14,649 | <Yangqiang, commonly used, phentolamine methanesulfonate dispersed tablets> |
| do_eat | Eat food when you are sick | 22,238 | <Suppleness fracture, suitable for eating, black fish> |
| drugs_of | Drugs on sale | 17,315 | <Penicillin V potassium tablets, on sale, Tongyao Pharmaceutical Penicillin V potassium tablets> |
| need_check | Disease tests | 39,422 | <unilateral emphysema, required examination, bronchography> |
| no_eat | Avoid eating food in diseases | 22,247 | <Lip disease, avoid eating, almonds> |
| recommended_drug | Recommended medicines for diseases | 59,467 | <Mixed hemorrhoids, recommended medication, Jingwanhong hemorrhoid cream> |
| recommended_eat | Recommended recipes for diseases | 40,221 | <Halvesting, recommended recipe, tomato and beef ball soup> |
| has_symptom* | Symptoms of the disease | 99,492 | <Early breast cancer, disease symptoms, breast tissue hypertrophy> |
| acompany_with | Diseases and diseases | 12,029 | <Insufficiency of the valve closure of the lower limbs of the traffic vein, complications of diseases, thromboocytic vasculitis> |
| Total | total | 294,149 | About 300,000 relationship magnitude |
| Attribute Type | Chinese meaning | Give an example |
|---|---|---|
| name | Disease Name | Wheezing bronchitis |
| desc | Disease Introduction | Also known as asthma bronchitis... |
| Cause | Causes of the disease | Common ones include syncytial viruses... |
| prevent | Preventive measures | Pay attention to the family and children's allergies history... |
| cure_lasttime | Treatment cycle | 6-12 months |
| cure_way | Treatment method | "Medicine treatment", "supportive treatment" |
| cured_prob | Probability of cure | 95% |
| easy_get | People with susceptibility to diseases | No specific crowd |
| Question type | Chinese meaning | Ask an example | illustrate |
|---|---|---|---|
| disease_symptom | Symptoms of the disease | What are the symptoms of breast cancer? | Relationships between different entities |
| symptom_disease | Find possible diseases with known symptoms | What is the problem of a runny nose? | Relationships between different entities |
| disease_cause | Causes of the disease | Why do some people suffer from insomnia? | Entity properties |
| disease_acompany | Complications of the disease | What are the complications of insomnia? | Relationships between similar entities |
| disease_not_food | Foods that require food to avoid food | What should people with insomnia not eat? | Relationships between different entities |
| disease_do_food | What foods are recommended for disease | What should I eat if I have tinnitus? | Relationships between different entities |
| food_not_disease | What disease is best not to eat something | Who is the best person to eat honey? | Relationships between different entities |
| food_do_disease | What disease is good for food | What are the benefits of goose meat? | Relationships between different entities |
| disease_drug | What medicine should I take for any disease | What medicine should I take for liver disease? | Relationships between different entities |
| drug_disease | What diseases can medicines cure | What diseases can isatis root granules cure? | Relationships between different entities |
| disease_check | What tests are needed for disease | How can meningitis be detected? | Relationships between different entities |
| check_disease | What disease can be detected in the examination | What can be detected by a complete blood cell count? | Relationships between different entities |
| disease_prevent | Preventive measures | How to prevent kidney deficiency? | Entity properties |
| disease_lasttime | Treatment cycle | How long does it take to get a cold? | Entity properties |
| disease_cureway | Treatment method | How to treat hypertension? | Entity properties |
| disease_cureprob | Probability of cure | Can leukemia be cured? | Entity properties |
| disease_easyget | People with susceptibility to diseases | Who is prone to hypertension? | Entity properties |
| disease_desc | Disease description | What is diabetes? | Entity properties |
| disease_getprob(todo) | Probability of illness | How high is the prevalence of diabetes? | Entity properties |
(1) Reference recognition: Dictionary-based matching + NER based on BERT_CRF, take the longer both as the mention words.
(2) Entity link: Based on the semantic matching of SBERT, the embedding of the dictionary is stored, the dictionary match between the mention words and the domain dictionary. The candidate entity with the similarity top20 combines overlapping words, and the candidate entity with the similarity top20 and the overlapping words with the mention times is greater than or equal to half of the length of the mention word, which is considered to be the target entity.
(3) Intent recognition: based on question words + domain dictionary. For example, in the question "What is dry eyes?", the mention word "symptom: dry eyes" and the disease question "illness". It is believed that the intention to ask the question is symptom_disease: find possible diseases with known symptoms.

Combining BERT and domain dictionary matching methods to obtain the symptoms in query mention "nose mucosal swelling".

The matching of the target entity "nose mucosa swelling" with the SBERT sentence is used to match the target entity "nose mucosa swelling".




Identify one or more reference words in the question, link to the corresponding one or more kg entities, and return the query answer in combination with the results of the intent recognition.

This bug occurs:
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Refer to 5 , use def load function of ./site-package/torch/serialization.py to use def load(f, map_location='cpu', pickle_module=pickle, **pickle_load_args): instead of def load(f, map_location=None, pickle_module=pickle, **pickle_load_args):
https://github.com/liuhuanyong/QASystemOnMedicalKG ↩
https://github.com/lonePatient/BERT-NER-Pytorch ↩
https://github.com/UKPLab/sentence-transformers ↩
https://github.com/jiangnanboy/movie_knowledge_graph_app ↩
https://stackoverflow.com/questions/56369030/runtimeerror-attempting-to-deserialize-object-on-a-cuda-device ↩