HierarchyTransformers
v0.1.1 - Refactor code and add customised HiT trainer
項目|擁抱面| arxiv | Zenodo
將層次結構嵌入語言模型。
新聞(ChangElog)?
sentence-transformers>=3.4.0.dev0 ( v0.1.0 )保持一致的重大發展。 sentence-transformers<3.0.0 )和錯誤修復。 ( v0.0.3 ) 層次結構變壓器(HIT)是一個框架,它使基於變壓器編碼器的語言模型(LMS)能夠學習雙曲線空間中的層次結構。主要思想是構建一個直接限制LMS輸出嵌入空間的龐加爾球,利用雙曲線空間的指數擴展來組織實體層次層次。除了介紹此框架(請參閱Github上的代碼),我們還致力於培訓和釋放各種hierachiies的命中模型。模型和數據集將在HuggingFace上訪問。
該存儲庫遵循與sentence-transformers庫相似的佈局。主模型直接擴展了句子變壓器體系結構。我們還利用deeponto從源數據和層次結構構造數據集中提取層次結構,而geoopt用於雙曲線空間中的算術。
sentence-transformers=3.3.1在評估過程中包含錯誤,這些錯誤是在其github dev版本sentence-transformers=3.4.0.dev0中固定的,請手動更新依賴項,直到官方3.4.0釋放。
# requiring Python>=3.9
pip install hierarchy_transformerspip install git+https://github.com/KRR-Oxford/HierarchyTransformers.git我們的熱門模型和數據集在HuggingFace Hub上發布。
from hierarchy_transformers import HierarchyTransformer
# load the model
model = HierarchyTransformer . from_pretrained ( 'Hierarchy-Transformers/HiT-MiniLM-L12-WordNetNoun' )
# entity names to be encoded.
entity_names = [ "computer" , "personal computer" , "fruit" , "berry" ]
# get the entity embeddings
entity_embeddings = model . encode ( entity_names )使用實體嵌入來預測它們之間的填充關係。
# suppose we want to compare "personal computer" and "computer", "berry" and "fruit"
child_entity_embeddings = model . encode ([ "personal computer" , "berry" ], convert_to_tensor = True )
parent_entity_embeddings = model . encode ([ "computer" , "fruit" ], convert_to_tensor = True )
# compute the hyperbolic distances and norms of entity embeddings
dists = model . manifold . dist ( child_entity_embeddings , parent_entity_embeddings )
child_norms = model . manifold . dist0 ( child_entity_embeddings )
parent_norms = model . manifold . dist0 ( parent_entity_embeddings )
# use the empirical function for subsumption prediction proposed in the paper
# `centri_score_weight` and the overall threshold are determined on the validation set
subsumption_scores = - ( dists + centri_score_weight * ( parent_norms - child_norms ))使用我們存儲庫中的示例腳本來複製現有模型並訓練/評估自己的模型。
Copyright 2023 Yuan He.
All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at *<http://www.apache.org/licenses/LICENSE-2.0>*
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
如果您發現此存儲庫或已發布的模型有用,請引用我們的出版物:
Yuan He,Zhangdie Yuan,Jiaoyan Chen,Ian Horrocks。語言模型作為層次結構編碼。出現在2024年Neurips。 /arxiv / /neurips /
@article{he2024language,
title={Language Models as Hierarchy Encoders},
author={He, Yuan and Yuan, Zhangdie and Chen, Jiaoyan and Horrocks, Ian},
journal={arXiv preprint arXiv:2401.11374},
year={2024}
}