inseq下載 - inseq源代碼下載

inseq

其他源碼

v0.6.0: Context Attribution CLI, New Attribution Methods, Performance Improvements and more

下載

序列生成模型的可疏鬆性？

Inseq是一種基於Pytorch的可駭客工具包，可民主化在Seq Uence生成模型的特性分析中對可變的途徑訪問。

安裝

inseq可在PYPI上使用，可以使用pip安裝python> = 3.10，<= 3.12：

 # Install latest stable version
pip install inseq

# Alternatively, install latest development version
pip install git+https://github.com/inseq-team/inseq.git

在Jupyter筆記本電腦中安裝額外的可視化和？數據集歸因為pip install inseq[notebook,datasets] 。

開發安裝

要安裝軟件包，請克隆存儲庫並運行以下命令：

 cd inseq
make uv-download # Download and install the UV package manager
make install # Installs the package and all dependencies

對於庫開發人員，您可以使用make install-dev命令來安裝所有開發依賴項（質量，文檔，附加功能）。

安裝後，您應該能夠make fast-test並無誤地make lint 。

常見問題解答安裝

安裝tokenizers軟件包需要安裝Rust編譯器。您可以從https://rustup.rs安裝Rust，然後在路徑中添加$HOME/.cargo/env 。
安裝sentencepiece需要各種軟件包，使用sudo apt-get install cmake build-essential pkg-config或brew install cmake gperftools pkg-config 。

python中的用法

此示例使用集成梯度歸因方法來歸因於從Winomt語料庫中獲取的句子的英語 - 法語翻譯：

 import inseq

model = inseq . load_model ( "Helsinki-NLP/opus-mt-en-fr" , "integrated_gradients" )
out = model . attribute (
  "The developer argued with the designer because her idea cannot be implemented." ,
  n_steps = 100
)
out . show ()

這會產生輸入句子中每個令牌的歸因分數（令牌級的聚合會自動處理）。這是jupyter筆記本中的可視化看起來像：

InSeq還支持僅使用解碼器的模型，例如GPT-2，可以直接從控制台中使用多種歸因方法和可自定義的設置：

 import inseq

model = inseq . load_model ( "gpt2" , "integrated_gradients" )
model . attribute (
    "Hello ladies and" ,
    generation_args = { "max_new_tokens" : 9 },
    n_steps = 500 ,
    internal_batch_size = 50
). show ()

特徵

來自大多數ForConditionalGeneration （encoder-decoder）和ForCausalLM （僅解碼器）模型的序列生成的特徵歸因？變壓器
支持多種功能歸因方法，擴展了Captum支持的方法
通過Aggregator類的後處理，過濾和合併歸因地圖。
筆記本，瀏覽器和命令行中的歸因可視化。
單個示例還是整個示例的有效歸因？帶有Inseq CLI的數據集。
目標功能的自定義歸因，支持高級方法，例如對比功能歸因和上下文Reliance檢測。
自定義分數（例如概率，熵）的提取和可視化在每個一代步驟中都符合歸因地圖。

支持的方法

使用inseq.list_feature_attribution_methods函數列出所有可用的方法標識符和inseq.list_step_functions ，以列出所有可用的步驟功能。目前支持以下方法：

基於梯度的歸因

saliency內部卷積網絡深處：可視化圖像分類模型和顯著圖（Simonyan等，2013）
input_x_gradient ：內部內部卷積網絡：可視化圖像分類模型和顯著圖（Simonyan等，2013）
integrated_gradients ：深網的公理歸因（Sundararajan等，2017）
deeplift ：通過傳播激活差異來學習重要特徵（Shrikumar等，2017）
gradient_shap ：一種統一的解釋模型預測方法（Lundberg和Lee，2017年）
discretized_integrated_gradients ：解釋語言模型的離散的集成梯度（Sanyal和Ren，2021）
sequential_integrated_gradients ：順序集成梯度：一種簡單但有效的語言模型的方法（Enguehard，2023）

基於內部的歸因

attention ：注意力歸因，從共同學習和翻譯的神經機器翻譯（Bahdanau et al。，2014）

基於擾動的歸因

occlusion ：可視化和理解卷積網絡（Zeiler和Fergus，2014年）
lime ：“我為什麼信任你？”：解釋任何分類器的預測（Ribeiro等，2016）
value_zeroing ：變壓器中的量化上下文混合（Mohebbi等，2023）
reagent ：試劑：一種生成語言模型的模型 - 不足的特徵歸因方法（Zhao等，2024）

步驟功能

步驟函數用於在屬性過程的每個步驟中從模型中提取自定義得分，並在model.attribute中使用step_scores參數提取自定義分數。它們還可以用作依賴模型輸出（例如基於梯度的方法）的歸因方法的目標，通過將它們作為attributed_fn參數傳遞。目前支持以下步驟功能：

logits ：目標令牌的邏輯。
probability ：目標令牌的概率。也可以通過傳遞logprob=True來用於對數概率。
entropy ：預測分佈的熵。
crossentropy ：目標令牌和預測分佈之間的跨凝性損失。
perplexity ：目標令牌的困惑。
contrast_logits / contrast_prob ：當向模型提供不同的對比輸入時，目標令牌的邏輯 /概率。當沒有提供對比度輸入時，等效於logits / probability 。
contrast_logits_diff / contrast_prob_diff ：原始和箔目標令牌對之間的邏輯 /概率差異，可用於對比度評估，如相反的歸因（Yin和Neubig，2022）。
pcxmi ：針對原始和對比度上下文的目標令牌的點上下文跨點信息（P-CXMI）（Yin等，2021）。
kl_divergence ：給定原始和對比的環境的預測分佈的KL差異。可以使用top_k和top_p參數局限於最有可能的目標令牌選項。
in_context_pvi ：context Cointsise v-usable信息（PVI）測量模型預測中使用的上下文信息量（Lu等，2023）。
mc_dropout_prob_avg ：使用MC輟學的多個樣本中目標令牌的平均概率（Gal和Ghahramani，2016年）。
top_p_size ：在模型的預測分佈中，累積概率的令牌數大於top_p 。

以下示例使用contrast_prob_diff步驟函數計算對比歸因：

 import inseq

attribution_model = inseq . load_model ( "gpt2" , "input_x_gradient" )

# Perform the contrastive attribution:
# Regular (forced) target -> "The manager went home because he was sick"
# Contrastive target      -> "The manager went home because she was sick"
out = attribution_model . attribute (
    "The manager went home because" ,
    "The manager went home because he was sick" ,
    attributed_fn = "contrast_prob_diff" ,
    contrast_targets = "The manager went home because she was sick" ,
    # We also visualize the corresponding step score
    step_scores = [ "contrast_prob_diff" ]
)
out . show ()

請參閱文檔以獲取包括自定義功能註冊的示例。

使用Inseq CLI

Inseq庫還提供了有用的客戶端命令，以啟用單個示例甚至整個示例的重複歸因？直接來自控制台的數據集。在安裝軟件包後，通過在終端中輸入inseq -h來查看可用選項。

支持三個命令：

inseq attribute ：用於啟用model.attribute的包裝器。
inseq attribute-dataset ：使用擁抱face datasets.load_dataset API將attribute擴展到完整數據集。
inseq attribute-context ：使用Sarti等人的方法來檢測和屬性上下文依賴性。（2023）。

所有命令都支持可用於attribute的全部參數，控制台中的歸因可視化以及將輸出保存到磁盤。

inseq attribute示例

下面的示例使用來自transformers的Mariannmt翻譯模型對英語句子進行了簡單的特徵歸因。最終結果打印到控制台。

inseq attribute 
--model_name_or_path Helsinki-NLP/opus-mt-en-it 
--attribution_method saliency 
--input_texts " Hello world this is Inseq! Inseq is a very nice library to perform attribution analysis "

inseq attribute-dataset示例

以下代碼可用於使用瑪麗安·弗洛雷斯（Flores-101）平行的語料庫進行的20個英語句子的假人樣本，以執行意大利翻譯的歸因（源和目標端），並使用瑪麗安·弗洛姆特（Flores-101）平行的transformers採集了瑪麗安·納姆特（Mariannmt）翻譯模型。我們在文件attributions.html中以HTML格式保存可視化。有關更多選項，請參見--help標誌。

inseq attribute-dataset 
  --model_name_or_path Helsinki-NLP/opus-mt-en-it 
  --attribution_method saliency 
  --do_prefix_attribution 
  --dataset_name inseq/dummy_enit 
  --input_text_field en 
  --dataset_split " train[:20] " 
  --viz_path attributions.html 
  --batch_size 8 
  --hide

inseq attribute-context示例

以下示例使用小的LM生成input_current_text的延續，並使用input_context_text提供的其他上下文來估計其對生成的影響。在這種情況下， "to the hospital. He said he was fine" ，並且根據contrast_prob_diff步驟功能，發現代幣hospital的產生依賴於情境代幣sick 。

inseq attribute-context 
  --model_name_or_path HuggingFaceTB/SmolLM-135M 
  --input_context_text " George was sick yesterday. " 
  --input_current_text " His colleagues asked him to come " 
  --attributed_fn " contrast_prob_diff "

結果：

計劃發展

支持更多基於注意力和基於閉塞的功能歸因方法（＃107和＃108中記錄）。
與雪貂的互操作性，以歸因性和忠誠評估。
使用Gradio塊在選項卡式界面中的豐富而交互式可視化。

貢獻

我們對Inseq的願景是創建一組集中，全面和健壯的工具集，以在序列生成模型的研究中實現公平和可重現的比較。為了實現這一目標，對這些主題感興趣的研究人員和開發人員的貢獻非常受歡迎。有關更多信息，請參閱我們的貢獻指南和行為準則。

引用inseq

如果您在研究中使用inseq，我們建議包括提及特定版本（例如v0.6.0），我們請您將參考文件引用為：

 @inproceedings { sarti-etal-2023-inseq ,
    title = " Inseq: An Interpretability Toolkit for Sequence Generation Models " ,
    author = " Sarti, Gabriele  and
      Feldhus, Nils  and
      Sickert, Ludwig  and
      van der Wal, Oskar and
      Nissim, Malvina and
      Bisazza, Arianna " ,
    booktitle = " Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations) " ,
    month = jul,
    year = " 2023 " ,
    address = " Toronto, Canada " ,
    publisher = " Association for Computational Linguistics " ,
    url = " https://aclanthology.org/2023.acl-demo.40 " ,
    doi = " 10.18653/v1/2023.acl-demo.40 " ,
    pages = " 421--435 " ,
}