hivemindダウンロードhivemindソースコードのダウンロード

hivemind

パイソン

1.1.10: macOS and Linux ARM support

ダウンロード

Hivemind：Pytorchの分散型深い学習

CIステータス

Hivemindは、インターネット全体で分散した深い学習のためのPytorchライブラリです。意図した使用法は、異なる大学、企業、ボランティアの数百のコンピューターで1つの大きなモデルをトレーニングすることです。

重要な機能

マスターノードなしの分散トレーニング：分散ハッシュテーブルにより、分散ネットワーク内のコンピューターを接続できます。
フォールトトレラントバックプロパゲーション：一部のノードが反応しないか、応答するのに時間がかかりすぎている場合でも、前方および後方パスが成功します。
分散型パラメーター平均化：ネットワーク全体で同期する必要なく、複数の労働者からの更新を繰り返し集約します（論文）。
任意のサイズのニューラルネットワークを訓練する：その層の一部は、分散型混合物（紙）で参加者全体に分布しています。

このライブラリの背後にあるアイデアの詳細については、以下の論文の完全なリストを参照してください。

ユースケースの例

このセクションには、分散トレーニングのためにHivemindを活用するプロジェクトをリストします。モデルのトレーニングを正常にトレーニングしたり、ライブラリの助けを借りて下流のリポジトリを作成したりした場合は、このリストにプロジェクトを追加するプルリクエストをお気軽に送信してください。

花びら（ウェブページ、コード） - 100B+言語モデルの推論と微調整のための分散型プラットフォーム。
トレーニングトランスを一緒に（Webページ、コード） - コラボレーションテキストからイメージへのトランスモデルをトレーニングしたニューリップス2021デモンストレーション。
Calm （Webページ、コード） - アラビア語のデータセットの組み合わせで訓練されたマスクされた言語モデル。
Sahajbert （ブログ投稿、コード） - ベンガル語の協力的に前提条件のAlbert-Xlarge。
Pytorch Lightning Integration （docs）。 Pytorch Lightningへの統合により、既存のパイプラインを信頼できないピアとの遅いネットワーク上でのトレーニングに適応させることができます。

インストール

インストールする前に、環境にPython 3.8+およびPytorch 1.9.0以降があることを確認してください。ネイティブまたはアナコンダで設置することができます。

PIPで最新リリースを取得したり、ソースからHivemindを構築できます。

ピップ付き

PythonとPytorchのバージョンが要件と一致する場合、PIPからHivemindをインストールできます。

 pip install hivemind

また、データ転送中にBitsandBytesからブロックごとの8ビット圧縮を使用する場合は、 pip install hivemind[bitsandbytes]でインストールできます。その後、hivemind.compressionでBlockwiseQuantizationクラスを使用できます

ソースから

ソースからHivemindをインストールするには、次のことを実行するだけです。

 git clone https://github.com/learning-at-home/hivemind.git
cd hivemind
pip install .

インストールが適切に機能していることを確認したい場合は、 pip install .[dev] 。次に、 pytest tests/でテストを実行できます。

デフォルトでは、HivemindはGo-Libp2p-Daemonライブラリのプリコンパイルされたバイナリを使用します。互換性の問題に直面している場合、または自分でバイナリを作成したい場合は、 pip install . --global-option="--buildgo" 。コンピレーションを実行する前に、マシンにGo Toolchainの最近のバージョンがあることを確認してください（1.15または1.16がサポートされています）。

システム要件

Linuxは、Hivemindが開発およびテストされるデフォルトのOSです。 Ubuntu 18.04+（64ビット）をお勧めしますが、他の64ビットディストリビューションも機能するはずです。レガシー32ビットは推奨されません。
macOSは部分的にサポートされています。問題がある場合は、代わりにDockerを使用してHivemindを実行できます。 Docker画像を使用することをお勧めします。
Windows 10+（実験）は、 WSLを使用してHivemindを実行できます。 NVIDIAによるこのガイドのセクション1〜3をフォローすることにより、GPUを使用するようにWSLを構成できます。その後、上記の手順に従って、PIPまたはソースからインストールするだけです。

ドキュメント

QuickStartチュートリアルは、インストールを進め、いくつかのピアとのシンプルなニューラルネットワークをトレーニングします。
例/アルバートには、スターターキットと、変圧器マスクされた言語モデルを協力してトレーニングするための指示が含まれています。
混合物のチュートリアルでは、分散型混合混合物層の使用について説明します。
APIリファレンスと追加のチュートリアルは、学習-Home.readthedocs.ioで入手できます

Hivemindのインストールと使用についてご質問がある場合は、Discordチャットでお気軽にお問い合わせや問題を提出してください。

貢献

Hivemindは現在、アクティブな開発段階にあり、すべての貢献を歓迎します。バグの修正やドキュメントの改善からまったく新機能まで、すべてが高く評価されています。

Hivemindに貢献したいがどこから始めればよいかわからない場合は、未解決の問題を見てください。新しい機能について話し合うか、可能なバグを報告したい場合に備えて、新しい問題を開くか、チャットルームに参加してください。バグの修正はいつでも歓迎されますが、新しい機能は、メンテナーと事前に議論することが望ましいです。

Hivemindのソースコードへの貢献を開始したい場合は、最初に貢献ガイドラインをご覧ください。貢献する他の方法の詳細については、ガイドをお読みください。

引用

Hivemindまたはその基礎となるアルゴリズムが研究に役立つことがわかった場合は、次の情報源を引用してください。

 @misc { hivemind ,
  title = { {H}ivemind: {D}ecentralized {D}eep {L}earning in {P}y{T}orch } ,
  author = { Max Ryabinin and Alexander Borzunov and Michael Diskin and Anton Gusev and Denis Mazur and Vsevolod Plokhotnyuk and Alexey Bukhtiyarov and Pavel Samygin and Anton Sinitsin and Artem Chumachenko } ,
  month = apr,
  year = 2020 ,
  address = { Online } ,
  url = { https://github.com/learning-at-home/hivemind }
}

また、このライブラリの作成に影響を与えた論文を引用することができます（Mryab/Learning-Homeで利用可能なHivemindのプロトタイプ実装）：

 @inproceedings { ryabinin2020crowdsourced ,
  title = { Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts } ,
  author = { Ryabinin, Max and Gusev, Anton } ,
  year = 2020 ,
  booktitle = { Advances in Neural Information Processing Systems } ,
  volume = 33 ,
  url = { https://proceedings.neurips.cc/paper/2020/file/25ddc0f8c9d3e22e03d3076f98d83cb2-Paper.pdf }
}

追加の出版物

「Moshpit SGD：不均一な信頼性の低いデバイスに関する通信効率の高い分散トレーニング」

 @inproceedings { ryabinin2021moshpit ,
  title = { Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices } ,
  author = { Ryabinin, Max and Gorbunov, Eduard and Plokhotnyuk, Vsevolod and Pekhimenko, Gennady } ,
  year = 2021 ,
  booktitle = { Advances in Neural Information Processing Systems } ,
  volume = 34 ,
  url = { https://proceedings.neurips.cc/paper/2021/file/97275a23ca44226c9964043c8462be96-Paper.pdf }
}

「オープンコラボレーションにディープラーニングを分配する」

 @inproceedings { diskin2021distributed ,
  title = { Distributed Deep Learning In Open Collaborations } ,
  author = { Michael Diskin and Alexey Bukhtiyarov and Max Ryabinin and Lucile Saulnier and Quentin Lhoest and Anton Sinitsin and Dmitry Popov and Dmitriy Pyrkin and Maxim Kashirin and Alexander Borzunov and Albert Villanova del Moral and Denis Mazur and Ilia Kobelev and Yacine Jernite and Thomas Wolf and Gennady Pekhimenko } ,
  year = 2021 ,
  booktitle = { Advances in Neural Information Processing Systems } ,
  url = { https://openreview.net/forum?id=FYHktcK-7v }
}

「大規模な分散トレーニング」

 @inproceedings { gorbunov2022secure ,
  title = { Secure Distributed Training at Scale } ,
  author = { Gorbunov, Eduard and Borzunov, Alexander and Diskin, Michael and Ryabinin, Max } ,
  year = 2022 ,
  month = { 17--23 Jul } ,
  booktitle = { Proceedings of the 39th International Conference on Machine Learning } ,
  series = { Proceedings of Machine Learning Research } ,
  volume = 162 ,
  url = { https://proceedings.mlr.press/v162/gorbunov22a.html }
}

「一緒に変圧器をトレーニングする」

 @misc { borzunov2022training ,
  title = { Training Transformers Together } ,
  author = { Alexander Borzunov and Max Ryabinin and Tim Dettmers and Quentin Lhoest and Lucile Saulnier and Michael Diskin and Yacine Jernite and Thomas Wolf } ,
  year = 2022 ,
  eprint = { 2207.03481 } ,
  archiveprefix = { arXiv } ,
  primaryclass = { cs.LG }
}

「花びら：大規模なモデルの共同推論と微調整」

 @inproceedings { borzunov-etal-2023-petals ,
  title = { Petals: Collaborative Inference and Fine-tuning of Large Models } ,
  author = { Borzunov, Alexander  and Baranchuk, Dmitry  and Dettmers, Tim  and Ryabinin, Max  and Belkada, Younes  and Chumachenko, Artem  and Samygin, Pavel  and Raffel, Colin } ,
  year = 2023 ,
  month = jul,
  booktitle = { Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations) } ,
  publisher = { Association for Computational Linguistics } ,
  address = { Toronto, Canada } ,
  pages = { 558--568 } ,
  doi = { 10.18653/v1/2023.acl-demo.54 } ,
  url = { https://aclanthology.org/2023.acl-demo.54 } ,
  editor = { Bollegala, Danushka  and Huang, Ruihong  and Ritter, Alan } ,
}

「群れの並列性：大きなモデルのトレーニングは驚くほどコミュニケーション効率が高い」

 @inproceedings { ryabinin2023swarm ,
  title = { {SWARM} Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient } ,
  author = { Ryabinin, Max and Dettmers, Tim and Diskin, Michael and Borzunov, Alexander } ,
  year = 2023 ,
  month = { 23--29 Jul } ,
  booktitle = { Proceedings of the 40th International Conference on Machine Learning } ,
  publisher = { PMLR } ,
  series = { Proceedings of Machine Learning Research } ,
  volume = 202 ,
  pages = { 29416--29440 } ,
  url = { https://proceedings.mlr.press/v202/ryabinin23a.html } ,
  editor = { Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan } ,
  pdf = { https://proceedings.mlr.press/v202/ryabinin23a/ryabinin23a.pdf }
}

「インターネット上の大きな言語モデルの分散推論と微調整」

 @inproceedings { borzunov2023distributed ,
  title = { Distributed Inference and Fine-tuning of Large Language Models Over The Internet } ,
  author = { Alexander Borzunov and Max Ryabinin and Artem Chumachenko and Dmitry Baranchuk and Tim Dettmers and Younes Belkada and Pavel Samygin and Colin Raffel } ,
  year = 2023 ,
  booktitle = { Thirty-seventh Conference on Neural Information Processing Systems } ,
  url = { https://openreview.net/forum?id=XmN7ZNbUAe }
}