hivemind 다운로드 hivemind 소스 코드 다운로드

hivemind

파이썬

1.1.10: macOS and Linux ARM support

다운로드

Hivemind : Pytorch의 분산 딥 러닝

CI 상태

Hivemind는 인터넷 전역에서 분산 된 딥 러닝을위한 Pytorch 라이브러리입니다. 의도 된 사용법은 다른 대학, 회사 및 자원 봉사자의 수백 개의 컴퓨터에서 하나의 대형 모델을 교육하는 것입니다.

주요 기능

마스터 노드가없는 분산 교육 : 분산 해시 테이블을 사용하면 컴퓨터를 분산 된 네트워크에 연결할 수 있습니다.
결함 내전적 역전 : 일부 노드가 응답하지 않거나 응답하기에는 너무 오래 걸리더라도 전방 및 후진 패스가 성공합니다.
분산 매개 변수 평균 : 전체 네트워크 (종이)에서 동기화 할 필요없이 여러 작업자의 반복적으로 업데이트를 집계합니다.
임의의 크기의 신경 네트워크를 훈련시킵니다. 층의 일부는 참가자에 분산 된 혼합 엑스퍼트 (종이)를 사용하여 분포됩니다.

이 라이브러리의 아이디어에 대한 자세한 내용은 아래 논문의 전체 목록을 참조하십시오.

예제 사용 사례

이 섹션에는 분산 교육을 위해 Hivemind를 활용하는 프로젝트가 나와 있습니다. 도서관의 도움으로 모델을 성공적으로 교육하거나 다운 스트림 저장소를 만든 경우이 목록에 프로젝트를 추가하는 풀 요청을 자유롭게 제출하십시오.

꽃잎 (웹 페이지, 코드)-100B+ 언어 모델의 추론 및 미세 조정을위한 분산 된 플랫폼.
Transformers Together (웹 페이지, 코드)-공동 텍스트-이미지 변압기 모델을 훈련시킨 Neurips 2021 데모.
Calm (웹 페이지, 코드) - 아랍어 데이터 세트의 조합에 대해 교육을받은 가면 언어 모델.
Sahajbert (블로그 게시물, 코드)-벵골어에 대한 공동으로 Albert-Xlarge가 사전에 사전에 사전.
Pytorch Lightning Integration (DOCS). Pytorch Lightning에 통합하면 기존 파이프 라인을 신뢰할 수없는 동료들과 느린 네트워크를 통해 교육에 적용 할 수 있습니다.

설치

설치하기 전에 환경에 Python 3.8+ 및 Pytorch 1.9.0 또는 최신 정보가 있는지 확인하십시오. 그들은 기본적으로 또는 아나콘다와 함께 설치할 수 있습니다.

PIP로 최신 릴리스를 얻거나 소스에서 Hivemind를 빌드 할 수 있습니다.

PIP와 함께

Python 및 Pytorch 버전이 요구 사항과 일치하는 경우 PIP에서 HiveMind를 설치할 수 있습니다.

 pip install hivemind

또한 데이터 전송 중에 Bitsandbytes의 Blockwise 8 비트 압축을 사용하려면 pip install hivemind[bitsandbytes] 사용하여 설치할 수 있습니다. 그런 다음 hivemind.compression에서 BlockwiseQuantization 클래스를 사용할 수 있습니다.

소스에서

소스에서 Hivemind를 설치하려면 다음을 실행하기 만하면됩니다.

 git clone https://github.com/learning-at-home/hivemind.git
cd hivemind
pip install .

설치가 올바르게 작동하는지 확인하려면 pip install .[dev] 대신. 그런 다음 pytest tests/ 로 테스트를 실행할 수 있습니다.

기본적으로 Hivemind는 GO-LIBP2P-DAEMON 라이브러리의 사전 컴파일 바이너리를 사용합니다. 호환성 문제에 직면하거나 바이너리를 직접 구축하려면 pip install . --global-option="--buildgo" . 컴파일을 실행하기 전에 머신에 최근 버전의 Go Toolchain (1.15 또는 1.16이 지원)이 있는지 확인하십시오.

시스템 요구 사항

Linux는 Hivemind가 개발 및 테스트되는 기본 OS입니다. 우분투 18.04+ (64 비트)를 권장하지만 다른 64 비트 배포판도 작동해야합니다. 레거시 32 비트는 권장되지 않습니다.
MACOS 는 부분적으로 지원됩니다. 문제가있는 경우 대신 Docker를 사용하여 Hivemind를 실행할 수 있습니다. Docker 이미지를 사용하는 것이 좋습니다.
Windows 10+ (실험)는 WSL을 사용하여 Hivemind를 실행할 수 있습니다. NVIDIA 의이 안내서 1 ~ 3 섹션에서 GPU를 사용하도록 WSL을 구성 할 수 있습니다. 그런 다음 위의 지침을 따라 PIP 또는 소스에서 설치하기 만하면됩니다.

선적 서류 비치

QuickStart 튜토리얼은 설치 및 여러 동료와의 간단한 신경망을 교육합니다.
예/Albert에는 스타터 키트와 트랜스포머 마스크 언어 모델을 공동으로 훈련하기위한 지침이 포함되어 있습니다.
믹스 러프 튜토리얼은 탈 중앙화 혼합 층의 사용을 다룹니다.
API 참조 및 추가 자습서는 Learning-at-Home.readthedocs.io에서 제공됩니다

Hivemind 설치 및 사용에 대해 궁금한 점이 있으면 불화 채팅에서 물어 보거나 문제를 제기하십시오.

기여

Hivemind는 현재 활발한 개발 단계에 있으며 모든 기여를 환영합니다. 버그 수정 및 문서 개선에서 완전히 새로운 기능에 이르기까지 모든 것이 감사합니다.

Hivemind에 기여하고 싶지만 어디서부터 시작 해야할지 모르면 해결되지 않은 문제를 살펴보십시오. 새로운 기능에 대해 논의하거나 가능한 버그를보고하려는 경우 새 문제를 열거나 채팅방에 가입하십시오. 버그 수정은 항상 환영 받지만 새로운 기능은 사전에 관리자와 논의해야합니다.

Hivemind의 소스 코드에 기여하기 시작하려면 먼저 기고 가이드 라인을 참조하십시오. 기여하는 다른 방법에 대해 자세히 알아 보려면 가이드를 읽으십시오.

소환

Hivemind 또는 연구에 유용한 기본 알고리즘을 발견 한 경우 다음과 같은 출처를 인용하십시오.

 @misc { hivemind ,
  title = { {H}ivemind: {D}ecentralized {D}eep {L}earning in {P}y{T}orch } ,
  author = { Max Ryabinin and Alexander Borzunov and Michael Diskin and Anton Gusev and Denis Mazur and Vsevolod Plokhotnyuk and Alexey Bukhtiyarov and Pavel Samygin and Anton Sinitsin and Artem Chumachenko } ,
  month = apr,
  year = 2020 ,
  address = { Online } ,
  url = { https://github.com/learning-at-home/hivemind }
}

또한이 라이브러리 제작에 영감을 준 논문을 인용 할 수 있습니다 (Mryab/Learning-at-Home에서 이용할 수있는 Hivemind의 프로토 타입 구현) :

 @inproceedings { ryabinin2020crowdsourced ,
  title = { Towards Crowdsourced Training of Large Neural Networks using Decentralized Mixture-of-Experts } ,
  author = { Ryabinin, Max and Gusev, Anton } ,
  year = 2020 ,
  booktitle = { Advances in Neural Information Processing Systems } ,
  volume = 33 ,
  url = { https://proceedings.neurips.cc/paper/2020/file/25ddc0f8c9d3e22e03d3076f98d83cb2-Paper.pdf }
}

추가 간행물

"Moshpit SGD : 이질적인 신뢰할 수없는 장치에 대한 커뮤니케이션 효율적인 분산 교육"

 @inproceedings { ryabinin2021moshpit ,
  title = { Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices } ,
  author = { Ryabinin, Max and Gorbunov, Eduard and Plokhotnyuk, Vsevolod and Pekhimenko, Gennady } ,
  year = 2021 ,
  booktitle = { Advances in Neural Information Processing Systems } ,
  volume = 34 ,
  url = { https://proceedings.neurips.cc/paper/2021/file/97275a23ca44226c9964043c8462be96-Paper.pdf }
}

"공개 공동 작업에서 분산 된 딥 러닝"

 @inproceedings { diskin2021distributed ,
  title = { Distributed Deep Learning In Open Collaborations } ,
  author = { Michael Diskin and Alexey Bukhtiyarov and Max Ryabinin and Lucile Saulnier and Quentin Lhoest and Anton Sinitsin and Dmitry Popov and Dmitriy Pyrkin and Maxim Kashirin and Alexander Borzunov and Albert Villanova del Moral and Denis Mazur and Ilia Kobelev and Yacine Jernite and Thomas Wolf and Gennady Pekhimenko } ,
  year = 2021 ,
  booktitle = { Advances in Neural Information Processing Systems } ,
  url = { https://openreview.net/forum?id=FYHktcK-7v }
}

"대규모 분산 교육"

 @inproceedings { gorbunov2022secure ,
  title = { Secure Distributed Training at Scale } ,
  author = { Gorbunov, Eduard and Borzunov, Alexander and Diskin, Michael and Ryabinin, Max } ,
  year = 2022 ,
  month = { 17--23 Jul } ,
  booktitle = { Proceedings of the 39th International Conference on Machine Learning } ,
  series = { Proceedings of Machine Learning Research } ,
  volume = 162 ,
  url = { https://proceedings.mlr.press/v162/gorbunov22a.html }
}

"트랜스포머를 함께 훈련"

 @misc { borzunov2022training ,
  title = { Training Transformers Together } ,
  author = { Alexander Borzunov and Max Ryabinin and Tim Dettmers and Quentin Lhoest and Lucile Saulnier and Michael Diskin and Yacine Jernite and Thomas Wolf } ,
  year = 2022 ,
  eprint = { 2207.03481 } ,
  archiveprefix = { arXiv } ,
  primaryclass = { cs.LG }
}

"꽃잎 : 대형 모델의 공동 추론과 미세 조정"

 @inproceedings { borzunov-etal-2023-petals ,
  title = { Petals: Collaborative Inference and Fine-tuning of Large Models } ,
  author = { Borzunov, Alexander  and Baranchuk, Dmitry  and Dettmers, Tim  and Ryabinin, Max  and Belkada, Younes  and Chumachenko, Artem  and Samygin, Pavel  and Raffel, Colin } ,
  year = 2023 ,
  month = jul,
  booktitle = { Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations) } ,
  publisher = { Association for Computational Linguistics } ,
  address = { Toronto, Canada } ,
  pages = { 558--568 } ,
  doi = { 10.18653/v1/2023.acl-demo.54 } ,
  url = { https://aclanthology.org/2023.acl-demo.54 } ,
  editor = { Bollegala, Danushka  and Huang, Ruihong  and Ritter, Alan } ,
}

"Swarm 병렬 처리 : 대형 모델을 훈련시키는 것은 놀랍게도 의사 소통 효율적 일 수 있습니다"

 @inproceedings { ryabinin2023swarm ,
  title = { {SWARM} Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient } ,
  author = { Ryabinin, Max and Dettmers, Tim and Diskin, Michael and Borzunov, Alexander } ,
  year = 2023 ,
  month = { 23--29 Jul } ,
  booktitle = { Proceedings of the 40th International Conference on Machine Learning } ,
  publisher = { PMLR } ,
  series = { Proceedings of Machine Learning Research } ,
  volume = 202 ,
  pages = { 29416--29440 } ,
  url = { https://proceedings.mlr.press/v202/ryabinin23a.html } ,
  editor = { Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan } ,
  pdf = { https://proceedings.mlr.press/v202/ryabinin23a/ryabinin23a.pdf }
}

"인터넷을 통해 대형 언어 모델의 배포 된 추론 및 미세 조정"

 @inproceedings { borzunov2023distributed ,
  title = { Distributed Inference and Fine-tuning of Large Language Models Over The Internet } ,
  author = { Alexander Borzunov and Max Ryabinin and Artem Chumachenko and Dmitry Baranchuk and Tim Dettmers and Younes Belkada and Pavel Samygin and Colin Raffel } ,
  year = 2023 ,
  booktitle = { Thirty-seventh Conference on Neural Information Processing Systems } ,
  url = { https://openreview.net/forum?id=XmN7ZNbUAe }
}