Mesin Penyebaran Universal LLM dengan kompilasi ML
Mulai | Dokumentasi | Blog
MLC LLM adalah kompiler pembelajaran mesin dan mesin penyebaran kinerja tinggi untuk model bahasa besar. Misi dari proyek ini adalah untuk memungkinkan semua orang mengembangkan, mengoptimalkan, dan menggunakan model AI secara asli di platform semua orang.
| AMD GPU | NVIDIA GPU | GPU Apple | Intel GPU | |
|---|---|---|---|---|
| Linux / Win | ✅ Vulkan, Rocm | ✅ Vulkan, Cuda | N/a | ✅ vulekan |
| MacOS | ✅ Logam (DGPU) | N/a | ✅ Metal | ✅ Logam (IGPU) |
| Browser web | ✅ Webgpu dan Wasm | |||
| iOS / ipados | ✅ Logam pada GPU A-Series Apel | |||
| Android | ✅ OpenCl tentang Adreno GPU | ✅ OpenCl on Mali GPU | ||
MLC LLM mengkompilasi dan menjalankan kode pada MLCEngine-mesin inferensi LLM berkinerja tinggi terpadu di platform di atas. MLCEngine menyediakan API yang kompatibel dengan openai yang tersedia melalui REST Server, Python, JavaScript, iOS, Android, semuanya didukung oleh mesin dan kompiler yang sama yang kami terus perbaikan dengan komunitas.
Silakan kunjungi dokumentasi kami untuk memulai dengan MLC LLM.
Harap pertimbangkan mengutip proyek kami jika Anda merasa berguna:
@software { mlc-llm ,
author = { {MLC team} } ,
title = { {MLC-LLM} } ,
url = { https://github.com/mlc-ai/mlc-llm } ,
year = { 2023-2024 }
}Teknik yang mendasari MLC LLM meliputi:
@inproceedings { tensorir ,
author = { Feng, Siyuan and Hou, Bohan and Jin, Hongyi and Lin, Wuwei and Shao, Junru and Lai, Ruihang and Ye, Zihao and Zheng, Lianmin and Yu, Cody Hao and Yu, Yong and Chen, Tianqi } ,
title = { TensorIR: An Abstraction for Automatic Tensorized Program Optimization } ,
year = { 2023 } ,
isbn = { 9781450399166 } ,
publisher = { Association for Computing Machinery } ,
address = { New York, NY, USA } ,
url = { https://doi.org/10.1145/3575693.3576933 } ,
doi = { 10.1145/3575693.3576933 } ,
booktitle = { Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 } ,
pages = { 804–817 } ,
numpages = { 14 } ,
keywords = { Tensor Computation, Machine Learning Compiler, Deep Neural Network } ,
location = { Vancouver, BC, Canada } ,
series = { ASPLOS 2023 }
}
@inproceedings { metaschedule ,
author = { Shao, Junru and Zhou, Xiyou and Feng, Siyuan and Hou, Bohan and Lai, Ruihang and Jin, Hongyi and Lin, Wuwei and Masuda, Masahiro and Yu, Cody Hao and Chen, Tianqi } ,
booktitle = { Advances in Neural Information Processing Systems } ,
editor = { S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh } ,
pages = { 35783--35796 } ,
publisher = { Curran Associates, Inc. } ,
title = { Tensor Program Optimization with Probabilistic Programs } ,
url = { https://proceedings.neurips.cc/paper_files/paper/2022/file/e894eafae43e68b4c8dfdacf742bcbf3-Paper-Conference.pdf } ,
volume = { 35 } ,
year = { 2022 }
}
@inproceedings { tvm ,
author = { Tianqi Chen and Thierry Moreau and Ziheng Jiang and Lianmin Zheng and Eddie Yan and Haichen Shen and Meghan Cowan and Leyuan Wang and Yuwei Hu and Luis Ceze and Carlos Guestrin and Arvind Krishnamurthy } ,
title = { {TVM}: An Automated {End-to-End} Optimizing Compiler for Deep Learning } ,
booktitle = { 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18) } ,
year = { 2018 } ,
isbn = { 978-1-939133-08-3 } ,
address = { Carlsbad, CA } ,
pages = { 578--594 } ,
url = { https://www.usenix.org/conference/osdi18/presentation/chen } ,
publisher = { USENIX Association } ,
month = oct,
}