mars下载 - mars源代码下载

mars

Python

v0.10.0

下载

火星是用于大规模数据计算的基于张量的统一框架，可扩展Numpy，Pandas，Scikit-Learn和许多其他库。

文档，中文文档

安装

火星很容易安装

pip install pymars

为开发人员安装

当您想向火星贡献代码时，可以按照以下说明安装火星进行开发：

git clone https://github.com/mars-project/mars.git
cd mars
pip install -e " .[dev] "

有关安装火星的更多详细信息，请访问MARS文档的安装部分。

体系结构概述

入门

通过：

 > >> import mars
> >> mars . new_session ()

或连接到已经初始化的火星群集。

 > >> import mars
> >> mars . new_session ( 'http://<web_ip>:<ui_port>' )

火星张量

火星张量提供了一个熟悉的接口，例如numpy。

numpy

火星张量

 import numpy as np
N = 200_000_000
a = np . random . uniform ( - 1 , 1 , size = ( N , 2 ))
print (( np . linalg . norm ( a , axis = 1 ) < 1 )
      . sum () * 4 / N )

 import mars . tensor as mt
N = 200_000_000
a = mt . random . uniform ( - 1 , 1 , size = ( N , 2 ))
print ((( mt . linalg . norm ( a , axis = 1 ) < 1 )
        . sum () * 4 / N ). execute ())

 3.14174502
CPU时间：用户11.6 s，系统：8.22 s，
           总计：19.9 s
墙时间：22.5 S

3.14161908
CPU时间：用户966 MS，SYS：544 MS，
           总计：1.51 s
墙时间：3.77 S

火星即使在笔记本电脑上也可以利用多个内核，并且在分布式设置中甚至可以更快。

火星数据框架

火星DataFrame提供了像Pandas这样熟悉的界面。

熊猫

火星数据框架

 import numpy as np
import pandas as pd
df = pd . DataFrame (
    np . random . rand ( 100000000 , 4 ),
    columns = list ( 'abcd' ))
print ( df . sum ())

 import mars . tensor as mt
import mars . dataframe as md
df = md . DataFrame (
    mt . random . rand ( 100000000 , 4 ),
    columns = list ( 'abcd' ))
print ( df . sum (). execute ())

 CPU时间：用户10.9 s，系统：2.69 s，
           总计：13.6 s
墙时间：11 S

CPU时间：用户1.21 s，系统：212 ms，
           总计：1.42 s
墙时间：2.75 S

火星学习

火星学习提供了一个熟悉的界面，例如Scikit-Learn。

Scikit-Learn

火星学习

 from sklearn . datasets import make_blobs
from sklearn . decomposition import PCA
X , y = make_blobs (
    n_samples = 100000000 , n_features = 3 ,
    centers = [[ 3 , 3 , 3 ], [ 0 , 0 , 0 ],
             [ 1 , 1 , 1 ], [ 2 , 2 , 2 ]],
    cluster_std = [ 0.2 , 0.1 , 0.2 , 0.2 ],
    random_state = 9 )
pca = PCA ( n_components = 3 )
pca . fit ( X )
print ( pca . explained_variance_ratio_ )
print ( pca . explained_variance_ )

 from mars . learn . datasets import make_blobs
from mars . learn . decomposition import PCA
X , y = make_blobs (
    n_samples = 100000000 , n_features = 3 ,
    centers = [[ 3 , 3 , 3 ], [ 0 , 0 , 0 ],
              [ 1 , 1 , 1 ], [ 2 , 2 , 2 ]],
    cluster_std = [ 0.2 , 0.1 , 0.2 , 0.2 ],
    random_state = 9 )
pca = PCA ( n_components = 3 )
pca . fit ( X )
print ( pca . explained_variance_ratio_ )
print ( pca . explained_variance_ )

火星学习还与许多库集成：

张量
Pytorch
xgboost
Lightgbm
约翰利
StatsModels

火星遥控器

火星遥控器允许用户并行执行功能。

香草功能调用

火星遥控器

 import numpy as np


def calc_chunk ( n , i ):
    rs = np . random . RandomState ( i )
    a = rs . uniform ( - 1 , 1 , size = ( n , 2 ))
    d = np . linalg . norm ( a , axis = 1 )
    return ( d < 1 ). sum ()

def calc_pi ( fs , N ):
    return sum ( fs ) * 4 / N

N = 200_000_000
n = 10_000_000

fs = [ calc_chunk ( n , i )
      for i in range ( N // n )]
pi = calc_pi ( fs , N )
print ( pi )

 import numpy as np
import mars . remote as mr

def calc_chunk ( n , i ):
    rs = np . random . RandomState ( i )
    a = rs . uniform ( - 1 , 1 , size = ( n , 2 ))
    d = np . linalg . norm ( a , axis = 1 )
    return ( d < 1 ). sum ()

def calc_pi ( fs , N ):
    return sum ( fs ) * 4 / N

N = 200_000_000
n = 10_000_000

fs = [ mr . spawn ( calc_chunk , args = ( n , i ))
      for i in range ( N // n )]
pi = mr . spawn ( calc_pi , args = ( fs , N ))
print ( pi . execute (). fetch ())

 3.1416312
CPU时间：用户32.2 s，系统：4.86 s，
           总计：37.1 s
墙时间：12.4 S

3.1416312
CPU时间：用户616 MS，系统：307 ms，
           总计：923毫秒
墙时间：3.99 S

火星上的戴斯克

有关更多信息，请参阅火星上的Dask。

急切的模式

火星支持渴望的模式，这使其对开发且易于调试友好。

用户可以通过选项启用急切的模式，在程序开头或控制台会话开始时设置选项。

 > >> from mars . config import options
> >> options . eager_mode = True

或使用上下文。

 > >> from mars . config import option_context
> >> with option_context () as options :
> >>     options . eager_mode = True
> >>     # the eager mode is on only for the with statement
>> >     ...

如果急切的模式打开，张量，数据帧等将在创建后立即通过默认情况下执行。

 > >> import mars . tensor as mt
> >> import mars . dataframe as md
> >> from mars . config import options
> >> options . eager_mode = True
> >> t = mt . arange ( 6 ). reshape (( 2 , 3 ))
> >> t
array ([[ 0 , 1 , 2 ],
       [ 3 , 4 , 5 ]])
> >> df = md . DataFrame ( t )
> >> df . sum ()
0    3
1    5
2    7
dtype : int64

火星在雷

火星还与Ray具有深层集成，并且可以在射线上有效地运行，并与机器学习和分布式系统的大型生态系统进行交互。

通过以下方式在本地启动新火星：

 import mars
mars . new_session ( backend = 'ray' )
# Perform computation

与射线数据集交互：

 import mars . tensor as mt
import mars . dataframe as md
df = md . DataFrame (
    mt . random . rand ( 1000_0000 , 4 ),
    columns = list ( 'abcd' ))
# Convert mars dataframe to ray dataset
ds = md . to_ray_dataset ( df )
print ( ds . schema (), ds . count ())
ds . filter ( lambda row : row [ "a" ] > 0.5 ). show ( 5 )
# Convert ray dataset to mars dataframe
df2 = md . read_ray_dataset ( ds )
print ( df2 . head ( 5 ). execute ())

有关更多信息，请参阅火星上的火星。

易于扩展并扩展

火星可以扩展到一台机器，并扩展到具有数千台机器的群集。从一台计算机迁移到集群以处理更多数据或获得更好的性能非常简单。

裸金属部署

通过在群集的不同机器上启动火星分布式运行时的不同组件，火星很容易扩展到群集。

可以选择一个节点作为主管，该主管集成了Web服务，而将其他节点作为工人。主管可以从以下命令开始：

mars-supervisor -h < host_name > -p < supervisor_port > -w < web_port >

工人可以从以下命令开始：

mars-worker -h < host_name > -p < worker_port > -s < supervisor_endpoint >

所有火星过程开始启动后，用户可以运行

 > >> sess = new_session ( 'http://<web_ip>:<ui_port>' )
> >> # perform computation

Kubernetes部署

有关更多信息，请参考在Kubernetes上运行。

纱线部署

有关更多信息，请参考在纱线上运行。

参与其中

阅读开发指南。
加入我们的Slack工作组：Slack。
加入邮件列表：发送电子邮件至[email protected]。
请通过提交GitHub问题来报告错误。
使用拉力请求提交贡献。

预先感谢您的贡献！

展开

附加信息

版本 v0.10.0
类型 Python
更新时间 2025-07-13
大小 22.43MB
来自于 Github

mars

安装

为开发人员安装

体系结构概述

入门

火星张量

火星数据框架

火星学习

火星遥控器

火星上的戴斯克

急切的模式

火星在雷

易于扩展并扩展

裸金属部署

Kubernetes部署

纱线部署

参与其中

火星移民我在火星造基地中文版（Martian Immigrants Idle Mars）

Go to Mars游戏

埃隆在火星上

独自在火星上

重塑火星

Mars_Qvod资源采集小偷程序

chat.petals.dev

GPT Prompt Templates

GPTyped

ToDo Co

Python Portfolio

Redash开源的数据图表工具 v24.10.0

Google Dorks

shepherd

mongo express