MLstatkit下载MLstatkit源代码下载

MLstatkit

Ai源码

v0.1.4

下载

mlstatkit

MLSTATKIT是一个综合的Python库，旨在将既定的统计方法无缝整合到机器学习项目中。它包含各种工具，包括DeLong的测试，用于比较两个相关接收器操作特征（ROC）曲线下的区域，用于计算置信区间的自动启动， AUC2OR ，用于将接收器操作特征曲线（AUC）下的区域转换为几个相关统计数据，例如Cohen的DESTICS，例如COHEN的DESTER _ PEARSON的RPB，INSTIRE for for pearson的rpb，and Natural for fords-grat and fords-grat and Natural and Natural and Natural and Natural and Idds and Idds and Idds and Idds and Iddio and Idds and Idds and Idd and rat rat rat rat，两个模型指标之间差异的重要性是通过随机调整数据并重新计算指标以创建差异分布的意义。 MLSTATKIT凭借其模块化设计，为研究人员和数据科学家提供了一种灵活而强大的工具包，以增强其分析和模型评估，以满足机器学习领域内的广泛统计测试需求。

安装

使用PIP直接从PYPI安装MLSTATKIT：

pip install MLstatkit

用法

Delong的测试

Delong_test函数可以对两个相关接收器操作特征（ROC）曲线下的区域之间的差异进行统计评估。这有助于对比较模型性能有更深入的了解。

参数：

true ：形状阵列（n_samples，）
范围{0，1}的真实二进制标签。
prog_a ：形状的阵列（n_samples，）
通过第一个模型预测概率。
prog_b ：类似于阵列的形状（n_samples，）
通过第二个模型预测概率。

返回：

Z_SCORE ：浮动
比较两个模型的AUC的z分数。
p_value ：浮动
比较两个模型的AUC的P值。

例子：

 from MLstatkit . stats import Delong_test

# Example data
true = np . array ([ 0 , 1 , 0 , 1 ])
prob_A = np . array ([ 0.1 , 0.4 , 0.35 , 0.8 ])
prob_B = np . array ([ 0.2 , 0.3 , 0.4 , 0.7 ])

# Perform DeLong's test
z_score , p_value = Delong_test ( true , prob_A , prob_B )

print ( f"Z-Score: { z_score } , P-Value: { p_value } " )

这证明了Delong_test的用法根据统计学的概率和真实标签在统计上比较了两个模型的AUC。返回的Z分数和P值有助于理解模型性能的差异是否具有统计学意义。

引导间隔

Bootstrapping功能使用引导函数计算指定性能指标的置信区间，从而衡量了估计的可靠性。它支持AUROC（ROC曲线下的区域），AUPRC（Precision-Recall曲线下的区域）和F1分数指标的计算。

参数：

true ：形状阵列（n_samples，）
True二进制标签，其中标签为{0，1}。
概率：类似于形状的数组（n_samples，）
预测概率，由分类器的preadive_proba方法返回，或基于指定的评分函数和阈值的二进制预测。
metric_str ：str，default ='f1'
标识符用于使用评分功能。支持的值包括“ F1”，“准确性”，“回忆”，“ Precision”，“ ROC_AUC”，“ PR_AUC”和“ faluere_precision”。
n_bootstraps ：int，默认值= 1000
进行行动迭代的数量要执行。增加此数字可提高置信区间估计的可靠性，但也增加了计算时间。
profester_level ：float，默认值= 0.95
间隔估计的置信度。例如，0.95代表95％的置信区间。
阈值：float，默认值= 0.5
用于将概率转换为“ F1”等指标的二进制标签的阈值值。
平均：str，默认='宏'
指定平均方法适用于多类/多标签目标。其他选项包括“微型”，“样品”，“加权”和“二进制”。
Random_State ：int，默认值= 0
随机数生成器的种子。此参数可确保结果的可重复性。

返回：

Original_score ：float
从原始数据集计算出的分数而无需引导。
信心_lower ：浮动
置信区间的下限。
信心_UPPER ：浮动
置信区间的上限。

示例：

 from MLstatkit . stats import Bootstrapping

# Example data
y_true = np . array ([ 0 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 ])
y_prob = np . array ([ 0.1 , 0.4 , 0.35 , 0.8 , 0.2 , 0.3 , 0.4 , 0.7 , 0.05 ])

# Calculate confidence intervals for AUROC
original_score , confidence_lower , confidence_upper = Bootstrapping ( y_true , y_prob , 'roc_auc' )
print ( f"AUROC: { original_score :.3f } , Confidence interval: [ { confidence_lower :.3f } - { confidence_upper :.3f } ]" )

# Calculate confidence intervals for AUPRC
original_score , confidence_lower , confidence_upper = Bootstrapping ( y_true , y_prob , 'pr_auc' )
print ( f"AUPRC: { original_score :.3f } , Confidence interval: [ { confidence_lower :.3f } - { confidence_upper :.3f } ]" )

# Calculate confidence intervals for F1 score with a custom threshold
original_score , confidence_lower , confidence_upper = Bootstrapping ( y_true , y_prob , 'f1' , threshold = 0.5 )
print ( f"F1 Score: { original_score :.3f } , Confidence interval: [ { confidence_lower :.3f } - { confidence_upper :.3f } ]" )

# Calculate confidence intervals for AUROC, AUPRC, F1 score
for score in [ 'roc_auc' , 'pr_auc' , 'f1' ]:
    original_score , conf_lower , conf_upper = Bootstrapping ( y_true , y_prob , score , threshold = 0.5 )
    print ( f" { score . upper () } original score: { original_score :.3f } , confidence interval: [ { conf_lower :.3f } - { conf_upper :.3f } ]" )

统计意义的排列测试

Permutation_test函数通过随机调整数据并重新计算指标以创建差异分布来评估两个模型指标之间差异的统计学意义。此方法不假定数据的特定分布，这是比较模型性能的强大选择。

参数：

y_true ：形状的阵列（n_samples，）
True二进制标签，其中标签为{0，1}。
prog_model_a ：类似于形状的数组（n_samples，）
从第一个模型预测概率。
prog_model_b ：类似于形状的数组（n_samples，）
从第二个模型预测概率。
metric_str ：str，default ='f1'
比较指标。支持的指标包括“ F1”，“准确性”，“回忆”，“ Precision”，“ ROC_AUC”，“ PR_AUC”和“ paquial_precision”。
n_bootstraps ：int，默认值= 1000
生成的置换样品数量。
阈值：float，默认值= 0.5
用于将概率转换为“ F1”等指标的二进制标签的阈值值。
平均：str，默认='宏'
指定平均方法适用于多类/多标签目标。其他选项包括“微型”，“样品”，“加权”和“二进制”。
Random_State ：int，默认值= 0
随机数生成器的种子。此参数可确保结果的可重复性。

返回：

metric_a ：float
使用原始数据计算出的模型A的度量。
metric_b ：float
使用原始数据计算出B模型的度量。
p_value ：浮动
置换测试中的p值表明观察到差异与无原假设下观察到的差异更为极端的差异的可能性。
基准：浮动
观察到的模型A和模型B的指标之间的差异。
samples_mean ：float
排列差异的平均值。
samples_std ：float
排列差异的标准偏差。

示例：

 from MLstatkit . stats import Permutation_test

y_true = np . array ([ 0 , 1 , 0 , 0 , 1 , 1 , 0 , 1 , 0 ])
prob_model_A = np . array ([ 0.1 , 0.4 , 0.35 , 0.8 , 0.2 , 0.3 , 0.4 , 0.7 , 0.05 ])
prob_model_B = np . array ([ 0.2 , 0.3 , 0.25 , 0.85 , 0.15 , 0.35 , 0.45 , 0.65 , 0.01 ])

# Conduct a permutation test to compare F1 scores
metric_a , metric_b , p_value , benchmark , samples_mean , samples_std = Permutation_test (
    y_true , prob_model_A , prob_model_B , 'f1'
)

print ( f"F1 Score Model A: { metric_a :.5f } , Model B: { metric_b :.5f } " )
print ( f"Observed Difference: { benchmark :.5f } , p-value: { p_value :.5f } " )
print ( f"Permuted Differences Mean: { samples_mean :.5f } , Std: { samples_std :.5f } " )

AUC与优势比的转换（OR）

AUC2OR函数将曲线（AUC）值下方的区域转换为优势比（OR），并选择返回中间值，例如T，Z，D和LN_OR。这种转换对于理解AUC，二进制分类中的常见度量和OR之间的关系很有用，该度量通常用于统计分析。

参数：

AUC ：浮动
要转换的曲线（AUC）值下的面积。
return_all ：bool，默认值= false
如果为true，则返回中间值（t，z，d，ln_or）。

返回：

或：浮动
从给定的AUC值中计算出的优势比（OR）。
T ：浮动，可选
从AUC计算得出的中间值。
Z ：浮动，可选
从t计算的中间值。
D ：浮动，可选
从z计算的中间值。
ln_or ：浮动，可选
优势比的自然对数。

示例：

 from MLstatkit . stats import AUC2OR

AUC = 0.7  # Example AUC value

# Convert AUC to OR and retrieve all intermediate values
t , z , d , ln_OR , OR = AUC2OR ( AUC , return_all = True )

print ( f"t: { t :.5f } , z: { z :.5f } , d: { d :.5f } , ln_OR: { ln_OR :.5f } , OR: { OR :.5f } " )

# Convert AUC to OR without intermediate values
OR = AUC2OR ( AUC )
print ( f"OR: { OR :.5f } " )