ดาวน์โหลด zipml - ดาวน์โหลดซอร์สโค้ด zipml

zipml

โค้ดแหล่งที่มา AI

1.0.0

ดาวน์โหลด

zipml

ZIPML เป็นไลบรารี AutomL ที่มีน้ำหนักเบาออกแบบมาสำหรับชุดข้อมูลขนาดเล็กที่มีฟังก์ชั่นผู้ช่วยที่จำเป็นเช่นการแยกการทดสอบรถไฟการเปรียบเทียบแบบจำลองและการสร้างเมทริกซ์ความสับสน

คุณสมบัติ

การฝึกอบรมโมเดลอัตโนมัติ : ฝึกอบรมและเปรียบเทียบโมเดลการเรียนรู้ของเครื่องโดยอัตโนมัติในชุดข้อมูลของคุณ
ฟังก์ชั่นผู้ช่วย :
- ฟังก์ชั่นแยกการทดสอบรถไฟเพื่อการจัดการข้อมูลที่ง่าย
- การสร้างเมทริกซ์ความสับสนและความสามารถในการบันทึกเป็น PNG
- คุณสมบัติการบันทึกที่กำหนดเองเพื่อการติดตามประสิทธิภาพของโมเดลที่ดีขึ้น
การเปรียบเทียบแบบจำลอง : เปรียบเทียบประสิทธิภาพของโมเดลที่แตกต่างกันอย่างง่ายดายให้การวัดและการตอบรับภาพ
การสนับสนุน CLI : รันงานการเรียนรู้ของเครื่องโดยตรงจากบรรทัดคำสั่ง
Extensible : เพิ่มโมเดลของคุณเองและปรับแต่งเวิร์กโฟลว์ตามต้องการ
เครื่องมือการสร้างภาพข้อมูล : รวมถึงเครื่องมือสำหรับการแสดงตัวชี้วัดประสิทธิภาพของโมเดลช่วยให้เข้าใจพฤติกรรมของโมเดลได้ดีขึ้น
การปรับแต่งพารามิเตอร์ Hyperparameter : รองรับการปรับจูนไฮเปอร์พารามิเตอร์เพื่อเพิ่มประสิทธิภาพของโมเดล
การประมวลผลข้อมูล ล่วงหน้า: ขั้นตอนการประมวลผลข้อมูลในตัวเพื่อจัดการค่าที่ขาดหายไปการปรับขนาดและการเข้ารหัส

โครงสร้างบรรจุภัณฑ์

zipml/
│
├── data/
│   ├── encoding.py
│   ├── file_operations.py
│   ├── split_data.py
│
├── model/
│   ├── analyze_model_predictions.py
│   ├── calculate_model_results.py
│   ├── measure_prediction_time.py
│
├── utils/
│   ├── calculate_sentence_length_percentile.py
│   ├── read_time_series_data.py
│
├── visualization/
│   ├── combine_and_plot_model_results.py
│   ├── plot_random_image.py
│   ├── plot_time_series.py
│   ├── save_and_plot_confusion_matrix.py
│
└── zipml.py

วิธีใช้แพ็คเกจ `zipml`

แพ็คเกจ zipml ให้บริการยูทิลิตี้ที่หลากหลายสำหรับข้อมูลการประมวลผลล่วงหน้าการวิเคราะห์รูปแบบและผลลัพธ์การแสดงภาพทั้งหมดได้รับการออกแบบมาเพื่อทำให้ AI และเวิร์กโฟลว์การเรียนรู้ของเครื่องง่ายขึ้น ด้านล่างนี้เป็นคำแนะนำสำหรับการใช้ฟังก์ชั่นคีย์บางอย่าง

1. การประเมินแบบจำลอง

analyze_model_predictions.py : ประเมินการทำนายแบบจำลองโดยการเปรียบเทียบกับค่าจริงและส่งคืนข้อมูลรายละเอียดของการคาดการณ์พร้อมกับการทำนายที่ไม่ถูกต้องที่สุด
```
 from zipml . model import analyze_model_predictions
val_df , most_wrong = analyze_model_predictions ( best_model , X_test , y_test )
```

การติดตั้ง

ติดตั้งแพ็คเกจผ่าน PIP:

pip install zipml

อีกทางเลือกหนึ่งโคลนที่เก็บ:

git clone https://github.com/abdozmantar/zipml.git
cd zipml
pip install .

การใช้งาน

ตัวอย่างการใช้งานด้วยรหัส

นี่คือตัวอย่างที่ใช้งานได้จริงของวิธีใช้ ZIPML:

 import pandas as pd
from zipml . model import analyze_model_predictions
from zipml . model import calculate_model_results
from zipml . visualization import save_and_plot_confusion_matrix
from zipml . data import split_data
from zipml import compare_models
from sklearn . ensemble import RandomForestClassifier , GradientBoostingClassifier
from sklearn . linear_model import LogisticRegression


# Sample dataset
data = {
    'feature_1' : [ 0.517 , 0.648 , 0.105 , 0.331 , 0.781 , 0.026 , 0.048 ],
    'feature_2' : [ 0.202 , 0.425 , 0.643 , 0.721 , 0.646 , 0.827 , 0.303 ],
    'feature_3' : [ 0.897 , 0.579 , 0.014 , 0.167 , 0.015 , 0.358 , 0.744 ],
    'feature_4' : [ 0.457 , 0.856 , 0.376 , 0.527 , 0.648 , 0.534 , 0.047 ],
    'feature_5' : [ 0.046 , 0.118 , 0.222 , 0.001 , 0.969 , 0.239 , 0.203 ],
    'target' : [ 0 , 1 , 1 , 1 , 1 , 1 , 0 ]
}

# Creating DataFrame
df = pd . DataFrame ( data )

# Splitting data into features (X) and target (y)
X = df . drop ( 'target' , axis = 1 )
y = df [ 'target' ]

# Split the data into training and test sets
X_train , X_test , y_train , y_test = split_data ( X , y )

# Define models
models = [
    RandomForestClassifier (),
    LogisticRegression (),
    GradientBoostingClassifier ()
]

# Compare models and select the best one
best_model , performance = compare_models ( models , X_train , X_test , y_train , y_test )
print ( f"Best model: { best_model } with performance: { performance } " )

# Calculate performance metrics for the best model
best_model_metrics = calculate_model_results ( y_test , best_model . predict ( X_test ))

# Analyze model predictions
val_df , most_wrong = analyze_model_predictions ( best_model , X_test , y_test )

# Save and plot confusion matrix
save_and_plot_confusion_matrix ( y_test , best_model . predict ( X_test ), save_path = "confusion_matrix.png" )

การใช้งาน CLI

คุณสามารถเรียกใช้ ZIPML จากบรรทัดคำสั่งโดยใช้คำสั่งต่อไปนี้:

ฝึกอบรมรุ่นเดียว

zipml --train train.csv --test test.csv --model randomforest --result results.json

--train : PATH ไปยังไฟล์ชุดข้อมูลการฝึกอบรม CSV
--test : พา ธ ไปยังไฟล์ชุดข้อมูลการทดสอบ CSV
--model : ชื่อของแบบจำลองที่จะได้รับการฝึกฝน (เช่น randomforest , โลจิสติ logisticregression , gradientboosting )
--result : Path ไปยังไฟล์ JSON ที่จะบันทึกผลลัพธ์

เปรียบเทียบหลายรุ่น

zipml --train train.csv --test test.csv --compare --compare_models randomforest svc knn --result results.json

--compare : ธงเพื่อระบุการเปรียบเทียบแบบจำลองหลายแบบ
--compare_models : รายการของแบบจำลองที่จะเปรียบเทียบ (เช่น randomforest , logisticregression , gradientboosting )
--result : Path ไปยังไฟล์ JSON ที่จะบันทึกผลการเปรียบเทียบ

โหลดแบบจำลองที่ผ่านการฝึกอบรมมาก่อนและทำการคาดการณ์

zipml --load_model trained_model.pkl --test test.csv --result predictions.json

--load_model : พา ธ ไปยังไฟล์รุ่นที่บันทึกไว้
--test : พา ธ ไปยังไฟล์ชุดข้อมูลการทดสอบ CSV
--result : Path ไปยังไฟล์ JSON ที่จะบันทึกการคาดการณ์

บันทึกโมเดลที่ผ่านการฝึกอบรม

เพื่อบันทึกรูปแบบที่ผ่านการฝึกอบรมหลังการฝึกอบรม:

zipml --train train.csv --test test.csv --model randomforest --save_model trained_model.pkl

--result : พา ธ ไปยังไฟล์ที่จะบันทึกโมเดลที่ผ่านการฝึกอบรม

เอาท์พุท

ผลลัพธ์ของคำสั่งการฝึกอบรมและการเปรียบเทียบจะรวมถึงการวัดประสิทธิภาพที่หลากหลายเช่นความแม่นยำความแม่นยำการเรียกคืนและคะแนน F1
ผลลัพธ์จะถูกบันทึกในรูปแบบ JSON ทำให้ง่ายต่อการตรวจสอบและวิเคราะห์