ml_things下載ml_things源代碼下載

ml_things

其他源碼

v0.0.1:

下載

機器學習的東西

機器學習的東西是一個輕巧的Python庫，其中包含我在日常研究中使用機器學習，深度學習，NLP的功能和代碼片段。

我創建了此存儲庫，因為我已經厭倦了總是從較舊的項目中查找相同的代碼，並且想獲得一些構建Python庫的經驗。通過使每個人都可以使用它，它使我可以輕鬆地訪問我經常使用的代碼，並且可以幫助其他人的機器學習工作。如果您發現任何錯誤或某些內容沒有意義，請隨時打開問題。

那不是全部！該庫還包含Python代碼片段和筆記本，以加快我的機器學習工作流程。

筆記：

更新： Feb 5, 2022再次感謝您的支持和友善！現在可以在PYPI上使用此軟件包！ pip install ml-things
更新： July 16, 2021謝謝大家的支持和友善！當我發放時，我將將此存儲庫移至PIP安裝模塊。
如果我達到100星，我將發布第一個官方版本，並將其添加到PIP安裝模塊中！

ml_things ：
- 有關如何安裝ml_things的安裝詳細信息。
- 數組功能詳細信息ML_THINGS數組相關功能：
  - pad_array
  - batch_array
- 繪圖功能詳細信息ML_THINGS圖相關功能：
  - plot_array
  - plot_dict
  - plot_confusion_matrix
- 文本功能詳細信息ML_THINGS與文本相關功能：
  - clean_text
- Web相關的ML_THINGS WEB相關功能的詳細信息：
  - download_from
摘要：我經常使用的Python片段列表。
評論：示例我喜歡評論我的代碼。這仍然是一項正在進行的工作。
筆記本教程：我轉換為教程並在線發布的機器學習項目。
最後注意：感激不盡。

ml_things

安裝

該倉庫用Python 3.6+測試。

在虛擬環境中安裝ml_things總是一個好習慣。如果您使用Python的虛擬環境進行指導，則可以在此處查看“用戶指南”。

您可以使用GitHub的PIP安裝ml_things ：

pip install git+https://github.com/gmihaila/ml_things

或來自PYPI：

pip install ml-things

功能

ML_THINGS模塊中實現的所有功能。

數組功能

陣列操作相關的功能，在使用機器學習時可能很有用。

pad_array [來源]

PAD變量長度陣列到固定的Numpy陣列。它可以處理單個數組[1,2,3]或嵌套數組[[1,2]，[3]]。

默認情況下，它將padd zeros到檢測到的行的最大長度：

 > >> from ml_things import pad_array
> >> pad_array ( variable_length_array = [[ 1 , 2 ],[ 3 ],[ 4 , 5 , 6 ]])
array ([[ 1. , 2. , 0. ],
       [ 3. , 0. , 0. ],
       [ 4. , 5. , 6. ]])

它還可以粘貼到自定義大小和具有cusotm值：

 > >> pad_array ( variable_length_array = [[ 1 , 2 ],[ 3 ],[ 4 , 5 , 6 ]], fixed_length = 5 , pad_value = 99 )
array ([[ 1. ,  2. , 99. , 99. , 99. ],
       [ 3. , 99. , 99. , 99. , 99. ],
       [ 4. ,  5. ,  6. , 99. , 99. ]])

batch_array [來源]

將列表分成批處理/塊。最後一批大小是列表值的剩餘。注意：這也稱為塊。我稱其為批次，因為我在ML中使用了更多。

最後一批將是轉向值：

 > >> from ml_things import batch_array
> >> batch_array ( list_values = [ 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 8 , 9 , 8 , 6 , 5 , 4 , 6 ], batch_size = 4 )
[[ 1 , 2 , 3 , 4 ], [ 5 , 6 , 7 , 8 ], [ 8 , 9 , 8 , 6 ], [ 5 , 4 , 6 ]]

情節功能

繪圖相關的功能，在使用機器學習時可能很有用。

plot_array [來源]

從單個值數組中創建繪圖。

所有參數均針對快速圖進行了優化。 magnify參數更改為改變圖的大小：

 > >> from ml_things import plot_array
> >> plot_array ([ 1 , 3 , 5 , 3 , 7 , 5 , 8 , 10 ], path = 'plot_array.png' , magnify = 0.1 , use_title = 'A Random Plot' , start_step = 0.3 , step_size = 0.1 , points_values = True , use_ylabel = 'Thid' , use_xlabel = 'This' )

plot_array

plot_dict [源]

從單個值數組中創建繪圖。

所有參數均針對快速圖進行了優化。 magnify參數更改為改變圖的大小：

 > >> from ml_things import plot_dict
> >> plot_dict ({ 'train_acc' :[ 1 , 3 , 5 , 3 , 7 , 5 , 8 , 10 ],
                'valid_acc' :[ 4 , 8 , 9 ]}, use_linestyles = [ '-' , '--' ], magnify = 0.1 , 
                start_step = 0.3 , step_size = 0.1 , path = 'plot_dict.png' , points_values = [ True , False ], use_title = 'Title' )

plot_dict

plot_confusion_matrix [源]

此功能打印並繪製混亂矩陣。可以通過設置normalize=True應用歸一化。

所有參數均針對快速圖進行了優化。 magnify參數更改為改變圖的大小：

 > >> from ml_things import plot_confusion_matrix
> >> plot_confusion_matrix ( y_true = [ 1 , 0 , 1 , 1 , 0 , 1 ], y_pred = [ 0 , 1 , 1 , 1 , 0 , 1 ], magnify = 0.1 , use_title = 'My Confusion Matrix' , path = 'plot_confusion_matrix.png' );
Confusion matrix , without normalization
array ([[ 1 , 1 ],
       [ 1 , 3 ]])

plot_confusion_matrix

文本功能

與文本相關的功能，在使用機器學習時可能很有用。

clean_text [來源]

使用各種技術的清潔文本：

 > >> from ml_things import clean_text
> >> clean_text ( "ThIs is $$$%.  t t n \ so dirtyyy$$ text :'(.   omg!!!" , full_clean = True )
'this is so dirtyyy text omg'

網絡相關

Web相關的功能，在使用機器學習時可能很有用。

download_from [來源]

從URL下載文件。它將返回下載文件的路徑：

 > >> from ml_things import  download_from
> >> download_from ( url = 'https://raw.githubusercontent.com/gmihaila/ml_things/master/setup.py' , path = '.' )
'./setup.py'

摘要

這是沒有特定主題的各種Python片段。我將它們放在最常用的邏輯順序的同時，將它們放在最常用的訂單中。我喜歡讓它們盡可能簡單和高效。

姓名	描述
讀取文件	一個可以讀取任何文件的襯裡。
寫文件	一個將字符串寫入文件的襯裡。
偵錯	在此行之後開始調試。
PIP安裝github	使用`pip`直接從GitHub安裝庫。
解析論點	運行`.py`文件時給出的解析參數。
醫生	如何使用函數documentaiton運行簡單的UnitTESC。在需要時在筆記本內進行UNITSEST時有用。
修復文字	由於文本數據總是很混亂，因此我總是使用它。很好地修復了任何不良的Unicode。
當前日期	如何在Python中獲得當前日期。我在需要命名日誌文件時使用此。
當前時間	在Python獲得當前時間。
刪除標點符號	刪除Python3中標點符號的最快方法。
pytorch-dataset	有關如何創建Pytorch數據集的代碼示例。
pytorch設備	如何在Pytorch中設置設備以檢測是否可用GPU。

 # required import for variables type declaration
from typing import List , Optional , Tuple , Dict

def my_function ( function_argument : str , another_argument : Optional [ List [ int ]] = None ,
                another_argument_ : bool = True ) -> Dict [ str , int ]
       r"""Function/Class main comment. 

       More details with enough spacing to make it easy to follow.

       Arguments:
       
              function_argument (:obj:`str`):
                     A function argument description.
                     
              another_argument (:obj:`List[int]`, `optional`):
                     This argument is optional and it will have a None value attributed inside the function.
                     
              another_argument_ (:obj:`bool`, `optional`, defaults to :obj:`True`):
                     This argument is optional and it has a default value.
                     The variable name has `_` to avoid conflict with similar name.
                     
       Returns:
       
              :obj:`Dict[str: int]`: The function returns a dicitonary with string keys and int values.
                     A class will not have a return of course.

       """
       
       # make sure we keep out promise and return the variable type we described.
       return { 'argument' : function_argument }

筆記本教程

在這裡，我保留了一些以前的項目的筆記本，這些項目將它們變成了小教程。很多時候，我都將它們作為啟動新項目的基礎。

所有筆記本都在Google Colab中。從未聽說過Google Colab？？您必須查看Colaboratory，Colab和Python簡介的概述，以及我認為這是一篇很棒的媒介文章，才能像Pro一樣配置Google Colab。

如果您檢查/ml_things/notebooks/其中許多尚未在此處列出，因為它們尚未以“拋光”形式。這些筆記本足以與所有人分享：

姓名	描述	鏈接
？使用pytorchtext buccetiterator更好的批次	如何使用pytorchtext存儲材料對文本數據進行分類以更好地批處理。
？使用擁抱的臉部變壓器在Pytorch中的預處理變壓器模型	在您的自定義數據集上預處理67個變形金剛模型。
？使用擁抱的臉部變壓器，Pytorch中的微調變壓器	完整的教程有關如何微調文本分類的73個變壓器模型 - 無需更改代碼！
使用擁抱的臉部變壓器在Pytorch中的內部工作	完整有關輸入如何流過Bert的教程。
？ GPT2用於使用擁抱臉的文本分類？變壓器	有關如何使用GPT2進行文本分類的完整教程。