markdrop下載 - markdrop源代碼下載

markdrop

Ai源碼

1.0.0

下載

Markdrop

一個用於在提取圖像和表格時將PDF（或PDF URL）轉換為降標的Python軟件包。 Markdrop使得在保留圖像和表格時可以輕鬆將PDF文檔轉換為降價格式。

特徵

PDF使用文檔進行格式保存降低轉換
使用XREF ID的質量保存自動圖像提取
使用微軟的表變壓器檢測表
PDF URL支持以上三個功能
任何圖像文件或文件夾的文本描述性描述
具有嵌入式文本的圖像的光學特徵識別（OCR）
增強對結構化輸出格式的支持（例如JSON，YAML）
支持多語言PDF

安裝

pip install markdrop

https://pypi.org/project/markdrop

快速開始

 from markdrop import extract_images , make_markdown , extract_tables_from_pdf

source_pdf = 'url/or/path/to/pdf/file'    # Replace with your local PDF file path or a URL
output_dir = 'data/output'                # Replace it with desired output directory's path

make_markdown ( source_pdf , output_dir )
extract_images ( source_pdf , output_dir , verbose = True )
extract_tables_from_pdf ( source_pdf , output_dir = output_dir )

 from markdrop import setup_keys

### API Key Setup
### If using 'openai' or 'gemini' as llm_client in the generate_descriptions function, you need to set up the API keys first.

setup_keys ()

 from markdrop import generate_descriptions

### Image Descriptions Generation

prompt = "Give textual highly detailed descriptions from this image ONLY, nothing else." # Replace it with your desired prompt
input_path = 'path/to/img_file/or/dir'    # Replace it with the path to the images dir or image file
output_dir = 'data/output'                # Replace it with the desired output directory's path
llm_clients = [ 'gemini' , 'llama-vision' ]        # Replace it with the desired models from ['qwen', 'gemini', 'openai', 'llama-vision', 'molmo', 'pixtral'] only

generate_descriptions ( input_path = input_path , output_dir = output_dir , prompt = prompt , llm_client = llm_clients )

API參考

make_markDown（源，output_dir，verbose = false）

將PDF或其URL轉換為Markdown格式。

參數：

source （str）：輸入PDF或URL的路徑
output_dir （str）：輸出目錄路徑
verbose （布爾）：啟用詳細的記錄

extract_images（源，output_dir，verbose = false）

從PDF或其URL提取圖像，同時保持質量。

參數：

source （str）：輸入PDF或URL的路徑
output_dir （str）：輸出目錄路徑
verbose （布爾）：啟用詳細的記錄

extract_tables_from_pdf（pdf_path，** kwargs）

檢測和提取表圖像。

參數：

pdf_path （str）：輸入路徑PDF或URL
start_page （int，可選）：啟動頁碼
end_page （int，可選）：結尾頁編號
threshold （浮點，可選）：檢測置信度閾值
output_dir （str）：輸出目錄路徑

generate_descriptions（input_path，output_dir，提示，llm_client）

基於給定的提示和llm_client在CSV中生成圖像的描述

支持的llm clients為['qwen'，'gemini'，'Openai'，'Llama-vision'，'Molmo'，'pixtral']

參數：

input_path （str）：輸入PDF或URL的路徑
output_dir （str）：輸出目錄路徑
prompt （str）：提示與圖像一起發送到模型
llm_client （列表）：列表包含LLM客戶端的最低模型

Analyze_pdf_images（源，output_dir，verbose = false）：

通過本地文件或URL分析PDF中不同類型的圖像參考

參數：

source （str）：本地PDF路徑或url到PDF
output_dir （str）：臨時文件目錄
verbose （布爾）：打印詳細信息

貢獻

我們歡迎捐款！有關詳細信息，請參閱我們的貢獻指南。

開發設置

克隆存儲庫：

git clone https://github.com/shoryasethia/markdrop.git  
cd markdrop

創建虛擬環境：

python -m venv venv  
source venv/bin/activate  # On Windows: venvScriptsactivate

安裝開發依賴性：

pip install -r requirements.txt

項目結構

markdrop/  
├── LICENSE  
├── README.md  
├── CONTRIBUTING.md  
├── CHANGELOG.md  
├── requirements.txt  
├── setup.py  
└── markdrop/ 
    ├── models/
    |   ├── .env
    |   ├── img_descriptions.py
    |   ├── logger.py
    |   ├── model_loader.py
    |   ├── responder.py
    |   └── setup_keys.py
    ├── __init__.py  
    ├── main.py  
    ├── utils.py  
    ├── helper.py
    └── ignore_warnings.py