markdrop下载 - markdrop源代码下载

markdrop

Ai源码

1.0.0

下载

Markdrop

一个用于在提取图像和表格时将PDF（或PDF URL）转换为降标的Python软件包。 Markdrop使得在保留图像和表格时可以轻松将PDF文档转换为降价格式。

特征

PDF使用文档进行格式保存降低转换
使用XREF ID的质量保存自动图像提取
使用微软的表变压器检测表
PDF URL支持以上三个功能
任何图像文件或文件夹的文本描述性描述
具有嵌入式文本的图像的光学特征识别（OCR）
增强对结构化输出格式的支持（例如JSON，YAML）
支持多语言PDF

安装

pip install markdrop

https://pypi.org/project/markdrop

快速开始

 from markdrop import extract_images , make_markdown , extract_tables_from_pdf

source_pdf = 'url/or/path/to/pdf/file'    # Replace with your local PDF file path or a URL
output_dir = 'data/output'                # Replace it with desired output directory's path

make_markdown ( source_pdf , output_dir )
extract_images ( source_pdf , output_dir , verbose = True )
extract_tables_from_pdf ( source_pdf , output_dir = output_dir )

 from markdrop import setup_keys

### API Key Setup
### If using 'openai' or 'gemini' as llm_client in the generate_descriptions function, you need to set up the API keys first.

setup_keys ()

 from markdrop import generate_descriptions

### Image Descriptions Generation

prompt = "Give textual highly detailed descriptions from this image ONLY, nothing else." # Replace it with your desired prompt
input_path = 'path/to/img_file/or/dir'    # Replace it with the path to the images dir or image file
output_dir = 'data/output'                # Replace it with the desired output directory's path
llm_clients = [ 'gemini' , 'llama-vision' ]        # Replace it with the desired models from ['qwen', 'gemini', 'openai', 'llama-vision', 'molmo', 'pixtral'] only

generate_descriptions ( input_path = input_path , output_dir = output_dir , prompt = prompt , llm_client = llm_clients )

API参考

make_markDown（源，output_dir，verbose = false）

将PDF或其URL转换为Markdown格式。

参数：

source （str）：输入PDF或URL的路径
output_dir （str）：输出目录路径
verbose （布尔）：启用详细的记录

extract_images（源，output_dir，verbose = false）

从PDF或其URL提取图像，同时保持质量。

参数：

source （str）：输入PDF或URL的路径
output_dir （str）：输出目录路径
verbose （布尔）：启用详细的记录

extract_tables_from_pdf（pdf_path，** kwargs）

检测和提取表图像。

参数：

pdf_path （str）：输入路径PDF或URL
start_page （int，可选）：启动页码
end_page （int，可选）：结尾页编号
threshold （浮点，可选）：检测置信度阈值
output_dir （str）：输出目录路径

generate_descriptions（input_path，output_dir，提示，llm_client）

基于给定的提示和llm_client在CSV中生成图像的描述

支持的llm clients为['qwen'，'gemini'，'Openai'，'Llama-vision'，'Molmo'，'pixtral']

参数：

input_path （str）：输入PDF或URL的路径
output_dir （str）：输出目录路径
prompt （str）：提示与图像一起发送到模型
llm_client （列表）：列表包含LLM客户端的最低模型

Analyze_pdf_images（源，output_dir，verbose = false）：

通过本地文件或URL分析PDF中不同类型的图像参考

参数：

source （str）：本地PDF路径或url到PDF
output_dir （str）：临时文件目录
verbose （布尔）：打印详细信息

贡献

我们欢迎捐款！有关详细信息，请参阅我们的贡献指南。

开发设置

克隆存储库：

git clone https://github.com/shoryasethia/markdrop.git  
cd markdrop

创建虚拟环境：

python -m venv venv  
source venv/bin/activate  # On Windows: venvScriptsactivate

安装开发依赖性：

pip install -r requirements.txt

项目结构

markdrop/  
├── LICENSE  
├── README.md  
├── CONTRIBUTING.md  
├── CHANGELOG.md  
├── requirements.txt  
├── setup.py  
└── markdrop/ 
    ├── models/
    |   ├── .env
    |   ├── img_descriptions.py
    |   ├── logger.py
    |   ├── model_loader.py
    |   ├── responder.py
    |   └── setup_keys.py
    ├── __init__.py  
    ├── main.py  
    ├── utils.py  
    ├── helper.py
    └── ignore_warnings.py