markdrop 다운로드 - markdrop 소스 코드 다운로드

markdrop

AI 소스 코드

1.0.0

다운로드

마크 드롭

PDF (또는 PDF URL)를 이미지와 테이블을 추출하는 동안 마크 다운으로 변환하기위한 파이썬 패키지. Markdrop을 사용하면 이미지와 테이블을 보존하면서 PDF 문서를 Markdown 형식으로 쉽게 변환 할 수 있습니다.

특징

Docling을 사용한 포맷 보존과 함께 PDF에서 Markdown 변환
XREF ID를 사용한 품질 보존으로 자동 이미지 추출
Microsoft 테이블 변압기를 사용한 테이블 감지
PDF URL 이상의 세 가지 기능에 대한 지원
모든 이미지 파일 또는 폴더에 대한 텍스트 설명 설명
내장 된 텍스트가 포함 된 이미지의 광학 문자 인식 (OCR)
구조화 된 출력 형식에 대한 지원 향상 (예 : JSON, YAML)
다중 언어 PDF에 대한 지원

설치

pip install markdrop

https://pypi.org/project/markdrop

빠른 시작

 from markdrop import extract_images , make_markdown , extract_tables_from_pdf

source_pdf = 'url/or/path/to/pdf/file'    # Replace with your local PDF file path or a URL
output_dir = 'data/output'                # Replace it with desired output directory's path

make_markdown ( source_pdf , output_dir )
extract_images ( source_pdf , output_dir , verbose = True )
extract_tables_from_pdf ( source_pdf , output_dir = output_dir )

 from markdrop import setup_keys

### API Key Setup
### If using 'openai' or 'gemini' as llm_client in the generate_descriptions function, you need to set up the API keys first.

setup_keys ()

 from markdrop import generate_descriptions

### Image Descriptions Generation

prompt = "Give textual highly detailed descriptions from this image ONLY, nothing else." # Replace it with your desired prompt
input_path = 'path/to/img_file/or/dir'    # Replace it with the path to the images dir or image file
output_dir = 'data/output'                # Replace it with the desired output directory's path
llm_clients = [ 'gemini' , 'llama-vision' ]        # Replace it with the desired models from ['qwen', 'gemini', 'openai', 'llama-vision', 'molmo', 'pixtral'] only

generate_descriptions ( input_path = input_path , output_dir = output_dir , prompt = prompt , llm_client = llm_clients )

API 참조

make_markdown (소스, output_dir, verbose = false)

PDF 또는 해당 URL을 Markdown 형식으로 변환합니다.

매개 변수 :

source (STR) : 입력 PDF 또는 URL 경로
output_dir (str) : 출력 디렉토리 경로
verbose (bool) : 자세한 로깅을 활성화하십시오

extrac_images (소스, output_dir, verbose = false)

품질을 유지하면서 PDF 또는 URL에서 이미지를 추출합니다.

매개 변수 :

source (STR) : 입력 PDF 또는 URL 경로
output_dir (str) : 출력 디렉토리 경로
verbose (bool) : 자세한 로깅을 활성화하십시오

extrac_tables_from_pdf (pdf_path, ** kwargs)

테이블 이미지를 감지하고 추출합니다.

매개 변수 :

pdf_path (str) : 입력 PDF 또는 URL 경로
start_page (int, 옵션) : 시작 페이지 번호
end_page (int, 옵션) : 종료 페이지 번호
threshold (플로트, 선택 사항) : 탐지 신뢰 임계 값
output_dir (str) : 출력 디렉토리 경로

generate_descriptions (input_path, output_dir, prompt, llm_client)

CSV에서 주어진 프롬프트 및 llm_client를 기반으로 이미지에 대한 설명을 생성합니다.

지원되는 llm clients 은 [ 'Qwen', 'Gemini', 'Openai', 'llama-vision', 'molmo', 'Pixtral'입니다.]

매개 변수 :

input_path (str) : 입력 PDF 또는 URL로가는 경로
output_dir (str) : 출력 디렉토리 경로
prompt (str) : 이미지와 함께 모델로 보내 겠다는 프롬프트
llm_client (list) : LLM 클라이언트의 최소 1 개 모델을 포함하는 목록

Analyze_pdf_images (소스, output_dir, verbose = false) :

로컬 파일 또는 URL의 PDF에서 다양한 유형의 이미지 참조 분석

매개 변수 :

source (str) : 로컬 PDF 경로 또는 URL에서 PDF
output_dir (str) : 임시 파일의 디렉토리
verbose (bool) : 자세한 정보를 인쇄하십시오

기여

우리는 기여를 환영합니다! 자세한 내용은 기고 가이드 라인을 참조하십시오.

개발 설정

저장소 복제 :

git clone https://github.com/shoryasethia/markdrop.git  
cd markdrop

가상 환경 생성 :

python -m venv venv  
source venv/bin/activate  # On Windows: venvScriptsactivate

개발 종속성 설치 :

pip install -r requirements.txt

프로젝트 구조

markdrop/  
├── LICENSE  
├── README.md  
├── CONTRIBUTING.md  
├── CHANGELOG.md  
├── requirements.txt  
├── setup.py  
└── markdrop/ 
    ├── models/
    |   ├── .env
    |   ├── img_descriptions.py
    |   ├── logger.py
    |   ├── model_loader.py
    |   ├── responder.py
    |   └── setup_keys.py
    ├── __init__.py  
    ├── main.py  
    ├── utils.py  
    ├── helper.py
    └── ignore_warnings.py