A Python package for converting PDFs (or PDF URLs) to markdown while extracting images and tables. Markdrop makes it easy to convert PDF documents into markdown format while preserving images and tables.
pip install markdrop https://pypi.org/project/markdrop
from markdrop import extract_images, make_markdown, extract_tables_from_pdf
source_pdf = 'url/or/path/to/pdf/file' # Replace with your local PDF file path or a URL
output_dir = 'data/output' # Replace it with desired output directory's path
make_markdown(source_pdf, output_dir)
extract_images(source_pdf, output_dir, verbose=True)
extract_tables_from_pdf(source_pdf, output_dir=output_dir)from markdrop import setup_keys
### API Key Setup
### If using 'openai' or 'gemini' as llm_client in the generate_descriptions function, you need to set up the API keys first.
setup_keys()from markdrop import generate_descriptions
### Image Descriptions Generation
prompt = "Give textual highly detailed descriptions from this image ONLY, nothing else." # Replace it with your desired prompt
input_path = 'path/to/img_file/or/dir' # Replace it with the path to the images dir or image file
output_dir = 'data/output' # Replace it with the desired output directory's path
llm_clients = ['gemini','llama-vision'] # Replace it with the desired models from ['qwen', 'gemini', 'openai', 'llama-vision', 'molmo', 'pixtral'] only
generate_descriptions(input_path = input_path, output_dir = output_dir, prompt = prompt, llm_client = llm_clients)Converts a PDF or its URL to markdown format.
Parameters:
source (str): Path to input PDF or URLoutput_dir (str): Output directory pathverbose (bool): Enable detailed loggingExtracts images from PDF or its URL while maintaining quality.
Parameters:
source (str): Path to input PDF or URLoutput_dir (str): Output directory pathverbose (bool): Enable detailed loggingDetects and extracts tables images.
Parameters:
pdf_path (str): Path to input PDF or URLstart_page (int, optional): Starting page numberend_page (int, optional): Ending page numberthreshold (float, optional): Detection confidence thresholdoutput_dir (str): Output directory pathGenerates the description of image(s) based on given prompt and llm_client in a csv
llm clientssupported are ['qwen', 'gemini', 'openai', 'llama-vision', 'molmo', 'pixtral']
Parameters:
input_path (str): Path to input PDF or URLoutput_dir (str): Output directory pathprompt (str): prompt to be sent to model along with imagellm_client (list): list containing minimum one model from llm clientsAnalyze different types of image references in a PDF from local file or URL
Parameters:
source (str): Local PDF path or URL to PDFoutput_dir (str): Directory for temporary filesverbose (bool): Print detailed informationWe welcome contributions! Please see our Contributing Guidelines for details.
git clone https://github.com/shoryasethia/markdrop.git
cd markdrop python -m venv venv
source venv/bin/activate # On Windows: venvScriptsactivate pip install -r requirements.txt markdrop/
├── LICENSE
├── README.md
├── CONTRIBUTING.md
├── CHANGELOG.md
├── requirements.txt
├── setup.py
└── markdrop/
├── models/
| ├── .env
| ├── img_descriptions.py
| ├── logger.py
| ├── model_loader.py
| ├── responder.py
| └── setup_keys.py
├── __init__.py
├── main.py
├── utils.py
├── helper.py
└── ignore_warnings.py This project is licensed under the MIT License - see the LICENSE file for details.
See CHANGELOG.md for version history.
Please note that this project follows our Code of Conduct.