federleicht download - federleicht source code download

federleicht

AI Source Code

v0.1.0

Download

federleicht

federleicht is a Python package providing a cache decorator for pandas.DataFrame, utilizing the lightweight and efficient pyarrow feather file format.

federleicht.cache_dataframe is designed to decorate functions that return pandas.DataFrame objects. The decorator saves the DataFrame to a feather file on the first call and loads it automatically on subsequent calls if the file exists.

Key Features

Feather Integration: Save and load pandas.DataFrame effortlessly using the Feather format, known for its speed and simplicity.
Decorator Simplicity: Add caching functionality to your functions with a single decorator line.
Efficient Caching: Avoid redundant computations by reusing cached results.

Cache Expiry

To implement cache expiry, federleicht requires all arguments of the decorated function to be serializable. The cache will expire under the following conditions:

Argument Sensitivity: Cache will expire if the arguments (args / kwargs) of the decorated function change.
When a os.PathLike object is passed as an argument, the cache will expire if the file size and / or modification time changes.
Code Change Detection: Cache will expire if the implementation / code of the decorated function changes during development.
Time-based Expiry: Cache will expire when it is older than a given timedelta.
In addition to the immutable built-in data types, the following types for arguments are supported:
- os.PathLike
- pandas.DataFrame
- pandas.Series
- numpy.ndarray
- datetime.datetime
- types.FunctionType

Installation

Install federleicht from PyPI:

pip install federleicht

Normally, md5 is used for hashing the arguments, but for even faster hashing, you can try xxhash as an optional dependency:

pip install federleicht[xxhash]

Usage

Here's a quick example:

import pandas as pd
from federleicht import cache_dataframe

@cache_dataframe
def generate_large_dataframe():
    # Simulate a heavy computation
    return pd.DataFrame({"col1": range(10000), "col2": range(10000)})

df = generate_large_dataframe()

Benchmark

file: Eartquakes-1990-2023.csv
size: 494.8 mb
lines: 3,445,752

Functions which are used to benchmark the performance of the cache_dataframe decorator.

def read_data(file: str, **kwargs) -> pd.DataFrame:
    """
    Read the earthquake dataset from a CSV file to Benchmark cache.

    Perform some data type conversions and return the DataFrame.
    """
    df = pd.read_csv(
        file,
        header=0,
        dtype={
            "status": "category",
            "tsunami": "boolean",
            "data_type": "category",
            "state": "category",
        },
        **kwargs,
    )

    df["time"] = pd.to_datetime(df["time"], unit="ms")
    df["date"] = pd.to_datetime(df["date"], format="mixed")

    return df

The pandas.DataFrame without the attrs dictionary will be cached in the .pandas_cache directory and will only expire if the file changes. For more details, see the Cache Expiry section.

@cache_dataframe
def read_cache(file: pathlib.Path, **kwargs) -> pd.DataFrame:
    return read_data(file, **kwargs)

Benchmark Results

Results strongly depend on the system configuration and the file system. The following results are obtained on:

OS: Windows
OS Version: 10.0.19044
Python: 3.11.9
CPU: AMD64 Family 23 Model 104 Stepping 1, AuthenticAMD

nrows	read_data [s]	build_cache [s]	read_cache [s]
10000	0.060	0.076	0.037
32170	0.172	0.193	0.033
103493	0.536	0.569	0.067
332943	1.658	1.791	0.143
1071093	5.383	5.465	0.366
3445752	16.750	17.720	1.141

Dependencies

Expand

Additional Information

Version v0.1.0
Type AI Source Code
Update Time 2025-08-30
size 123.55KB
From Github

Related Applications

ML stack

2025-07-01
awesome free chatgpt

2025-01-04
pywin_contextmenu

2025-08-31
promptl

2025-02-17
tick.chat

2025-09-16
FastLoRAChat

2025-09-03

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
pywin_contextmenu

AI Source Code

Version update
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All