A software engineering framework to jump start your Machine Learning projects

Full documentation can be accessed here.
You can install lolpop from PyPI using pip:
pip install lolpopIf you're working in dev mode, you can clone this repo and install lolpop by cd'ing to the this directory and executing:
poetry install Welcome to lolpop!
lolpop is a software engineering framework for machine learning workflows.
The overarching goal is to provide a framework that can help unify data science and machine learning engineering teams. We believe by establishing a standard framework for machine learning work that teams can collaborate more cleanly and be more productive.
Good system design is critical in software development and lolpop tries to follow the following principles. A good system design will contain:
Furthermore, the following goals were kept in mind when building lolpop:
lolpop has a relatively flat conceptual model which contains three main resources to understand:
train_model. This method would know how to take incoming data, train a model or set of models, version those models, and return the winning model. This method would work across several components, such as a feature encoder, model trainer, hyperparameter tuner, metadata tracker, and resource version control system.Components, pipelines, and runners have many common traits. We use the term integration when referring to the set of components, pipelines, and runners.
There is also a natural hierarchy between components, pipelines, and runners:
lolpop has a straightforward development workflow. We hope all find it delightful to use!
First: write your own components or use pre-built ones:
from lolpop.comonent import BaseComponent
from catboost import CatBoostRegressor, CatBoostClassifier
class CatboostModelTrainer(BaseComponent):
def __init__(problem_type=None, params={}, *args, **kwargs):
super().__init__(*args, **kwargs)
if problem_type == "classification":
self.model = CatBoostClassifier(**params)
elif problem_type == "regression":
self.model = CatBoostRegressor(**params)
def fit(self, data, *args, **kwargs):
self.model.fit(data["X_train"], data["y_train"])
return self.model
... Components can then be leveraged in pipeline and runner workflows. Instead of referring to specific component classes, these workflows are designed to use generic component types, as shown below.
from lolpop.pipeline import BasePipeline
class MyTrainingPipeline(BasePipeline):
...
def train_model(self, data, *args, **kwargs):
model = self.model_trainer.train_model(data)
return model
...We then configure which classes to use in our pipeline and runner configuration, as shown below:
#runner config
pipeline:
train: MyTrainingPipeline
...
#pipelines config
train:
component:
model_trainer: CatBoostTrainer
model_trainer:
config:
training_params:
iterations: 2
depth: 2
learning_rate: 1
loss_function: RMSE
...
Finally, workflows can either be invoked via python code:
from lolpop.extension import MyRunner
config_file = "/path/to/dev.yaml"
runner = MyRunner(conf=config_file)
...
model = runner.train.train_model(data)
... or via the lolpop cli:
lolpop run workflow MyRunner --config-file /path/to/dev.yamlIf you're interested in building out your own workflows, it's a good idea to look into some of the provided examples and also look into the extensiblity framework
We've long felt that the ML ecosystem lacked a tool to act as the glue between all the various things that one needs to do in order to successfully execute a production use case. lolpop is an attempt to bridge that gap -- to be that glue. For more information regarding the inspiration behind lolpop, please read our launch blog.
Sometimes it's helpful to understand what a tool is not in order to fully understand what it is. The description 'software engineering framework for machine learning workflows' can be a little obtuse, so it might be helpful to understand the following:
lolpop is not an orchestration tool. In fact, you should probably use an orchestrator to run code you create with lolpop. You should easily be able to integrate your orchestration tool of choice with lolpop.
lolpop is not a pipelining tool. There's several good pipelining tools out there and you even might want to use them with lolpop. For example, we have an example of using metaflow with lolpop, for those who are so inclined.
lolpop is not a metadata tracker, training platform, experiment tacker, etc. We think you should have and use those if you want to. lolpop will be happy to have those as components and let you build them into your workflows.
lolpop doesn't really do anything, it mainly helps you write better ML workflows, faster. It's unopinionated about what tools you use to do that.
Quickstart: Go here for a quickstart guide. Learn how to install lolpop and get it up and running. Run your first workflow, dance, and celebrate!
User Guide: Go here to learn how to work with lolpop.
Integrations: Go here to learn about pre-built runner, pipelines, and components that you can use to build out your own workflows.
Extensions: Go here to learn all you need to do to start building your own runners, pipelines, and components.
CLI: Go here to learn how to use the lolpop command line interface.
Examples: Go here to find some examples of using lolpop.
Resources : Go here to find out how to get in touch with the lolpop team, contributing to lolpop, etc.