From notebook to pipeline in no time with LineaPy
Thomas Fraunholz
The nightmare before data science production: You found a working prototype for your problem using a Jupyter notebook and now it's time to build a production grade solution from that notebook. Unfortunately, your notebook looks anything but production grade. The good news is, there's finally a cure!
The open-source python package LineaPy aims to automate data science workflow generation and expediting the process of going from data science development to production. And truly, it transforms messy notebooks into data pipelines like Apache Airflow, DVC, Argo, Kubeflow, and many more. And if you can't find your favorite orchestration framework, you are welcome to work with the creators of LineaPy to contribute a plugin for it!
In this talk, you will learn the basic concepts of LineaPy and how it supports your everyday tasks as a data practitioner. For this purpose, we will transform a notebook step by step together to create a DVC pipeline. Finally, we will discuss what place LineaPy will take in the MLOps universe. Will you only have to check in your notebook in the future?
Thomas Fraunholz
Affiliation: WOGRA AG
Thomas has a great fondness for science. Strictly speaking for numerics. After his doctorate, he went to the school of embedded programming. During this time he got to know and love DevOps. His enthusiasm for number crunching ultimately led him to the topic of artificial intelligence. He is currently in charge of publicly funded open source research programs. When he’s not trying to convince his colleagues to use DVC, he’s busy with MLOps, CML and his low-budget bark beetle detection drone – once you’ve done emdedded you just can’t get away from it.