PyData: PyData & Scientific Libraries Stack | PyConDE & PyData Berlin 2023

Talk pydata-pydata-scientific-libraries-stack

Apache Arrow: connecting and accelerating dataframe libraries across the PyData ecosystem

Joris Van den Bossche

Connecting and accelerating dataframe libraries across the PyData ecosystem with Apache Arrow. Learn about the recent developments in Arrow and its adoption, and how it can improve your day-to-day data analytics workflows.

Tutorial pydata-pydata-scientific-libraries-stack

Geospatial Data Processing with Python: A Comprehensive Tutorial

Martin Christen

Learn how to use Python to process geospatial data in this comprehensive tutorial! You'll gain hands-on experience with many Geo modules, learning how to read and write spatial data, perform coordinate system transformations, create interactive maps, and more.

Tutorial pydata-pydata-scientific-libraries-stack

Let's contribute to pandas (3 hours) #1

Noa Tamir, Patrick Hoefler

Join our beginner friendly, mentored contributing to @pandas_dev workshop at PyData Berlin! 🥳 #opensource #pandas

Tutorial pydata-pydata-scientific-libraries-stack

Let's contribute to pandas (3 hours) #2

Noa Tamir, Patrick Hoefler

Join our beginner friendly, mentored contributing to @pandas_dev workshop at PyData Berlin! 🥳 #opensource #pandas

Talk pydata-pydata-scientific-libraries-stack

Observability for Distributed Computing with Dask

Hendrik Makait

Debugging is hard. Distributed debugging is hell. Let’s dive into distributed logging, automated metrics, event-based monitoring, and root-causing problems with diagnostic tooling to understand how Dask helps you remain sane while identifying and solving your problems.

Talk pydata-pydata-scientific-libraries-stack

Pandas 2.0 and beyond

Joris Van den Bossche, Patrick Hoefler

Pandas has reached a 2.0 milestone in 2023. But what does that mean? And what is coming after 2.0? This talk will give an overview of what happened in the latest releases of pandas and highlight some topics and major new features the pandas project is working on.

Talk pydata-pydata-scientific-libraries-stack

Shrinking gigabyte sized scikit-learn models for deployment

Pavel Zwerschke, Yasin Tatar

Shrinking gigabyte sized scikit-learn models for deployment: this talk shows how to deploy machine learning models with up to 6x disk space improvement

Talk pydata-pydata-scientific-libraries-stack

The Beauty of Zarr

Sanket Verma

Hi all, I’ll be talking about Zarr, an open-source data format for storing chunked, compressed N-dimensional arrays, along with a hands-on session. If you work with huge datasets in local/cloud storage and looking for an efficient format, please attend my talk. Thanks!

Talk pydata-pydata-scientific-libraries-stack

Unlocking Information - Creating Synthetic Data for Open Access.

Antonia Scherz

A lot of data is private but this talk is not - learn how to synthesize anonymized, reliable data from sensitive, private data.

Talk pydata-pydata-scientific-libraries-stack

You've got trust issues, we've got solutions: Differential Privacy

Vikram Waradpande, Sarthika Dhawan

What if I tell you I could answer everything about you without knowing you using Differential Privacy