Accelerate Python with Julia
Stephan Sahm

You want to accelerate your Python code, but going C is too tedious? Julia is a fresh alternative which flows like Python and runs like C. Join this tutorial to learn how to use Julia to easily speed up Python.

Ask-A-Question: an FAQ-answering service for when there's little to no data
Suzin You

Doing data science in international development often means dealing with more resource-constraints. This talk will walk you through Ask-A-Question, a simple FAQ-answering service for when there's little to no data that we built for WhatsApp helplines for public health.

Behind the Scenes of tox: The Journey of Rewriting a Python Tool with more than 10 Million Monthly Downloads
Jürgen Gmach

Behind the Scenes of tox: The Journey of Rewriting a Python Tool with Over 10 Million Monthly Downloads

Common issues with Time Series data and how to solve them
Vadim Nelidov

Handling time series data is an important yet not an easy task. After this talk you will learn to identify, understand, and resolve time series issues such as divergence, delayed data, time series imputation and impact of outliers.

Contributing to an open-source content library for NLP
Leonard Püttmann

Learn to build amazing open-source enrichments for natural language processing!

Cooking up a ML Platform: Growing pains and lessons learned
Cole Bailey

What is a ML platform and do you even need one? When should you consider investing in your own ML platform? What challenges can you expect building and maintaining one? Join my talk at PyData to hear how we are cooking up our own ML platform at @Delivery Hero!

Data Kata: Ensemble programming with Pydantic #1
Lev Konstantinovskiy, Gregor Riegler, Nitsan Avni

Write code as an ensemble to solve a data validation problem using P. Working together is not just about code - we will see what it is like to listen to colleagues, make typos in front of everyone, become a supportive team member, defend our ideas and maybe even accept criticism.

Data Kata: Ensemble programming with Pydantic #2
Lev Konstantinovskiy, Gregor Riegler, Nitsan Avni

Write code as an ensemble to solve a data validation problem with Py. Working together is not just about code - we will see what it is like to listen to colleagues, make typos in front of everyone, become a supportive team member, defend our ideas and maybe even accept criticism.

Data-driven design for the Dask scheduler
Guido Imperiale

Historically, changes in the scheduling algorithm of Dask have often been based on theory, single use cases, or even gut feeling. Coiled has now moved to using hard, comprehensive performance metrics for all changes - and it's been a turning point!

Enabling Machine Learning: How to Optimize Infrastructure, Tools and Teams for ML Workflows
Yann Lemonnier

Join us for a deep dive into machine learning enabler engineering! Discover how to optimize infrastructure, tools and teams for ML workflows and reduce time to deployment. Get practical tips and insights for successful projects from an expert in the field. #MLenabler #optimizingM

Fear the mutants. Love the mutants.
Max Kahan

Developers often use code coverage as a target, which makes it a bad measure of test quality. Mutation testing changes the game: use your code to create mutants that break your tests, and you'll quickly start to write better tests! Come and learn to use it in your CI/CD process.

Have your cake and eat it too: Rapid model development and stable, high-performance deployments
Christian Bourjau, Jakub Bachurski

Python's data science tools are fantastic for data exploration and model development, but the price is often a slow and difficult deployment. Join us to find out what tools we have developed to have the cake and eat it too!

How Python enables future computer chips
Tim Hoffmann

Learn how we adopted Python to build the computer chips of the future

How to baseline in NLP and where to go from there
Tobias Sterbak

Join us for a talk on baselines in NLP! We'll cover common tasks like classification, clustering, search, and NER, and discuss how to establish and improve baselines using weak learning. Don't miss out on this opportunity to gain a deeper understanding of NLP baselines!

How to increase diversity in open source communities
Maren Westermann

Learn about strategies for increasing diversity in #opensource projects presented by @MarenWestermann

Hyperparameter optimization for the impatient
Martin Wistuba

HPO does not need to be expensive, see how to speed it up with a couple of simple algorithms

Improving Machine Learning from Human Feedback
Erin Mikail Staples, Nikolai

While powerful, models built off large datasets like GPT-3 often bring their biases along with them. However, is this the best future for machine learning? Join us to explore Reinforcement Learning from Human Feedback (RLHF) techniques and why they matter more now than ever.

Introducing FastKafka
Tvrtko Sternak

"Don't miss our talk on FastKafka, a Python library for easy Kafka communication! #PyCon #Kafka #FastKafka"

Keynote - A journey through 4 industries with Python: Python's versatile problem-solving toolkit
Susan Shu Chang

Susan, Principal Data Scientist at Elastic, shares her experiences with Python in 4 industries, from telecom, gaming and beyond.

Keynote - Lorem ipsum dolor sit amet
Miroslav Šedivý

A randomly real and a really random journey to discover the balance between real and random data!

Keynote - Towards Learned Database Systems
Carsten Binnig

ML and DBMSs? Carsten talks about data-driven learning where the idea is to learn the data distribution over a complex relational schema.

Maps with Django
Paolo Melchiorre

"Maps with Django" Keeping in mind the Pythonic principle that simple is better than complex we'll see how to create a web map with the Python based web framework Django using its GeoDjango module, storing geographic data in your local database on which to run geospatial queries.

Performing Root Cause Analysis with DoWhy, a Causal Machine-Learning Library
Patrick Blöbaum

Learn how to use the Python DoWhy library to perform root cause analysis using methods of causal machine-learning.

Postmodern Architecture: The Python Powered Modern Data Stack
John Sandall

Learn how to upgrade your pandas pipelines powering DAG workflows to a Python Powered Modern Data Stack, demystify the jargon from ETL to ELT, and see how tools like dbt can integrate with Python to change how data pipelines are built and maintained.

Prompt Engineering 101: Beginner intro to LangChain, the shovel of our ChatGPT gold rush."
Lev Konstantinovskiy

A modern AI start-up is a front-end developer plus a prompt engineer" is a popular joke on Twitter. This talk is about LangChain, a Python open-source tool for prompt engineering.

Software Design Pattern for Data Science
Theodore Meynard

I will share some specific software design concepts that can be used by data scientists to build better data products.

Staying Alert: How to Implement Continuous Testing for Machine Learning Models
Emeli Dral

ML monitoring might be easy for a single model, but hard at scale. In this talk, I will introduce the idea of test-based monitoring, and how to standardize data and model checks across models and lifecycle.

Teaching Neural Networks a Sense of Geometry
Jens Agerberg

By taking neural networks back to the school bench and teaching them some elements of geometry and topology we can build algorithms that can reason about the shape of data. This is the promise of the emerging field of Topological Data Analysis (TDA) which we will introduce!

The bumps in the road: A retrospective on my data visualisation mistakes
Artem Kislovskiy

Join us for a talk: The bumps in the road: A retrospective on my data visualisation mistakes, on data visualisation and how it's essential for conveying insights from data. We'll discuss best practices with Matplotlib, the limitations of static visualisations, and how CI can stre

The CPU in your browser: WebAssembly demystified
Antonio Cuni

WebAssembly is essentially a virtual and efficient CPU embedded in your browser. Let's see what it is!

The State of Production Machine Learning in 2023
Alejandro Saucedo

Join us at the PyCon DE conference to learn about the current state of production machine learning in the Python ecosystem! We'll cover key principles, frameworks for end-to-end ML lifecycle, best practices, and recommended tools for deployment, security, and scaling.

WALD: A Modern & Sustainable Analytics Stack
Florian Wilhelm

WALD: A modern & sustainable analytics stack consisting of a warehouse like Snowflake or BigQuery, Airbyte, Lightdash and dbt.

Why GPU Clusters Don't Need to Go Brrr? Leverage Compound Sparsity to Achieve the Fastest Inference Performance on CPUs
Damian Bogunowicz

Fun fact: you can remove 90% of a neural network's weights without losing much accuracy! With model sparsity, you can even run these networks on your CPU with GPU-level performance. Learn about compound sparsity (pruning, quantization, knowledge distillation) for faster inference

You are what you read: Building a personal internet front-page with spaCy and Prodigy
Victoria Slocum

The internet can be overwhelming, so I made a tool to create a personalized summary of it! Through building this internet front-page project, I've learned how the design concepts of tools like spaCy and Prodigy can facilitate the development of both complex and simple software.

You've got trust issues, we've got solutions: Differential Privacy
Vikram Waradpande, Sarthika Dhawan

What if I tell you I could answer everything about you without knowing you using Differential Privacy

Filter