5 Things about fastAPI I wish we had known beforehand
Alexander CS Hendorf

5 Things about fastAPI I wish we had known beforehand - An opinionated talk about fastAPI in practice.

A concrete guide to time-series databases with Python
Heiner Tholen, Ellen König

A concrete guide to time-series databases with Python - how to choose the right time-series database for your application.

Accelerate Python with Julia
Stephan Sahm

You want to accelerate your Python code, but going C is too tedious? Julia is a fresh alternative which flows like Python and runs like C. Join this tutorial to learn how to use Julia to easily speed up Python.

Accelerating Public Consultations with Large Language Models: A Case Study from the UK Planning Inspectorate
Michele Dallachiesa, Andreas Leed

New study shows Large Language Models can accelerate public consultations by streamlining the analysis process of representations for Local Plans. Results show the potential for 30% faster analysis time and up to 90% classification accuracy #AI #NLP #DataScience #pyconde @PINSgov

Accelerating Python Code
Jens Nie

Struggling to get your Python simulation prototype to production because you think it's too slow? Let's speed it up using #PyPy, #numpy, #numba and friends.

Actionable Machine Learning in the Browser with PyScript
Valerio Maggio

Interactive ML apps in the browser with zero installation and no server needed? Come to my talk to know how..

Advanced Visual Search Engine with Self-Supervised Learning (SSL) Representations and Milvus
Antoine Toubhans, Noé Achache

Building a Visual Search Engine with Milvus and comparing supervised and self-supervised approaches for images representations

An unbiased evaluation of environment management and packaging tools
Anna-Lena Popkes

Python packaging is quickly evolving and new tools pop up on a regular basis. Lots of talks and posts on packaging exist but none of them give a structured, unbiased overview of the available tools. Let's change this!

Apache Arrow: connecting and accelerating dataframe libraries across the PyData ecosystem
Joris Van den Bossche

Connecting and accelerating dataframe libraries across the PyData ecosystem with Apache Arrow. Learn about the recent developments in Arrow and its adoption, and how it can improve your day-to-day data analytics workflows.

Apache StreamPipes for Pythonistas: IIoT data handling made easy!
Tim Bossenmaier, Sven Oehler

Data enthusiasts love to play with IIoT data. However, the technical challenges remain high (e.g., connect to devices). @StreamPipes makes this easy by providing a self-service toolbox. In this talk, we introduce a new python module to work with IIoT data in a pythonic way.

Ask-A-Question: an FAQ-answering service for when there's little to no data
Suzin You

Doing data science in international development often means dealing with more resource-constraints. This talk will walk you through Ask-A-Question, a simple FAQ-answering service for when there's little to no data that we built for WhatsApp helplines for public health.

Aspect-oriented Programming - Diving deep into Decorators
Mike Müller

Effectively programming cross-cutting task with decorators - Code re-use via the @ symbol

AutoGluon: AutoML for Tabular, Multimodal and Time Series Data
Caner Turkmen, Oleksandr Shchur

Learn about #AutoML and @AutoGluon, which can handle a range of tasks from regression to image classification and time series forecasting with state-of-the-art performance. #AutoML #datascience

Bayesian Marketing Science: Solving Marketing's 3 Biggest Problems
Dr. Thomas Wiecki

A Bayesian modeling toolkit to solve today's biggest marketing challenges.

Behind the Scenes of tox: The Journey of Rewriting a Python Tool with more than 10 Million Monthly Downloads
Jürgen Gmach

Behind the Scenes of tox: The Journey of Rewriting a Python Tool with Over 10 Million Monthly Downloads

BHAD: Explainable unsupervised anomaly detection using Bayesian histograms
Alexander Vosseler

We present a Bayesian histogram anomaly detector (BHAD). BHAD scales linearly with the size of the data and allows a direct explanation of individual anomaly scores due to its simple linear form

BLE and Python: How to build a simple BLE project on Linux with Python
Bruno Vollmer

Learn what BLE is and how to use it with Python. @bvollmer5 shows in this talk how you can easily build a Linux-based BLE server for your next project.

Bringing NLP to Production (an end to end story about some multi-language NLP services)
Larissa Haas, Jonathan Brandt

How to bring NLP models to production? Following a use case that runs for over 1 year in 10 different languages, this talk will enable you to ask the right questions before starting to deploy NLP services.

Building a Personal Assistant With GPT and Haystack: How to Feed Facts to Large Language Models and Reduce Hallucination.
Mathis Lucka

Building a Personal Assistant With GPT and Haystack: How to Feed Facts to Large Language Models and Reduce Hallucination.

Building Hexagonal Python Services
Shahriyar Rzayev

Building Hexagonal Python Services from scratch using Repository, Unit of Work and Use Cases patterns

Cloud Infrastructure From Python Code: How Far Could We Go?
Etzik Bega, Asher Sterkin

Why Infrastructure as Code is not enough and what needs to be done to make Python trully cloud-native programming language?

Code Cleanup: A Data Scientist's Guide to Sparkling Code
Corrie Bartelheimer

Does your production code look like it’s been copied from Untitled12.ipynb? Are your engineers complaining about the code but nobody got time to clean things up? Check out this talk to learn some of the basics of clean coding and how to implement them in a data science team.

Common issues with Time Series data and how to solve them
Vadim Nelidov

Handling time series data is an important yet not an easy task. After this talk you will learn to identify, understand, and resolve time series issues such as divergence, delayed data, time series imputation and impact of outliers.

Contributing to an open-source content library for NLP
Leonard Püttmann

Learn to build amazing open-source enrichments for natural language processing!

Cooking up a ML Platform: Growing pains and lessons learned
Cole Bailey

What is a ML platform and do you even need one? When should you consider investing in your own ML platform? What challenges can you expect building and maintaining one? Join my talk at PyData to hear how we are cooking up our own ML platform at @Delivery Hero!

Create interactive Jupyter websites with JupyterLite
Jeremy Tuloup

Do you want to create your own interactive Jupyter website with JupyterLite? Check out this step-by-step tutorial and learn how to configure and customize your website 💡

Data Kata: Ensemble programming with Pydantic #1
Lev Konstantinovskiy, Gregor Riegler, Nitsan Avni

Write code as an ensemble to solve a data validation problem using P. Working together is not just about code - we will see what it is like to listen to colleagues, make typos in front of everyone, become a supportive team member, defend our ideas and maybe even accept criticism.

Data Kata: Ensemble programming with Pydantic #2
Lev Konstantinovskiy, Gregor Riegler, Nitsan Avni

Write code as an ensemble to solve a data validation problem with Py. Working together is not just about code - we will see what it is like to listen to colleagues, make typos in front of everyone, become a supportive team member, defend our ideas and maybe even accept criticism.

Data-driven design for the Dask scheduler
Guido Imperiale

Historically, changes in the scheduling algorithm of Dask have often been based on theory, single use cases, or even gut feeling. Coiled has now moved to using hard, comprehensive performance metrics for all changes - and it's been a turning point!

Delivering AI at Scale
Severin Schmitt, Anna Achenbach, Thorsten Kranz

Unbelievable tricks to integrate AI into a company with 600k colleagues – experts are shocked!” Learn about Deutsche Post DHL Group’s journey towards a Data-Driven company, with Use Cases, technology details and code snippets #yournextcareerstep #ai #datascience #forecasting

Driving down the Memray lane - Profiling your data science work
Cheuk Ting Ho

You should profile your data science work. In this talk, we will introduce Mamray its new Jupyter plugin.

Dynamic pricing at Flix
Amit Verma

How Flixbus designed dynamic pricing strategy according to market demands

Enabling Machine Learning: How to Optimize Infrastructure, Tools and Teams for ML Workflows
Yann Lemonnier

Join us for a deep dive into machine learning enabler engineering! Discover how to optimize infrastructure, tools and teams for ML workflows and reduce time to deployment. Get practical tips and insights for successful projects from an expert in the field. #MLenabler #optimizingM

evosax: JAX-Based Evolution Strategies
Robert Lange

Tired of having to handle asynchronous processes for neuroevolution? Do you want to leverage high-throughput accelerators for evolution strategies (ES)? evosax allows you to leverage JAX, XLA compilation & auto-vectorization/parallelization to scale ES to accelerators.

Exploring the Power of Cyclic Boosting: A Pure-Python, Explainable, and Efficient ML Method
Felix Wick

We just open-sourced Cyclic Boosting, a pure-Python ML algorithm that's explainable, accurate, robust, easy to use, and fast! Learn more in our presentation #CyclicBoosting #MachineLearning #OpenSource

FastAPI and Celery: Building Reliable Web Applications with TDD
Avanindra Kumar Pandeya

Build reliable and maintainable APIs with FastAPI and Celery using test-driven development (TDD)! Learn how to set up a testing environment, write unit and integration tests, and use mocks and fixtures to isolate and control the tests.

Fear the mutants. Love the mutants.
Max Kahan

Developers often use code coverage as a target, which makes it a bad measure of test quality. Mutation testing changes the game: use your code to create mutants that break your tests, and you'll quickly start to write better tests! Come and learn to use it in your CI/CD process.

From notebook to pipeline in no time with LineaPy
Thomas Fraunholz

The nightmare before data science production: You found a working prototype for your problem using a Jupyter notebook and now it's time to build a production grade solution from that notebook. The good news is, there's finally a cure: The open-source python package LineaPy!

Geospatial Data Processing with Python: A Comprehensive Tutorial
Martin Christen

Learn how to use Python to process geospatial data in this comprehensive tutorial! You'll gain hands-on experience with many Geo modules, learning how to read and write spatial data, perform coordinate system transformations, create interactive maps, and more.

Getting started with JAX
Simon Pressler

Getting Started with JAX! Hands-on tips to overcome your first hurdles.

Giving and Receiving Great Feedback through PRs
David Andersson

Do you struggle with PRs? Have you ever had to change code even though you disagreed with the change? Have you ever given feedback only to get into a comment war? We'll discuss how to give and receive feedback optimally without the communication problems

Great Security Is One Question Away
Wiktoria Dalach

Security doesn't have to be a nightmare. The 3rd hack will surprise you.

Grokking Anchors: Uncovering What a Machine-Learning Model Relies On
KIlian Kluge

What makes or breaks a machine-learning model's decision? Let's use anchor explanations to find out!

Have your cake and eat it too: Rapid model development and stable, high-performance deployments
Christian Bourjau, Jakub Bachurski

Python's data science tools are fantastic for data exploration and model development, but the price is often a slow and difficult deployment. Join us to find out what tools we have developed to have the cake and eat it too!

Haystack for climate Q/A
Vibha Vikram Rao

Haystack for climate Q/A - How to build POCs quickly and take it to production

Honey, I broke the PyTorch model >.< - Debugging custom PyTorch models in a structured manner
Clara Hoffmann

Honey, I broke the Pytorch model >.< No problem! In this talk, we'll build a toolbox to debug our models and prevent this from happening again -all by leveraging DL logic, synthetic data and pytest. Let's make our models unbreakable <3

How Chatbots work – We need to talk!
Yuqiong Weng, Katrin Reininger

We need to talk - All about concepts, techniques as well as practical experience with the Rasa framework for building a chatbot

How Python enables future computer chips
Tim Hoffmann

Learn how we adopted Python to build the computer chips of the future

How to baseline in NLP and where to go from there
Tobias Sterbak

Join us for a talk on baselines in NLP! We'll cover common tasks like classification, clustering, search, and NER, and discuss how to establish and improve baselines using weak learning. Don't miss out on this opportunity to gain a deeper understanding of NLP baselines!

How to build observability into a ML Platform
Alicia Bargar

Check out Shopify's talk on how to build observability into a #machinelearing platform. They'll share key learnings on how to track model performance, catch unexpected behaviour & how observability could work with large language models and Chat AIs

How to connect your application to the world (and avoid sleepless nights)
Luis Fernando Alvarez

Come and explore some of the common techniques to help you build reliable distributed systems in Python

How to increase diversity in open source communities
Maren Westermann

Learn about strategies for increasing diversity in #opensource projects presented by @MarenWestermann

How to teach NLP to a newbie & get them started on their first project
Lisa Andreevna Chalaguine

Learn how to teach people to analyse textual data with the help of Python

Hyperparameter optimization for the impatient
Martin Wistuba

HPO does not need to be expensive, see how to speed it up with a couple of simple algorithms

Improving Machine Learning from Human Feedback
Erin Mikail Staples, Nikolai

While powerful, models built off large datasets like GPT-3 often bring their biases along with them. However, is this the best future for machine learning? Join us to explore Reinforcement Learning from Human Feedback (RLHF) techniques and why they matter more now than ever.

Incorporating GPT-3 into practical NLP workflows
Ines Montani

Large language models like @OpenAI GPT-3 can complement existing machine learning workflows really well. You can get initial annotations from GPT-3, quickly fix them with an annotation tool like https://prodi.gy , and train a cheaper and better model.

Introducing FastKafka
Tvrtko Sternak

"Don't miss our talk on FastKafka, a Python library for easy Kafka communication! #PyCon #Kafka #FastKafka"

Introduction to Async programming
Dishant Sethi

Asynchronous programming has been gaining a lot of attention in the past few years, and for good reason. This session is going to be an intro to async programming in python.

Keynote - A journey through 4 industries with Python: Python's versatile problem-solving toolkit
Susan Shu Chang

Susan, Principal Data Scientist at Elastic, shares her experiences with Python in 4 industries, from telecom, gaming and beyond.

Keynote - How Are We Managing? Data Teams Management IRL
Noa Tamir

The title “Data Scientist” has been in use for 15 years now. We have been attending PyData conferences for over 10 years as well. The hype around data science and AI seems higher than ever before. But How are we managing? Let's talk about Data Science Management IRL.

Keynote - Lorem ipsum dolor sit amet
Miroslav Šedivý

A randomly real and a really random journey to discover the balance between real and random data!

Keynote - Towards Learned Database Systems
Carsten Binnig

ML and DBMSs? Carsten talks about data-driven learning where the idea is to learn the data distribution over a complex relational schema.

Large Scale Feature Engineering and Datascience with Python & Snowflake
Michael Gorkow

Learn how Snowpark for Python enables large scale feature engineering and data science!

Let's contribute to pandas (3 hours) #1
Noa Tamir, Patrick Hoefler

Join our beginner friendly, mentored contributing to @pandas_dev workshop at PyData Berlin! 🥳 #opensource #pandas

Let's contribute to pandas (3 hours) #2
Noa Tamir, Patrick Hoefler

Join our beginner friendly, mentored contributing to @pandas_dev workshop at PyData Berlin! 🥳 #opensource #pandas

Machine Learning Lifecycle for NLP Classification in E-Commerce
Gunar Maiwald, Tobias Senst

idealo.de presents its MLOps solution and ML lifecycle for product classification

Maps with Django
Paolo Melchiorre

"Maps with Django" Keeping in mind the Pythonic principle that simple is better than complex we'll see how to create a web map with the Python based web framework Django using its GeoDjango module, storing geographic data in your local database on which to run geospatial queries.

Maximizing Efficiency and Scalability in Open-Source MLOps: A Step-by-Step Approach
Paul Elvers

Novel approach to #MLOps combines open-source tech with cloud computing to build scalable, maintainable ML system accessible to ML Engineers & Data Scientists.

Methods for Text Style Transfer: Text Detoxification Case
Daryna Dementieva

How to detoxify texts? How to collect parallel corpus for text style transfer task? How to transfer the knowledge of a style between languages? We answer these questions in this talk.

MLOps in practice: our journey from batch to real-time inference
Theodore Meynard

I will present the challenges we encountered while migrating an ML model from batch to real-time predictions and how we handled them.

Modern typed python: dive into a mature ecosystem from web dev to machine learning
samsja

Typing is at the center of „modern Python“, and tools (mypy, beartype) and libraries (FastAPI, SQLModel, Pydantic, DocArray) based on it are slowly eating the Python world. This talks explores the benefits of Python type hints, and shows how they are infiltrating the next big do

Monorepos with Python
AbdealiLoKo

Monorepos have been successful in other communities - how does it work in Python ?

Most of you don't need Spark. Large-scale data management on a budget with Python
Guillem Borrell

Most of you don't need Spark. Large-scale data management on a budget with Python

Neo4j graph databases for climate policy
Marcus Tedesco

Can Neo4j graph databases and Python help us understand climate policy? Find out!

Observability for Distributed Computing with Dask
Hendrik Makait

Debugging is hard. Distributed debugging is hell. Let’s dive into distributed logging, automated metrics, event-based monitoring, and root-causing problems with diagnostic tooling to understand how Dask helps you remain sane while identifying and solving your problems.

Pandas 2.0 and beyond
Joris Van den Bossche, Patrick Hoefler

Pandas has reached a 2.0 milestone in 2023. But what does that mean? And what is coming after 2.0? This talk will give an overview of what happened in the latest releases of pandas and highlight some topics and major new features the pandas project is working on.

Performing Root Cause Analysis with DoWhy, a Causal Machine-Learning Library
Patrick Blöbaum

Learn how to use the Python DoWhy library to perform root cause analysis using methods of causal machine-learning.

Polars - make the switch to lightning-fast dataframes
Thomas Bierhance

Want to learn about a new Python library that can speed up your datascience and analytics work? Join us at the conference to hear about polars, a lightning-fast dataframe library based on Apache Arrow and written in Rust!

Postmodern Architecture: The Python Powered Modern Data Stack
John Sandall

Learn how to upgrade your pandas pipelines powering DAG workflows to a Python Powered Modern Data Stack, demystify the jargon from ETL to ELT, and see how tools like dbt can integrate with Python to change how data pipelines are built and maintained.

Practical Session: Learning on Heterogeneous Graphs with PyG
Ramona Bendias, Matthias Fey

Building and learning on heterogeneous graphs with PyG in a practical session

Pragmatic ways of using Rust in your data project
Christopher Prohm

Pragmatic ways of using Rust in your data project - strategies to speed up your data pipelines without rewriting the whole program.

Prompt Engineering 101: Beginner intro to LangChain, the shovel of our ChatGPT gold rush."
Lev Konstantinovskiy

A modern AI start-up is a front-end developer plus a prompt engineer" is a popular joke on Twitter. This talk is about LangChain, a Python open-source tool for prompt engineering.

PyLadies Panel Session. Tech Illusions and the Unbalanced Society: Finding Solutions for a Better Future

PyLadies chapters around the world reflect on their contributions in advocating for gender representation and leadership as well as combating biases and the gender pay gap.

PyLadies Workshop

Know your rights! PyLadies and Berlin Tech Workers Coalition will unveil important details on work contracts and your rights to get you covered in case of layoffs

Raised by Pandas, striving for more: An opinionated introduction to Polars
Nico Kreiling

Have you also been raised with #pandas for all kinds of data transformations and wonder, if there is more? I did, I searched for performance and more concise syntax, and I would like to introduce you to #polars

Rethinking codes of conduct
Tereza Iofciu

Did you know that the Python Software Foundation Code of Conduct is turning 10 years old in 2023? It was voted in as they felt they were “unbalanced and not seeing the true spectrum of the greater community”. Why is that a big thing? Come to my talk and find out!

Rusty Python: A Case Study
Robin Raymond

Talk on optimizing Python performance with Rust and PyO3, including case study, code profiling, and live demonstration of speedup. Discussion on PyO3 features and tradeoffs with other FFI options.

Shrinking gigabyte sized scikit-learn models for deployment
Pavel Zwerschke, Yasin Tatar

Shrinking gigabyte sized scikit-learn models for deployment: this talk shows how to deploy machine learning models with up to 6x disk space improvement

Software Design Pattern for Data Science
Theodore Meynard

I will share some specific software design concepts that can be used by data scientists to build better data products.

Specifying behavior with Protocols, Typeclasses or Traits. Who wears it better (Python, Scala 3, Rust)?
Kolja Maier

Did you ever wonder how to elegantly & safely abstract over concepts in your code? Check out Python's `typing.Protocol`, Scala's Typeclasses, and Rust's Traits!

Staying Alert: How to Implement Continuous Testing for Machine Learning Models
Emeli Dral

ML monitoring might be easy for a single model, but hard at scale. In this talk, I will introduce the idea of test-based monitoring, and how to standardize data and model checks across models and lifecycle.

Streamlit meets WebAssembly - stlite
Yuichiro Tachibana

Streamlit, a pure-Python data app framework, has been ported to Wasm as "stlite". See its power and convenience with many live examples and explore its internals from a technical perspective. You will learn to quickly create interactive in-browser apps using only Python.

Teaching Neural Networks a Sense of Geometry
Jens Agerberg

By taking neural networks back to the school bench and teaching them some elements of geometry and topology we can build algorithms that can reason about the shape of data. This is the promise of the emerging field of Topological Data Analysis (TDA) which we will introduce!

The Battle of Giants: Causality vs NLP => From Theory to Practice
Aleksander Molak

Join us for a workshop on the latest advances in Causal NLP to see the Causal Transformer in action! All in Python! ❤️

The Beauty of Zarr
Sanket Verma

Hi all, I’ll be talking about Zarr, an open-source data format for storing chunked, compressed N-dimensional arrays, along with a hands-on session. If you work with huge datasets in local/cloud storage and looking for an efficient format, please attend my talk. Thanks!

The bumps in the road: A retrospective on my data visualisation mistakes
Artem Kislovskiy

Join us for a talk: The bumps in the road: A retrospective on my data visualisation mistakes, on data visualisation and how it's essential for conveying insights from data. We'll discuss best practices with Matplotlib, the limitations of static visualisations, and how CI can stre

The CPU in your browser: WebAssembly demystified
Antonio Cuni

WebAssembly is essentially a virtual and efficient CPU embedded in your browser. Let's see what it is!

The future of the Jupyter Notebook interface
Jeremy Tuloup

Jupyter Notebook 7 is the new version of the popular document-oriented notebook interface. It comes packed with a lot of new features, and its future looks bright!

The Spark of Big Data: An Introduction to Apache Spark
Pasha Finkelshteyn

Spark your big data skills! Learn Apache Spark basics: data frames, SQL APIs, and merging data for Python devs new to big data &amp; tech explorers. Don&#39;t miss out! #ApacheSpark #BigData #Python

The State of Production Machine Learning in 2023
Alejandro Saucedo

Join us at the PyCon DE conference to learn about the current state of production machine learning in the Python ecosystem! We'll cover key principles, frameworks for end-to-end ML lifecycle, best practices, and recommended tools for deployment, security, and scaling.

Thou Shall Judge But With Fairness: Methods to Ensure an Unbiased Model
Nandana Sreeraj

Biased models can impact each of us. While it may feel abstract, AI fairness can be achieved through many methods and metrics. More so, mitigation reports can initiate you to responsible AI. Check out my talk & demo at PyData Berlin.

Unlocking Information - Creating Synthetic Data for Open Access.
Antonia Scherz

A lot of data is private but this talk is not - learn how to synthesize anonymized, reliable data from sensitive, private data.

Use Spark from anywhere: A Spark client in Python powered by Spark Connect
Martin Grund

Check out how to participate in the extension of Spark Connect to bring the power of Spark everywhere!

Using transformers – a drama in 512 tokens
Marianne Stecklina

Nearly all pretrained transformers have an annoying limitation: they can only process short input sequences. Watch me rant about it ;-)

Visualizing your computer vision data is not a luxury, it's a necessity: without it, your models are blind and so do you.
Chazareix Arnault

Visualizing your #ComputerVision data is not a luxury, it's a necessity: without it, your models are blind and so do you! Learn how to elevate your projects and #datasets with #DatasetVisualization.

WALD: A Modern & Sustainable Analytics Stack
Florian Wilhelm

WALD: A modern & sustainable analytics stack consisting of a warehouse like Snowflake or BigQuery, Airbyte, Lightdash and dbt.

What are you yield from?
Maxim Danilov

In this talk we will discover why many developers avoid using generators in regular python code.

What could possibly go wrong? - An incomplete guide on how to prevent, detect & mitigate biases in data products
Lea Petters

Data Ethics: What could possibly go wrong? - An incomplete guide on how to prevent, detect & mitigate biases in data products

When A/B testing isn’t an option: an introduction to quasi-experimental methods
Inga Janczuk

Have you ever wanted to know the causal effect of an action but A/B testing wasn’t an option? Here’s a brief helicopter tour over quasi-experimental methods that can be used instead!

Why GPU Clusters Don't Need to Go Brrr? Leverage Compound Sparsity to Achieve the Fastest Inference Performance on CPUs
Damian Bogunowicz

Fun fact: you can remove 90% of a neural network's weights without losing much accuracy! With model sparsity, you can even run these networks on your CPU with GPU-level performance. Learn about compound sparsity (pruning, quantization, knowledge distillation) for faster inference

Workshop on Privilege and Ethics in Data
Tereza Iofciu, Paula Gonzalez Avalos

Data-driven Products are built by humans. Humans are intrinsically biased. This bias goes into the products, which amplifies the original bias. In this tutorial, you will learn how to identify your biases and reflect on the consequences of unchecked biases in Data Products.

Writing Plugin Friendly Python Applications
Travis Hathaway

Learn how to write plugin friendly applications with Python with the pluggy library!

You are what you read: Building a personal internet front-page with spaCy and Prodigy
Victoria Slocum

The internet can be overwhelming, so I made a tool to create a personalized summary of it! Through building this internet front-page project, I've learned how the design concepts of tools like spaCy and Prodigy can facilitate the development of both complex and simple software.

You've got trust issues, we've got solutions: Differential Privacy
Vikram Waradpande, Sarthika Dhawan

What if I tell you I could answer everything about you without knowing you using Differential Privacy

“Who is an NLP expert?” - Lessons Learned from building an in-house QA-system
Nico Kreiling, Alina Bickel

Imagine to have somethingn like ChatGPT for your worklife! Or at least a bot you could ask about all your internal documents? We tried to build something like that @scieneers and will tell you about our journey #haystack #weaviate