Releases: ddotta/awesome-polars
Releases · ddotta/awesome-polars
2024-04-23
New Additions
Python
- polars-fuzzy-match - Python package for fuzzy matching with Polars, i.e. matching text elements that are similar but not exactly identical by @bnm3k.
- Polars for Identifiers and Standard Format Strings - Python package for Processing IBAN, ISINs, URLs and other standard format data in Polars by @abstractqqq.
- polars_hash - Python package that provides stable hashing functionality across different Polars versions by @ion-elgreco.
- polars_ta - Python package that provides technical indicator operators rewritten in Polars by @wukan1986.
- QuickEcharts - Python package for fast and easy echarts with Polars backend by @AdrianAntico.
- Polars OLS - Python package that provides efficient rust implementations of common linear regression variants and exposes them as simple Polars expressions by @azmyrajab.
- polars-finance - A collection of Python Polars plugins and functions for market data processing by @ngriffiths13.
- polars-candle - Python package for for running candle ML models on Polars DataFrames by @wdoppenberg.
Tutorials & workshops
- Scripts and datasets for the O'Reilly book Python Polars: The Definitive Guide - Useful Python notebooks ordered by book chapter by @jeroenjanssens.
- Python-Polars-Tips-and-Tricks - Collection of source code demonstrating tips and tricks in Polars by @StuffbyYuki.
Blog posts
- Groupby in Polars - A post that explains how to Learn how to do group data using Polars by Alexandre Petit.
- DuckDB vs Polars - Thunderdome. - A blog post that compares Polars and DuckDB with the use of 16 GB of data on a machine of only 4 GB by @danielbeach.
- How moving from Pandas to Polars made me write better code without writing better code - A post that describs the process of "Polarification" of code written with Pandas by @duvenagep.
- Revisiting a Classic Cheminformatics Paper with Polars: The Wiener Index - A science blog post that uses Polars to track the information for the molecules in DataFrames by @bertiewooster.
- How to start using Polars & DuckDB together for data analysis - A post that demonstrates the usage of Polars with DuckDB to perform similar data transformations as is done using Pandas by @sumaniitm.
- Anatomy of a Polars Query: A Syntax Comparison of Polars vs SQL - A post that compares Polars syntax to SQL by @bfeif.
- Pandas vs. Polars — Time to Switch? - A blog post that compares Polars to Pandas in a series of 4 benchmarks performed on a csv file with 11 million rows by @daradecic.
- How to JOIN datasets in Polars … compared to Pandas - A blog post compares dataframe joins in Polars vs Pandas by @danielbeach.
- DuckDB vs Polars - Which One Is Faster? - An unofficial benchmark on DuckDB and Polars by @StuffbyYuki.
- Pandas vs Polars? Bid Adieu to Pandas and Switch To Polars! - An article that compares Polars to Pandas with a dataset of 1.2 GB. Code used is available on Github here.
Talks and videos
- Polars is the Pandas killer | PyData Tel Aviv 2024 ⏳ 22 min - A video that shows how Polars is competing head to head with scale, speed and ease of use for dataframe solution in python by Igor Mintz.
- Polars-Cookbook in Python - Polars cookbook with organized by Python notebooks and chapter by @StuffbyYuki.
What's Changed
- Add groupby blog by @alexandthedataworld by @ddotta in #254
- Add blog post about comparing Polars and DuckDB by @ddotta in #255
- Add Polars for Identifiers and Standard Format Strings by @ddotta in #257
- Add blog post about "Polarification" by @ddotta in #258
- Polars for chemistry by @ddotta in #259
- Add polars_hash python package by @ddotta in #261
- Add blog post about using Polars and DuckDB together by @ddotta in #263
- Add blog post about comparing Polars syntax to SQL by @ddotta in #266
- Add polars_ta python package by @ddotta in #267
- Add QuickEcharts python package by @ddotta in #269
- Add Polars OLS python package by @ddotta in #272
- Add scripts and datasets for O'Reilly book Python Polars by @ddotta in #273
- Add Polars plugins for market data processing by @ddotta in #275
- Add blog post by @daradecic by @ddotta in #279
- Add blog post about comparing joins in Polars vs Pandas by @ddotta in #280
- Add video by Igor Mintz in PyData Tel Aviv by @ddotta in #281
- Add blog post by @StuffbyYuki by @ddotta in #285
- Add Python Polars Cookbook in Python by @ddotta in #286
- Add Python-Polars-Tips-and-Tricks repository by @ddotta in #287
- Add polars candle by @ddotta in #290
- Add blog post by @sm823zw by @ddotta in #291
Full Changelog: 2024-03-01...2024-04-23
2024-03-01
New Additions
Python
- Ibis Python package for Polars - Ibis is a Python library that provides a lightweight, universal interface for data wrangling. It can be used with Polars.
- Python package polars-ds - Python package that contains multiple extension to simplify common numerical/string data analysis procedures by @abstractqqq.
- Narwhals - Python files that provides an extremely lightweight compatibility layer between Polars, Pandas, cuDF, and Modin by @MarcoGorelli.
- polars-upgrade - Python package that automatically upgrades your Polars code so it's compatible with future versions by @MarcoGorelli.
- polars-fuzzy-match - Python package for fuzzy matching with Polars, i.e. matching text elements that are similar but not exactly identical by @bnm3k.
Cheat Sheets
- Cheatsheet for Pandas to Polars - A Cheat Sheet that shows how to convert some familiar Pandas commands to Polars by @braaannigan.
Tutorials & workshops
- Polars plugins tutorial - How you (yes, you!) can write a Polars Plugin, by @MarcoGorelli.
Blog posts
- Great Tables: The Polars DataFrame Styler of Your Dreams - A post that shows how Great Tables package uses polars expressions to make delightful tables by @machow.
- Polars dataframe’s plugins and extensibility: getting started - A post that illustrates the possibility of extending the core Dataframe API of Polars with a few examples by @brunocous.
- 15 Pandas ↔ Polars ↔ SQL ↔ PySpark Translations - A post that depicts the 15 most common tabular operations in Polars and their corresponding translations in Pandas, SQL and PySpark by @ChawlaAvi.
- LazyFrame: Exploring Laziness in Dataframes from Polars in Python - A blog post that introduces LazyFrames with Polars an Python by Manoj Das.
- Data Statistics in Polars - A post that explains how to extract insightful information from your data in Polars by Alexandre Petit.
Talks and videos
- Polars and time zones: everything you need to know | PyData Global 2023 ⏳ 29 min - A video that shows how to use Polars effectively for time series analysis involving different time zones by @MarcoGorelli.
What's Changed
- Add post about Great tables package by @ddotta in #222
- Add Ibis project by @ddotta in #224
- Add cheatsheet for Pandas to Polars by @ddotta in #226
- Ddotta/issue227 by @ddotta in #228
- Add python package polars-ds by @ddotta in #230
- Add blog post by @ChawlaAvi by @ddotta in #233
- Add video about time series by @MarcoGorelli by @ddotta in #234
- Add guide for Lazyframes by @ddotta in #236
- Add post about data statistics in Polars by @ddotta in #238
- polars-xdt by @FBruzzesi in #240
- Typos by @FBruzzesi in #241
- polars plugins tutorial by @FBruzzesi in #243
- Add Narwhals by @MarcoGorelli by @ddotta in #246
- Add polars-upgrade by @MarcoGorelli by @ddotta in #247
- Add polars-fuzzy-match by @MarcoGorelli by @ddotta in #249
New Contributors
- @FBruzzesi made their first contribution in #240
Full Changelog: 2024-01-17...2024-03-01
2024-01-17
New Additions
Official documentation
Python
- polars-business - Polars extension that offers utilities for business day operations with Polars and Python by @MarcoGorelli.
R
- polarssql -
polarssql
experimental package which is a DBI-compliant interface to Polars.
Blog posts
- Date and DateTime Manipulation in Polars - A blog post that shows examples of doing a number of date and datetime manipulations in Polars (Python) by @danielbeach. Code used is available on Github here.
- Pandas2 and Polars for Feature Engineering - A blog post that A blog post that compares Pandas2 and Polars for Feature Engineering tasks with Python by @hopswork.
- Spark vs Polars. Real-life Test Case. - A blog post in which the author tests whether Polars is able to handle "real amounts of data" and "really replace some production Spark workloads." by @danielbeach. Code used is available on Github here.
- Using Polars Plugins for a 14x Speed Boost with Rust - A blog post thats shows the use of Polars plugin system for Rust from some concrete examples by @ngriffiths13.
- Working with DateTime data in Polars - A blog post to helps you with the main operations that can be done with datetime data by Rielly Griffiths.
- Revolutionize Your Data Analysis: Polars Outperforms Pandas by Up to 5x in Numerical Filter Operations! - A blog post that compares Polars with Pandas by examining their performance in the real world by Daniel Builescu.
- Time series Analysis with Polars - A short blog post that explains how to deal with temporal datasets by @gaborschulz. Full helpful notebook available here.
- Interesting thread about Polars on Hacker News
- Level Up Your Data Analysis with Polars: A Powerful DataFrame Library for Speed and Efficiency](https://python.plainenglish.io/level-up-your-data-analysis-with-polars-a-powerful-dataframe-library-for-speed-and-efficiency-0b82c226c7f1) - A blog post that describs the main features of Polars (with benchmarks) by ravi-m.
- polars’ Rgonomic Patterns - A blog post that deeps dive into some of the advanced data wrangling functionality in python’s Polars package by @emilyriederer.
Talks and videos
- Library of the week 13 : Polars with Python ⏳ 15 min - A video that presents Polars with Python by @enarroied. Article supplied with the video in this page.
What's Changed
- Add blog post about date manipulations by @danielbeach by @ddotta in #194
- Add polars-business by @MarcoGorelli by @ddotta in #196
- Add blog post about Feature Engineering by hopswork by @ddotta in #198
- Add blog post - Real-life Test Case Spark vs Polars by @ddotta in #201
- Add blog post about using Polars plugin system by @ddotta in #202
- Add video by Eric Narro by @ddotta in #205
- Add polarssql R package by @ddotta in #206
- Add blog post about working with datetime data in Polars by @ddotta in #208
- Add blog post by Daniel Builescu by @ddotta in #210
- Add notebook about time series by @gaborschulz by @ddotta in #212
- Add polars plugins by @ddotta in #214
- Add thread on Hacker News by @ddotta in #216
- Add blog post by ravi M by @ddotta in #219
- Add blog post by @emilyriederer by @ddotta in #220
Full Changelog: 2023-10-19...2024-01-17
2023-10-19
New Additions
Official documentation
- Talk about Polars at EuroPython Conference 2023 ⏳
28 min
- Talk by @ritchie46 that introduces Polars and some of its design decisions.
Python
- Python package functime - Machine learning Python package built on Polars for time-series predictions by @neocortexdb. According to the developpers, it's the world's fastest and most feature-full machine learning forecasting library !
Blog posts
- Enhancing Data Analytics with Polars and MinIO - A blog post that explains how to use Polars with Minio’s open-source object storage by @IndexSeek.
- Using Polars with Snowflake - A blog post that shows how to use Polars with Snowflake by @IndexSeek.
- Partitioning Polars DataFrame on S3 with Apache Arrow - A blog post that explains how to partition large Polars DataFrames in AWS S3 by Matteo Arellano.
- Goodbye Spark. Hello Polars + Delta Lake - An article that presents how to use Polars in addition to Delta Lake by @danielbeach.
- How to learn Polars with ChatGPT? - An article that explains how to learn fundamental Polars concepts with ChatGPT by Suhith Illesinghe.
Talks and videos
- Polars DataFrame ⏳ 41 min - A video that shows some basic manipulations with Polars and Python by @vedica1011. Notebook used for the video in this github repo.
- Why I switched grom Pandas to Polars ⏳ 53 min - A workshop that breaks down the 3 reasons why you could switched from Pandas to Polars by @bfeif. Notebook used for the video in this github repo.
- Delimiters in Python Polars ⏳ 15 min - A video that explains how to use delimiters in Python Polars by @CodeKlaudia.
- Intro to Polars ⏳ 7 videos - A playlist of 7 videos that introduces the basic concepts of Polars (DataFrames, filtering, splitting...) by Joram Mutenge.
- Machine Learning with Polars ⏳ 6 videos - A playlist of 6 videos that analyzing and cleaning data using Polars to train machine learning models by Joram Mutenge.
- Pandas and Polars with Marco Gorelli ⏳
55 min
- A podcast by The Developers' Bakery that compares the performance of Polars to Pandas by @MarcoGorelli.
What's Changed
- Broken link to functime org by @ddotta in #167
- Add blog post about using Polars and MinIO by @ddotta in #169
- Add talk by @ritchie46 at EuroPython Conference 2023 by @ddotta in #171
- Add blog post about using Polars with Snowflake by @ddotta in #173
- Add video by @vedica1011 by @ddotta in #176
- Add video by @bfeif by @ddotta in #177
- Add video about delimiters by @CodeKlaudia by @ddotta in #179
- Add blog post about partitionning Polars DataFrames in AWS S3 by @ddotta in #181
- Add blog post about Polars + Delta Lake by @ddotta in #184
- Add introduction playlist to Polars by Joram Mutenge by @ddotta in #187
- Add machine learning playlist by Joram Mutenge by @ddotta in #188
- Add blog post about learning Polars with ChatGPT by @ddotta in #190
- Add podcast by @MarcoGorelli by @ddotta in #192
Full Changelog: 2023-09-14...2023-10-19
2023-09-14
New Additions
Official documentation
- Keynote on Polars at EuroSciPy 2023 ⏳
57 min
- Talk by @ritchie46 that dives into Polars and sees what makes it so efficient. It will touch on technologies like Arrow, Rust, parallelism, data structures, query optimization and more.
Rust
- Polars CLI
Polars CLI
is a command line interface for running SQL queries with Polars as backend.
Tutorials & workshops
- Data Pipelines with Polars: Step-by-Step Guide - A tutorial that explains how to build data pipelines with Polars by @AntonsRuberts. Code used is available on Github here.
- Python Polars: A Lightning-Fast DataFrame Library - A tutorial that shows how to use Polars with Python ecosystem by @hfhoffman1144. Code used is available on Github here.
Blog posts
- All that Polars that Make You Forget Pandas - A blog post that explores some deeper reasons behind the performance gains of Polars over Pandas.
- Polars vs Pandas. Inside an AWS Lambda - A blog post that covers the topic of using Polars vs Pandas inside an AWS Lambda to do data processing by @danielbeach. Code used is available on Github here.
- DuckDB vs Polars for Data Engineering - A blog post that compares Polars and DuckDB with pipelines for Data Engineering by @danielbeach.
- Pandas vs Polars: A database speed test. Who wins? - A blog post that compares the run-time of reading a database into a dataframe using Pandas versus using Polars by Thomas Reid.
- Polars and Pandas : What's the difference ? - A blog post that explains how Polars works under the hood and th best use cases for Polars and Pandas by @t-redactyl.
- Understanding the Polars nested column types - A blog post that helps to understand how nested column types works in Polars by @braaannigan.
- Polars vs DuckDB for Delta Lake ops - A blog post that compares Polars to DuckDB using Delta Lake by @wolliq.
Talks and videos
- Using the Rust Polars DataFrame library in a CLI ⏳
4 min
- A video that shows how to integrate Polars in a commande line interface by @paiml. - The Ultimate Guide to Data Wrangling with Python | Rust Polars Data Frame ⏳
10 videos
- A playlist of 10 videos (WIP) that equips you with all the necessary knowledge required to utilize Python Polars Data Frame by @AmitXShukla.
Follow : Official
- Eitsupi (@eitsupi) - Contributor to R Polars project
- Etienne Bacher (@etiennebacher) - Contributor to R Polars project
What's Changed
- Add blog post about deeper reasons behind the performance gains of Polars by @ddotta in #137
- Add blog post about using Polars inside AWS by @ddotta in #138
- Add blog post about Polars vs DuckDB for Data Engineering by @ddotta in #139
- Add pipelines tutorial by @AntonsRuberts by @ddotta in #141
- Ddotta/issue142 by @ddotta in #143
- Add duration missing for "Manipulación de Datos con Polars en python" video by @ddotta in #145
- Add blog post by Thomas Reid by @ddotta in #147
- Add news about Polars's raisin seed round by @ddotta in #149
- Add real python blog about polars by @ddotta in #155
- Add video about using the Rust Polars DataFrame library in a CLI by @ddotta in #156
- Add video by Amit Shukla by @ddotta in #157
- Add keynote at EuroSciPy 2023 by @ddotta in #158
- Add polars CLI by @ddotta in #159
- chore: typo correction by @ddotta in #160
- chore: minor corrections by @ddotta in #163
- Add blog post about Delta Lake by @wolliq by @ddotta in #165
Full Changelog: 2023-07-19...2023-09-14
2023-07-19
New Additions
Python
- Python package seaborn_polars - Python package to plot Polars DataFrames and LazyFrames with seaborn by @pavelcherepan.
- Python package functime - Machine learning Python package built on Polars for time-series predictions by @DescendantAI . According to the developpers, it's the world's fastest and most feature-full machine learning forecasting library !
R
- tidypolars for R
tidypolars
package to use polars with tidyverse syntax.
Blog posts
- EDA with Polars: Step-by-Step Guide for Pandas Users (Part 1) - A blog post that describes the main data processing operations with Polars in Python by @AntonsRuberts. Code used is available in this notebook.
- EDA with Polars: Step-by-Step Guide to Aggregate and Analytic Functions (Part 2) - A blog post that shows how to perform with Polars and Python some fairly complex aggregates, rolling statistics and more by @AntonsRuberts. Code used is available in this notebook.
- Pyspark or Polars — What should you use? - A blog post that explores and breaks down some of the similarities between PySpark and Polars. It provides insights on when to choose one over the other by Vivek Kovvuru.
- Getting Started with the Polars Data Manipulation Library - A blog post that presents some simple features of Polars using Python by Juveriya Mahreen.
- 8 ways pandas really losing to Polars for quick market data analysis - A newsletter that compares the performance of Polars to Pandas for many common data manipulation techniques by PyQuant News.
Talks and videos
- How to update mass data using Polars DataFrame ⏳
9 min
- A video that presents the process of writing code to update mass columns across CSV or data files by @AmitXShukla. Notebook used for the video in this github repo.
What's Changed
- Add first part of blog post by @AntonsRuberts by @ddotta in #119
- Add video by @AmitXShukla by @ddotta in #121
- Add second part of blog post by @AntonsRuberts by @ddotta in #123
- Add blog post about PySpark and Polars by Vivek Kovvuru by @ddotta in #125
- Add blog post by Juveriya Mahreen by @ddotta in #128
- Add newsletter by PyQuant News by @ddotta in #129
- Add functime Python library by @ddotta in #132
- Add tidypolars R package by @etiennebacher by @ddotta in #133
Full Changelog: 2023-06-28...2023-07-19
2023-06-28
New Additions
Tutorials & workshops
- Cookbook Polars for R - A side-by-side comparison of Polars, R base, dplyr and data.table packages by @ddotta.
- Polars Workshop on AWS - A comprehensive workshop comparing Polars to Pandas, exploring a wide range of functions and features by @debnsuma.
- Polars cookbook in Python - This cookbook is a fork of the popular pandas-cookbook and has been modified to use the polars library. By @escobar-west, it uses real-world examples with "all the bugs and weirdness that entails."
Blog posts
- Pandas vs Polars – Speed Comparison - A blog post that compares the performance of Polars, Pandas and Pandas 2.0 by @StuffbyYuki. Code used is available on Github here.
- LazyFrame vs DataFrame in Polars – Performance Comparison - A blog post that introduces what LazyFrame is in Polars and its performance gain compared to DataFrame by @StuffbyYuki. Code used is available on Github here.
- Querying Polars DataFrames using SQL - A blog post that shows how to use the SQLContext object in Python to query a Polars DataFrame directly using SQL by @weimenglee.
- Polars vs Pandas: A Brief Tale of Two DataFrame Libraries - A blog post that compares Polars and Pandas focusing in particular on optional dependencies by @ranggakd.
Talks and videos
- Polars - make the switch to lightning-fast dataframes ⏳
30 min
- A talk that reports an experience switching from Pandas to Polars in a real-world ML project by @datenzauberai. Slides are available here. - Polars: A highly optimized dataframe library ⏳
20 min
- A video that presents some mains features of Polars by @mattharrison.
What's Changed
- Add blog post by @StuffbyYuki by @ddotta in #101
- Add comparison blog post by @StuffbyYuki by @ddotta in #103
- Add blog post about querying Polars DataFrames using SQL by @ddotta in #105
- Add cookbook Polars for R by @ddotta in #108
- Add workshop by @debnsuma by @ddotta in #109
- Add blog post by @ranggakd by @ddotta in #111
- Add link to video - make the switch to lightning-fast dataframes by @ddotta in #114
- Add talk by @mattharrison by @ddotta in #115
- Add polars cookbook in Python by @escobar-west by @ddotta in #117
Full Changelog: 2023-05-30...2023-06-28
2023-05-30
New Additions
Python
- Python package seaborn_polars - Python package to plot Polars DataFrames and LazyFrames with seaborn by @pavelcherepan
Ruby
- polars for Ruby - Ruby
polars-df
gems to use Polars with Ruby.
Tutorials & workshops
- Fast String Processing with Polars — Scam Emails Dataset - A tutorial using Polars to implement a text processing pipeline process by @AntonsRuberts. Code used is available on Github here.
Blog posts
- Polars in the aRtic! - An another blog post that compares the performance between Pandas and Polars across a range of common data manipulation tasks by @MCodrescu. Code used is available on Github.
- A Polars exploration into Kedro - A blog post that explains how Polars can be used instead of pandas in Kedro for your data catalog and data manipulation by @astrojuanlu.
- High Performance Data Manipulation in Python: pandas 2.0 vs. polars - A blog post that compares differences between Python pandas 2.0 and Polars libraries by @jcanalesluna.
- Lightning-fast queries with Polars - Another blog post that is a good introduction to Polars by @astrojuanlu.
- Polars – Laziness and SQL Context. - A blog post that presents two good reasons to adopt Polars : Lazy and SQL Context by @danielbeach.
- Exploring Polars - The Lightning-Fast DataFrame Library in Python - A blog post on the basics of Polars by @mddas.
What's Changed
- Add seaborn_polars package by @ddotta in #83
- Add blog post by MCodrescu by @ddotta in #85
- Add kedro blog post by @ddotta in #88
- Add blog post by Datacamp by @ddotta in #89
- Add Polars-ruby by @ddotta in #91
- Add blog post by astrojuanlu by @ddotta in #96
- Add new blog post by @danielbeach by @ddotta in #97
- Add blog post by @mddas by @ddotta in #98
- Add tutorial by @AntonsRuberts by @ddotta in #99
Full Changelog: 2023-05-12...2023-05-30
2023-05-12
New Additions
Tutorials & workshops
- Rust Polars: Unlocking High-Performance Data Analysis — Part 1 - First part of an article that explores the world of Rust’s Polars and explain some basic concepts of Polars such as Series by @wiseaidev. Code used is available on Github here.
Blog posts
- Pandas vs Polars vs Pandas 2.0 …. FIGHT - A blog post that does an ETL process for checking big data speed processing between Pandas, Pandas 2.0 and Polars by @guoliveira.
- Pandas vs Polars vs Pandas 2.0 … ROUND 2) - A blog post that makes a new comparison between Pandas, Pandas 2.0 and Polars by @guoliveira.
- Polars VS PySpark: Lazy Evaluation and Big Data - A blog post that compares lazy evaluation between Polars and Spark by @guoliveira.
Talks and videos
- Polars vs Pandas | detailed test with explained results ⏳
22 min
- A video that presents 8 distinct tests which demonstrates differences between Pandas and Polars by @vb100. Associated github repo is here.
What's Changed
- Add comparison video Polars vs Pandas by Data Science Garage by @ddotta in #72
- Add first blog post by @guoliveira by @ddotta in #76
- Add second blog post by @guoliveira by @ddotta in #77
- Add third blog post by @guoliveira by @ddotta in #79
- Add first blog post by @wiseaidev by @ddotta in #81
Full Changelog: 2023-04-21...2023-05-12
2023-04-21
New Additions
Tutorials & workshops
- How to display Polars dataframes with itables - A tutorial that explains how to display Polars dataframes with itables by @mwouts.
Blog posts
- Polars - modern data frame library - A blog post that describes why Polars could be a better alternative to pandas, dplyr or data.table by @DSkrzypiec.
- The fastest way to read a CSV file in Python - A blog post that compares different ways (including Polars, pyarrow and C) to read a CSV file with Python by Finn Andersen.
Talks and videos
- An opinionated introduction to Polars - Great Polars introduction slides from @krlng at PyCon 2023.
- Polars - make the switch to lightning-fast dataframes - A talk that reports an experience switching from Pandas to Polars in a real-world ML project by @datenzauberai.