The biggest Python topics of 2023 ›

Python Performance Optimization and Speedup Techniques

This topic delves into the realm of improving performance in Python, particularly focusing on techniques such as multiprocessing and optimizing the CPython interpreter. Documented discussions include strategies for speeding up code execution using libraries like NumPy and exploring the performance challenges and solutions related to concurrency and parallelism in Python development.


Limiting Concurrency in Python asyncio Article

This article shows you how to do rate limiting when dealing with repeated tasks within asyncio. It uses a thread pool and imap_unordered() to show you why the answer may not always be to use a Semaphore.

https://death.andgravity.com/limit-concurrency

Speeding Up Your Code When Multiple Cores Aren’t an Option Article

Parallelism isn’t the only answer: often you can optimize low-level code to get significant performance improvements.

https://pythonspeed.com/articles/optimizing-dithering/

Two Kinds of Threads Pools, and Why You Need Both Article

This article talks about thread pools and how tuning them correctly can make a difference in the performance of your concurrent code.

https://pythonspeed.com/articles/two-thread-pools/

CPython Dynamic Dispatch Internals Article

Just how many lines of C does it take to execute a + b in Python? This article goes into detail about the CPython internals.

https://codeconfessions.substack.com/p/cpython-dynamic-dispatch-internals

joblib: Lightweight Pipelining With Python Functions Project

Computing with Python functions.

https://github.com/joblib/joblib

Python 3.11 Is Faster, but Pyston & PyPy Still Show Advantages Article

There are many speed improvements in CPython 3.11, but that doesn’t mean the Python alternatives don’t still have some advantages. Pyston and PyPy are still better in some cases.

https://www.phoronix.com/review/python311-pyston-pypy

Four Kinds of Optimisation Article

“Premature optimisation might be the root of all evil, but overdue optimisation is the root of all frustration. No matter how fast hardware becomes, we find it easy to write programs which run too slow.” Read on to learn what to do about it.

https://tratt.net/laurie/blog/2023/four_kinds_of_optimisation.html

ExecuTorch: Run PyTorch Programs on Mobile Article

https://pytorch.org/executorch/stable/index.html

Bypassing the GIL for Parallel Processing in Python Article

In this tutorial, you’ll take a deep dive into parallel processing in Python. You’ll learn about a few traditional and several novel ways of sidestepping the global interpreter lock (GIL) to achieve genuine shared-memory parallelism of your CPU-bound tasks.

https://realpython.com/python-parallel-processing/

Developing an Asynchronous Task Queue in Python Article

This tutorial looks at how to implement several asynchronous task queues using the Python multiprocessing library and Redis.

https://testdriven.io/blog/developing-an-asynchronous-task-queue-in-python/

Is Parallel Programming Hard? Article

https://news.ycombinator.com/item?id=36318280

Visualizing the CPython Release Process Article

This blog post covers how the release process of CPython works and includes a diagram documenting each step. It also highlights supply chain threat spots.

https://sethmlarson.dev/security-developer-in-residence-weekly-report-9

pytz: The Fastest Footgun in the West Article

The pytz library and its interactions with datetime are a source of misunderstandings and ultimately bugs. This article points out the problem cases.

https://blog.ganssle.io/articles/2018/03/pytz-fastest-footgun.html

AsyncIO: Why I Hate It Article

Charles is the creator of Peewee ORM and often gets the question “when will it support asyncio?” In this opinion piece he talks about why he doesn’t like asyncio and the alternatives he prefers.

https://charlesleifer.com/blog/asyncio/

CPython Internals: Understanding the Role of PyObject Article

Understand how objects are implemented in CPython and how CPython emulates Inheritance and Polymorphism in C using struct embedding.

https://codeconfessions.substack.com/p/cpython-object-system-internals-understanding

Running Python Parallel Applications With Sub Interpreters Article

Python 3.13 is adding programmatic control over sub-interpreters in Python. They spawn faster than creating a new process, but slower than threads. Learn why and how you can use them in the next release of Python.

https://tonybaloney.github.io/posts/sub-interpreter-web-workers.html

Why Python Is Better Than C++ for Algotrading Article

Even in high speed trading, time to market can be more important than the performance of the code. This blog post from a trading systems programmer outlines why he prefers Python to write his code.

https://profitview.net/blog/why-python-is-better-than-cpp-for-algotrading

Faster CPython at PyCon Article

This article summarizes the report the Faster CPython team gave at PyCon 2023. It gives information on PEP 659 Specializing Adaptive Interpreter and other performance improvements on the roadmap.

https://lwn.net/Articles/930705/

Vectorizing Wide PyTorch Expressions? Article

“In scientific computing, code is often naturally expressed as wide, tree-like expressions. Often different branches of that tree contain similar chunks of logic, so there is potential to run many different branches together in parallel vectorized operations.”

https://probablymarcus.com/blocks/2023/10/19/vectorizing-wide-pytorch-expressions.html

Long-Term Vision for a Parallel Programming Model? Article

https://discuss.python.org/t/what-is-the-long-term-vision-for-a-parallel-python-programming-model/39190

Simple Async Queue Article

https://saq-py.readthedocs.io/en/latest/

Guide to Queues in Python Article

A queue is a mechanism for storing information in a system, and is a particularly helpful data structure when dealing with multi-processing. Learn all about queues in Python.

https://stackabuse.com/guide-to-queues-in-python/

{n} Times Faster Than C …with Python Article

SIMD instructions along with Python calling out to its extensions can provide pretty impressive speed-up. This article shows you one such case.

https://eddieantonio.ca/blog/2023/07/12/faster-than-c-with-python/

Codon Achieves Orders-of-Magnitude Speedups Article

https://news.ycombinator.com/item?id=35165218

Two Ways to Turbo-Charge tox Article

How to pre-build wheels to improve the installation portion of a tox run and how to gain some parallel test execution if run-parallel doesn’t work for you.

https://hynek.me/articles/turbo-charge-tox/

Asyncio, Twisted, Tornado, Gevent Walk Into a Bar… Article

A good introduction to I/O bound concurrency in Python and the libraries used to achieve it. Has a nice compare and contrast between the approaches and finishes with some good advice: you probably don’t need any of them.

https://www.bitecode.dev/p/asyncio-twisted-tornado-gevent-walk

Why & How Python Uses Bloom Filters in String Processing Article

Dive into Python’s clever use of Bloom filters in string APIs for speedier performance. Find out how CPython’s unique implementation makes it more efficient.

https://codeconfessions.substack.com/p/cpython-bloom-filter-usage

Python’s Multiprocessing Performance Problem Article

While multiprocessing allows Python to scale to multiple CPUs, it has some performance overhead compared to threading. This article details why processes have performance issues that threads don’t, ways to work around it, and a sample bad solution.

https://pythonspeed.com/articles/faster-multiprocessing-pickle/

The Heisenbug Lurking in Your Async Code Article

When using the create_task() function in asyncio it is very important to maintain a reference to the created tasks. Although this requirement is documented, it is easy to forget and can have some very hard to understand consequences.

https://textual.textualize.io/blog/2023/02/11/the-heisenbug-lurking-in-your-async-code/

Backend of Meta Threads Is Built With Python 3.10 Article

https://news.ycombinator.com/item?id=36612835

The Easy Way to Concurrency With Python Stdlib Article

Although writing concurrent programs can be challenging, certain kinds of parallelism aren’t that bad. This article introduces you to the ThreadPoolExecutor and shows you how to deal with I/O bound processing. Associated HN discussion.

https://www.bitecode.dev/p/the-easy-way-to-concurrency-and-parallelism

Overhead of Python Asyncio Tasks Article

The Textual library uses a lot of asyncio tasks. In order to determine whether to spend time optimizing them, Will measured the cost of creating asyncio tasks. TLDR; optimize something else. This article also spawned a conversation on Hacker News.

https://textual.textualize.io/blog/2023/03/08/overhead-of-python-asyncio-tasks/

Why Does Python Code Run Faster in a Function? Article

Python is not necessarily known for its speed, but there are certain things that can help you squeeze out a bit more performance from your code. Surprisingly, putting your code in a function might be one of them.

https://stackabuse.com/why-does-python-code-run-faster-in-a-function/

Understanding CPUs Can Help Speed Your NumPy Article

With a little understanding of how CPUs and compilers work, you can speed up NumPy using Numba, the just-in-time compiler.

https://pythonspeed.com/articles/speeding-up-numba/

Some Reasons to Avoid Cython Article

Cython lets you seamlessly merge Python syntax with calls into C or C++ code, making it easy to write high-performance extensions, but it is not the best tool in all circumstances. This article goes over some of the limitations and problems with Cython, and suggests alternatives.

https://pythonspeed.com/articles/cython-limitations/

7 Ways to Share a NumPy Array Between Processes Article

If you’re doing multi-processing with NumPy you will need to pass arrays between processes. This article covers different ways of doing just that.

https://superfastpython.com/numpy-share-array-processes/

Faster CPython at PyCon, Part Two Article

This is the second part of an article describing the conversations at PyCon around CPython optimizations and performance improvements being worked on as part of the Faster CPython project.

https://lwn.net/Articles/931197/

Not-So-Casual Performance Optimization in Python Article

Nathaniel did a small project where he implemented the sum of inverse squares in multiple programming languages. The Python version was rather slow. This article talks about alternate ways of writing the Python code for better performance.

https://www.nathom.dev/blog/casual_performance_optimization_python/

When NumPy Is Too Slow Article

Sometimes just switching to NumPy just isn’t enough of a speed boost, what then? Before you contemplate parallelism, there are other approaches. This articles shows you other ways of improving performance.

https://pythonspeed.com/articles/numpy-is-slow/

5 Common Asyncio Errors in Python (And How to Avoid Them) Article

Asyncio is one of several methods of doing parallelism in Python. It uses a co-routine structure. This article describes five common errors people new to asyncio may make and how to avoid them.

https://superfastpython.com/asyncio-common-errors/

Learn the Latest AI Capabilities with Python 3.11 via OpenVINO™ 2023.0 Article

Looking for more AI optimizations with Python 3.11? Check out OpenVINO™ DevCon for monthly workshops on how you can improve your AI applications with OpenVINO Toolkit’s 2023.0 release. Register. Learn. Connect.

https://software.seek.intel.com/openvino-devcon

New Library Updates in PyTorch 2.0 Article

Learn what’s changed in the newly released PyTorch 2.0 library. Includes new data collectors, augmentation operators, vision features, and loads more.

https://pytorch.org/blog/new-library-updates-in-pytorch-2.0/

Cython vs CPython: Comparing the Speed Difference Article

This article does a speed comparison between Cython and CPython using eleven different benchmarks. And although, as expected, Cython is faster, it isn’t in every scenario.

https://coderslegacy.com/cython-vs-cpython-comparing-speed/

Using NumPy and Linear Algebra for Faster Python Code Article

Are you still using loops and lists to process your data in Python? Have you heard of a Python library with optimized data structures and built-in operations that can speed up your data science code? This week on the show, Jodie Burchell, developer advocate for data science at JetBrains, returns to share secrets for harnessing linear algebra and NumPy for your projects.

https://realpython.com/podcasts/rpp/146/

Accelerating Python Code With Numba Vectorize Article

This article delves into the inner workings of Numba Vectorize. Learn how to harnesses the power of SIMD operations to improve code performance.

https://coderslegacy.com/python-code-with-numba-vectorize/

PyTorch Performance Features and How They Interact Article

PyTorch in 2023 is a complex beast, with many great performance features hidden away. This article goes through a series of empirically tested tuning techniques and settings in all combinations.

https://paulbridger.com/posts/pytorch-tuning-tips/

Python’s multiprocessing Performance Problem Article

Last week’s issue of PyCoders included a link to Python’s multiprocessing performance problem which now has a Hacker News conversation to go along with it.

https://news.ycombinator.com/item?id=34974480

The Speed of Python: It Ain’t That Bad! Article

The articles discusses whether Python is as slow as so many authors claim. When doing so, it mentions highly optimized Python frameworks for numerical computation, efficient compilers - but also coding time as opposed to execution time. All in all, Python is much faster than most think.

https://medium.com/pythoniq/the-speed-of-python-it-aint-that-bad-9f703dd2924e

Python Bindings for Performance Optimization Article

“This article describes techniques to accelerate a Python codebase by exposing parallelized C++ functions using PyBind.” The example in the article achieves a 3x speed-up through this technique.

https://alexhagiopol.github.io/posts/2023/01/python-bindings/

What Would You Want in Tomorrow’s CPython Build System? Article

https://discuss.python.org/t/what-do-you-want-to-see-in-tomorrow-s-cpython-build-system/28197

Build AI-powered Internal Tools 10x Faster with Python & Superblocks Article

https://www.superblocks.com/webinar/python?utm_campaign=Newsletter-202304PycodersWeekly&utm_source=pycodersweekly

Faster Python 3.13 Plan Project

This brief outline highlights the plan for the faster CPython project for the 3.13 release. Includes PEP 669, PEP 554, improved memory management, and more.

https://github.com/faster-cpython/ideas/blob/main/3.13/README.md

mpire: Easy, but Faster Multiprocessing Project

A Python package for easy multiprocessing, but faster than multiprocessing

https://github.com/sybrenjansen/mpire

lpython: Python Compiler Project

Python compiler

https://github.com/lcompilers/lpython

Trace Your Python Process Line by Line Project Started in 2023

Trace your python process line by line with low overhead!

https://github.com/furkanonder/beetrace

Optimizing WebSocket Calls Experiment Project Started in 2023

This post demonstrates replacing the Python code that accepts a WebSocket connection with a C++ equivalent. It shows you how to call C++ code from Python and what kind of speed-up to expect.

https://github.com/szabolcsdombi/optimization-demo

fastnumpyio: Fast NumPy I/O Project

Fast Numpy I/O : Fast replacement for numpy.load and numpy.save

https://github.com/divideconcept/fastnumpyio