We asked Python experts what useful Python libraries they use and recommend for other developers.
Python is an untyped programming language, and on the one hand that's an advantage that promotes quick learning and sets a low entry threshold, but at some point, the lack of typing becomes a problem.
With pydantic and the data type hints from PEP 484, you can turn python into a strictly typed language. This is especially useful when your application has an API but no contracts.
A handy logger to replace the standard logging. Aside from not requiring complex configurations, it has a huge number of useful features and works well with asynchronous code.
Colorful log output to the console, informative tracebacks with different levels and hints, a handy built-in log parser, multi-threading support and thread-safety are just a small fraction of the functionality of this great Python library. In terms of ease of use it's comparable to a regular print, but in terms of functionality it's a whole rocket.
If you cover your code with unit tests, you're familiar with the situation when you spend time generating test data and various fixtures. This is especially true when you're using an ORM and you have a lot of models with a lot of fields.
Instead of spending time on writing different datasets and helper generators to cover all possible cases, all you need to do is to create a factory, specify the model to be tested and specify the desired number of fixtures when you call it.
An excellent Python library from Yandex for natural language processing. Unlike pymorphy2, it is not as good at converting a word to its normal form, but is excellent, and most importantly, very fast at determining part of speech and lexemes of a word. It works only with the Russian language.
The PEP8 codestyle convention evolved in parallel with Python. Codestyle is an essential part of any project, especially when you work in a team. It takes a lot of time and practice to follow all the canons, but to make sure your code always looks the way it should, you can use black.
This auto-formatter will save you time, reduce the number of comments at reviews, and your code will always look the same, regardless of the project.
In Python we develop mostly computer vision solutions (like this one) , so our selection is related to that in one way or another. Very helpful:
Flask is a micro web framework that allows us to quickly make services and integrate our solutions. It has a lot of useful extensions. Suitable for both experimentation and industrial use.
Keras is a nice high-level API for TensorFlow. Saves a lot of time and keeps code readable. Pretty low entry threshold compared to pure TF.
Numpy - very handy tool for working with multidimensional arrays and matrices, indispensable in deep learning.
Pillow - good old Python library for working with images. Lots of formats, pixel manipulation, filters, effects.
I have included in my list not the most popular, but no less useful Python libraries.
iuliia transliteration library
Sometimes you need to write Cyrillic words in Latin. There are different standards and rules for transliteration, so somewhere I am Dmitrij, somewhere Dmitry, and somewhere Dmitrii.
In the library standards are collected together and correctly implemented special rules for certain combinations of letters and word endings. The home page briefly describes the differences between the schemes for easy selection.
Convenient data validation with pydantic
In almost any project, you need to accept data from users or colleagues from other projects.
You need to validate that data so you don't get hurt. The pydantic project allows you to get rid of the hassle and do validation conveniently on all sorts of data. You can describe the data structure you want so that you can not only validate it on input, but also get detailed messages when there are errors. You can write your own validation rules for individual fields, validate some fields together (for example, if the last name field is populated, then the first name field should be populated too) and much more.
Dramatiq to handle distributed tasks
I think almost every Pythonist has encountered the Celery project.
Dramatiq is the project of a guy who got so tired of some problems in Celery, that he decided to write his own project. It turned out very well. If you need to do any tasks that you want to send to the background (recalculating some indicators, mass updating of different data from external services, preparing mails for mailing, generating reports and so on) this library is for you.
Httpx - asynchronous http library
In the Python world, the requests library is practically a model for usability and functionality. Unfortunately, it cannot be used in asynchronous code.
The authors of httpx create an asynchronous http library with the same interface as requests. Synchronous work with httpx is also possible.
The project is still officially in beta, but is already quite stable and used by many programmers.
Loguru for setting up logging
Setting up logging via the standard logging library is no fun. If your project is not so big that you want to tweak the syslog, journald, or ELK stack, but you really want to see good logging, try loguru.
The main advantage of this Python library is easy setup and a lot of nice features out of the box:
- message highlighting in different colors,
- good formatting,
- easy to set up logging to files with rotation and archiving, and much more.
Python library for date detection - dateparser
A lifesaving library that comes to the rescue when you need to recognize a date from strings in a wide variety of formats. It will be useful if you are parsing web pages or some logs from a variety of sources.
It can recognize both the usual 'March 2, 2021 at 15:00', '2021-03-04 10:01:02 UTC+3' and such extremes as 'in 5 days', 'a week ago'. Supports many languages.
Python library for creating tqdm progress bars
Sometimes you need to run a script which processes a lot of data (it can be a command for Django or just a script to convert a large number of files). If nothing happens in the console, then after a while it seems that the script hangs, it is unclear how fast the process is progressing and how long to wait.
With tqdm library you can quickly create indicators (progress bars) to show the process. Out of the box it integrates with IPython/Jupyter.
In my opinion, this is one of the most useful tools when you need to measure, track and analyze the memory usage of specific objects in python applications.
A handy Python library that allows you to retry a function call if it fails. For example, this may be useful if you need to repeat a request to an external service if it was not successfully called the first time. Of course, you can change many variables, such as the number of attempts, the time to try, and so on.
A tool for imitating responses to requests made with the requests library. It is very useful for testing the behavior of a feature depending on the response returned by an external service.
Faker allows to generate fictitious data in different categories. For example, names, addresses, bank card details, phone numbers and so on. The list of categories is really extensive. Quite often this library is essential for testing the functionality.
The tool provides a lot of power and a convenient API for formatting the text displayed by the application in the console. It allows you to change the text style and color, draw tables, highlight language syntax, work with emoji and much more. Note that this library is not written in python, but it helps with python applications.
Ru-spy is a sampling profiler for Python applications. The distinguishing feature of this profiler is that it allows you to connect to an already running application to be analyzed without adding any code. The tool allows you to see in real time what is running and how much time it takes, to collect information about the running application and use that information to generate a flamegraph, for example. Very useful when you need to find a problem in the application "here and now" or when there is no way to change the code to integrate with other profilers.
There are situations when you need to write tests for functionality that has different behavior depending on current time/date. This tool is just for such cases. It lets you fix the time at the value you want by swapping out the datetime module.
Also not a bad library that contains a set of functional utilities.