Today, Parse.ly is happy to announce the 1.0 release of PyKafka, a new Python driver for Kafka. Kafka is “publish-subscribe messaging rethought as a distributed commit log”, and Parse.ly makes heavy use of it. PyKafka is an upgrade of an older library named samsa, which was only compatible with Kafka 0.7.x. We’ve spent the last few months upgrading it to work with Kafka 0.8.2.1 and revamping the codebase to be cleaner and more efficient.
The biggest difference between PyKafka and other Python drivers is the inclusion of a balanced consumer implementation. The PyKafka consumer uses the same balancing algorithm as the official JVM-based driver, allowing us to run many parallel consumers of the same data stream.
We’ve been using PyKafka in production to feed our data processing backend, which processes thousands of data points per second as it powers the Parse.ly dashboard. We’re confident in its stability, and we’re looking to add more features over the next few months and are always trying to find ways to improve performance.
If you’re interested in PyKafka, please join us in working on it! You can find the project on PyPI and Github. We also have a mailing list for questions.
You can also get in touch on Twitter:
- Keith Bourgoin (@kbourgoin)
- Emmett Butler (@sensitiveemmett)