These days, more and more people ask the Python vs. PHP question when they start out with a web application from scratch. I’ve developed PHP applications for 5 years but for the last couple of years I’ve been doing Python. This post is meant to note some of my observations. If you don’t want to read the whole of it, my opinion — opinion — is to stick with PHP for dishing out your *ML. Use Python in the back-end, if you must.
Production Worthiness
In terms of production stability and ubiquity, Python is still playing catch-up to PHP. PHP derives a huge benefit from its custom fit into the web development model, whereas Python is a general purpose language, with an adaptor for web development, called WSGI. PHP doesn’t do multi-threading, which is a great way to guide developers in a “shared nothing” line of thought. PHP and apache also work together to protect you from a lot of issues — you can enforce a limit on the memory your script uses for each request. Memory leaks are not a bother, thanks to periodic reaping of apache processes if you use the default prefork
MPM and set MaxRequestsPerChild
to a sane value (default is 4000).
Most Python web application frameworks would like you to build multi-threaded applications and recommend that you run your Python application as a daemon. From what I’ve seen so far, you really don’t want to be doing this. Multi-threading is not easy to begin with. Then there’s the GIL, which basically prevents a single Python daemon from utilising multiple CPUs and/or cores at the same time. That’s enough to be a show-stopper for an application being written in 2010. Third, I’ve seen random lock-ups, and slow memory leaks happen with Python daemons under the kind of production loads a small website may encounter (~ 1 Mbps network I/O; 20 reqs/sec).
The model that I would recommend is to run your Python application embedded in apache process space through mod_wsgi
. This primarily ensures that you don’t need to renice
up the priority of your Python daemon process to give it sufficient CPU time-share while under load and, more importantly, it enables the MaxRequestsPerChild
clean-up mechanism.
Runtime Model
In terms of runtime model, one of the things to keep under consideration is that PHP would load all of its extensions, whether or not they’re used, while Python modules are loaded on demand. The PHP scenario is not too bad, though, since the memory usage is shared across all child processes, not multiplied across them. One of the differences that don’t stand out but are significant is that PHP’s extension library is almost all written in C, whereas Python libraries are not, to a very large extent. Many Python frameworks choose to implement a lot of important stuff like data structures in native Python. Given that a typical web application spends most of its time iterating through records from a database and doing various manipulations on it, having core functionality implemented in the script is not optimal. Python templates make the whole situation even worse.
That said, if you want to bet on the future, the JIT compilers being developed for Python might help you out. Then there’s HipHop for PHP.
Development Models
One of the big reasons for Python’s attractiveness as a development platform, apart from syntax and language features, is the presence of Ruby On Rails inspired frameworks. The most popular among them is Django, but I’ve also developed on Pylons and Werkzeug. I won’t list down the reasons why these frameworks are attractive, but they’re very good from a development model perspective.
PHP too has frameworks but they seem extremely loaded and superfluous to me. PHP itself is the framework for web development. It offers templating, it offers routing and it has a rich library of web development helpers. What it doesn’t do is a couple of things:
enforce MVC model out-of-the-box. You can write your entire application in one file and do it easily to start with
play well with the (IMO highly retarded) Java and GoF Patterns style of object oriented programming
One large application that I’ve worked on had thousands of files, each corresponding to a PHP class, with layers upon layers of wrapper classes without much individual responsibility. Some poor code organisation decisions also led to a scenario where each user facing page ended up including almost all of the 100k lines of PHP in the application.
Python would, to some extent, let you get away with this atrocity. More importantly, though, if you’re using the web frameworks, it could effectively prevent you from landing up in such a situation to begin with. If you know enough of MVC and shared-nothing fu, I think it’s worthwhile going with PHP as of now, if only for performance concerns.
How About PHP and Python?
Like I mentioned, one of the big reasons for using Python for web applications is the development model and the programming language capabilities itself. PHP takes care of the ‘V’ and ‘C’ parts of MVC but trying to write complicated models in PHP can lead you to pushing at the limits of the language’s development model.
One of the development models I am exploring (and have working in production already) is a two-tier application layer. The “model” lives as a separate tier of the application, accessed through JSON-HTTP services. The “front-end” is done entirely in PHP and is only responsible for generating web pages and handling browser-based interaction. This model has two benefits:
The ‘front-end’ runtime is completely decoupled from the ‘back-end’, whether it be complexity, performance or scaling concerns
You get usable webservice APIs right out of the box.
Someone’s quipped that not having APIs in 2009 is like not having a website in 1999. I wouldn’t disagree. Every web application that I’ve built since 2004 has eventually had to support some kind of APIs. That’s essentially the reason for wsloader‘s existence. It allows you to write your super heavy-duty back-end in Python, while requiring you to provide a portable interface, which it can automatically expose as a JSON-HTTP web-service. Writing 1000 words to make a plug seems preposterous, and probably is, so please don’t consider this as a plug!