Why You Should Love Code Generation

What a nice title, right? I often see people complain about code generation, and I would like to explain my point of view in a pragmatic approach. As you may know, I’m the lead developer of Propel, a great PHP5 ORM which uses code generation a lot.

Code generation is about to use code to write code. Pretty easy, isn’t it? In Propel2, we will use Twig to generate PHP code for instance. The golden rule is to never use the same programming language you want to generate, otherwise it will be a pain to maintain (try to read Propel 1.6 builders). But in order to generate code, we need to know information that describe the code we want to generate. The term metadata defines these information. A metadata is a data about data, that’s exactly what we need!

In Propel, the schema.xml (a XML file) contains all metadata we need to generate both PHP and SQL code. As Propel is an ORM, data will be a database name, tables, columns, primary keys, etc. This is your business model! We can call it a Platform Independent Model (PIM) because this schema is database vendor agnostic. It doesn’t know anything about MySQL, Oracle, or whatever you want. It just describes your model with its own notation.

To generate code, Propel relies on builders and platforms. Builders own the logic to write PHP code, and Platforms contain the logic for each database vendor (whether to quote table names or not for instance). This is specific, right? So let’s call this layer a Platform Specific Model (PSM).

Propel uses both a PIM and a PSM to generate code. Actually, it uses the PIM to drive the PSM which depends on configuration parameters also known as build properties in Propel terminology. That’s all, here is how Propel works at buildtime :-)

But wait, what? It’s all about Model Driven Architecture! Did you ever hear about that? It’s the state of the art in research in Computer Science (at least in France). This is a software design approach for the development of applications.

Picture is missing here :(

But this is a formal approach, and I would like to keep a pragmatic mind here through Propel. Then, what I love in code generation is that your code is easily debuggable, you can read it, and learn from it. Few years ago, generated code was a mess, especially in symfony 1.x, but today we are able to generate clean code, as you could write yourself. And, as you write code to generate code, you can test both, and avoid more errors. To generate code speeds up your application, and you can pre-calculate some parts of your code. In Propel (again), we are able to pre-generate SQL statements at buildtime. That way, we by-pass the need to generate SQL code at runtime. It’s really really faster than to use the whole Propel stack.

In Propel, we also have behaviors. This is a self-contained logic which extends the current implementation by using hooks at buildtime. A behavior is reusable, testable, and easy to write. Then, you can adopt a behavior driven development: write behaviors you will reuse in order to avoid code duplication.

Picture is missing here :(

I know there are drawbacks to use code generation, especially the need to generate code before to deploy your application in production, or the lack of implementation details. But, for the most part of an application, to generate code helps a lot, and it helps you to get what you really want to get too.

To sum up, code generation is not a crazy idea. It’s close to MDA, it has pros and cons but in my opinion, code generation is definitely not a bad practice, and you should use it instead of relying on in memory code or complex architectures just to avoid generation.

By the way, if you found a typo, please fork and edit this post. Thank you so much! This post is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

If you like this post or if you use one of the Open Source projects I maintain, say hello by email. There is also my Amazon Wish List. Thank you ♥

Comments

Fork me on GitHub