Tuesday, October 21, 2008

Hand-written and Generated Code: Never the Twain Shall Meet

There are many tools these days that generate code. Before writing such a tool, stop to consider if you really should be generating code in the first place. After all, you're generating code from a model---you might not think of it as a model, but that's what I call it because that's what it is---so if you have enough information to generate the code that realizes the behavior described by the model, you obviously have enough information to emulate the behavior of the model. Byte code has a cost. It causes bloat. Don't produce it if you don't have to. EMF, for example, can emulate an instance of an Ecore model, including a fully functional editor, without generating a single line of code; just trying invoking "Create Dynamic Instance..." on any EClass' pop-up. It's a cool thing.


If you do have a good reason to generate code, keep in mind that humans will read it. Hand writing bad code is unacceptable, but generating bad code is completely inexcusable. Have you ever seen generated code where every referenced class name is fully qualified? It's clearly the simplest way to avoid name collisions, but it seems disrespectful of the human reader. Generating code that isn't of hand-written quality gives generators a bad reputation so focus on creating a thing of beauty.


In the ideal world, generated code would be complete. It would never need to be sullied from its untouched pristine state. Technically, you would not even need to put it under source code control because of course you can always regenerate it. You'd want to be very careful to version the generator in that case though. And keep in mind that if you don't version control the generated code, all your clients will need to install the right version of the generator tools simply to produce a functional code base. Also, it will be more difficult to detect when changes in the generator produces code that's different from what you've been testing. Treating generated code as if it's ephemeral has definite appeal, but is something to consider carefully.


You've probably noticed that the world is typically not quite ideal, and often far from it. So it's often the case that clients need to tailor what's generated. Sometimes that's even the whole point: the generated code is just scaffolding or a starting point from which to hand code a complete application. It's typically important to be able to invoke the generator again if the input model for it changes. Because many generators will simply overwrite any files they generated that last time, keeping hand written changes separate is obviously important in that case. But many generators also support protected regions where users can write their code such that it will not be overwritten. EMF takes this design to the extreme, effectively inverting it, by marking all the regions that the generator may touch. I like to think it's a bright idea.


There are those who believe that one should never modify generated code. I'm not one of those people, though there are clear advantages to avoiding it. For example, it's really easy to see what you've written yourself verses what was generated for you. JDT's support for filters mitigates that advantage by supporting the same thing dynamically, i.e., hides everything marked @generated. More importantly, it's possible to delete all the generated stuff to do a clean sweep. That's probably the strongest reason. On the downside, more classes result in more bloat. Even an empty class will take close to 0.5k. Worse yet, if you can't anticipate which files a user will wish to specialize, you're liable to double the number of classes. For example, in the implementation of MOF that preceded EMF, for every EClass Foo, it would generate FooGen, Foo, FooGenImpl, and FooImpl, where Foo extends FooGen, FooGenImpl implements Foo, and FooImpl extends FooGenImpl and implements Foo. The whole design caused significant bloat and just looked very stilted; even in the public API was very clearly tainted by the fact a generator was being employed. It's import to realize that small droplets of bloat will tend add up...


So while some will argue that when it comes to hand written and generated code, never the twain shall meet. I think it's important to keep in mind that, as with most things in life, there are trade-offs to our design decisions . As such, it's more important to explain and understand all the considerations that should be taken into account when making a choice than it is to decide which specific choice is a best practice in general. After all, EMF's generator model generates both the Ecore model and itself, so we're not actually in a position to delete our generated code. We need it to bootstrap the environment. It's prickly problem.


So while it's often a good practice to separate generated code from hand written code, and it's not necessary to version control generated code in that case, these decisions come at a price.

8 comments:

ekke said...

for me its always the most important thing to decide which code is only generated, which is always enhanced by manual written code and which is partly enhanced manually.
in my projects are always all three kinds and (after the difficult decision which to use where) its easy to use it all with openArchitectureWare, beause I can define different outlets for generated or generated-with-protected-areas.
if there are only some parts of the code manually enhanced I prefer the protected regions and avoid the complex structure of IMPL classes / interfaces as you described above. If my software-design needs such a structure - OK, but if I bloat my code with these classes only for technical reasons because I'm generating code then I'm using protected areas instead.

Jan Köhnlein said...

Here are some of the experiences which made me avoid a mix generated and manually written code:

Mixing generated and hand-written code requires the target platform to support some kind of annotation or comment mechanism. This is not always the case, e.g. for Eclipse plug-in manifests. Furthermore, Eclipse gets quite confused if you change the manifest with an external generator. So I even prefer completely generated and completely manually written plug-ins.

A broken generator run can quickly mess up your whole workspace, making it very hard to recover your manual changes. Checking everything into a VCS before running the generator is not always an option.

Furthermore, many reconcilers I used - the programs that actually merge generated and existing code - work only half ways. If I have to check for each manual change if it's still there after generating code, the development process gets rather annoying. I might even loose all the agility of the generative process.

One more thing: Inheritance is not the only mechanism to integrate generated and hand-written code. Consider using dependency injection, generated hooks or callbacks, extension points etc. which will not bloat your codebase.

Ed Merks said...

Jan,

Yes, things like plugin.xml and MANIFEST.MF don't support merge because we don't have a nice mechanism for marking things. They also don't support separating into a generated part and an non-generated part. So they suffer from the generate once problem. I saw a few examples at MDSD where files are generated once as placeholders for user changes. This seems similar and that's okay. But doubling the number of plugins, like doubling the number of classes, doesn't seem like an ideal approach to advocate as best for all cases.

When you're designing a generator and it could produce a mess, it's of course very frustrating to mess up a code base. Of course EMF's generator doesn't do that, though while developing it, we had to be more careful than we need to be today. Fortunately Eclipse keeps a history so recovering changes is annoying but not impossible. Keeping a backup zip is of course possible as well. But obviously separating the two is easiest from a disaster recovery point of view.

I've been using EMF's merging generator for many years, so I have great confidence that it produces only good results; the worst possible outcome is to overwrite code that's marked as owned by the generator but that I had intended to control myself. It's an exceedingly agile process that an uncounted number of people use every day...

The inheritance example was used because Ecore's purpose is to generate exactly such an API. When designing a DSL, the purpose is to generate infrastructure to realize that DSL's semantics, so all manner of good techniques are available. If it's possible to produce a good design that facilitates separation of generated and hand written code, I totally agree, that's all the better. But to argue that all generators should conform to this, including and in particular EMF's generator for Ecore, seems to me to be over zealous.

wow gold said...

Weekends to peopleig2tmean that they can have a two-day wowgold4europe good rest. For example, people gameusdcan go out to enjoy themselves or get meinwowgoldtogether with relatives and friends to talk with each storeingameother or watch interesting video tapes with the speebiewhole family.
Everyone spends agamegoldweekends in his ownmmoflyway. Within two days,some people can relax themselves by listening to music, reading novels,or watchingogeworld films. Others perhaps are more active by playing basketball,wimming ormmorpgvipdancing. Different people have different gamesavorrelaxations.
I often spend weekends withoggsalemy family or my friends. Sometimes my parents take me on a visit to their old friends. Sometimesgamersell I go to the library to study or borrow some books tommovirtexgain much knowledge. I also go to see various exhibition to broadenrpg tradermy vision. An excursion to seashore or mountain resorts is my favorite way of spending weekends. Weekends are always enjoyable for me.
igxe swagvaultoforu wowgold-usaignmax wowgoldlivebrogame thsaleGoldRockU

Anonymous said...

Do you play any internet game? Do you know WOW? The biggest and the most famous mmorpg. What is your hero's level? Do you want to make them more charmful, more powerful? If you use yes to answer my questions, then I guess you must want to get much more gold to make your hero more charmful and more powerful. How do you get the gold? farm by yourself or just buy the gold from some internet store? If you have the habit to get the gold from internet store, I strongly recommand you get wow gold from Masswowgold.
Masswowgold is a professional site designed specifically for tradingWorldofWarcraft gold from gamer to gamer. Not only is the price cheaper than other sites, but also the service is instant, secure and professional. Masswowgold dedicate themselves to offer WOW players with great prices and quality services. They do instant delivery through customer service 24 hours a day, 7days a week and the prices are updated regularly according to current market rates. You can buy cheap wow gold from masswowgold, fast delivery and good service. Masswowgold is a booming provider of MMORPG virtual currency and assets including buying & selling service. It has started its online selling since 2004. It supports payment by Paypal, Credit Card, Moneybookers ,Western Union and Bank Transfer. With its advanced internal ERP system, extensive supplier network, enthusiastic employees and 24/7 live chat; Masswowgold has helped about 150,000 WOW players. Its concept is: Honesty, Enthusiasm, Innovation and Cheap. New chapters will be written by its unceasing innovation.
Hope what i said just now is useful for you and for your WOW life. Hope you can enjoy your WOW life better with WOW gold from masswowgold.

Aion also is a very interesing game, new game. But i dont know if it has been started in USA. It is very popular in Korea and China. 永恒之塔代练 is a very hot word in China. Many people want to level their hero become high level, but they have not enough time to do that. So they just let others to help them. Beuwant provides this kind of service. They do the powerleveling with hands but not bots, so aion代练 is very safe.

Anonymous said...

網頁設計,網頁設計公司,最新消息,訪客留言,網站導覽
情趣用品,情趣用品,情趣用品
色情遊戲,寄情築園小遊戲,情色文學,一葉情貼圖片區,情人視訊網,辣妹視訊,情色交友,成人論壇,情色論壇,愛情公寓,情色,舊情人,情色貼圖,色情聊天室,色情小說,做愛,做愛影片,性愛

免費視訊聊天室,aio交友愛情館,愛情公寓,一葉情貼圖片區,情色貼圖,情色文學,色情聊天室,情色小說,情色電影,情色論壇,成人論壇,辣妹視訊,視訊聊天室,情色視訊,免費視訊,免費視訊聊天,視訊交友網,視訊聊天室,視訊美女,視訊交友,視訊交友90739,UT聊天室,聊天室,豆豆聊天室,尋夢園聊天室,聊天室尋夢園,080聊天室,080苗栗人聊天室,女同志聊天室,上班族聊天室,小高聊天室
AV,AV女優
視訊,影音視訊聊天室,視訊交友
視訊,影音視訊聊天室,視訊聊天室,視訊交友,視訊聊天,視訊美女,視訊辣妹,免費視訊聊天室
自慰器,自慰器

eda said...

101煙火,煙火批發,煙火工廠,製造浪漫煙火小舖,煙火小舖,衣蝶,衣蝶,情趣用品,情趣商品,情趣,情趣,衣蝶情趣精品百貨,衣蝶情趣精品百貨,煙火批發,情趣禮品,成人用品,丁字褲,按摩棒,跳蛋,情趣內衣,情趣精品,情趣商品,情趣用品,情趣,情趣,真愛密碼情趣用品,真愛密碼,真愛密碼,真愛密碼情趣用品,貓裝,自慰器,性感內褲,角色扮演,丁字褲,,跳蛋,AV,丁字褲,煙火,情趣用品,情趣用品

酒店上班請找艾葳 said...

艾葳酒店經紀提供專業的酒店經紀,酒店上班,酒店打工、兼職、酒店相關知識等酒店相關產業服務,想加入這行業的水水們請找專業又有保障的艾葳酒店經紀公司!
艾葳酒店經紀是合法的公司、我們是不會跟水水簽任何的合約 ( 請放心 ),我們是不會強押水水辛苦工作的薪水,我們絕對不會對任何人公開水水的資料、工作環境高雅時尚,無業績壓力,無脫秀無喝酒壓力,高層次會員制客源,工作輕鬆。
一般的酒店經紀只會在水水們第一次上班和領薪水時出現而已,對水水們的上班安全一點保障都沒有!艾葳酒店經紀公司的水水們上班時全程媽咪作陪,不需擔心!只提供最優質的酒店上班環境、上班條件給水水們。