Wednesday, November 19, 2008

Eclipse Summit Europe 2008: The Symposiums

There was really good turnout for the modeling symposium.

Well over forty people attended.

Markus Voelter started the symposium with a brief outline of the agenda and a description of how the "open space" discussion would be structured in the afternoon.

Lars Schneider of Siemens presented his position paper on model quality assurance based on his research work.

He defines qualitative metrics for assessing model quality, which he terms as smells. The idea is to detect problematic patterns, i.e., bad smells, and then to use refactoring to eliminate such patterns. He described how his refactoring tool can be used to apply basic refactoring transformations and he also showed how the his graphical tool is used to design transformations.

Sandro Boehme of inovex presented a description of how EMF's EStore is used to provide support for Java soft references, a good solution for dealing with models that are too big to fit within memory.

Recall that soft references will be garbage collected when the JVM runs low on memory. By generating an EObject implementation based on the generator's reflective delegation pattern, it's possible to delegate all data access onto an InternalEObject.EStore. The store itself delegates to a map that manages the soft references under the covers. He uses this approach to implement EMFT's JCRM so it can access and traverse models larger than can fit in available memory.

Dimitrios Kolovos of York University described how GMT's Epsilon can be used to navigator heterogeneous models.

It's often the case that we need to establish links between different types of models. He described an example problem of applying probability information to an activity diagram for simulation purposes. In this scenario, it's desirable to be able to treat such information as if were just properties directly on the model. To solve this problem, the Epsilon runtime queries the available models to determine which ones know something about a specific property being associated with some type object, i.e, which ones define bridges.

Hajo Eichler of ikv++ talked about executable models.

He described some existing approaches, which as he said, are interesting, but not EMF based, and hence don't directly solve the problem. He defines operational semantics by extending Ecore and has founded it on a well-defined formalism. There is a graphical editor for specifying the dynamic behavior. He showed the debugger directly working with specified behavior. Effectively this approach allows him them very quickly specify the full semantics of a complex domain specific language.

Moritz Eysholdt of itemis described his work on meta model evolution.

He described a simple example of a document with an author-name attribute evolving to become a model with an author object to carry that name attribute. The problem is of course what happens to existing instance serialized according to the old model? If one can record the changes made to the model, that information can be used to determine how the corresponding instances need to be transformed. It's also possible to simply compare the original and final versions. In general, one needs to collect information about how the original model was changed. That information is encoded as an Epatch which groups the changes in a meaningful so that corresponding changes can be interpreted in a meaningful way against the instances.

Markus Herrmannsdoerfer described coupled evolution of meta models and models.

He gave us choice of pretty slides or a demo, so naturally the audience wanted a demo. He showed how his tool records all the changes being made to his sample model. These changes can be grouped and then annotated with instructions for how to migrate the instances. He also has refactoring transformations that can be applied and have built-in meaning in terms of migrating the corresponding instances of the model. You can learn more at his website.

Miguel Garcia presented on techniques to extend languages in the style supported by LINQ
with a focus on how to apply the same types of ideas in Java.

Annotated Ecore models can be used to specify details for how best to query instances in an SQL-like way.

Jabier Martinez of European Software Institute talked about some of the challenges faced in model drive software development, i.e., versioning and transformation, particularly those that scale to very large instances.

These are the problems they are exploring and building tools to solve.

Peter Friese and Heiko Behrens of itemis talked about Xtext, a framework for developing textual DSLs.

The idea is to support syntax that's more human readable than XML. Certainly managing and working with textual source artifacts has a long history of support. The text of course represents deep structure but all structure is implied by the text so only the source files need to be version controlled. It's difficult to work in deep meaningful ways with just source though, so an alternative is to represent the deep structure in a repository, but that implies it's necessary to resurrect the textual form when the users needs to view it, which might be lossy in terms of formatting details. A third approach is to combine these two, i.e., create an index of the deep structure from the source and maintain both.

Jos Warmer of Orinda talked about an alternative approach to large models.

Current solutions include repositories and file storage. Their solution is simply to avoid having large models. They focus on a collection of small independent DSLs along with collections of small instances. So models are composed of a number of model units and scaling happens by virtue of scaling the number of units, not growing the unit size. References across units are soft references, e.g., reference by name. Of course this implies that you'll need an index, like CrossX, to understand the soft references in order to reason about the deep structure implied by them. Effectively this is like compiling your source models into a compact representation of the deep structure.

After the presentations Markus observed that the community is maturing as evidenced by the more sophisticated topics of the interesting presentations we just saw. He then proceeded to gather topics to focus the afternoon break-out discussions.

One was focused on modeling verses programming. Kenn said "programming is modeling but modeling is not necessarily programming." The implication is that programming involves a significant focus on behavior not just structure. What is the purpose of a model generally? Typically they are defined as a target for some program to manipulate, but Kenn claims that sometimes models are just meant to communicate information. Pretty pictures, oh no!

It was asked, why don't we focus modeling more around things we've done with languages like Java, i.e., much what like Jos talked about earlier. In other words, lots of small source files and indexing them. It's key to be able to answer queries about inverse references. A focus on use cases driven by real user needs is important to achieving the same type of comfortable experience that JDT provides. Two main parts of the issue is how to compute the closure for what's to be indexed and to identify which specific things are important enough to be indexed. So we all agree we need an indexing mechanism and a query language to exploit it. Sven will propose a project that focuses on indexing.

We digresses a little at that point about the need uniquely identify every object. A URI should suffice for that. Part of the contentious issue was readable names verses arbitrary fragments that might not be human readable, e.g., a UUID. Even the issue of how the resource is serialized becomes important to understanding the issue, i.e., does the reference to some object include that full URI or is it perhaps just name based, just like a name-based Java reference. It's a bit confusing that we mix up how a resources reference each other verses how, in general, we can reference any object.

We discussed scaling via lazy loading and importantly the ability to unload things so that the heap doesn't steadily grow. It's even important to partially load objects, i.e., proxies aren't enough because sometimes specific features can be hard to compute so you'd like to defer that computation. It's important that EMF allows integration with any persistent representation
so that XML, databases, object-based stores, or repositories can be used. Often the persistent representation has an impact on what type of query mechanisms are supported.

It was also asked if scaling be transparent to the programming model or if we should expect to write our models and algorithms differently to scale better. It seems generally important to think about scalability at design time; it doesn't generally come for free. We concluded that we need good information about examples and best practices based on experience; after all, EMF's EStore API can solve many of these problems. Kenn will start a wiki about best practices.

We the crashed the e4 symposium where an excellent modeling discussion was already underway. It was interesting to hear some of the arguments against modeling because they fit in so well with my classification of all the reasons modeling is stupid.

After that we had a quick summary of the breakout sessions. It was a long and tiring symposium, but very enjoyable.

Later in the evening, the members of the Architecture Council had a nice dinner after which I decided I had enough blog fodder to choke an army, so I'd needed to get it out of my system. Tomorrow promises to be full of interesting talks followed by a great reception. Stay tuned...


Wassim Melhem said...

I really enjoy your narrative style.

Anonymous said...

Hey Ed,

there's a typo in your summary. Instead of Hieko, please write Heiko Behrens.

Thanks, see you later.

Ed Merks said...

That's two names I misspelled. How rude is that?! I suppose that's what happens when your name isn't John Smith.