Tuesday, December 4, 2007

Winter's Icy Grip

Sunday night we had freezing rain north of Toronto. We never had such things in Vancouver where I grew up, just lots of normal rain. Predictions of freezing rain always make me worried about the garden, but there was little damage and much beauty. Diversity of weather certainly has its points of interest. This shot of a berry locked away in ice was my favorite for the day.


Monday was also the day that the Friends of Eclipse program was launched. As a committer representative on the board, I received advanced notice on Sunday. This is a program the committer representatives asked the foundation to look into setting up, so naturally I was thrilled with Ian's results and at the opportunity to be one of the first to demonstrate my friendship with my wallet, not just with my words. I even outbid Mike, but then the stakes quickly became to high for me! (I must say, Jeff is such a competitive guy!) Have a look at how quickly the donor list is growing! There's even a kind remark about modeling. How nice is that?

I'm proud to display a Friends of Eclipse logo on my blog home page. Thanks to Nathan, it's not just a one size fits all logo either, but rather is available in small, medium, and large. I suppose pride is a sin, so I should say I'm honored to display it. Eclipse has been my friend for a long time, so it's nice to know that now I'm Eclipse's friend...

Today the sun came out again. The combination of sun and ice is quite a thing!


I love a sunny day with fresh snow. You can see from all the ice on my blue spruce offset so nicely by the blue sky why I might worry about damage. It's a lot of extra weight to bear!


Even the girls think the snow is kind of cool, but I have to clear a bit of the backyard for them since they tend to disappear in a foot of snow.


I've not been busy just taking pictures either! I've been working on something that's kind of cool and also relatively simple. Have you ever heard of a substitution group? I see I've piqued your curiosity. (Just ask her where the ducky is and she'll give you this same look!)



A substitution group is an XML Schema thing and is what I call syntactic sugar because it's primary purpose is to make your XML serialization look pretty by allowing the instance to avoid the use of xsi:type. Though goodness knows it's beyond me why anyone cares that something as ugly as XML should be made slightly less ugly when it's barely human readable in the first place, but hey, there are an awful lot of people who are quite obsessed with the prettiness of their XML.

So here's the basic idea. Suppose I defined a type Resource with two subtypes, Folder and File, where a Folder has a list of members of type Resource. To round it out, we might define a FileSystem as having a list of folders. We could define that in XML Schema like this:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:ecore="http://www.eclipse.org/emf/2002/Ecore"
xmlns:resource="http://www.example.com/resource"
targetNamespace="http://www.example.com/resource">

<xsd:element name="fileSystem"
type="resource:FileSystem"/>
<xsd:complexType name="FileSystem">
<xsd:sequence>
<xsd:element name="folder"
ecore:name="folders" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>

<xsd:element name="member"
ecore:name="members" type="resource:Resource"/>
<xsd:complexType name="Resource" abstract="true">
<xsd:attribute name="name" type="xsd:string"/>
</xsd:complexType>

<xsd:complexType name="Folder">
<xsd:complexContent>
<xsd:extension base="resource:Resource">
<xsd:sequence>
<xsd:element ref="resource:member"
ecore:name="members" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>

<xsd:complexType name="File">
<xsd:complexContent>
<xsd:extension base="resource:Resource">
<xsd:sequence>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>

</xsd:schema>
Generating the EMF code for the above, we could use the generated API to create an instance like this:
Resource resource =
resourceSet.createResource
(URI.createURI("http:///My.resource"));
DocumentRoot documentRoot =
ResourceFactory.eINSTANCE.createDocumentRoot();
FileSystem fileSystem =
ResourceFactory.eINSTANCE.createFileSystem();
documentRoot.setFileSystem(fileSystem);
Folder folder1 =
ResourceFactory.eINSTANCE.createFolder();
fileSystem.getFolders().add(folder1);
folder1.setName("folder1");
File file1 = ResourceFactory.eINSTANCE.createFile();
file1.setName("file1");
folder1.getMembers().add(file1);
resource.getContents().add(documentRoot);
resource.save(System.out, null);
ByteArrayOutputStream out = new ByteArrayOutputStream();
resource.save(out, null);
Resource resource2 =
resourceSet.createResource
(URI.createURI("http:///My2.resource"));
resource2.load
(new ByteArrayInputStream(out.toByteArray()), null);
DocumentRoot loadedDocumentRoot =
(DocumentRoot)resource2.getContents().get(0);
The resulting serialization looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<resource:fileSystem
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:resource="http://www.example.com/resource">
<folder xsi:type="resource:Folder" name="folder1">
<resource:member xsi:type="resource:File" name="file1"/>
</folder>
</resource:fileSystem>
To our horror, it's full of xsi:types! Clearly it's metadata and it offends our sensibilities to see such metadata mixed in our data. Of course the element and attribute names are metadata too, and xsi:type is something that any conforming XML processor must handle, but let's not cloud an emotional issue with facts.

You might wonder why these xsi:types are even needed? If you think about it, when you seen an element named member, you won't know if it's a folder or a file, so you won't know what elements and attributes to expect in addition to those you would expect for a Resource. For EMF such information is important too because we need to create the right type of object.

To deal with this type of issue, XML Schema defines the notion of a substitution group. Here's the idea. We define new elements named folder and file and declare that they are in the substitution group for the member element.
<xsd:element name="folder"
substitutionGroup="resource:member"
type="resource:Folder"/>
<xsd:element name="file"
substitutionGroup="resource:member"
type="resource:File"/>
Now, where ever a member element may appear, a folder or file element can be substituted. And of course, once we see that element, we'll know the type of that element and hence won't need an xsi:type. If we are feeling particularly clever, we might even declare the member element abstract because Resource is abstract and hence we'd never expect to see a member element.
<xsd:element name="member"
ecore:name="members"
type="resource:Resource"
abstract="true"/>
So now we can go back, generate our API again, and try to rerun our previous example, which ends up failing with the incredibly helpful runtime exception that says
Invalid entry feature 'Folder.members'
Naturally we scurry off to the EMF newsgroup where it's explained to us that the members feature is read only because the members element is abstract. We must specify a "document root" feature in some horrible feature map that we didn't notice was generated in the Folder interface and that we don't know how to use.
FeatureMap getMembersGroup();
Finally we come to realize, based on the helpful advice, that we must change our example like this:
// folder1.getMembers().add(file1);
folder1.getMembersGroup().add
(ResourcePackage.Literals.DOCUMENT_ROOT__FILE, file1);
Woo hoo. Finally we have it producing what we want.
<?xml version="1.0" encoding="UTF-8"?>
<resource:fileSystem
xmlns:resource="http://www.example.com/resource">
<resource:folder name="folder1">
<resource:file name="file1"/>
</resource:folder>
</resource:fileSystem>
But it's kind of a bitter sweet victory because clearly our syntactic sugar has turned into semantic rat poison. So many folks won't quite be satisfied and some I know have even gone through great lengths to suppress the rat poison from the API and to write clever code that figures if you add a File to the getMembers() list, you really must mean you want to use the document root's file feature since that's the only choice available. It's clear that this kind of cleverness could be handled by the serializer itself, so I've added support for XMLResource.OPTION_ELEMENT_HANDLER along with an XML Schema annotation to eliminate the semantic rat poison:
<xsd:schema
xmlns:ecore="http://www.eclipse.org/emf/2002/Ecore"
ecore:ignoreSubstitutionGroups="true"
If we reload the schema and regenerate the API, it goes back to like we had it before. Then we can change the code to how we had it and we can add the new option to the regenerated resource factory's createResource method:
result.getDefaultSaveOptions().put
(XMLResource.OPTION_ELEMENT_HANDLER,
new ElementHandlerImpl(false));
The result is that we get the same sugary sweet serialization without the rotten decay in the API.

While I was at it, I thought I might as well look at eliminating the need to use a document root, since that rubs some people the wrong way. Often folks don't understand why it even exists and it's effectively just another example of semantic rat poison. As modelers, we're quite used to the concept of having an object that is an instance of some type, and that corresponds directly to having an instance of some type in XML schema. But the root of an XML document doesn't specify a type name, it specifies an element name. Hence a DocumentRoot is like an invisible root object that has features to contain the real root object; the name of that containing feature determines the root element's name. So clearly if we can deduce a substitution group element during serialization based on the type of object being serialized, we can also deduce a correct root element name in a very similar way. The new element handler option supports that as well, so all that remains is to be able to suppress the document root during load and then we can pretend that document roots don't exist at all.

The new option XMLResource.OPTION_SUPPRESS_DOCUMENT_ROOT can be used to suppress loading of a document root. We just need to add it to your generate createResource method.
result.getDefaultLoadOptions().put
(XMLResource.OPTION_SUPPRESS_DOCUMENT_ROOT,
Boolean.TRUE);
Then we can rewrite our example to omit the document root.
Resource resource =
resourceSet.createResource
(URI.createURI("http:///My.resource"));
FileSystem fileSystem =
ResourceFactory.eINSTANCE.createFileSystem();
Folder folder1 = ResourceFactory.eINSTANCE.createFolder();
fileSystem.getFolders().add(folder1);
folder1.setName("folder1");
File file1 = ResourceFactory.eINSTANCE.createFile();
file1.setName("file1");
folder1.getMembers().add(file1);
resource.getContents().add(fileSystem);
resource.save(System.out, null);
ByteArrayOutputStream out = new ByteArrayOutputStream();
resource.save(out, null);
Resource resource2 =
resourceSet.createResource
(URI.createURI("http:///My2.resource"));
resource2.load
(new ByteArrayInputStream(out.toByteArray()), null);
FileSystem loadedFileSystem =
(FileSystem)resource2.getContents().get(0);
It's thing of beauty. Sugar without decay.

Here's a final image to capture your imagination. It's like a Rorschach inkblot test. What do you see in this picture?

4 comments:

Anonymous said...

Hi there

Very impressive images. i haven't looked at the eclipse stuff because of them ;)

You should really put more focus on your photography. maybe buy a DSLR and get into postprocessing. it will be great :)

Anonymous said...

"...Here's a final image to capture your imagination. It's like a Rorschach inkblot test. What do you see in this picture?..."

Oh, you filthy pervert!!!

Ron said...

From a developer who likes to keep his code shiny clean (and unfortunately forced to use esoteric XML stuff), thanks for reducing our semantic rat poison :-p

EMF is doing a great job balancing the needs of the modeling people, XML people, plug-in developers, etc. Not an easy thing to do! Truly a showcase of Eclipse' philosophy of "extensible frameworks and exemplary tools".

Unknown said...

Ed, you have just made my night! Been trying to work out how to get rid of the xsi:"type" stuff for the last two hours and after stumbling up your blog post my serialisation now matches the sample data that I was given along with the XSDs!

I can sleep easy, thanks you! :-)

Alan