By Date: <-- -->
By Thread: <-- -->

[XAR2] New implementation of Blocklayout compiler



In the 2.x development branch i've switched the usage of the 
Blocklayout compiler to a new XSL based implementation.

Instead of a hand-crafted character parser and codegenerator, the 
blocklayout compiler is now made up of an XSLT processer based on the 
PHP5 XSL extension, using a set of xsl templates to tranform xaraya 
templates to the desired output. As a consequence, besides the PHP5 
requirement we already imposed, there is now also the XSL extension 
requirement for the 2.x branch. This extension is bundled by default 
in PHP5, so this should not cause any problems we hope.

The implementation is far from complete but works well enough to 
warrant testing by a broader group of people. The templates as they 
are of this moment are still compatible with both BL compilers, but 
this is likely to change rapidly, moving template constructs to BL2 
syntax where appropriate. It is likely that the BL 2 compiler will not 
remain 100% backward compatible with the existing compiler in some 
areas. We keep a record of these so it should not cause unnecessary 
pain tracking these changes down to change the relevant template 
constructs.

All xar: tags as they are defined in RFC-0010 should work as 
documented (with a few exceptions which are not finished yet) as well 
as the custom tags defined in the core repository. (base and 
dynamicdata have their own tag definitions) Modules have not been 
considered yet, in general, for the 2.x branch (but we secretly test 
them anyways, it's no disaster yet)

The implementation has been around for a while here locally. Going the 
last mile and implementing the remaining tags was done last week. The 
preliminary test results are very promising. If you can stand the 
occassional ugly exception here and there, i encourage you to take a 
look and report your findings (no bugzilla for 2.x yet, it cant keep 
up with the change-rate we have ;-) ).

Some observations from testing i'd like to share:

- as XSL is very strict in what it will and wont do for you, there's 
very little room for 'cheating' with undefined entities, unclosed tags 
or other invalid XML constructs. This in itself can be a pain 
sometimes, but the good news is that in core I only had to change a 
handful of things to bring it into shape.

- we are transforming the input .xd/.xt/.xml directly to the compiled 
cache, skipping the step of explictly building a Node Tree in PHP. Of 
course, behind the scenes the XSL processor does roughly the same, but 
in the extension and invisible. This is a lot faster (the extension is 
written in C, which in general should be faster than interpreted PHP)
(for some templates having the compile cache off is even faster than 
with the cache turned on!!, which was sort of a surprise )

- being forced to adhere to the XML rules showed a couple of 
inconsistencies in the definition of the XAR template language. Design 
errors, if you will. These will be corrected in the syntax, where they 
will not lead to massive incompatibilities.

- The new compiler is 'context aware' which means we have much more 
control over how templates will be processed. One of the more 
interesting consequences could be for example that the <xar:mlstring/> 
tag in it's plain form (that is, not containing #(1) like constructs 
is redundant. The compiler has enough contextual info to detect text 
nodes as such, and automatically put them up for translation. If this 
is going to happen, is not yet certain, but it would make templating 
quite a bit more comfortable in my opinion.

- creating a new tag, assuming a little XSL knowledge is very easy. 
Most of the tags were literally implemented in 30 minutes or less.
On the other hand, if a tag *is* complex, it can take a whole day to 
figure out how to do it.

- it has becomes clear that we probably will need to provide some 
'convenience' settings to go with the new compiler. For example: 
everyone is sort of used to using &nbsp; or &eacute; etc. These are 
HTML entities, by default XML has no knowledge of them, so we need to 
put some rules into place for those types of situations (either 
switching to numeric entities only, or predefine some of the more 
common ones). There are a couple of other areas like that which we 
will need to address to try complication to the minimum.

- similarly for the expressions we use in BL (the things between 
#-pairs) there is less room for cheating, especially in attributes. 
One of the challenges is to precisely define how '#' is handled as it 
also has a special meaning in entities (like: &#160; anchors in XML 
dialects like XHTML (named #anchors) and URI-adressing (#target in an 
URI). Solvable, but easy to do wrong.

- we get a couple of things for free now which were problematic in the 
earlier implementation. Most importantly <![CDATA[...]]> sections work 
out of the box now (very, very comfortable in templating). Another 
example is Processing instructions (PI), like <?php or <?perl or <?xar 
can now be processed natively if we like. (they are disabled now, as 
they are in BL 1 )

Personally, i'm very excited about this because it opens up a whole 
new playground giving more power to the output producing (which is 
already pretty powerful in my opinion) and it marks the next step in 
creating an output independent templating system.

So, good times :-)

marcel.




_______________________________________________
Xaraya_devel mailing list
Xaraya_devel (at) xaraya.com
http://xaraya.com/mailman/listinfo/xaraya_devel