top | item 9556895

(no title)

ICWiener | 10 years ago

Did you not read my reply, really?

I already mentionned that with a Lisp like data-format, shared sub-expressions could be denoted using CL's reader variables:

      (document
        #1=(author (id "Bob") ... )
        #2=(author (id "Alice") ... )
        (span (author #1#) "written by Bob")
        (span (author #2#) "written by Alice")
        (span (author #1#) "written by Bob"))

I do not claim that this is the most appropriate solution in all cases, just that we are not forced to introduce indirection levels when unnecessary. Now, if I am using Lisp and I want to introduce external references to authors described in other documents, I could introduce a meta-data with an appropriate semantical structure:

       (external-element (pathname (directory (relative "path" "to")) 
                                   (type "lisp")
                                   (name "file")) 
                         (tree-path 2 1 3 2 2 3))

This would be a practical way to encode a precise location in a tree in an external file. And I could use this form everywhere I need to reference an object. Also, the tree-path notation is handy because there is no distinction between an attribute or an element, just which branch to take at each step from the root.

Now, with XML attributes, I would typically have an "xref" attribute. How can we model xref attributes? If we wanted to have structured data, we would need to create external tags with the same concepts as above, like <pathname>, create a local identifier for each xref and refer indirectly to each xref using their local identifier: because we can only put strings. I mean:

     <author xref="xref02"> 
     ...
     <xref id="xref02">
       <pathname> ... </pathname>
       <tree-path> ... </tree-path>
     </xref>

Or, we do as everybody and encode it like for XMI, or ECORE, or any other custom format, with a complex string, hoping that HTML entities are properly escaped.

Besides, you failed to notice that you had <author> tags, which precisely goes against your idea that there should be a place for "meta-data" and a place for "data": effectively, authors are now part of the content of the document, and are not only meta-informations.

If you think my examples are artificial, open the source code of this page, and observe how any kind of complex information written in attributes has to be properly escaped to bypass the limitation of stringly-typed data:

       reply?id=9556252&amp;goto=item%3Fid%3D9555880"

       href="vote?for=9556252&amp;dir=up&amp;auth=0UU000REDACTED000208d8b9f4a45575b4edea3779&amp;goto=item%3Fid%3D9555880"

Notice how you need to escape HTML entities in inline javascript attributes (onclick) but not on script tags. Why are inline javascript not tags instead?

(see http://stackoverflow.com/questions/8749001/escaping-html-ent...).

Whatever example you choose, you cannot deny the fact that attributes are not given the same rights as elements, because the way they do not allow to contain structured data or cannot have meta-attributes themselves.

discuss

lisper|10 years ago

> Did you not read my reply, really?

I did read it.

> I already mentionned that with a Lisp like data-format, shared sub-expressions could be denoted using CL's reader variables:

Yes, of course this is possible. But that's just a different way of implementing tags (and not even a very good one either because your tags are constrained to be numeric).

> we are not forced to introduce indirection levels when unnecessary

That's a tautology.

> I could use this form everywhere I need to reference an object.

Of course you could. Most problems have more than one reasonable solution. But pointing out one reasonable solution is irrelevant to the question of whether a different solution is also reasonable.

> your idea that there should be a place for "meta-data" and a place for "data"

That wasn't exactly my idea. What I said was that there was value in having a syntactic distinction between data and meta-data. But I didn't say that this distinction should be universal. In fact it is impossible to distinguish between data and metadata in general, so you can always come up with examples where a particular datum's role is ambiguous. That doesn't change the fact that in many practical circumstances, having a syntactic distinction is appropriate and useful.

> observe how any kind of complex information written in attributes has to be properly escaped

Again, citing circumstances where things fall apart does not change the fact that in many practical circumstances, having a syntactic distinction between data and meta-data is appropriate and useful.

If you choose to reply to this, please remember: I'm a Lisp fan. (Look at my HN user ID!) I hate XML. I much prefer S expressions. When I have to deal with XML, the first thing I do is parse it into S-expressions. The world would be a better place if everything were S-exprs no one used SGML or any of its devil spawn syntaxes. But that's not the world we live in. In the world we live in, where markup languages exist and are required to have matching end tags, attributes are a defensible design.