[Notes] Markup Languages: XML and Namespacing

Miguel Menéndez

XML and Namespacing.

markup languages

  • Presentation oriented: HTML, XHTML
  • Description oriented: XML, SGML
  • Procedure-oriented: LaTeX, PostScript
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE books SYSTEM "books.dtd">
<?xml-stylesheet type="text/xsl" href="styles.xsl" ?>
<?xml-stylesheet type="text/css" href="styles.css" ?>
<![CDATA[With CDATA you can write <div id='ID' class="class"></div>]]>

Entities

  • &gt; (>)
  • &lt; (<)
  • &amp; (&)
  • &quot; (")
  • &apos; (')

well-formed XML

  • A single root element.
  • Elements, attributes and entities with correct syntax:
    • Start and end tag (unique tags ending in />)
    • Attributes between quotation marks or apostrophes
    • Case sensitive

The textual content of the elements and the textual value of the attributes cannot contain >, <, &, " or ‘. Entities or CDATA must be used. Characters with tildes and eñes can appear.

Element and attribute names must be valid XML names, ie begin with a letter (with or without a tilde) or the underscore or colon (deprecated but valid). Any combination of letters, numbers, periods, hyphens, underscores, and colons can follow. There can be no blank spaces or other punctuation characters other than those mentioned.

Element names cannot start with the text xml and cannot be lowercase, uppercase, or a combination of both.

Invalid: <9number>, <number?>, <phonenum>

Valid: <number>, <_number>, <phone_num>, <number9>

Invalid: <div></div id="">, <async>

Valid: <div id=""></div>, <async="async">

Namespaces

<?xml version="1.0" encoding="UTF-8" ?>
<document>
  <miguel.title>Test Document</miguel.title>
  <content>
    <html>
      <head>
        <xhtml.title>XHTML title</xhtml.title>
      </head>
      ...
<?xml version="1.0" encoding="UTF-8" ?>
<miguelmenendez.pro.document>
  <miguelmenendez.pro.title>Test Document</miguelmenendez.pro.title>
  <miguelmenendez.pro.content>
    <www.w3c.org.html>
      <www.w3c.org.head>
        <www.w3c.org.title>XHTML Title</www.w3c.org.title>
        ...
<?xml version="1.0" encoding="UTF-8" ?>
<miguel:document xmlns:miguel="https://miguelmenendez.pro/document"
                xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <miguel:title>Test Document</miguel:title>
  <miguel:content>
    <xhtml:html>
      <xhtml:head>
        <xhtml:title>HTML Title</xhtml:title>
      </xhtml:head>
      ...

Any text string can be used as a namespace prefix. The namespace URI does need to be unique, although it’s not actually checked over a connection. The URI is nothing more than a logical name of the namespace.

<?xml version="1.0" encoding="UTF-8" ?>
<document xmlns="https://miguelmenendez.pro/document"
          xmlns:xhtml="http://www.w3.org/1999/xhtml">
  <title>Test Document</title>
  <content>
    <xhtml:html>
      <xhtml:head>
        <xhtml:title>HTML Title</xhtml:title>
      </xhtml:head>
      ...

The scope of a default namespace is that of the element in which it is declared and its descendant elements. Attributes, however, are not associated with any namespaces by default. For an attribute to belong to an namespace, it must be preceded by a prefix.

<?xml version="1.0" encoding="UTF-8" ?>
<document xmlns="https://miguelmenendez.pro/document">
  <title>Test Document</title>
  <content>
    <xhtml:html xmlns:xhtml="http://www.w3.org/1999/xhtml">
      <xhtml:head>
        <xhtml:title>XHTML title</xhtml:title>
      </xhtml:head>
      ...
<?xml version="1.0" encoding="UTF-8" ?>
<document xmlns="https://miguelmenendez.pro/document">
<!-- start miguelsanchez.net namespace -->
  <title>Test Document</title>
  <content>
  <html xmlns="http://www.w3.org/1999/xhtml">
  <!-- from here the space is now that of xhtml -->
    <head>
      <title>HTML title</title>
    </head>
    ...

Using the xmlns="" attribute inside a tag indicates that that element and its children do not use any namespaces:

<p xmlns="">

Comments

Found a bug? Do you think something could be improved? Feel free to let me know and I will be happy to take a look.