You are in: Home > Articles > Defining Elements and their Children | ||||||||||||||||||
Defining Elements and their ChildrenArticle Links
Defining Elements and their ChildrenIn our previous example, we explained that we had defined an element named <?xml version = "1.0" encoding="UTF-8" standalone = "yes"?>
<!DOCTYPE CONTACTS [
<!ELEMENT CONTACTS ANY>
<!ELEMENT CONTACT (NAME)>
<!ELEMENT NAME (#PCDATA)>
]>
<CONTACTS>
<CONTACT>
<NAME>Roger Kaplan</NAME>
</CONTACT>
</CONTACTS>
Well, truthfully, we were "mostly" right in our explanation of the DTD. More correctly, the example defined an element
named Remember that DTDs give you quite a bit of flexibility to specify exactly what elements can contain. Using regular expression pattern matching, DTDs allow you to specify very complex logical relationships between elements and their children For example, you could specify such things as: an element may contain a child, one or more children, zero or more children, or at least one child, You could also specify more complex relationships such as element X is valid if it contains one or more children named Y OR one Child named Z. Element definitions are described by their Element Content Models (ECM)....that is, all the stuff in the parentheses. :) Thus, as we saw, the ECM of the <!ELEMENT CONTACT (NAME)> The contents of the ECM are governed by a set of regular expression rules very similar to those used in UNIX. But if you are not familiar with UNIX, don't worry, it is pretty easy. The idea of regular expressions is that certain characters are used to communicate matching logic. Take a look at the possible meta characters....
Of course, these are best seen by example. Let's consider the simplest case of defining an order of child elements. Ordering Child ElementsConsider the following DTD snippet.... <!ELEMENT CONTACT (NAME, EMAIL)> <!ELEMENT NAME (#PCDATA)> <!ELEMENT EMAIL (#PCDATA)> In this case, we expect to see XML along the lines of <CONTACT> <NAME>Jim Sanger</NAME> <EMAIL>sanger@sanger.com</EMAIL> </CONTACT> Alternatively, the following code would be valid: <CONTACT> <EMAIL>sanger@sanger.com</EMAIL> <NAME>Jim Sanger</NAME> </CONTACT> We used a comma to order the list because all children must be ordered. We could use a pipe to delimit a list of non-ordered, optional elements, however. [thanks to Jason Suwala for pointing our error on unordered children--ed]. Thus if we redefined our DTD to use <!ELEMENT CONTACT (NAME, EMAIL)> <!ELEMENT NAME (#PCDATA)> <!ELEMENT EMAIL (#PCDATA)> Then the following XML would be valid <CONTACT> <NAME>Jim Sanger</NAME> <EMAIL>sanger@sanger.com</EMAIL> </CONTACT> but the following XML would be invalid because the <CONTACT> <EMAIL>sanger@sanger.com</EMAIL> <NAME>Jim Sanger</NAME> </CONTACT> Repeated ElementsWhat do you think the following DTD snippet would imply? <!ELEMENT CONTACT (NAME, EMAIL+)> <!ELEMENT NAME (#PCDATA)> <!ELEMENT EMAIL (#PCDATA)> Take a look at the regular expression character chart above and guess. That is right! It would mean that a <CONTACT> <NAME>Jim Sanger</NAME> <EMAIL>sanger@sanger.com</EMAIL> <EMAIL>sanger@yahoo.com</EMAIL> <EMAIL>sanger@netscape.com</EMAIL> </CONTACT> What about the following? <CONTACT> <NAME>Jim Sanger</NAME> </CONTACT> Well that would be invalid because the "+" sign specifies "one or more". To allow for "zero or more" occurrences, you must use a "*" such as <!ELEMENT CONTACT (NAME, EMAIL*)> <!ELEMENT NAME (#PCDATA)> <!ELEMENT EMAIL (#PCDATA)> Grouping ElementsChildren can be grouped using parentheses. Thus, the following DTD snippet would specify that a <!ELEMENT CONTACT (NAME, EMAIL)+> <!ELEMENT NAME (#PCDATA)> <!ELEMENT EMAIL (#PCDATA)> That would look something like the following: <CONTACT> <NAME>Jim Sanger</NAME> <EMAIL>sanger@sanger.com</EMAIL> <NAME>James Sanger</NAME> <EMAIL>james.sanger@sanger.com</EMAIL> <NAME>Kris Kringle</NAME> <EMAIL>santa@sanger.com</EMAIL> </CONTACT> By Selena Sol at eXtropia |
||||||||||||||||||