XML: CDATA and EntitiesArticle Links
CDATAAs we have already said, it is a pretty good rule of thumb to consider anything outside of tags to be character data and anything inside of tags to be considered markup. But alas, in one case this is not true. In the special case of CDATA blocks, all tags and entity references are ignored by an XML processor that treats them just like any old character data. CDATA blocks have been provided as a convenience measure when you want to include large blocks of special characters a character data, but you do not want to have to use entity references all the time. What if you wanted to write about an XML document in XML! Consider the following example in which you would have an example tag in your XML Guide written in XML: <EXAMPLE>
As you can see, you would be forced to use entity references for all the tags. YUCK! To avoid the inconvenience of translating all special characters, you can use a CDATA block to specify that all character data should be considered character data whether or not it "looks" like a tag or entity reference. Consider the following example: <EXAMPLE>
As you might have guessed, the character string CommentsNot only will you sometimes want to include tags in your XML document that you want the XML processor will ignore (display as character data), but sometimes you will want to put character data in your document that you want the XML processor to ignore (not display at all). This type of text is called COMMENT text. You will be familiar with comments from HTML. In HTML, you specified comments using the <!-- Begin the Names -->
When using comments in your XML documents, however, you should keep in mind a couple of rules. First, you should never have "-" or "--" within the text of your comment as it might be confusing to the XML processor. Second, never place a comment within a tag. Thus, the following code would be poorly-formed XML <NAME <!--The name --> >Peter Williams</NAME>
Likewise, never place a comment inside of an entity declaration and never place a comment before the XML declaration that must always be the first line in any XML document. Comments can be used to comment out tag sets. Thus, in the following case, all the names will be ignored except for Barbara Tropp. <!-- don't show these
However, if you do comment out blocks of tags, make sure that the remaining XML is well-formed. Processing InstructionsWe have already seen a processing instruction. The XML declaration is a processing instruction. And if you recall, when we introduced the XML declaration we promised to return to the concept of processing instructions to explain them as a category.So here we are. A processing instruction is a bit of information meant for the application using the XML document. That is, they are not really of interest to the XML parser. Instead, the instructions are passed intact straight to the application using the parser. The application can then pass this on to another application or interpret it itself. All processing instructions follow the generic format of: <?NAME_OF_APPLICATION_INSTRUCTION_IS_FOR INSTRUCTIONS?>
As you might imagine, you cannot use any combination of "xml" as the <?JAVA_OBJECT JAR_FILE = "/java/myjar.jar"?>
EntitiesTo a large degree much of the discussion of entities is more relevant in the next section, writing "valid" documents, rather than in this section, writing "well-formed" documents.As such, we will discuss entities in greater details in the next section. Nevertheless, some issues make sense within this section, because entities must be well-formed as well as valid. So, in this section, we will introduce entities in terms of their basic syntax and leave the nitty gritty for a little bit later. As we said before, entities are essentially aliases that allow you to refer to large sections of text without having to type them out every time you want to use them. Suppose you have your letterhead saved as an entity in a shared file. Then, every time you write a letter in XML, you might say something like <LETTER>
Notice that the letterhead might expand out to My Company However, instead of typing that out in every letter, you just use &letterhead;
There are two types of entities, general and parameter entities and each entity has two parts, the declaration and the entity reference. General EntitiesGeneral entities look something like:<!ENTITY NAME "text that you want to be represented by the entity">
which might look like the following in the real world: <!ENTITY full_name "Diego Ramirez Valenzuela Martinez Perez the 5th">
NOTE: You can specify an entity that has text defined external to the document by using the <!ENTITY license_agreement
In this case, the XML processor will replace the entity reference with the contents of the document specified. Parameter EntitiesParameter entities, that can also be either internal or external, are only used within the DTD that we will discus in the next section so we will defer a serious discussion until then. However, we will mention that a well-formed parameter entity will look the same as a general entity except that it will include the " <!ENTITY % NAME "text that you want to be represented by the entity">
The DOCTYPE DeclarationsIf you want to declare entities, you MUST do so within the document <?xml version="1.0"?>
Thus, you might have something like the following (Consider how much easier changing office addresses is when you use entities!): <?xml version="1.0"?>
Entity ReferencesWell we have pretty much let the cat out of the bag already. We have shown several examples of entity references above.In short, Entity References refer to the key that unlocks an entity whch has been declared in an Entity Declaration. Entity References follow the simple syntax of: &ENTITY_NAME;
such as &letterhead;
WARNINGS: As you might expect parameter entity references work much like general entity references. In this case, we use
a " %PARAMETER_ENTITY_NAME;
Now, you have already seen that entity references can take the place of regular character data and you have seen how useful that is. Before we leave the subject, I would only mention that you could also use entity references within tag attributes. For example, consider the following: <INVOICE CLIENT = "&IBM;" PRODUCT = "&PRODUCT_ID_8762;" QUANTITY = "5">
WARNINGS: The Valid XML Document and the DTD By Selena Sol at eXtropia |
||