170 likes | 279 Views
XML DTD. XML Validation. XML with correct syntax is "Well Formed" XML. XML validated against a DTD is "Valid" XML. Well Formed XML Documents. XML documents must have a root element XML elements must have a closing tag XML tags are case sensitive XML elements must be properly nested
E N D
XML Validation • XML with correct syntax is "Well Formed" XML. • XML validated against a DTD is "Valid" XML.
Well Formed XML Documents • XML documents must have a root element • XML elements must have a closing tag • XML tags are case sensitive • XML elements must be properly nested • XML attribute values must be quoted
Valid XML Documents • A "Valid" XML document is a "Well Formed" XML document, which also conforms to the rules of a Document Type Definition (DTD).
DTD • The purpose of a DTD (Document Type Definition) is to define the legal building blocks of an XML document. • A DTD defines the document structure with a list of legal elements and attributes.
Use a DTD • With a DTD, each of your XML files can carry a description of its own format. • With a DTD, independent groups of people can agree to use a standard DTD for interchanging data. • Your application can use a standard DTD to verify that the data you receive from the outside world is valid. • You can also use a DTD to verify your own data.
Internal DTD Declaration • If the DTD is declared inside the XML file, it should be wrapped in a DOCTYPE definition with the following syntax: • <!DOCTYPE root-element [element-declarations]>
code <note> <to>Raj</to> <from>IME</from> <heading>Invitation</heading> <body>Welcome to IME!</body> </note>
Corresponding DTD • <?xml version="1.0"?><!DOCTYPE note [<!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>]><note> <to>Raj</to> <from>IME</from> <heading>Invitation</heading> <body>Welcome to IME!</body> </note>
interpretation • !DOCTYPE note defines that the root element of this document is note • !ELEMENT note defines that the note element contains four elements: "to,from,heading,body" • !ELEMENT to defines the to element to be of type "#PCDATA" • !ELEMENT from defines the from element to be of type "#PCDATA" • !ELEMENT heading defines the heading element to be of type "#PCDATA" • !ELEMENT body defines the body element to be of type "#PCDATA"
PCDATA - Parsed Character Data • XML parsers normally parse all the text in an XML document. • When an XML element is parsed, the text between the XML tags is also parsed: • <message>This text is also parsed</message>
The parser does this because XML elements can contain other elements, as in this example, where the <name> element contains two other elements (first and last): • <name><first>Bill</first><last>Gates</last></name>and the parser will break it up into sub-elements like this: • <name> <first>Bill</first> <last>Gates</last></name>
CDATA - (Unparsed) Character Data • The term CDATA is used about text data that should not be parsed by the XML parser. • Characters like "<" and "&" are illegal in XML elements. • "<" will generate an error because the parser interprets it as the start of a new element. • "&" will generate an error because the parser interprets it as the start of an character entity.
cdata • Everything inside a CDATA section is ignored by the parser. • A CDATA section starts with "<![CDATA[" and ends with "]]>":
External DTD Declaration • If the DTD is declared in an external file, it should be wrapped in a DOCTYPE definition with the following syntax: • <!DOCTYPE root-element SYSTEM "filename">
code • <?xml version="1.0"?><!DOCTYPE note SYSTEM "note.dtd"> <note> <to>Raj</to> <from>IME</from> <heading>Invitation</heading> <body>Welcome to IME!</body> </note>
And this is the file "note.dtd" which contains the DTD: • <!ELEMENT note (to,from,heading,body)><!ELEMENT to (#PCDATA)><!ELEMENT from (#PCDATA)><!ELEMENT heading (#PCDATA)><!ELEMENT body (#PCDATA)>