710 likes | 847 Views
Network Based Application Development. Tony Kombol XML Lecture. XML Overview. XML Basics. Benefits Easy to format Describes any type of data Machine-readable Human-readable Data can be distributed Defines the meaning of data. XML. eXtensible Markup Language
E N D
Network Based Application Development Tony Kombol XML Lecture http://courses.coreservlets.com
XML Overview http://courses.coreservlets.com
XML Basics • Benefits • Easy to format • Describes any type of data • Machine-readable • Human-readable • Data can be distributed • Defines the meaning of data
XML • eXtensible Markup Language • Not actually a markup language • Specification for making markup languages • XML documents have two fundamental characteristics • Must be “well-formed” • May be associated with a DTD or XML schema
XML • Well-formed • Must comply with XML syntax rules • For example: • All attributes must have values • All tags must have an end tag • Has exactly one root tag • DTD – Document Type Definition • Dictates what elements and attributes are permitted • HTML Example <imgsrc=“eiffel.jpg” alt=“Eiffel Tower”> • <img> element (tag) • src and alt: attributes
XML • Why is XML so popular? • It’s flexible • It enables you to create your own documents with elements (tags) that you define • HTML example: <p>This is a paragraph of text</p> • What does this tell you? • What does it tell a program?
XML • What does the browser see? <p>blah, blah, blah</p>
XML • XML example: <pets> <pet> <type>Dog</type> <name>Rover</name> <age>12</age> <owner>Dave</owner> </pet> </pets> HTML Comparison: <html> <body> <p> <img…> <a href=…> <script…> <form…> </body> </html>
XML • XML tags provide meaning or context • Programs can interpret these meanings • XML is ideal for sharing data between programs • One user can encode data using XML and share it with others • A different user can interpret that data
XML • XML documents may have three parts • Prolog • Body • Epilog
XML • Prolog • Comments & processing instructions • Version information • Reference to a specific XML DTD or schema • Body • One or more elements • Exactly one root containing 0 or more elements • Defined by the DTD or schema • Forms a hierarchical tree structure • One top-level element • All others below it in the hierarchy • Epilog • Comments & processing instructions
XML • Why is XML important? • Forms the basis for web services • Used by businesses to exchange information • How is XML different from HTML? • HTML is fairly forgiving of errors • <p>This is a paragraphwill probably work OK • HTML can mix upper and lower case in tags • <TabLe>…</table> is ok • Attribute values don’t have to be enclosed in quotes • <font color = #FF0000> is ok
XML • Bottom line: • Poorly-written HTML documents • Usually no big deal • Usually kind of work (at least close enough) • XML is not that forgiving • You have to follow the rules • What are the rules to remember?
XML Basics • Well-formed or Valid? • Well-formed XML documents conform to basic XML syntax • Has exactly one root • All elements completely contain elements within them • ValidXML documents conform to definitions in a: • DTD - or - • Schema
XML Basics • DTD – Document Type Definition • Defines the elements (tags) and attributes (properties) allowed in an XML document • Ubiquitous, but “old school” • Original way to define valid tags, elements and structure • Inherited from SGML • XSD – XML Schema Definition • More powerful than DTDs • “Replaces” DTD • Uses XML Schema • Based on XML format
XML Basics • DTDs can be referenced: • Externally (i.e., in a separate file) via DOCTYPE • Internally within the XML document • Multiple DTDs can be referenced • Blended XML document • Relies on several sets of valid elements and attributes
XML Basics • Schema • Similar to a DTD • Written in XML format • Richer set of tools for creating elements and attributes • Easier to specify data types • XSD
XML Basics • Using XML allows developers to: • Define allowed data explicitly • Make unique, well-defined data structure • Pass data from one application to another • Makes it easier for: • Web applications to communicate and work together • Applications to communicate data • Example …
XML Basics <clients> <client> <name>Oliver Wendell Douglas</name> <phone>510-555-1212</phone> </client> <client> <name>Fred Ziffle</name> <phone>510-555-3456</phone> </client> </clients>
XML Basics • What does this tell an application reading it? • Has some “clients” • Refers to a “client” • 1st item is a “name” • 2nd item is a “phone” • Without a DTD the receiving application doesn’t “know” anything about name or phone • Are they required? • Are they allowed? • Where should they be placed? • Is there a required order? • Do they have attributes • If so are they required • Can the element have content • If so what type of content <clients> <client> <name>Emerson Cod</name> <phone>510-555-1212</phone> </client> </clients>
XML Basics • With a DTD or a schema, additional information or rules can be conveyed • E.g.: • A client may have only one name • A name must consist of alphabetic characters only • Parties sharing data must agree on • Meaning of each element • Action to be taken on each element
XML Basics • XML vocabularies help accomplish this purpose • Major players in an industry develop a vocabulary • Anyone wishing to use that player’s data must conform to that vocabulary • Basically like a protocol • A set of rules commonly agreed upon
XML Basics • Everyone who has adopted a DTD or schema or vocabulary • Knows what every other party using that DTD/schema/vocabulary means • Can program their applications to • Share data • Share processing • Communicate
XML Basics • XML documents may contain all parts in one file • May also be composed of separate sections in files distributed across the Internet • Tags here • Data there
XML Basics • XML document parts are entities • Entities • Have names • Contain • Data or references to data • References are in a URL format
XML Basics • XML vocabulary • Definitions of elements and attributes contained within a DTD or schema • Think of a vocabulary as a mini-dictionary • Defines only those words and phrases that are permitted in a certain situation
XML Basics • Names used for XML structures must follow rules: • 1st character must be a letter, underscore, or colon • Cannot be a number • Colons are never used except in namespace aliases • 2nd and subsequent characters can be any Unicode character • Characters 1-3 cannot be “xml” in any combination of cases (xML, XMl, xml, etc.)
XML Document Prolog • Well-formed documents must have a body • They may have • A prolog at the beginning • An epilog at the end • Prolog can include an XML declaration specifying • Version of XML used • Encoding & stand-alone attributes
XML Document Prolog • Example: <? xml version=“1.0” encoding=“UTF-8” standalone=“yes” ?> • Version enables applications reading this XML document to be able to know • Rules to use for each element • How to decide if the document is “well-formed” or not
XML Document Prolog • Encoding specifies how the characters used in the document were constructed • Standalone tells user if any entity declarations are contained in an external DTD • XML declaration should be the first line in the prolog
XML Document Prolog • You can write your own DTD - or - • Use one that someone else has written and published • By referencing a DTD • XML processors can check if your document is • Well-formed • Valid • Sharing documents works if everyone is using the same DTD
XML Document Prolog • External DTD <? xml version=“1.0” encoding=“UTF-8” standalone=“yes” ?> <!DOCTYPE WService SYSTEM “http://www.servata.com/DTD02”> • “WService” is the name of the root element • “SYSTEM” means a URL is being used to reference the DTD
XML Document Prolog • Internal DTD <? xml version=“1.0” encoding=“UTF-8” standalone=“yes” ?> <!DOCTYPE WService [ <!ENTITY WSNM “Web Service Name”> ]>
XML Document Prolog • XML documents declare the elements and attributes that will be used • XML elements that contain content must be written with both a starting and ending tag <myElement>content goes here</myElement> • XML elements withoutcontent may be terminated with a slash: <myElement/>
XML Document Prolog • Elements are organized hierarchically • Root element is at the “top” • Root element can occur only once • Child elements may occur many times • If the DTD allows • Elements may contain • Attributes • Always written into the starting tag as name-value pairs • Other elements • Other XML structures
XML Document Prolog <city> <street name=“Tryon Street”> <address>101</address> <address>102</address> </street> <street name=“Trade Street”> <address>201</address> <address>202</address> </street> </city>
XML Document Prolog • Both elements and attributes may hold data • Attributes may occur only once in an element • In any order if multiple attributes are present • Should you use an element or an attribute? • No rules that require one over the other • Use which way fits the context
XML Document Prolog Which is better? <city> <street> <name>Tryon Street</name> <address>201</address> <address>202</address> </street> </city> - or - <city> <street name=“Trade Street”> <address>201</address> <address>202</address> </street> </city> Assumptions: • only one street name allowed for a street • multiple addresses per street are allowed Remember in the “real world” you will only choose one consistent way!
Writing Document Type Definitions • A DTD defines • Elements • “Things” the document works with • Attributes • Properties associated with an element • Entities • Notations
Writing Document Type Definitions • Content is the data in an element • May exist between the starting and ending tags • May exist in child elements • Each element has a specific syntax
Writing Document Type Definitions • Content may be in one of 4 “models” • EMPTY • Elements that have no content like <br/> • ANY • No restrictions on content • MIXED • May contain child elements, text data, attributes • CHILDREN • May have child elements and attributes but no text data
Writing Document Type Definitions • Element syntax • General form: <!ELEMENT name model> • Examples <!ELEMENT myelement1 ANY> <!ELEMENT myelement2 (child01, child02)> <!ELEMENT myelement3 (child01 | child02)> Order shown One OR the other
Examples • Multiple Children • a+ - One or more occurrences of a<!ELEMENT BOOK (CHAPTER)+> • a* - Zero or more occurrences of a<!ELEMENT List (Object)*> • a? - a specific item or nothing<!ELEMENT Table (plate)?> • a, b - a followed by b<!ELEMENT SUM (val1, val2)> • a | b - a or b but not both<!ELEMENT POINT (XYCOORDINATES | POLAR)> • (expression) - expression treated as a unit<!ELEMENT CHAPTER (INTRODUCTION, (P | QUOTE | NOTE)*, DIV*)>
Writing Document Type Definitions • Attributes <!ATTLIST myelement myatt1 ID #REQUIRED myatt2 CDATA #IMPLIED ... > Must appear Unique name Character data May appear
Example • DTD – an element with attributes • <!ELEMENT Rectangle EMPTY><!ATTLIST Rectangle length CDATA "0px" width CDATA "0px"> • Use • <Rectangle width="80px” length="40px"/>
Extra References • Examples: • http://www.xmlfiles.com/dtd/dtd_examples.asp
XML Schemas http://courses.coreservlets.com
XML Schema • In databases, a schema defines • Structure • Constraints • Usually reflects business rules • Customer SSN must be part of account information • Therefore, cannot add customer record without an SSN
XML Schema • XML schemas function similarly to DB schemas • Like a DTD, schemas define items • Allows more control over elements and attributes • Frequency of occurrence • Order of appearance • Allowable data types • Custom data types