320 likes | 402 Views
An XML Introduction. Next Generation Web Data Ian GRAHAM Centre for Academic Technology Tel: 978-4548 Email: <ian.graham@utoronto.ca> Talk: http://www.utoronto.ca/ian/talks/. Overview. An XML example -- so what’s so special about XML? The birth of the Web -- HTML
E N D
An XML Introduction Next Generation Web Data Ian GRAHAM Centre for Academic Technology Tel: 978-4548 Email: <ian.graham@utoronto.ca> Talk: http://www.utoronto.ca/ian/talks/
Overview • An XML example • -- so what’s so special about XML? • The birth of the Web -- HTML • HTML is not enough -- why? • XML for universal data • Common uses and applications
XML Example: test.xml <?xml version="1.0" encoding="iso-8859-1” ?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head> <body> <div class="myDiv"> <h1> Heading of Page </h1> <p> here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. </p> <p>Here is another paragraph with <em>inline emphasized</em> text, and <b> absolutely no</b> sense of humor. </p> <p>And here is another paragraph, this one containing an <img src="image.gif" alt="waste of time" /> inline image, and a <br /> line break. </p> </div> </body></html>
It Looks Like HTML …. • Sort of …. • Tags look just like HTML tags (although XML lets you ‘create’ your own) • The “red bits” are special XML stuff (we will discuss them later) • It’s got .xml at the end
The Birth of the Web • The HyperText Markup Language • A simple language for distributing text • All that other stuff • URLs, HTTP, CGI ...
HTML Evolution • Started with very few tags … • Language evolved, as more tags were added • forms • tables • fonts • frames
HTML Problems • Desire for personalized tags • Want to put data into HTML form • mathematics, database entries, literary text, poems, purchase orders …. • HTML just isn’t designed for that!
Software processing Server management of data But -- HTML is so ill-formed, this is hard! HTML Problems (2) HTML HTML HTML HTML HTML
Idea: Back to the Basics • HTML was defined using SGML • Standard Generalized Markup Language • A meta-language for defining languages. • Complex, sophisticated, powerful • Idea: Use SGML
Languages based on SGML HTML TEI DocBook . . . SGML
Problems with SGML • Too complicated a language • Rules are too strict • Can’t distribute ‘loosely’ formatted text (like HTML) • Not good in a distributed environment • Can’t mix different data together • Can’t add arbitrary tags
Idea (2): “Webified” SGML • New eXtensible Markup Language: XML • Can use XML to define new languages • Distributes easily on the Web • Can mix different types of data together • can easily add new tags, and tell a browser what to do with them.
Basic XML Rules • Tags like in HTML, but ... • Technical details • Tag names are case-sensitive • Always need end tags • Special empty-element tags • Always quote attribute values
Like this example ….. <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head><body> <div class="myDiv"> <h1> Heading of Page </h1> ….. <p>And here is another paragraph, this one containing an <img src="image.gif" alt="waste of time" /> inline image, and a <br /> line break. </p> </div> </body></html>
XML Things • <?xml version=“1.0” encoding=“iso-8859-1” ?> • Says that this is an XML document • <html xmlns=“http://www.w3.org/TR/xhtml1”> • Says that the meaning of the tags inside (and including) the html “element” are defined here.
Evolution of XML • Many XML languages, optimised for different roles • MathML -- for mathematics • SMIL -- for synchronised multimedia • RDF -- for describing “things” • XUL -- for describing the Navigator 5 user interface
Designed to express semantics of maths Also can express layout Cut & paste into Maple, Mathematica x2 + 4x + 4 =0 <mrow> <mrow> <msup> <mi>x</mi> <mn>2</mn> </msup> <mo>+</mo> <mrow> <mn>4</mn> <mo>&invisibletimes;</mo> <mi>x</mi> </mrow> <mo>+</mo> <mn>4</mn> </mrow> <mo>=</mo> <mn>0</mn> </mrow> MathML
SMIL • Synchronised Multimedia Integration Language • Integration of multimedia with text, audio, video • Support in RealPlayer G2
SMIL Example <smil> <head> <meta name="title" content="Online Teaching Services promo" /> <meta name="author" content="Jay Moonah, CAT" /> <layout type="text/smil-basic-layout"> <root-layout width="280" height="316" background-color="white"/> <region id="AnimChannel1" title="AnimChannel1" left="0" top="0" height="265" width="280" fit="hidden"/> </layout> </head> <body> <par title="Online Teaching Services promo" author="Jay Moonah, CAT" > <audio src="final.rm" id="Soundtrack" title="Soundtrack"/> <animation src="otscompfin.swf" id="Animation" region="AnimChannel1" title="Animation" fill="freeze"/> <text src="cc.rt" id="caption" region="cc" title="cc" fill="freeze"/> </par> </body></smil>
XHTML: NextGen HTML <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of text XHTML Document </title> </head> <body> <div class="myDiv"> <h1> Heading of Page </h1> <p> here is a paragraph of text. I will include inside this paragraph a bunch of wonky text so that it looks fancy. </p> <p>Here is another paragraph with <em>inline emphasized</em> text, and <b> absolutely no</b> sense of humor. </p> <p>And another paragraph, this one with an <img src="image.gif" alt="waste of time" /> image, and a <br /> line break. </p> </div> </body></html>
XHTML • Just like HTML, but based on XML rules • Will support integration of different data into a single document
XHTML and other Data <?xml version="1.0" encoding="iso-8859-1"?> <html xmlns="http://www.w3.org/TR/xhtml1" > <head> <title> Title of XHTML Document </title> </head><body> <div class="myDiv"> <h1> Heading of Page </h1> <mathml xmlns=“http://www.w3.org/TR/mathml”> … MathML markup … </mathml> <p> more html stuff goes here </p> <smil xmlns=“http://www.w3.org/TR/smil1”> … SMIL markup … </smil> </div> </body></html>
Displaying XML • More complicated than HTML • XML represents data only, not how it looks • Need extra instructions (a “style sheet” document) to define how things should look
What Browsers Do Now? • Netscape 5 -- ignores the tags ... or so it seems ... • Internet Explorer 5 -- shows a tree of elements • Navigator 4, Internet Explorer 4 • Uggh…… (can’t handle it)
Other Use: Data Abstraction • XML as a universal format for data interchange • Machines exchange data as XML-format messages • Eliminates proprietary data formats • Lots of XML processing software available
Supplier Place order Factory Supplier Supplier Response XML Messaging
XML Messaging Other DB Request/send data Database Other DB Other DB Request/send data
Example Message <partorders xmlns=“http://myco.org/Spec/partorders.desc”> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> <desc> Gold sprockel grommets, with matching hamster</desc> <part number=“23-23221-a12” /> <quantity units=“gross”> 12 </quantity> <delivery-date date=“27aug1999-12:00h”> </order> <order ref=“x23-2112-2342” date=“25aug1999-12:34:23h”> …. Order something else ….. </order> </partorders>
SMIL SpeechML XUL XHTML MathML RDF The XML Family Tree HTML TEI . . . . . . XML SGML
Other Examples • XUL: XML User Interface Language • How Navigator 5 configures its interface • RDF: Resource Description Framework • For describing things • Used by Netscape Open Catalog project to define Web accessible resources
Summary • a framework for distributing data on the Web • an integration tool for mixing different types of data • a universal format for exchanging data between machines
An XML Introduction Next Generation Web Data Ian GRAHAM Centre for Academic Technology Tel: 978-4548 Email: <ian.graham@utoronto.ca> Talk: http://www.utoronto.ca/ian/talks/