1 / 31

3.2 Document Object Model (DOM)

3.2 Document Object Model (DOM). How access structured documents uniformly in parsers, browsers, editors, databases,...? Overview of the W3C DOM Spec Level 1, W3C Rec, Oct. 1998 Level 2 , W3C Rec, Nov. 2000

brit
Download Presentation

3.2 Document Object Model (DOM)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 3.2 Document Object Model (DOM) • How access structured documents uniformly in parsers, browsers, editors, databases,...? • Overview of the W3C DOM Spec • Level 1, W3C Rec, Oct. 1998 • Level 2, W3C Rec, Nov. 2000 • Level 3 in progress (as 21 modules!); Validation, Core, and Load and Save Recommendations (Spring 2004) 3.2: Document Object Model

  2. DOM: What is it? • An object-based, language-neutral API for XML and HTML documents • Allows programs and scripts to build, access, and modify documents • Supports the development of querying, filtering, transformation, formatting etc. applications on top of DOM implementations • In contrast to “Serial Access XML” could think as “Directly Obtainable in Memory” 3.2: Document Object Model

  3. DOM structure model • Based on O-O concepts: • methods (to access or change object’s state) • interfaces (declaration of a set of methods) • objects (encapsulation of data and methods) • Roughly similar to the XSLT/XPath data model (to be discussed later)  syntax-tree • Tree structure implied by abstract relationships defined by the API; Data structures of an implementation may differ (but hardly do(?)) 3.2: Document Object Model

  4. <invoice form="00" type="estimated"> <addressdata> <name>John Doe</name> <address> <streetaddress>Pyynpolku 1 </streetaddress> <postoffice>70460 KUOPIO </postoffice> </address> </addressdata> ... form="00" type="estimated" invoice ... addressdata address name Document streetaddress postoffice John Doe Element Pyynpolku 1 70460 KUOPIO Text NamedNodeMap DOM structure model 3.2: Document Object Model

  5. Structure of DOM Level 1 I: DOM Core Interfaces • Fundamental interfaces • basic interfaces: Document, Element, Attr, Text, ... • "Extended" (XML specific) interfaces • CDATASection, DocumentType, Notation, Entity, EntityReference, ProcessingInstruction II: DOM HTML Interfaces • more convenient access to HTML documents • (we'll ignore these) 3.2: Document Object Model

  6. DOM Level 2 • Level 1: basic representation and manipulation of document structure and content (No access to the contents of a DTD) • DOM Level 2 adds • support for namespaces • accessing elements by ID attribute values • optional features (we’ll skip these) • interfaces to document views and style sheets • an event model (for, say, user actions on elements) • methods for traversing the document tree and manipulating regions of document (e.g., selected by the user of an editor) • Load/Save of documents not specified (until Level 3) 3.2: Document Object Model

  7. DOM Language Bindings • Language-independence: • DOM interfaces are defined using OMG Interface Definition Language (IDL; Defined in Corba Specification) • Language bindings (implementations of interfaces) defined in the Recommendation for • Java (See the Java API doc) and • ECMAScript (standardised JavaScript) 3.2: Document Object Model

  8. Core Interfaces: Node & its variants Node Document DocumentFragment Element Attr CharacterData “Extended interfaces” Comment Text CDATASection DocumentType Notation Entity EntityReference ProcessingInstruction 3.2: Document Object Model

  9. Node getNodeType, getNodeName, getNodeValue getOwnerDocument getParentNode hasChildNodes, getChildNodes getFirstChild, getLastChild getPreviousSibling, getNextSibling hasAttributes, getAttributes appendChild(newChild) insertBefore(newChild,refChild) replaceChild(newChild,oldChild) removeChild(oldChild) Document Element Text NamedNodeMap DOM interfaces: Node form="00" type="estimatedbill" invoice ... addressdata name address John Doe streetaddress postoffice Pyynpolku 1 70460 KUOPIO 3.2: Document Object Model

  10. Type and Name of aNode • node.getNodeType():short intconstants 1, 2, …, 12 forNode.ELEMENT_NODE,Node.ATTRIBUTE_NODE,Node.TEXT_NODE, … • node.getNodeName() • for an Element = node.getTagName() • for an Attr: the name of the attribute • for anonymous nodes: "#text", "#document", "#comment" etc 3.2: Document Object Model

  11. The Value of aNode • node.getNodeValue() • content of a text node, value of attribute, …; null for an Element(!!) • (in XSLT/XPath the value of a node is its full textual content) • DOM 3 gives access to full textual content with the methodnode.getTextContent() 3.2: Document Object Model

  12. Object Creation in DOM • Each DOM Node n belongs to aDocument: n.getOwnerDocument() • Objects implementing interface X are created by factory methods doc.createX(…) ,where doc is aDocumentobject. E.g: doc.createElement("A"), doc.createAttribute("href"), doc.createTextNode("Hello!") • Loading & saving specified in DOM3 (or via implementation-specific methods , or JAXP) 3.2: Document Object Model

  13. Document Element Text NamedNodeMap Node DOM interfaces: Document Document getDocumentElement getElementById(IdVal) getElementsByTagName(tagName) createElement(tagName) createTextNode(data) form="00" type="estimated" invoice ... addressdata address name streetaddress postoffice John Doe Pyynpolku 1 70460 KUOPIO 3.2: Document Object Model

  14. Document Element Text NamedNodeMap Node DOM interfaces: Element Element getTagName() hasAttribute(name) getAttribute(name) setAttribute(attrName, value) removeAttribute(name) getElementsByTagName(name) invoice form="00" type="estimatedbill" invoicepage addressee addressdata name address John Doe streetaddress postoffice 3.2: Document Object Model Pyynpolku 1 70460 KUOPIO

  15. Text Content Manipulation in DOM • for an object c that implements the CharacterDatainterface (Text, Comments, CDATASections): • c.substringData(offset, count) • c.appendData(string) • c.insertData(offset, string) • c.deleteData(offset, count) • c.replaceData(offset, count, string)( = c.deleteData(offset, count);c.insertData(offset, string) ) 3.2: Document Object Model

  16. Additional Core Interfaces (1) • NodeListfor ordered lists of nodes • e.g. fromNode.getChildNodes()or Element.getElementsByTagName("name") • all descendant elements of type "name" in document order ("*" matches any element type) • Accessing a specific node, or iterating over all nodes of a NodeList: • E.g., to process all children of node:for (i=0; i<node.getChildNodes().getLength(); i++) process(node.getChildNodes().item(i)); 3.2: Document Object Model

  17. Additional Core Interfaces (2) • NamedNodeMap for unordered sets of nodes accessed by their name: • e.g. fromNode.getAttributes() • NodeLists and NamedNodeMaps are "live": • updates of the document structure are reflected to their contents • e.g., this would delete every other child of node n:NodeList cList = n.getChildNodes();for (i=0; i<cList.getLength(); i++) n.removeChild(cList.item(i)); • That’s strange! (What happens?) 3.2: Document Object Model

  18. DOM: XML Implementations • Java-based parsers e.g. Apache Xerces, Apache Crimson, … • In MS IE browser: COM programming interfaces for C/C++ and Visual Basic; ActiveX object programming interfaces for script languages • Perl: XML::DOM (Implements DOM Level 1) • Others? APIs for other applications than parsers? • Vendors of different kinds of systems have participated in the W3C DOM WG 3.2: Document Object Model

  19. Document loaded succesfully > list the contents A Java-DOM Example • Command-line tool RegListMgrfor maintaining a course registration list • with single-letter commands for listing, adding, updating and deleting student records • Example: $ java RegListMgr reglist.xml l … 40: Tero Ulvinen, TKM1, tero@fake.addr.fi, 241: heli viinikainen, tkt5, heli@fake.addr.fi, 1 3.2: Document Object Model

  20. Registration list: the XML file <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE reglist SYSTEM "reglist.dtd"> <reglist lastID="41"> <student id="RDK1"> <name><given>Juho</given> <family>Ahopelto</family></name> <branchAndYear>TKT4</branchAndYear> <email>juho@fake.addr.fi</email> <group>2</group> </student> <!-- … and the other students … --> </reglist> 3.2: Document Object Model

  21. Registration List: the DTD <!ELEMENT reglist (student*)> <!ATTLIST reglist lastID CDATA #REQUIRED > <!ELEMENT student (name, branchAndYear, email, group)> <!ATTLIST student id ID #REQUIRED > <!ELEMENT name (given, family)> <!ELEMENT given (#PCDATA)> <!-- … and the same for family, branchAndYear, email,and group --> 3.2: Document Object Model

  22. Loading and Saving the RegList • Loading of the registration list into DOMDocumentdoc implemented with a JAXP DocumentBuilder • assume this has been done: doc is a handle to the Document • Saving implemented with a JAXP Transformer • … to be discussed later 3.2: Document Object Model

  23. Listing student records (1) NodeList students = doc.getElementsByTagName("student"); for (int i=0; i<students.getLength(); i++) showStudent((Element) students.item(i)); private void showStudent(Element student) { // Collect relevant sub-elements: Node given = student.getElementsByTagName("given").item(0); Node family = given.getNextSibling(); Node bAndY = student. getElementsByTagName("branchAndYear").item(0); Node email = bAndY.getNextSibling(); Node group = email.getNextSibling(); 3.2: Document Object Model

  24. Listing student records (2) // Method showStudent continues: System.out.print( student.getAttribute("id").substring(3)); System.out.print(": " + given.getFirstChild().getNodeValue() ); // or given.getTextContent() with DOM3 // .. similarly access and display the // value of family, bAndY, email, and group// … } // showStudent 3.2: Document Object Model

  25. add students Antti Last name: Ahkera Branch&year: tkt3 email: antti@fake.addr.fi group: 2 First name (or <return> to finish): Finished adding records > Adding New Records • Example: > a First name (or <return> to finish): l … 41: heli viinikainen, tkt5, heli@fake.addr.fi, 1 42: Antti Ahkera, tkt3, antti@fake.addr.fi, 2 3.2: Document Object Model

  26. Implementing addition of records (1) Element rootElem = doc.getDocumentElement(); String lastID = rootElem.getAttribute("lastID"); int lastIDnum = java.lang.Integer.parseInt(lastID); System.out.print( "First name (or <return> to finish): "); String firstName = terminalReader.readLine().trim(); while (firstName.length() > 0) { // Get the next unused ID: ID = "RDK" + new Integer(++lastIDnum).toString(); // … Read values lastName, bAndY, email, // and group from the terminal, and then ... 3.2: Document Object Model

  27. Implementing addition of records (2) Element newStudent = newStudent(doc, ID, firstName, lastName, bAndY, email, group); rootElem.appendChild(newStudent); System.out.print( "First name (or <return> to finish): "); firstName = terminalReader.readLine().trim(); } // while firstName.length() > 0 // Update the last ID used: String newLastID = java.lang.Integer.toString(lastIDnum); rootElem.setAttribute("lastID", newLastID); System.out.println("Finished adding records"); 3.2: Document Object Model

  28. Creating new student records (1) private Element newStudent(Document doc, String ID, String fName, String lName, String bAndY, String email, String grp) { Element stu = doc.createElement("student"); stu.setAttribute("id", ID); Element newName = doc.createElement("name"); Element newGiven = doc.createElement("given"); newGiven.appendChild(doc.createTextNode(fName)); Element newFamily = doc.createElement("family"); newFamily.appendChild(doc.createTextNode(lName)); newName.appendChild(newGiven); newName.appendChild(newFamily); stu.appendChild(newName); 3.2: Document Object Model

  29. Creating new student records (2) // method newStudent(…) continues:Element newBr = doc.createElement("branchAndYear"); newBr.appendChild(doc.createTextNode(bAndY)); stu.appendChild(newBr); Element newEmail = doc.createElement("email"); newEmail.appendChild(doc.createTextNode(email)); stu.appendChild(newEmail); Element newGrp = doc.createElement("group"); newGrp.appendChild(doc.createTextNode(group)); stu.appendChild(newGrp); return stu; } // newStudent 3.2: Document Object Model

  30. Updates and Deletions • Updates and deletions implemented similarly, by manipulating the DOM structures • To be treated in the exercises 3.2: Document Object Model

  31. Summary of XML APIs so far • Give applications access to the structure and contents of XML documents • Event-based APIs (e.g. SAX) • notify application through parsing events • efficient • Object-model (or tree) based APIs (e.g. DOM) • provide a full parse tree • more convenient, but require much resources with large documents • Major parsers support both SAX and DOM • used through proprietary methods • used through JAXP (-> next) 3.2: Document Object Model

More Related