Module pl.xml
XML LOM Utilities.
 This implements some useful things on LOM documents, such as returned by lxp.lom.parse.
 In particular, it can convert LOM back into XML text, with optional pretty-printing control.
 It is based on stanza.lua from Prosody
> d = xml.parse "<nodes><node id='1'>alice</node></nodes>" > = d <nodes><node id='1'>alice</node></nodes> > = xml.tostring(d,'',' ') <nodes> <node id='1'>alice</node> </nodes>
 Can be used as a lightweight one-stop-shop for simple XML processing; a simple XML parser is included
 but the default is to use lxp.lom if it can be found.
 
Prosody IM Copyright (C) 2008-2010 Matthew Wild Copyright (C) 2008-2010 Waqas Hussain-- classic Lua XML parser by Roberto Ierusalimschy. modified to output LOM format. http://lua-users.org/wiki/LuaXmlSee the Guide
Dependencies: pl.utils
 Soft Dependencies: lxp.lom (fallback is to use basic Lua parser)
Functions
| new (tag[, attr={}]) | create a new document node. | 
| parse (text_or_filename, is_file, use_basic) | parse an XML document. | 
| elem (tag, items) | Create a Node with a set of children (text or Nodes) and attributes. | 
| tags (list) | given a list of names, return a number of element constructors. | 
| Doc:addtag (tag[, attrs={}]) | Adds a document Node, at current position. | 
| Doc:text (text) | Adds a text node, at current position. | 
| Doc:up () | Moves current position up one level. | 
| Doc:reset () | Resets current position to top level. | 
| Doc:add_direct_child (child) | Append a child to the current Node (ignoring current position). | 
| Doc:add_child (child) | Append a child at the current position (without changing position). | 
| Doc:set_attribs (t) | Set attributes of a document node. | 
| Doc:set_attrib (a, v) | Set a single attribute of a document node. | 
| Doc:get_attribs () | Gets the attributes of a document node. | 
| Doc.subst (template, data) | create a substituted copy of a document, | 
| Doc:child_with_name (tag) | Return the first child with a given tag name (non-recursive). | 
| Doc:get_elements_with_name (tag[, dont_recurse=false]) | Returns all elements in a document that have a given tag. | 
| Doc:children () | Iterator over all children of a document node, including text nodes. | 
| Doc:first_childtag () | Return the first child element of a node, if it exists. | 
| Doc:matching_tags ([tag=nil[, xmlns=nil]]) | Iterator that matches tag names, and a namespace (non-recursive). | 
| Doc:childtags () | Iterator over all child tags of a document node. | 
| Doc:maptags (callback) | Visit child Nodes of a node and call a function, possibly modifying the document. | 
| xml_escape (str) | Escapes a string for safe use in xml. | 
| xml_unescape (str) | Unescapes a string from xml. | 
| tostring (doc[, b_ind[, t_ind[, a_ind[, xml_preface]]]]) | Function to pretty-print an XML document. | 
| Doc:tostring ([b_ind[, t_ind[, a_ind[, xml_preface="<?xml version='1.0'?>"]]]]) | Method to pretty-print an XML document. | 
| Doc:get_text () | get the full text value of an element. | 
| clone (doc[, strsubst]) | Returns a copy of a document. | 
| Doc:filter ([strsubst]) | Returns a copy of a document. | 
| compare (t1, t2) | Compare two documents or elements. | 
| is_tag (d) | is this value a document element? | 
| walk (doc, depth_first, operation) | Calls a function recursively over Nodes in the document. | 
| parsehtml (s) | Parse a well-formed HTML file as a string. | 
| basic_parse (s, all_text, html) | Parse a simple XML document using a pure Lua parser based on Robero Ierusalimschy's original version. | 
| Doc:match (pat) | does something... | 
Functions
- new (tag[, attr={}])
- 
    create a new document node.
    Parameters:Returns:- 
        the Node object
    
 See also:Usage:local doc = xml.new("main", { hello = "world", answer = "42" }) print(doc) --> <main hello='world' answer='42'/> 
- parse (text_or_filename, is_file, use_basic)
- 
    parse an XML document.  By default, this uses lxp.lom.parse, but
 falls back to basic_parse, or if use_basicis truthyParameters:- text_or_filename file or string representation
- is_file whether textorfile is a file name or not
- use_basic do a basic parse
 Returns:- a parsed LOM document with the document metatatables set
- nil, error the error can either be a file error or a parse error
 
- elem (tag, items)
- 
    Create a Node with a set of children (text or Nodes) and attributes.
    Parameters:- tag string a tag name
- items table or string either a single child (text or Node), or a table where the hash part is the attributes and the list part is the children (text or Nodes).
 Returns:- 
        the new Node
    
 See also:Usage:local doc = xml.elem("top", "hello world") -- <top>hello world</top> local doc = xml.elem("main", xml.new("child")) -- <main><child/></main> local doc = xml.elem("main", { "this ", "is ", "nice" }) -- <main>this is nice</main> local doc = xml.elem("main", { xml.new "this", xml.new "is", xml.new "nice" }) -- <main><this/><is/><nice/></main> local doc = xml.elem("main", { hello = "world" }) -- <main hello='world'/> local doc = xml.elem("main", { "prefix", xml.elem("child", { "this ", "is ", "nice"}), "postfix", attrib = "value" }) -- <main attrib='value'>prefix<child>this is nice</child>postfix</main>" 
- tags (list)
- 
    given a list of names, return a number of element constructors.
 If passing a comma-separated string, then whitespace surrounding the values
 will be stripped.
The returned constructor functions are a shortcut to xml.elem where you no longer provide the tag-name, but only the itemstable.Parameters:Returns:- 
        (multiple) constructor functions; 
 function(items). For theitemsparameter see xml.elem.See also:Usage:local new_parent, new_child = xml.tags 'mom, kid' doc = new_parent {new_child 'Bob', new_child 'Annie'} -- <mom><kid>Bob</kid><kid>Annie</kid></mom> 
- Doc:addtag (tag[, attrs={}])
- 
    Adds a document Node, at current position.
 This updates the last inserted position to the new Node.
    Parameters:Returns:- 
        the current node (
 self)Usage:local doc = xml.new("main") doc:addtag("penlight", { hello = "world"}) doc:addtag("expat") -- added to 'penlight' since position moved print(doc) --> <main><penlight hello='world'><expat/></penlight></main> 
- Doc:text (text)
- 
    Adds a text node, at current position.
    Parameters:- text string a string
 Returns:- 
        the current node (
 self)Usage:local doc = xml.new("main") doc:text("penlight") doc:text("expat") print(doc) --> <main><penlightexpat</main> 
- Doc:up ()
- 
    Moves current position up one level.
    Returns:- 
        the current node (
 self)
- Doc:reset ()
- 
    Resets current position to top level.
 Resets to the selfnode.Returns:- 
        the current node (
 self)
- Doc:add_direct_child (child)
- 
    Append a child to the current Node (ignoring current position).
    Parameters:- child a child node (either text or a document)
 Returns:- 
        the current node (
 self)Usage:local doc = xml.new("main") doc:add_direct_child("dog") doc:add_direct_child(xml.new("child")) doc:add_direct_child("cat") print(doc) --> <main>dog<child/>cat</main> 
- Doc:add_child (child)
- 
    Append a child at the current position (without changing position).
    Parameters:- child a child node (either text or a document)
 Returns:- 
        the current node (
 self)Usage:local doc = xml.new("main") doc:addtag("one") doc:add_child(xml.new("item1")) doc:add_child(xml.new("item2")) doc:add_child(xml.new("item3")) print(doc) --> <main><one><item1/><item2/><item3/></one></main> 
- Doc:set_attribs (t)
- 
    Set attributes of a document node.
 Will add/overwrite values, but will not remove existing ones.
 Operates on the Node itself, will not take position into account.
    Parameters:- t table a table containing attribute/value pairs
 Returns:- 
        the current node (
 self)
- Doc:set_attrib (a, v)
- 
    Set a single attribute of a document node.
 Operates on the Node itself, will not take position into account.
    Parameters:- a attribute
- v
         its value, pass in nilto delete the attribute
 Returns:- 
        the current node (
 self)
- Doc:get_attribs ()
- 
    Gets the attributes of a document node.
 Operates on the Node itself, will not take position into account.
    Returns:- 
        table with attributes (attribute/value pairs)
    
 
- Doc.subst (template, data)
- 
    create a substituted copy of a document,
    Parameters:- template may be a document or a string representation which will be parsed and cached
- data a table of name-value pairs or a list of such tables
 Returns:- 
        an XML document
    
 
- Doc:child_with_name (tag)
- 
    Return the first child with a given tag name (non-recursive).
    Parameters:- tag the tag name
 Returns:- 
        the child Node found or 
 nilif not found
- Doc:get_elements_with_name (tag[, dont_recurse=false])
- 
    Returns all elements in a document that have a given tag.
    Parameters:- tag string a tag name
- dont_recurse boolean optionally only return the immediate children with this tag name (default false)
 Returns:- 
        a list of elements found, list will be empty if none was found.
    
 
- Doc:children ()
- 
    Iterator over all children of a document node, including text nodes.
 This function is not recursive, so returns only direct child nodes.
    Returns:- 
        iterator that returns a single Node per iteration.
    
 
- Doc:first_childtag ()
- 
    Return the first child element of a node, if it exists.
 This will skip text nodes.
    Returns:- 
        first child Node or 
 nilif there is none.
- Doc:matching_tags ([tag=nil[, xmlns=nil]])
- 
    Iterator that matches tag names, and a namespace (non-recursive).
    Parameters:- tag string tag names to return. Returns all tags if not provided. (default nil)
- xmlns string the namespace value ('xmlns' attribute) to return. If not provided will match all namespaces. (default nil)
 Returns:- 
        iterator that returns a single Node per iteration.
    
 
- Doc:childtags ()
- 
    Iterator over all child tags of a document node.  This will skip over
 text nodes.
    Returns:- 
        iterator that returns a single Node per iteration.
    
 
- Doc:maptags (callback)
- 
    Visit child Nodes of a node and call a function, possibly modifying the document.
 Text elements will be skipped.
 This is not recursive, so only direct children will be passed.
    Parameters:- callback
            function
         a function with signature function(node), passed the node. The element will be updated with the returned value, or deleted if it returnsnil.
 
- callback
            function
         a function with signature 
- xml_escape (str)
- 
    Escapes a string for safe use in xml.
 Handles quotes(single+double), less-than, greater-than, and ampersand.
    Parameters:- str string string value to escape
 Returns:- 
        escaped string
    
 Usage:local esc = xml.xml_escape([["'<>&]]) --> ""'<>&" 
- xml_unescape (str)
- 
    Unescapes a string from xml.
 Handles quotes(single+double), less-than, greater-than, and ampersand.
    Parameters:- str string string value to unescape
 Returns:- 
        unescaped string
    
 Usage:local unesc = xml.xml_escape(""'<>&") --> [["'<>&]] 
- tostring (doc[, b_ind[, t_ind[, a_ind[, xml_preface]]]])
- 
    Function to pretty-print an XML document.
    Parameters:- doc an XML document
- b_ind
            string or int
         an initial block-indent (required when t_indis set) (optional)
- t_ind
            string or int
         an tag-indent for each level (required when a_indis set) (optional)
- a_ind string or int if given, indent each attribute pair and put on a separate line (optional)
- xml_preface
            string or bool
         force prefacing with default or custom , if truthy then <?xml version='1.0'?>will be used as default. (optional)
 Returns:- 
        a string representation
    
 See also:
- Doc:tostring ([b_ind[, t_ind[, a_ind[, xml_preface="<?xml version='1.0'?>"]]]])
- 
    Method to pretty-print an XML document.
 Invokes xml.tostring.
    Parameters:- b_ind
            string or int
         an initial indent (required when t_indis set) (optional)
- t_ind
            string or int
         an indent for each level (required when a_indis set) (optional)
- a_ind string or int if given, indent each attribute pair and put on a separate line (optional)
- xml_preface string force prefacing with default or custom (default "<?xml version='1.0'?>")
 Returns:- 
        a string representation
    
 See also:
- b_ind
            string or int
         an initial indent (required when 
- Doc:get_text ()
- 
    get the full text value of an element.
    Returns:- 
        a single string with all text elements concatenated
    
 Usage:local doc = xml.new("main") doc:text("one") doc:add_child(xml.elem "two") doc:text("three") local t = doc:get_text() --> "onethree" 
- clone (doc[, strsubst])
- 
Returns a copy of a document. The strsubstparameter is a callback with signaturefunction(object, kind, parent).Param kindhas the following values, and parameters:- "*TAG":- objectis the tag-name,- parentis the Node object. Returns the new tag name.
- "*TEXT":- objectis the text-element,- parentis the Node object. Returns the new text value.
- other strings not prefixed with - *:- kindis the attribute name,- objectis the attribute value,- parentis the Node object. Returns the new attribute value.
 Parameters:- doc Node or string a Node object or string (text node)
- strsubst function an optional function for handling string copying which could do substitution, etc. (optional)
 Returns:- 
        copy of the document
    
 See also:
- Doc:filter ([strsubst])
- 
    Returns a copy of a document.
 This is the method version of xml.clone.
    Parameters:- strsubst function an optional function for handling string copying (optional)
 See also:
- compare (t1, t2)
- 
    Compare two documents or elements.
 Equality is based on tag, child nodes (text and tags), attributes and order
 of those (order only fails if both are given, and not equal).
    Parameters:- t1 Node or string a Node object or string (text node)
- t2 Node or string a Node object or string (text node)
 Returns:- 
           boolean
        
 truewhen the Nodes are equal.
- is_tag (d)
- 
    is this value a document element?
    Parameters:- d any value
 Returns:- 
           boolean
        
 trueif it is a table with propertytagbeing a string value.
- walk (doc, depth_first, operation)
- 
    Calls a function recursively over Nodes in the document.
 Will only call on tags, it will skip text nodes.
 The function signature for operationisfunction(tag_name, Node).Parameters:- doc Node or string a Node object or string (text node)
- depth_first boolean visit child nodes first, then the current node
- operation function a function which will receive the current tag name and current node.
 
- parsehtml (s)
- 
    Parse a well-formed HTML file as a string.
 Tags are case-insensitive, DOCTYPE is ignored, and empty elements can be .. empty.
    Parameters:- s the HTML
 
- basic_parse (s, all_text, html)
- 
    Parse a simple XML document using a pure Lua parser based on Robero Ierusalimschy's original version.
    Parameters:- s the XML document to be parsed.
- all_text if true, preserves all whitespace. Otherwise only text containing non-whitespace is included.
- html if true, uses relaxed HTML rules for parsing
 
- Doc:match (pat)
- 
    does something...
    Parameters:- pat