Failed to fetch fork details. Try again later.
-
Delaigue Olivier authored
Reviewed-by: @francois.tilmant Refs #144
aa4a5514
Forked from
HYCAR-Hydro / airGR
Source project has a limited visibility.
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.13: http://docutils.sourceforge.net/" />
<title>lxml.objectify notes</title>
<meta name="author" content="Dave Kuhlman" />
<meta name="date" content="July 06, 2015" />
<style type="text/css">
/* css */
body {
font: 90% 'Lucida Grande', Verdana, Geneva, Lucida, Arial, Helvetica, sans-serif;
background: #ffffff;
color: black;
margin: 2em;
padding: 2em;
}
a[href] {
color: #436976;
background-color: transparent;
}
a.toc-backref {
text-decoration: none;
}
h1 a[href] {
text-decoration: none;
color: #fcb100;
background-color: transparent;
}
a.strong {
font-weight: bold;
}
img {
margin: 0;
border: 0;
}
p {
margin: 0.5em 0 1em 0;
line-height: 1.5em;
}
p a {
text-decoration: underline;
}
p a:visited {
color: purple;
background-color: transparent;
}
p a:active {
color: red;
background-color: transparent;
}
a:hover {
text-decoration: none;
}
p img {
border: 0;
margin: 0;
}
h1, h2, h3, h4, h5, h6 {
color: #003a6b;
7172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140
background-color: transparent;
font: 100% 'Lucida Grande', Verdana, Geneva, Lucida, Arial, Helvetica, sans-serif;
margin: 0;
padding-top: 0.5em;
}
h1 {
font-size: 160%;
margin-bottom: 0.5em;
border-bottom: 1px solid #fcb100;
}
h2 {
font-size: 140%;
margin-bottom: 0.5em;
border-bottom: 1px solid #aaa;
}
h3 {
font-size: 130%;
margin-bottom: 0.5em;
text-decoration: underline;
}
h4 {
font-size: 110%;
font-weight: bold;
}
h5 {
font-size: 100%;
font-weight: bold;
}
h6 {
font-size: 80%;
font-weight: bold;
}
ul a, ol a {
text-decoration: underline;
}
dt {
font-weight: bold;
}
dt a {
text-decoration: none;
}
dd {
line-height: 1.5em;
margin-bottom: 1em;
}
legend {
background: #ffffff;
padding: 0.5em;
}
form {
margin: 0;
}
dl.form {
margin: 0;
padding: 1em;
}
dl.form dt {
width: 30%;
float: left;
margin: 0;
padding: 0 0.5em 0.5em 0;
141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210
text-align: right;
}
input {
font: 100% 'Lucida Grande', Verdana, Geneva, Lucida, Arial, Helvetica, sans-serif;
color: black;
background-color: white;
vertical-align: middle;
}
abbr, acronym, .explain {
color: black;
background-color: transparent;
}
q, blockquote {
}
code, pre {
font-family: monospace;
font-size: 1.2em;
display: block;
padding: 10px;
border: 1px solid #838183;
background-color: #eee;
color: #000;
overflow: auto;
margin: 0.5em 1em;
}
div.admonition, div.attention, div.caution, div.danger, div.error,
div.hint, div.important, div.note, div.tip, div.warning {
margin: 2em ;
border: medium outset ;
padding: 1em }
div.admonition p.admonition-title, div.hint p.admonition-title,
div.important p.admonition-title, div.note p.admonition-title,
div.tip p.admonition-title {
font-weight: bold ;
font-family: sans-serif }
div.attention p.admonition-title, div.caution p.admonition-title,
div.danger p.admonition-title, div.error p.admonition-title,
div.warning p.admonition-title {
color: red ;
font-weight: bold ;
font-family: sans-serif }
tt.docutils {
background-color: #dddddd;
}
ul.auto-toc {
list-style-type: none }
</style>
</head>
<body>
<div class="document" id="lxml-objectify-notes">
<h1 class="title">lxml.objectify notes</h1>
<table class="docinfo" frame="void" rules="none">
<col class="docinfo-name" />
<col class="docinfo-content" />
<tbody valign="top">
<tr><th class="docinfo-name">Author:</th>
<td>Dave Kuhlman</td></tr>
<tr><th class="docinfo-name">Address:</th>
<td><pre class="address">
dkuhlman (at) davekuhlman (dot) org
211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280
</pre>
</td></tr>
<tr><th class="docinfo-name">Revision:</th>
<td>1.1a</td></tr>
<tr><th class="docinfo-name">Date:</th>
<td>July 06, 2015</td></tr>
</tbody>
</table>
<!-- vim:ft=rst: -->
<table class="docutils field-list" frame="void" rules="none">
<col class="field-name" />
<col class="field-body" />
<tbody valign="top">
<tr class="field"><th class="field-name">Copyright:</th><td class="field-body">Copyright (c) 2015 Dave Kuhlman. All Rights Reserved.
This software is subject to the provisions of the MIT License
<a class="reference external" href="http://www.opensource.org/licenses/mit-license.php">http://www.opensource.org/licenses/mit-license.php</a>.</td>
</tr>
<tr class="field"><th class="field-name">Abstract:</th><td class="field-body">A document intended to help those getting started with
<tt class="docutils literal">lxml.objectify</tt> and, in particular, to help those
attempting to transition from <tt class="docutils literal">generateDS.py</tt>.</td>
</tr>
</tbody>
</table>
<div class="contents topic" id="contents">
<p class="topic-title first">Contents</p>
<ul class="auto-toc simple">
<li><a class="reference internal" href="#introduction" id="id1">1 Introduction</a></li>
<li><a class="reference internal" href="#migrating-from-generateds-py-to-lxml-objectify" id="id2">2 Migrating from generateDS.py to lxml.objectify</a><ul class="auto-toc">
<li><a class="reference internal" href="#parsing-an-xml-instance-document" id="id3">2.1 Parsing an XML instance document</a></li>
<li><a class="reference internal" href="#exporting-an-xml-document" id="id4">2.2 Exporting an XML document</a><ul class="auto-toc">
<li><a class="reference internal" href="#exporting-without-ignorable-whitespace" id="id5">2.2.1 Exporting without "ignorable whitespace"</a></li>
</ul>
</li>
<li><a class="reference internal" href="#the-lxml-objectify-api-access-to-children-and-attributes" id="id6">2.3 The lxml.objectify API -- access to children and attributes</a></li>
<li><a class="reference internal" href="#manipulating-and-modifying-the-element-tree" id="id7">2.4 Manipulating and modifying the element tree</a></li>
</ul>
</li>
<li><a class="reference internal" href="#useful-tips-and-hints" id="id8">3 Useful tips and hints</a><ul class="auto-toc">
<li><a class="reference internal" href="#a-mini-library-of-helpful-functions" id="id9">3.1 A mini-library of helpful functions</a></li>
<li><a class="reference internal" href="#printing-a-more-readable-representation" id="id10">3.2 Printing a (more) readable representation</a></li>
<li><a class="reference internal" href="#exploring-element-specific-api" id="id11">3.3 Exploring element-specific API</a></li>
<li><a class="reference internal" href="#searching-an-xml-document" id="id12">3.4 Searching an XML document</a></li>
</ul>
</li>
<li><a class="reference internal" href="#sample-applications-with-lxml-objectify" id="id13">4 Sample applications with lxml.objectify</a></li>
<li><a class="reference internal" href="#evaluation-and-comparison-lxml-objectify-vs-generateds-py" id="id14">5 Evaluation and comparison -- lxml.objectify vs. generateDS.py</a><ul class="auto-toc">
<li><a class="reference internal" href="#api-discovery" id="id15">5.1 API discovery</a></li>
<li><a class="reference internal" href="#namespaces" id="id16">5.2 Namespaces</a></li>
<li><a class="reference internal" href="#summary" id="id17">5.3 Summary</a></li>
</ul>
</li>
</ul>
</div>
<div class="section" id="introduction">
<h1><a class="toc-backref" href="#id1">1 Introduction</a></h1>
<p>This document is an attempt to give a little help to those starting
out with <tt class="docutils literal">lxml.objectify</tt>. But, it does not attempt to replace
the official doc, which you can find here:
<a class="reference external" href="http://lxml.de/objectify.html">http://lxml.de/objectify.html</a>.</p>
<p>Much of the code in this document assumes that you have done the
following in your Python code:</p>
<pre class="literal-block">
from lxml import objectify
</pre>
</div>
<div class="section" id="migrating-from-generateds-py-to-lxml-objectify">
<h1><a class="toc-backref" href="#id2">2 Migrating from generateDS.py to lxml.objectify</a></h1>
<p>With <tt class="docutils literal">lxml.objectify</tt>, unlike <tt class="docutils literal">generateDS.py</tt>, there is no need
to generate code before processing an XML instance document.</p>
<div class="section" id="parsing-an-xml-instance-document">
281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350
<h2><a class="toc-backref" href="#id3">2.1 Parsing an XML instance document</a></h2>
<p>Use something like the following:</p>
<pre class="literal-block">
def objectify_parse(infilename):
doctree = objectify.parse(infilename)
root = doctree.getroot()
return doctree, root
</pre>
<p>Or, when you want to validate against a schema while parsing, use:</p>
<pre class="literal-block">
def objectify_parse_with_schema(schemaname, infilename):
schema = etree.XMLSchema(file=schemaname)
parser = objectify.makeparser(schema=schema)
doctree = objectify.parse(infilename, parser=parser)
root = doctree.getroot()
return doctree, root
</pre>
<p>And, if validation against a schema is one of your needs, don't
forget the <tt class="docutils literal">xmllint</tt> command line tool. For example:</p>
<pre class="literal-block">
$ xmllint --noout --schema my_schema.xsd my_instancedoc.xml
</pre>
</div>
<div class="section" id="exporting-an-xml-document">
<h2><a class="toc-backref" href="#id4">2.2 Exporting an XML document</a></h2>
<p>There are several ways:</p>
<pre class="literal-block">
>>> print etree.tostring(doctree)
>>> print etree.tostring(root)
>>> doctree.write(sys.stdout)
>>> doctree.write(sys.stdout, pretty_print=True)
</pre>
<p>You can also export a sub-tree:</p>
<pre class="literal-block">
In [163]: person = root.person[1]
In [164]: print etree.tostring(person, pretty_print=True)
</pre>
<p>And, with optional pretty printing (indenting) and an XML
declaration:</p>
<pre class="literal-block">
>>> doctree.write(my_output_file, pretty_print=True)
>>> doctree.write(my_output_file, xml_declaration=True)
>>> doctree.write(my_output_file, pretty_print=True, xml_declaration=True)
</pre>
<p>Yet more examples:</p>
<pre class="literal-block">
>>> a = obj.fromstring('<aaa><bbb>111</bbb><bbb><ccc>222</ccc></bbb></aaa>')
>>> etree.tostring(a)
>>> print etree.tostring(a)
>>> print etree.tostring(a, pretty_print=True)
>>> print etree.tostring(a.bbb[1], pretty_print=True) # pretty print a subtree
</pre>
<div class="section" id="exporting-without-ignorable-whitespace">
<h3><a class="toc-backref" href="#id5">2.2.1 Exporting without "ignorable whitespace"</a></h3>
<p>The <tt class="docutils literal">export</tt> methods generated by <tt class="docutils literal">generateDS.py</tt> support an
optional argument (<tt class="docutils literal">pretty_print=True</tt>) that enables you to export
a document <em>without</em> ignorable whitespace. <tt class="docutils literal">lxml.objectify</tt> has
support for that also:</p>
<ol class="arabic">
<li><p class="first">Parse the document initially without the ignorable whitespace.
Example:</p>
<pre class="literal-block">
parser = etree.XMLParser(remove_blank_text=True)
doc = etree.parse(filename, parser)
root = doc.getroot()
</pre>
</li>
<li><p class="first">In some cases you might need to remove ignorable whitespace with
something like the following:</p>
<pre class="literal-block">
351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420
for element in root.iter():
element.tail = None
</pre>
</li>
</ol>
<p>The above code examples and more information on ignorable whitespace
and formatting serialized output (also known as "export" in
<tt class="docutils literal">generateDS.py</tt>) can be found in the <tt class="docutils literal">lxml</tt> FAQ:
<a class="reference external" href="http://lxml.de/FAQ.html#why-doesn-t-the-pretty-print-option-reformat-my-xml-output">http://lxml.de/FAQ.html#why-doesn-t-the-pretty-print-option-reformat-my-xml-output</a></p>
</div>
</div>
<div class="section" id="the-lxml-objectify-api-access-to-children-and-attributes">
<h2><a class="toc-backref" href="#id6">2.3 The lxml.objectify API -- access to children and attributes</a></h2>
<p><strong>Attributes</strong> -- The attributes of an <tt class="docutils literal">lxml.objectify</tt> XML element are
available in a dictionary-like object. But you can also access them
directly throught the element. Examples:</p>
<pre class="literal-block">
In [81]: element.attrib
Out[81]: {'ratio': '3.2', 'id': '1', 'value': 'abcd'}
In [82]:
In [82]: element.get('ratio')
Out[82]: '3.2'
In [83]: print element.get('ratio')
3.2
In [84]: print element.get('ratioxxx')
None
</pre>
<p>And, use <tt class="docutils literal">element.set(key, value)</tt> to set an attribute's value.</p>
<p>Iterate over the attributes using the standard dictionary API on the
elements <tt class="docutils literal">el.attrib</tt> attribute. Example:</p>
<pre class="literal-block">
In [48]: link = root.Link[2]
In [49]: for key, value in link.attrib.items():
....: print 'key: {} value: {}'.format(key, value)
....:
key: rel value: down
key: type value: application/vnd.vmware.admin.vmwExtension+xml
key: href value: https://vcloud.example.com/api/admin/extension
</pre>
<p><strong>Children</strong> -- The children of an XML element are available by using
the child's tag as an attribute. For example, if the element
<tt class="docutils literal">people</tt> has one or more children whose tag is <tt class="docutils literal">person</tt>, then
those children can be accessed as follows:</p>
<pre class="literal-block">
In [87]: people.person # first person available without index
Out[87]: <Element person at 0x7fa0f1814ea8>
In [88]: people.person[0] # same as previous
Out[88]: <Element person at 0x7fa0f1814ea8>
In [89]: people.person[1]
Out[89]: <Element person at 0x7fa0f1814e60>
</pre>
<p>You can also use <tt class="docutils literal">getattr()</tt> to access child elements. You may
need to do this when there are children from different namespaces
within the same element. Examples:</p>
<pre class="literal-block">
In [50]: rootgroup = root.RootGroup
In [51]: rootgroup.Group
Out[51]: <Element {http://hdfgroup.org/HDF5/XML/schema/HDF5-File.xsd}Group at 0x7f8d34a05b48>
In [52]:
In [52]: getattr(rootgroup, 'Group')
Out[52]: <Element {http://hdfgroup.org/HDF5/XML/schema/HDF5-File.xsd}Group at 0x7f8d34a05b48>
In [53]:
In [53]: getattr(rootgroup, '{http://hdfgroup.org/HDF5/XML/schema/HDF5-File.xsd}Group')
Out[53]: <Element {http://hdfgroup.org/HDF5/XML/schema/HDF5-File.xsd}Group at 0x7f8d34a05b48>
In [54]:
In [54]: getattr(rootgroup, '{http://hdfgroup.org/HDF5/XML/schema/HDF5-File.xsd}Group')[1]
Out[54]: <Element {http://hdfgroup.org/HDF5/XML/schema/HDF5-File.xsd}Group at 0x7f8d34a05ab8>
</pre>
<p>Iterate over the children by using the element's
<tt class="docutils literal">el.iterchildren()</tt> method. Example:</p>
421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490
<pre class="literal-block">
In [47]: for child in root.iterchildren():
print child.tag
....:
{http://www.vmware.com/vcloud/v1.5}Link
{http://www.vmware.com/vcloud/v1.5}Link
{http://www.vmware.com/vcloud/v1.5}Link
{http://www.vmware.com/vcloud/v1.5}Link
{http://www.vmware.com/vcloud/v1.5}Link
</pre>
</div>
<div class="section" id="manipulating-and-modifying-the-element-tree">
<h2><a class="toc-backref" href="#id7">2.4 Manipulating and modifying the element tree</a></h2>
<p>Modify text content -- You can assign to a leaf element to modify
its text content, for example:</p>
<pre class="literal-block">
>>> dataset.datanode = 'a simple string'
</pre>
<p>However, you may want to use <tt class="docutils literal">lxml.objectify</tt> data types. If you
do not, <tt class="docutils literal">lxml.objectify</tt> may put them in a different namespace.
Here are examples that preserve the existing data types:</p>
<pre class="literal-block">
>>> dataset.datanode = objectify.StringElement('a simple string')
>>> dataset.datanode = objectify.IntElement('200')
>>> dataset.datanode = objectify.FloatElement('300.5')
</pre>
<p>See the following for more on how to work with Python data types:
<a class="reference external" href="http://lxml.de/objectify.html#python-data-types">http://lxml.de/objectify.html#python-data-types</a></p>
<p>Creating new elements -- See this for information on how to add
elements to the XML element tree:
<a class="reference external" href="http://lxml.de/objectify.html#creating-objectify-trees">http://lxml.de/objectify.html#creating-objectify-trees</a></p>
<p>You can also copy existing elements or sub-trees of elements, for
example:</p>
<pre class="literal-block">
>>> import copy
>>> new_element = copy.deepcopy(old_element)
>>> parent_element.append(new_element)
</pre>
</div>
</div>
<div class="section" id="useful-tips-and-hints">
<h1><a class="toc-backref" href="#id8">3 Useful tips and hints</a></h1>
<div class="section" id="a-mini-library-of-helpful-functions">
<h2><a class="toc-backref" href="#id9">3.1 A mini-library of helpful functions</a></h2>
<p>Some of the helper functions mentioned below are available here:
<a class="reference external" href="Objectify_files/objectify_helpers.py">objectify_helpers.py</a>.</p>
</div>
<div class="section" id="printing-a-more-readable-representation">
<h2><a class="toc-backref" href="#id10">3.2 Printing a (more) readable representation</a></h2>
<p>In order to get a picture of the API available at various elements,
you can use the <tt class="docutils literal">objectify.dump(element)</tt>. For example:</p>
<pre class="literal-block">
In [237]: print objectify.dump(root.programmer)
programmer = None [ObjectifiedElement]
* id = '2'
* language = 'python'
* editor = 'xml'
name = 'Charles Carlson' [StringElement]
interest = 'programming' [StringElement]
category = 2233 [IntElement]
description = 'A very happy programmer' [StringElement]
email = 'charles@happyprogrammers.com' [StringElement]
elposint = 14 [IntElement]
elnonposint = 0 [IntElement]
elnegint = -12 [IntElement]
elnonnegint = 4 [IntElement]
eldate = '2005-04-26' [StringElement]
eldatetime = '2005-04-26T10:11:12' [StringElement]
eldatetime1 = '2006-05-27T10:11:12.40' [StringElement]
eltoken = 'aa bb cc\tdd\n ee' [StringElement]
491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560
elshort = 123 [IntElement]
ellong = 1324123412 [IntElement]
elparam = u'' [StringElement]
* id = 'id001'
* name = 'Davy'
* semantic = 'a big semantic'
* type = 'abc'
elparam = u'' [StringElement]
* id = 'id002'
* name = 'Davy'
* semantic = 'a big semantic'
* type = 'int'
</pre>
<p>A similar display can be gotten by using <tt class="docutils literal">str(element)</tt>. But,
in order to do so, you may need to call
<tt class="docutils literal">objectify.enable_recursive_str()</tt>, first. For
example:</p>
<pre class="literal-block">
In [238]: print str(root.programmer)
programmer = None [ObjectifiedElement]
* id = '2'
* language = 'python'
* editor = 'xml'
name = 'Charles Carlson' [StringElement]
interest = 'programming' [StringElement]
category = 2233 [IntElement]
description = 'A very happy programmer' [StringElement]
email = 'charles@happyprogrammers.com' [StringElement]
elposint = 14 [IntElement]
elnonposint = 0 [IntElement]
elnegint = -12 [IntElement]
elnonnegint = 4 [IntElement]
eldate = '2005-04-26' [StringElement]
eldatetime = '2005-04-26T10:11:12' [StringElement]
eldatetime1 = '2006-05-27T10:11:12.40' [StringElement]
eltoken = 'aa bb cc\tdd\n ee' [StringElement]
elshort = 123 [IntElement]
ellong = 1324123412 [IntElement]
elparam = u'' [StringElement]
* id = 'id001'
* name = 'Davy'
* semantic = 'a big semantic'
* type = 'abc'
elparam = u'' [StringElement]
* id = 'id002'
* name = 'Davy'
* semantic = 'a big semantic'
* type = 'int'
</pre>
<p>This behavior of <tt class="docutils literal">str(o)</tt> can be turned on and off with the
following:</p>
<pre class="literal-block">
In [75]: objectify.enable_recursive_str(True)
In [76]: objectify.enable_recursive_str(False)
</pre>
<p>And, here is an implementation that mimics <tt class="docutils literal">objectify.dump(o)</tt> but has
several additional features:</p>
<ul class="simple">
<li>It enables you to limit the number of levels of nesting and
display of children and their children etc. Imagine displaying
the root node of a very large file containing many levels of
nested children.</li>
<li>It writes to a file rather than accumulating a string. For some
situations, this saves having to type <tt class="docutils literal">print</tt> in order to format
the output. And, again thinking about very large documents, it
might save us from building up a huge string.</li>
</ul>
<pre class="literal-block">
def swrite(element, maxlevels=None, outfile=sys.stdout):
"""Recursively write out a formatted, readable representation of element.
561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630
Possibly do shallow recursion.
Limit recursion to maxlevels (default is all levels).
Write output to file outfile (default is sys.stdout).
"""
wrt = outfile.write
swrite_(element, 0, maxlevels, wrt)
def swrite_(element, indent, maxlevels, wrt):
indentstr = ' ' * indent
wrt('{}{}: {}\n'.format(indentstr, element.tag, repr(element), ))
for name, value in element.attrib.iteritems():
wrt(' {}* {}: {}\n'.format(indentstr, name, value, ))
indent += 1
if maxlevels is not None and indent > maxlevels:
return
for child in element.iterchildren():
swrite_(child, indent, maxlevels, wrt)
</pre>
</div>
<div class="section" id="exploring-element-specific-api">
<h2><a class="toc-backref" href="#id11">3.3 Exploring element-specific API</a></h2>
<p>With <tt class="docutils literal">lxml.objectify</tt>, inspecting objects to determine the API for
that specific element type is a frequent task. You may find a
function something like the following helpful:</p>
<pre class="literal-block">
Standard_attrs = set([ '__dict__', '__getattr__', 'addattr',
'countchildren', 'descendantpaths', '__class__', '__contains__',
'__copy__', '__deepcopy__', '__delattr__', '__delitem__',
'__doc__', '__format__', '__getattribute__', '__getitem__',
'__hash__', '__init__', '__iter__', '__len__', '__new__',
'__nonzero__', '__reduce__', '__reduce_ex__', '__repr__',
'__reversed__', '__setattr__', '__setitem__', '__sizeof__',
'__str__', '__subclasshook__', '_init', 'addnext',
'addprevious', 'append', 'attrib', 'base', 'clear', 'extend',
'find', 'findall', 'findtext', 'get', 'getchildren',
'getiterator', 'getnext', 'getparent', 'getprevious',
'getroottree', 'index', 'insert', 'items', 'iter',
'iterancestors', 'iterchildren', 'iterdescendants', 'iterfind',
'itersiblings', 'itertext', 'keys', 'makeelement', 'nsmap',
'prefix', 'remove', 'replace', 'set', 'sourceline', 'tag',
'tail', 'text', 'values', 'xpath', ])
def members(element):
names = [attr for attr in dir(element) if attr not in Standard_attrs]
return names
</pre>
<p>I obtained that list of <tt class="docutils literal">Standard_attrs</tt> by doing <tt class="docutils literal">print
dir(element)</tt> on a standard element (and then modifying it a bit).</p>
<p>However, instead of calling that <tt class="docutils literal">members(o)</tt> function (above),
the following snippet is likely just as useful:</p>
<pre class="literal-block">
In [96]: [child.tag for child in element.iterchildren()]
Out[96]: ['example1', 'name', 'interest', 'interest', 'category', 'hot.agent']
In [97]:
In [97]: sorted([child.tag for child in element.iterchildren()])
Out[97]: ['category', 'example1', 'hot.agent', 'interest', 'interest', 'name']
</pre>
<p>And, to save typing, the following functions might be helpful:</p>
<pre class="literal-block">
def children(element, tag=None):
"""Return a list of children of an element.
Optional argument tag can be a single string or list of strings
to select only children with that tag name.
"""
child_list = [child for child in element.iterchildren(tag=tag)]
return child_list
def child_tags(element, tag=None):
"""Return a list of the tag names of the children of an element.
631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700
Optional argument tag can be a single string or list of strings
to select only children with that tag name.
"""
tags = [child.tag for child in element.iterchildren(tag=tag)]
return tags
</pre>
<p>Or, you may find this shallow dump function useful. It uses
<tt class="docutils literal">objectify.dump(o)</tt>, but attempts to <em>only</em> return the description
of the top level object:</p>
<pre class="literal-block">
def sdump(element):
content = objectify.dump(element)
content = content.splitlines()
prefix = ' '
content = [line for line in content if not line.startswith(prefix)]
content = '\n'.join(content)
return content
</pre>
</div>
<div class="section" id="searching-an-xml-document">
<h2><a class="toc-backref" href="#id12">3.4 Searching an XML document</a></h2>
<p><tt class="docutils literal">lxml.objectify</tt> has its own XPath-like search capability with a
(possibly) simpler form of the XPath/XQuery language. See this for
information about ObjectPath: <a class="reference external" href="http://lxml.de/objectify.html#objectpath">http://lxml.de/objectify.html#objectpath</a></p>
<p>And, you can also use that lxml xpath on <tt class="docutils literal">lxml.objectify</tt>
elements. Example:</p>
<pre class="literal-block">
In [68]: root.xpath('.//@Name')
Out[68]:
['dataset1-1',
'dataset1-2',
'subgroup01',
'dataset2-1',
'dataset2-2',
'subgroup02',
'dataset3-1',
'dataset3-2',
'subgroup03',
'dataset3-3']
</pre>
<p>See this for information about the <tt class="docutils literal">lxml</tt> support for
<tt class="docutils literal">xpath</tt>: <a class="reference external" href="http://lxml.de/xpathxslt.html">http://lxml.de/xpathxslt.html</a>. And, see this for
information about the XPath path language:</p>
<ul class="simple">
<li><a class="reference external" href="http://www.w3.org/TR/2014/REC-xpath-30-20140408/#unabbrev">http://www.w3.org/TR/2014/REC-xpath-30-20140408/#unabbrev</a></li>
<li><a class="reference external" href="http://www.w3.org/TR/2014/REC-xpath-30-20140408/#abbrev">http://www.w3.org/TR/2014/REC-xpath-30-20140408/#abbrev</a></li>
</ul>
</div>
</div>
<div class="section" id="sample-applications-with-lxml-objectify">
<h1><a class="toc-backref" href="#id13">4 Sample applications with lxml.objectify</a></h1>
<ol class="arabic">
<li><p class="first">Here is a sample application that parses and displays weather
information from an XML document: <a class="reference external" href="Objectify_files/weather_test.py">weather_test.py</a>.</p>
</li>
<li><p class="first">This sample application picks data out of an XML document that
was generated with <tt class="docutils literal">h5dump</tt>. For example:</p>
<pre class="literal-block">
$ h5dump -x my_data.hdf5 > my_data.hdf5.xml
</pre>
<p>This sample application attempts to create a new hdf5 data file from that XML
document. The code is here:
<a class="reference external" href="Objectify_files/obj_hdf_xml.py">obj_hdf_xml.py</a></p>
<p>Here is more information about HDF5:</p>
<ul class="simple">
<li><a class="reference external" href="http://www.hdfgroup.org/">http://www.hdfgroup.org/</a></li>
<li><a class="reference external" href="http://docs.h5py.org/en/latest/index.html">http://docs.h5py.org/en/latest/index.html</a> -- HDF5 for Python</li>
</ul>
</li>
<li><p class="first">Here are several small applications that pick data out of files