Skip to content

Commit

Permalink
update docs and README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
bicycle1885 committed Nov 12, 2016
1 parent b033293 commit df4ab73
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 43 deletions.
41 changes: 8 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
<p align="center"><img src="/docs/EzXML.jl.png" alt="EzXML.jl Logo" width="250" /></p>

# EzXML.jl - XML/HTML tools for primates
EzXML.jl - XML/HTML tools for primates
======================================

[![Docs Latest][docs-latest-img]][docs-latest-url]
[![Build Status][build-latest-img]][build-latest-url]
Expand Down Expand Up @@ -58,34 +59,8 @@ for species_name in content.(find(primates, "//species/text()"))
end
```


## Core concepts

The main types exported from this package are `Document` and `Node`. `Document`
represents an entire XML/HTML document and `Node` are components of it. Everything
in an XML/HTML tree is a `Node` object: document, element, text, attribute, comments,
and so on. A document object of `Document` type is a thin wrapper to a document
node of `Node` type. This design leads to simplicity of interfaces because
tree-traversal functions always return `Node` objects. In addition, the type
stability of this design may enable the Julia compiler to generate faster code.

In this package, a `Node` object is regarded as a container of its child nodes.
This idea is reflected on function names; for example, a function returning the
first child node is named as `firstnode` instead of `firstchildnode` because
it is apparent that we are interested in child nodes. If the user is interested
in a special type of nodes like element nodes, functions like `firstelement`
are provided.

Internally, a `Node` object is a proxy object to a node-like struct allocated by
the libxml2 library. Additionally, a node-like struct also has a pointer to
Julia's `Node` object, which enables to extract a unique proxy object from C's
struct. Therefore, two `Node` objects pointing to the same node in an XML/HTML
document are identical even if they are generated from different ways. A `Node`
object also keeps an owner node that is responsible for releasing memories of
nodes.


## Quick reference
Quick reference
---------------

Types:
* `Document`: an XML/HTML document
Expand Down Expand Up @@ -181,16 +156,16 @@ Queries:
* `findfirst(doc|node, xpath)`: find the first matching node.
* `findlast(doc|node, xpath)`: find the last matching node.


## Examples
Examples
--------

* [primates.jl](/example/primates.jl): "primates" example shown above.
* [julia2xml.jl](/example/julia2xml.jl): convert a Julia expression to XML.
* [issues.jl](/example/issues.jl): list latest issues of the Julia repository.
* [graphml.jl](/example/graphml.jl): read a GraphML file with streaming reader.


## Other XML/HTML packages in Julia
Other XML/HTML packages in Julia
--------------------------------

* [LightXML.jl](https://github.com/JuliaIO/LightXML.jl)
* [LibExpat.jl](https://github.com/amitmurthy/LibExpat.jl)
Expand Down
11 changes: 5 additions & 6 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,11 @@ Home
====

EzXML.jl is a package for handling XML and HTML documents. The APIs are simple
and support a range of functionalities including:
* Traversing XML/HTML documents with
[DOM](https://en.wikipedia.org/wiki/Document_Object_Model)-like interfaces.
* Searching elements using [XPath](https://en.wikipedia.org/wiki/XPath).
* Handling [XML namespaces](https://en.wikipedia.org/wiki/XML_namespace).
* Parsing with [streaming APIs](http://xmlsoft.org/xmlreader.html).
and consistent, and provide a range of functionalities including:
* Traversing XML/HTML documents with DOM-like interfaces.
* Properly handling XML namespaces.
* Searching elements using XPath.
* Parsing large files with streaming APIs.
* Automatic memory management.

Here is an example of parsing and traversing an XML document:
Expand Down
17 changes: 13 additions & 4 deletions docs/src/manual.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,14 +147,23 @@ julia> println(genus[1]) # The "genus" element has been updated.
```

In this package, a `Node` object is regarded as a container of its child nodes.
This idea is reflected on function names; for example, a function returning the
first child node is named as `firstnode` instead of `firstchildnode`. All
functions provided by the `EzXML` module are named in this way and tree
traversal functions works on its child nodes by default. Functions with a
direction prefix works on that direction; for example, `nextnode` returns the
next sibling node and `parentnode` returns the parent node.

Distinction between nodes and elements is what every user should know about
before using DOM APIs. There are good explanations on this topic:
<http://www.w3schools.com/xml/dom_nodes.asp>,
<http://stackoverflow.com/questions/132564/whats-the-difference-between-an-element-and-a-node-in-xml>.
Once you get it, tree traversal functions of EzXML.jl must be quite natural to
you. For example, `hasnode(<parent node>)` checks if a (parent) node has one or
more child *nodes* while `haselement(<parent node>)` checks if a (parent) node
has one or more child *elements*. All functions are also named in this way:
Some functions have a suffix like `node` or `element` that indicates the node
type the function is interested in. For example, `hasnode(<parent node>)` checks
if a (parent) node has one or more child *nodes* while `haselement(<parent
node>)` checks if a (parent) node has one or more child *elements*. All
functions are also named in this way:
```jlcon
julia> hasnode(primates) # `primates` contains child nodes?
true
Expand Down

0 comments on commit df4ab73

Please sign in to comment.