Skip to content

Latest commit

 

History

History
83 lines (65 loc) · 2.42 KB

readme.md

File metadata and controls

83 lines (65 loc) · 2.42 KB

C XML Parser and Query

A simple C program for parsing XML files (with partial validation - see below) and querying them.

This is still very much work in progress, parsing is not 100% complete & querying is slowly underway. This also means git history will most likely be full of mess (random commits to store progress, occasional commit with broken code).

Setup

Nix

For Nix users, there is a flake.nix for setting up a dev-shell.

A simple nix develop should probably just work, or just use direnv.

Others

Having GNU Make and a C compiler (clang, gcc ..., make sure $CC is set) should be enough. If you use clang, make sure to set ASAN_SYMBOLIZER_PATH to which llvm-symbolizer (it needs to be installed - usually via libllvm).

Building

Debug

$ make debug
$ ./debug/cxpq_debug ./examples/bookstore.xml

Release

$ make realease
$ ./release/cxpq ../examples/bookstore.xml

Features/non-features

  • Parsing
    • Prolog
    • DTD
    • (Root) Node
      • Attributes
      • Namespaces
      • Content
      • CDATA (parsed as text content, with a flag)
    • Comments
    • Processing Instructions
  • Validation (should be reworked and under a flag)
    • Valid file (no unclosed tags, comments, etc.) - should be done?
    • Valid Prolog
    • Valid Root Node name
    • Valid Node names (partially, missing xml "ban")
    • Namespaces (currenly they just get included in node name)
    • Valid document (has prolog & a single root node)
  • Querying
    • XPath?
    • Custom query language (CSS selector like)
  • Tests
    • Individual node parsers
    • Entire file parser
    • Validator (after rework)
    • Query parsing
    • Query execution

Example usages

Simply parse and print back xml document

$ cxpq ./examples/valid/bookstore.xml

Query all books anywhere in the xml document:

$ cxpq -Q xpath --query "//book" ./examples/valid/bookstore.xml

Select all tags anywhere in the xml document:

$ cxpq -Q xpath --query "*" ./examples/valid/bookstore.xml

Complex selector with sub query & function filtering, selects titles of books which have more than 2 price tags and are inside bookstore tag.

$ cxpq -Q xpath --query "//bookstore//book[price/position() > 1]/title" ./examples/valid/bookstore.xml