Skip to content

Overview

bseddon edited this page Nov 10, 2016 · 5 revisions

There are three major components provided by this project. One provides the ability to read an existing taxonomy and access its content in a structured way. Another provides the ability to load an instance document, relate that document to its respective taxonomy and access the values it contains. These two components correspond to the two fundamental features of XBRL: the taxonomy and an instance document. The third component allows simple reports to be produced from an instance document and its related taxonomy. Other pages in this wiki show how to use all these components.

These three components are implemented as classes, each in its own file: XBRL, XBRL_Instance and XBRL_Report.

Performance

On one hand it's great that taxonomies are defined using XML. This allows taxonomy authors to define a great deal of semantic information in a way that that is structured and which can read by any application. The disadvantage is that taxonomies are often huge XML documents that take appreciable amounts of time to parse. The resulting document when in memory is normalized but not indexed. This means it is relatively slow to load, parse and use taxonomy information which gets in the way of using XBRL in a [non regulatory] reporting scenario.

Compiling

The approach taken by this project to address the performance problems of using XBRL taxonomy documents is to 'compile' taxonomies. This process reads the taxonomy information then stores it in indexed arrays (dictionaries) so key information can always be retrieved quickly. Where appropriate, such as with presentation and definition arc relationships, nested arrays are used to build a tree of nodes.

Many times the generated arrays are highly de-normalized to reduce the number of look ups required. For example, taxonomy element information is recorded with all nodes in trees generated from a taxonomy's presentation linkbase. Nodes in the tree that represent primary items are identified and associated with the respective hypercube(s) even though this means replicating the hypercube information multiple time.

Where appropriate additional indexes are added to make look up processes faster. For example, it may be desirable to place instance document values in a node of a presentation tree. Traversing the presentation linkbase tree to find specific nodes by label is not practical for real world cases because the search time to find a node is exponential. By adding an index for each tree the required node can be identified and the path to the node in the tree retrieved and used. Any one value may appear one or more times in the tree generated from presentation linkbase arc and these indexes accommodate this possibility.

The resulting set of in-memory arrays are then persisted to a file in a JSON format. It turns out that PHP is very efficient reconstructing the in-memory, indexed arrays (dictionaries) from the persisted JSON. On a 1.7GHz single core laptop the compiled representation of the US GAAP 2015 taxonomy takes only a few hundred milliseconds to load.

You can find more information about how to compile a taxonomy and how to use a compiled taxonomy in the examples section.

XBRL class

This is an abstract class that exposes all the features of a taxonomy. For example, class properties allow a caller to access element definitions, and presentation and definition roles. The class transforms the presentation linkbase arcs and locators into a hierarchy that can be navigated quickly. It also transforms the definition linkbase arcs and locators into hypercubes, dimensions and primary items (elements linked to one or more hypercubes). Dimensions, their members and elements are made available as a hierarchy.

It also provides support for extension taxonomies which are routinely used by US corporations submitting returns to the SEC.

The source code includes concrete descendant implementations of the XBRL class to represent US GAAP (2014/2015), UK GAAP, UK Audit Exempt and UK FRC taxonomies. Among the things a descendent taxonomy implementation will typically define are the set of XML namespaces used by the taxonomy. This allows the code to associate a specific concrete XBRL implementation with a taxonomy schema document.

XBRL Instance

The XBRL_Instance class makes it easy to retrieve element values and contexts. It will also automatically associate the instance document with the appropriate taxonomy (XBRL class implementation) based on the 'schemaLocation' attribute value.

The XBRL_Instance class also implement two helper classes that can be used to filter and access context and element information. For example, it may be helpful to find the range of years covered by the contexts or only retrieve elements that are assigned to contexts that do not include segment information.

XBRL_Report_Base

This abstract class makes provides features to make it simple to take an instance document to produce a basic HTML report of the values in that document based on the sections defined by the roles in the presentation linkbase of the respective taxonomy.

XBRL_Report is a concrete implementation that creates a report for a single instance document. The report has columns for each financial year covered by the contexts. It will display segment information within the report as appropriate.

XBRL_Report_Compare is another concrete implementation that is able to take two or more instance documents, as long as each shares the same base taxonomy, and present are report comparing the value of both documents. For example, a report might be created from the instance documents of two companies.

Clone this wiki locally