diff --git a/docs/wp/hdb-analysis/index.md b/docs/wp/hdb-analysis/index.md new file mode 100644 index 000000000..95cc30121 --- /dev/null +++ b/docs/wp/hdb-analysis/index.md @@ -0,0 +1,828 @@ +# Historical database (HDB) analysis and maintenance + +The historical database (hdb) is the backbone of the vast majority of kdb+ systems. This is the location for data storage (typically at the end of the day, sometimes intraday) and for fast retrieval later on (handling system/user queries). Maintaining the integrity of the hdb is crucial to any system functioning efficiently. There are two main aspects of a hdb, the sym file and the data itself. Others have already covered the importance of the sym file (https://code.kx.com/q/wp/symfiles), this paper will cover data quality and structural integrity. + +There are many ways that a hdb can be broken, some less obvious than others. Investigating and analysing a broken hdb is a skill that all kdb+ deveolpers should have. However, this paper will aim to make this task easier by providing a utility script to assist with some common scenarios. The reader's familiarity with the hdb structure and its contents are assumed. + +The script itself is broken into 3 components - analysis, maintenance and warnings. +1. Analysis interrogates and reports on issues found in the hdb +1. Maintenance attempts to fix any issues reported in the analysis section +1. Warnings will report issues that are beyond the scope of this paper to fix, or issues that might not break the hdb, but cause performance degradation if not rectified. + +The analysis and appropriate maintenance will be covered together, followed by the warnings. + +Let's first create a somewhat realistic hdb using the script provided in the Kx repository: https://github.com/KxSystems/cookbook/blob/master/start/buildhdb.q. + +```bash +$ curl https://raw.githubusercontent.com/KxSystems/cookbook/master/start/buildhdb.q > $QHOME/buildhdb.q +$ q buildhdb.q +$ cd start/db +``` + +and download the script itself: + +```bash +$ curl https://raw.githubusercontent.com/cillianreilly/qtools/master/start/ham.q > $QHOME/ham.q +``` + +Let's examine the inputs to the script and some of the functionalites before we get started: + +```q + +$ q ham.q -q + +usage: q ham.q -tables [tables] -par [partitions] -level [0-8] -dbmaint [01] +path to hdb is mandatory. all other flags are optional and defaults are described below +tables : tables to analyse. default is all partitioned tables in the hdb +par : partitions to analyse. default is all partitions in the hdb, otherwise restricts using .Q.view +dbmaint: perform maintenance on the hdb. default is 0 (no maintenance) +level : level of analysis, least to most intensive. default is 6 + 0: check if specified tables exist in specified partitions + 1: check if .d files exist in specified partitions + 2: check if partition field (.Q.pf) exists in the .d file per partiton + 3: check if partition field (.Q.pf) exists on disk per partition + 4: check if all columns in the .d file exist in the same partition + 5: check if all columns from the latest partition exist in each partition + 6: check if the order of columns per partition matches that of the latest partition + 7: check if the column types per partition match that of the latest partition + 8: check if column counts are consistent across columns per partition +wlevel : level of warnings, least to most intensive. default is 2 + 0: check if enumeration files exist in the hdb root e.g. sym + 1: check if all columns on disk exist in the .d file of the same partition + 2: check if all column attributes match those of the latest partition + 2: check if all foreign keys match those of the latest partition +``` + +Some of the utility functions exposed in the script are described here. Given a table name as input, `paths` and `dotd` return all paths to the table folder on disk and all the paths to the `.d` files espectively. `lastpath` and `lastdotd` return the last of these values respectively. The last partition is important, as this is the partition that kdb uses to build metadata from. It is assumed that the last partition is correct. + +```q +q)paths`trade +`:./2013.05.01/trade`:./2013.05.02/trade`:./2013.05.03/trade`:./2013.05.06.. +q)dotd`trade +`:./2013.05.01/trade/.d`:./2013.05.02/trade/.d`:./2013.05.03/trade/.d`:./2.. +q)lastpath`trade +`:./2013.05.31/trade +q)lastdotd`trade +`:./2013.05.31/trade/.d +``` + +The `init` function runs the script. This calls several functions; in order: +* `lh` - load hdb; attempts to load the specified hdb and prints some general information if successful. Exits on failure. +* `pa` - parse arguments; parse command line arguments and supply defaults values, these are documented above. +* `rp` - restrict partitions; attempts to restrict the hdb view using `.Q.view`, but always includes the last partiton found on disk. Exits on failure. +* `ra` - run analysis; runs each of the `al?` functions found and populates the analysis results table (`ar`). +* `rw` - run warnings; runs each of the `wl?` functions found and populates the warning results table (`wr`). +* `ld` - load dbmaint; attempts to load `dbmaint.q`. Only called if the `dbmaint` flag is true. Exits on failure. +* `rm` - run maintenance; details below. + +Most of the functions above are simple to follow, but some more information on the `rm` function here. The `rm` function takes a single argument, the analysis results table generated by `ar`. It first finds the lowest level of failure in the results table (exits early if none), then re-runs the analysis at that level. This is done as it may be the case that a lower level of maintenance has already resolved the issue at a higher level; in which case, there is no work to be done. If the issue persists, it will run the appropriate maintenance function. Finally, it will re-run the analysis at that level to confirm if the maintenance was successful. We iterate `rm` over the analysis results table; the iteration will be halted when either: + +* there are no more failures to be rectified or +* maintenance has failed for a given level, and the analysis functions returns the same results as in the initial investigation + +## Analysis and maintenance + +### Level 0 + +Level 0 starts with the very basic - does the table exist in the partition in question? This manifests itself as a 'No such file or directory error' of the first column listed in the `.d` file of the latest partition. + +```q +$ rm -r 2013.05.01/trade +$ q . +q)select from trade where date=2013.05.01 +'./2013.05.01/trade/sym. OS reports: No such file or directory + [0] select from trade where date=2013.05.01 + ^ +``` + +We can investigate the existence of files and folders from within kdb+ using the `key` keyword: https://code.kx.com/q/ref/key. For folders, `key` returns the folder name if it exists, or an empty general list otherwise. Counting the result is suffficient to tell us whether the folder exists. We wrap this into an `exists` helper function. `al0` (analysis level 0) takes a list or table names, and returns a dictionary of tables vs partitions that the table is missing from, if any. We index the variable `.Q.pv` to find the affected partitons: https://code.kx.com/q/ref/dotq/#pv-modified-partition-values +```q + +exists:0