This is the Dilbert Strip Index (finder_dsi), used by the Dilbert Strip Finder at http://www.bfmartin.ca/finder.
It contains:
-
The source data in JSON format used to index the content of the Dilbert comic strips so a full-text search can be performed.
-
Some programs written in Ruby to load the data into a MariaDB database, test the validity of the data, and some command-line utilities to report and maintain the data.
You can reach me at http://www.bfmartin.ca/contact
This project is dedicated to all fans of Dilbert and to Scott Adams in particular for creating the comic strip.
The JSON data files (dsistrips.json and dsibooks.json) are copyright 1999 - 2017 by Byron F. Martin. They are licensed under the Creative Commons Attribution 2.5 Canada license. The short version is that you can do whatever you want with these files as long as you give me credit for creating them. A link to my site will be good, though it's not required.
The program files (Ruby and Rakefile files) that were written by me
(which is everything except the bundled module lib/dsi/stem.rb
) are
donated to the public domain. Do whatever you want with them.
A bundled Ruby module for word stemming (lib/dsi/stem.rb
) is
copyright by its author(s). Please refer to its documentation for its
license information. It was obtained from
http://tartarus.org/~martin/PorterStemmer/ruby.txt.
All dates are in the format YYYY-MM-DD.
This is the main data file that describes each Dilbert strip. For each strip, the JSON file contains:
-
Date of newspaper publication.
-
Synopsis, a one or two sentence description of the strip.
-
Subject (optional), the main topic.
-
Keywords (optional), many words to describe the action and important concepts.
-
Characters (optional), the characters with speaking parts.
-
Saga (optional), to mark the beginning of three or more strips with a common theme.
-
Notes (optional), to describe important items about the strip, like the first appearance of a character.
-
Comment (optional), text about the strip for my own purposes and is not meant to be displayed in search results.
This data is not included here but is generated from some text files available on the Internet. The programs format the data for easy searching.
This file is generated by running the command
rake dialog:prepare
This will download the necessary file(s) from their respective sites and generate dsidialog.json. See the Rakefile for details on how this happens.
NOTE: This file is retired, and is not maintained any more. You can find all strips online, so finding them in books is superfluous. This file stays, but has no books past 2014.
This describes each Dilbert comic collection and its contents. For each book, the file contains (all are required):
-
a one word code id for the book.
-
the book title
-
the layout of a week of strips, beginning with Sunday, and separated by commas. Most books are laid out as 1,3,3 which means:
-
Sunday on one page
-
three more strips on the next page
-
three more strips on the next page.
-
-
A list containing the start and end dates and the page number that they start on. This is a list because sometimes strips are not in chronological order.
If you like to play around with the Ruby programming language, there are some example programs included in this package. To make them work you will need to have the following software packages installed. Your operating system's package manager should be able to help with these.
-
Ruby, at least version 2.3.
-
Rake
-
The following ruby gems
- json (to parse JSON files)
- net/http (to fetch dialog file)
- optimist (to parse command line options)
- rake_notes (shows TODOs and FIXMEs in code)
Try:
gem install <gemname>
Some programs refer to a database. The SQL will work with MariaDB, and uses the following table definitions. The search feature relies on MariaDB's full text indexing.
There are several programs included to work with the DSI data.
If the program name begins with 'dsi', then its purpose is to maintain or report on the dsi data.
-
dsi-generate-dialog.rb
this will read raw dialog files (as downloaded from different web locations) and merge them into a standard format JSON file.
-
dsi-key.rb
this will read the dsistrips json file and format it for printing. it takes the keywords or characters or subjects, and prints a list of dates that contain that item.
-
dsi-notes.rb
this will read the dsistrips json file and print all items with notes. Useful for browsing and debugging.
If the program name begins with 'finder', then it generates data to update the Finder web site.
-
finder-load.rb
reads dsistrips and creates lines to be loaded into the bfmartin.ca database. See the database schema in the README. also loads dialog if available, otherwise inserts nulls.
-
finder-reindex.sh
a wrapper to do everything. it will download all required files, reformat them, and load the database.
If you have questions or feedback, you can reach me at http://www.bfmartin.ca/contact
You can visit the Dilbert Strip Finder web site at http://www.bfmartin.ca/finder