-
Notifications
You must be signed in to change notification settings - Fork 51
FreeDict HOWTO – FreeDict Build System
The build system is based on make and is used to build/convert, validate and distribute dictionaries. It is the common entry point for most of the tools used within FreeDict.
A dictionary is usually in our git repository. For a release, the build system
is instructed to convert the dictionaries into the available output formats and
creates compressed archives which can be moved to its final destination.
An exception to this procedure are auto-imported dictionaries. These are usually
not in a git repository, but at a different location without version control.
After these dictionaries have been generated, the make build system is used as described.
A strength of FreeDict is its support for different dictionary platforms. Once a dictionary is available in TEI format, it can be converted to many other formats, to be used with dictionary applications, spell checkers (for this only the headwords or translation equivalents are taken), for printing a book using XSL-FO etc.
This is enabled by two factors. First, XML is purposely very flexible. Second, the tools for converting the TEI files are kept in one place (the tools module) and are shared between the dictionary modules.
The subsequent sections explain the most relevant aspects of the build system, starting with the general structure, the involved Makefiles with their usage and with the API generation.
All paths are relative to the tools directory, if not stated otherwise.
You should have a local copy of the tools repository, for instance, by cloning it:
git clone https://github.com/freedict/tools
The path to this directory may not contain space, this is a restriction Make puts on us.
FreeDict's build system and its scripts need to find its files, located in the
tools directory. This is done with the FREEDICT_TOOLS
environment variable. It
should be set and point to the tools directory.
On UNIX-alike systems, exporting the variable in the shell configuration as
export FREEDICT_TOOLS=/path/...
is enough. On Windows, the environment
variable must be set in the system settings. Since the approach changes over
time, it is best to search for the exact steps on the internet.
Some bits for converting dictionaries (and for managing them) require Python. To make this process painless, our buildsystem will assist you in setting up the environment. Before you start, you should make sure that the following packages are installed:
- Python >= 3.4
- libicu-dev
- python3-dev
(these are the names of a Debian or derived distribution).
Afterwards, you can execute the mk_venv
rule from the root directory of the
tools repository. A virtual environment (venv) is Python's way of installing
libraries and programs locally without affecting the system-wide installation.
If you want to understand how this works and what this command does, use
make mk_venv-help
and the excellent tutorial from
https://developer.akamai.com/blog/2017/06/21/building-virtual-python-environment/.
For shortness reasons, the command for installing
the virtual environment to the directory ../fd_venv is given below:
make mk_venv P=../fd_venv
*Note: If the creation of your virtual environment fails with a python
traceback ending on
FileNotFoundError: [Errno 2] No such file or directory: 'icu-config'
you need
to install the libicu headers, on Debian/Ubuntu, execute
sudo apt install libicu-dev
.
The tools directory contains, among other things:
- XSL conversion style sheets for conversion into other formats
- the
mk
directory with the heart of the make-based build system - importer scripts, which export dictionaries into FreeDict
- the API generator
- and much more
This file provides all the rules for building a dictionary and is included by the dictionary Makefile. It works exactly on one dictionary and implements all the logic for the conversion process. A minimal Makefile for a dictionary usually looks like this:
FREEDICT_TOOLS ?= ../../tools # fallback, if variable is unset
DISTFILES = AUTHORS ChangeLog COPYING lg1-lg2.tei \
freedict-P5.xml freedict-P5.rng freedict-P5.dtd freedict-dictionary.css INSTALL Makefile NEWS README
include $(FREEDICT_TOOLS)/mk/dicts.mk
In the first line, the fall back for the FREEDICT_TOOLS
variable is set. As
said, it is better to have this variable set globally on the system.
The second lines gives all the files which should be distributed when building a
release archive. The contents may vary. Most of the dictionaries follow GNU
conventions and ship files like COPYING, AUTHORS, etc. FreeDict only mandates a
ChangeLog, the Makefile, the dictionary (with icensing information) and some
XML schemas.
mk/dicts.mk
provides the support for the following targets (as well as some
more internally used targets). If you want a quick yet mor extensive overview,
just type make help
.
The default target converts the TEI XML source into the supported output
formats. Please run make list-platforms
for a list of supported output formats.
Updating all the pieces of a TEI header for a new release can be tedious. This rule assists by update date, edition, extent, copyright year and change information. For the changelog entry, an editor is opened. The edition has to be given on the command line for instance as:
make E=1.8.2 changelog
Please note that this rule requires the value user_name
and optionally
full_name
from the FreeDict configuration. Please see the section on
how to create a FreeDict configuration for
more details.
A help screen for this rule can be obtained using make changelog-help
.
This removes the non-source files generated during the build of anything from the dictionary module.
This builds and deploys a release to the place where releases should go to,
something the make system knows best. It requires a
FreeDict configuration. If you want to
deploy a release again, use make FORCE=y deploy
.
Note: After the deployment, you should use make api
to generate a new API
file.
This lists all supported output formats / platforms.
Install the dictionary to the locale file system. The variables DESTDIR
and
PREFIX
can be used to control the destination.
List all available platforms.
This runs all quality assurance
helpers of FreeDict. This is a strongly
advised step before a new release of a dictionary.
This puts a release file for the specified platform into the corresponding directory below $BUILD_DIR)
, usually ../build
.
Example: make release-dictd
This tries to find duplicated entries or empty XML nodes and removes them. Afterwards, a human-readable diff of the changes is presented to the user.
This target is used to check the TEI XML file against the FreeDict RNG schema. It is used to spot errors in the dictionary and should be used by each dictionary maintainer, to make sure that their dictionaries adhere to the rules.
Output the current version of the dictionary.
This file is included by the top-level Makefile of the FreeDict repository and
provides convenience functionality for all dictionaries at once. As for all
Makefiles, make help
will explain most of the relevant targets.
The default target invokes a build of all dictionaries in the repository for all
available output formats. This is potentially a very time-consuming process, so
it can be parallelized. Try make -j8
if you have a system with eight CPU
cores.
There is, as for each dictionary, a install
rule. Additionally, there's
a rule make install-restart
, which will also attempt to restart the dictd
daemon after a successful installation.
install-core
will install all dictionaries, where install
will also attempt
to restart involved services.
The variables DESTDIR
and PREFIX
can be used to control the
destination of the installed dictionaries.
Within the tools directory, there is a Makefile which defines targets relevant for the management of the tools. These targets are mostly relevant for project administrators.
Some of the commands available in the tools directory commands require a configuration. This configuration configures paths and user credentials to access certain parts of FreeDict's release infrastructure or to automate the changelog creation.
A configuration has to be in %LOCALAPPDATA%
on Windows and in
$HOME/.config/freedict/freedictrc
on UNIX-alike systems. A absolute minimal
configuration could look like this:
[DEFAULT]
file_access_via = sshfs
api_output_path = ~/freedict/fd-dictionaries/build
[release]
user = humenda,freedict
local_path = ~/freedict/release
[generated]
user=humenda
local_path = ~/freedict/generated
[crafted]
local_path = ~/freedict/fd-dictionaries
The default section contains global options. The file_access_via
is used to
determine the method to access remote files of the project, including releases
and auto-importred dictionaries. At the moment, SSHFS and unison are supported
(spelled in lower case in the configuration).
SSHFS will mount the files as a remote file system (UNIX only) and Unison will
synchronise these files with the server, so that you have a copy to work with.
The api_output_path
specifies the resulting directory name of the API files (AKA
freedict-database.xml
and freedict-database.json
). It is also advised to add
user_name
and full_name
to the DEFAULT
section to the GitHub and real name
respectively. They will be used for instance in the make changelog
rule.
The subsequent sections described different locations for dictionaries. The
crafted location is the repository with all hand-crafted dictionaries.
The section for generated
dictionaries is a remote folder which contains all automatically
imported dictionaries. Since these dictionaries are generated, it doesn't make
sense to version-control them, only the script needs to be under version
control. To access these generated files, a local path and a user name is
mandatory. Other fields are the server
and the remote_path
, but these values
should be set to the correct values by default. The sections generated
and release
work the same way, the section crafted has only the option to set a local path. For the crafted
section, it is assumed that the dictionaries are accessed using git and hence this can be kept up-to-date by different procedures.
If you want to skip a section for testing, e.gg. the generated
section, you
can just write skip = yes
as first argument into the section.
As usual, make help
gives an overview about
all commands, the following are used most frequently
This generates the FreeDict API file with information about all available dictionaries and their release candidates. This target assumes that you have python3 and SSHFS or Unison and have set up a configuration file as explained in the previous section.
There is a Relax NG schema to validate the contents of the generated API. For
this to work, the configuration option api_output_path' has to be set and a file has to exist at the specified location. This is the case, if you have run
make api` before.
Beside the XML structure, the validation step will also check whether the date
format specified is correct and whether the version adheres to the
version.major.minor
versioning schema.
This will install the tools to $DESTDIR/$PREFIX/share/freedict
. Default is
/usr/local/share/freedict
.
This probes the current operating systems and starts up the package manager to install the required dependencies for dictionary development and conversion. At the time of writing, Debian-based distributions and Arch GNU/Linux are supported.
Releases and generated dictionaries are on remote machines and need to be made accessible. This can be done with either SSHFS or Unison. Sshfs can mount remote volumes securely, but may be undesirable for slower internet connections. Unison downloads and synchronises remote files with a local copy and only needs to transfer data, if files have been changed. This rule will either mount or synchronize the remote data.
This will parse the source of all dictionaries and the list of released files to detect unreleased changes. It will present a table with dictionaries to release.
To execute this target, a configuration has to exist. Please see the corresponding section of this chapter.
This builds a release tarball for the tools directory.
Please see the section on mount
fore more details.
This rule umounts remote shares, if they were mounted before.