Fairplay is a library built with the goal of enabling fully automated extraction of data from plots. One of the primary components is a plot simulator which can be used to generate training images for various tasks.
Clone the repository. Then, from within fairplay/
:
pip install -e ./
Dependencies will be installed automatically. Python >= 3.9 recommended.
python ./src/fairplay/gen/generate_random_scatter.py ./data/demo -n 20 -t 10
Arguments
./data/demo
: directory to build image dataset and corresponding labeled images-n 20
: 20 training images-t 10
: 10 test images
Simulation parameters are configured in data/plot_params
. There are two key files:
continuous.csv
: parameters to define a truncated normal distribution from which to sample continuously-defined plotting arguments, likemarker_alpha
. Attributes are:min
: lowest allowable valuemax
: highest allowable valuemean
: if specified, can place the mean of the truncated normal somewhere other than the midpoint of min and max, which is the defaultn_stds
: number of SD's of the normal distribution to "fit" between min and max. Default is 1, and ifmean
is also unspecified, this means the amplitude of the truncated normal PDF atmin
andmax
will be that of -1 and +1 SD. Higher values will result in more concentrated sampling near the mean and less at the edges.
discrete.csv
: literally-defined Lists on which to perform uniform sampling of discretely-defined plotting arguments, such asmarker_style
. To weight a member more heavily, simply add more copies of that member to the list. Very rudimentary.
RGB values for class labels (e.g. x ticks, markers, background) are defined as label_colors
in generate_random_scatter.py
Simulated | Labeled |
---|---|