-
Notifications
You must be signed in to change notification settings - Fork 35
1. Getting started
First, clone the repository:
git clone [email protected]:jim-schwoebel/allie.git
cd allie
Set up virtual environment (to ensure consistent operating mode across operating systems).
python3 -m pip install --user virtualenv
python3 -m venv env
source env/bin/activate
Now install required dependencies and perform unit tests to make sure everything works:
python3 setup.py
Note the installation process and unit tests above takes roughly ~10-15 minutes to complete and makes sure that you can featurize, model, and load model files (to make predictions) via your default featurizers and modeling techniques. It may be best to go grab lunch or coffee while waiting. :-)
After everything is done, you can use the Allie CLI by typing in:
python3 allie.py -h
Which should output some ways you can use Allie:
Usage: allie.py [options]
Options:
-h, --help show this help message and exit
--c=command, --command=command
the target command (annotate API = 'annotate',
augmentation API = 'augment', cleaning API = 'clean',
datasets API = 'data', features API = 'features',
model prediction API = 'predict', preprocessing API =
'transform', model training API = 'train', testing
API = 'test', visualize API = 'visualize',
list/change default settings = 'settings')
--p=problemtype, --problemtype=problemtype
specify the problem type ('c' = classification or 'r'
= regression)
--s=sampletype, --sampletype=sampletype
specify the type files that you'd like to operate on
(e.g. 'audio', 'text', 'image', 'video', 'csv')
--n=common_name, --name=common_name
specify the common name for the model (e.g. 'gender'
for a male/female problem)
--i=class_, --class=class_
specify the class that you wish to annotate (e.g.
'male')
--d=dir, --dir=dir an array of the target directory (or directories) that
contains sample files for the annotation API,
prediction API, features API, augmentation API,
cleaning API, and preprocessing API (e.g.
'/Users/jim/desktop/allie/train_dir/teens/')
For more information on how to use the Allie CLI, check out the Allie CLI tutorial or any of the links below:
- annotating files
- augmenting files
- cleaning files
- collecting data
- featurizing files
- training models
- model predictions
- preprocessing / making transformers
- unit tests
- visualizing data
- new settings
You can run Allie in a Docker container fairly easily (10-11GB container run on top of Linux/Ubuntu):
git clone [email protected]:jim-schwoebel/allie.git
cd allie
docker build -t allie_image .
docker run -it --entrypoint=/bin/bash allie_image
cd ..
You will then have access to the docker container to use Allie's folder structure. You can then run tests @
cd tests
python3 test.py
Note you can quickly download datasets from AWS buckets and train machine learning models from there.
You can read more about how to use Allie and Docker here.
Note that there are many incomptible Python libraries with Windows, so I encourage you to instead run Allie in a Docker container with Ubuntu or on Windows Subsystem for Linux.
If you still want to try to use Allie with Windows, you can do so below.
First, install various dependencies:
- Download Microsoft Visual C++ (https://www.visualstudio.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=15).
- Download SWIG and compile locally as an environment variable (http://www.swig.org/download.html).
- Follow instructions to setup Tensorflow on Windows.
Now clone Allie and run the setup.py script:
git clone --recurse-submodules -j8 [email protected]:jim-schwoebel/allie.git
git checkout windows
cd allie
python3 -m pip install --user virtualenv
python3 -m venv env
python3 setup.py
Note that there are some functions that are limited (e.g. featurization / modeling scripts) due to lack of Windows compatibility.