-
Notifications
You must be signed in to change notification settings - Fork 3
Java Developer's Guide Quick Start
- Introduction
- Quick Start - Train and Classify
- Reading/Writing Sounds
- Persisting a Classifier
- Javadoc
The goal of the platform is to provide easy access to and use of something called a classifier and the ecosystem of artifacts that surround it. A classifier can be thought of as a black box that can analyze a piece of data and identify or assign one or more classes (i.e. categories, labels, types, etc) to it. This is useful in many domains including security, healthcare, finance, agriculture and retail to name just a few. Classifiers may be implemented by using a set of fixed, user-defined rules that define what category a piece of data would be assigned. For example, if an image has a high amount of red pixels, this might represent a fire situation. Other classifiers, and the ones we are more focused on in this library, automatically learn the set of rules from a set of exemplar (i.e. training) data. This training data is generally a large set of datum instances with each datum being associated with one or more labels (we will use the term label instead of category as that is what is primarily used in this domain). For example, in the case of image recognition, each image would be labeled to identify the set of objects contained in the image and the classifier would learn the mapping of images to objects. After training, the classifier is then able to analyze new data it has not been trained on to develop the set of labels (e.g. objects in the image) that should be associated the given data.
Data is a key concept when considering classifiers. The type (i.e. integer, double, complex, etc.) and shape (i.e. vector, matrix, etc) of data and how it is labeled and stored are all key aspects that must be considered. The library is designed to allow a generality of type and shape, but implementations focus on arrays of double values (i.e. double[]). This allows for broad support of arbitrary data sets and enables us to focus on segments (i.e. time windows) of scalar (i.e. sound or other sensor) data. Storage and organization is particularly important as the training data sets are ideally very large. Given these large data sizes, streaming and/or distributed computation over the data during training is particularly important if not required. The library is designed to enable and encourage streaming and distributed computing where possible.
The next sections cover the details of the design and implementation of the library that enables the model and requirements described above.
First we show how to train a classifier and use it to generating classifications.
The first example shows how to train using sounds, the second trains on vectors of scalar data.
As a quick start for those wanting to work immediately with sounds, there are really just a few key classes with which you will be able begin working fairly quickly. They are:
-
SoundClip
- holds a raw PCM sound sample of arbitrary length and allows reading and writing to/from disk as either wav or mp3 (read only). -
SoundRecording
- groups labels with a SoundClip and allows reading/writing of labeled sounds to/from disk. -
GMMClassifier
- an implementation ofIClassifer
that uses a gaussian mixtures of features. -
Classification
- returned by a classification request and encapsulates a label value and confidence for a given label name.
Next we show how to load sounds from the file system, train a model on them and then classify them. The two primary methods of interest available on an IClassifier
instance are train()
and classify()
. The train()
method accepts a set of labeled training data and builds a model that allows the classify()
method to assign labels to unlabeled data. These are used in the example below.
public class TrainAndClassify {
public static void main(String[] args) throws AISPException, IOException {
// The location of the sounds in the aisp-core-samples project
String metadata = "sample-sounds/chiller";
// Load the sounds and require a meta-data file providing labels & sensor ids.
// The sounds acquire their status labels from the metadata.csv file.
Iterable<SoundRecording> srList = SoundRecording.readMetaDataSounds(metadata);
// Create and train the classifier on the "status" label attached to
// the loaded sounds. The GMMClassifier is a fairly good shallow model.
final String trainingLabel = "status";
IClassifier<double[]> classifier = new GMMClassifier();
classifier.train(trainingLabel, srList);
// Now see if the classifier works on the training data.
for (SoundRecording sr : srList) {
String expectedValue = sr.getLabels().getProperty(trainingLabel);
SoundClip clip = sr.getDataWindow(); // Raw data window w/o labels
Map<String, Classification> cmap = classifier.classify(clip);
Classification c = cmap.get(trainingLabel);
String classifiedValue = c == null ? Classification.UndefinedLabelValue
: c.getLabelValue();
System.out.println("Expected label= " + expectedValue
+ ", classified label=" + classifiedValue);
}
}
}
Sounds and their labels (i.e. SoundRecording instances) may be stored either in the file system or a database. We cover both cases in the next sections
We have defined a very simple custom text file format to define the labeling of sounds. The file format is a 3-column comma-separated value text file.
Each line in the file corresponds to one sound file, which columns defined as follows:
- The first column names the location of the file,
- the second defines 1 or more l label=value pairs separated by semi-colons,
- the 3rd is a set of tag name=value pairs seprated by semi-colons. The following shows sounds with multiple labels and no tags.
wheeze/wheeze1.wav,event=cough;type=wheeze;smoker=true, hack/hack1.wav,event=cough;type=hack;smoker=true, ...
The metadata file is named metadata.csv, by convention, and usually is located in the same directory as the files, although this is not required.
The file names are relative to the location of the metadata file.
The only restriction is that the basenames of the files are unique.
The following would be the directory structure for the metadata.csv file above:
+mysounddir - metadata.csv +wheeze - wheeze1.mp3 +hack - hack1.wav - ...
A single label model may then be trained on any one of the labels defined in the metadata file.
For example,
Iterable<SoundRecording> sounds = SoundRecording.readMetaDataSounds("mysounddir");
IClassifier<double[]> classifier = new GMMClassifier("event");
classifier.train(sounds);
Labeled sounds may be written to the file system along with the metdata.csv file as follows:
Iterable<SoundRecording> sounds = ...;
int count = SoundRecording.writeMetaDataSounds("mysounddir", sounds);
After loading and or manipulating sounds, you can save them to disk with the metdata.csv file with the following:
// Now write them out to the local directory along with the metadata file.
String destDir = ".";
int count = SoundRecording.writeMetaDataSounds(destDir, sounds);
System.out.println("Wrote " + count + " sounds and metadata.csv to " + destDir);
<a name="persistingclassifier"></a>
# Persisting Classifiers
Classifiers can be stored persistently in either the file system or a database (currently mongo). Models may also be trained and stored in the server using the [REST APIs](REST-apis).
## Persisting Classifiers in the File System
`IClassifier `is defined to be a Java `Serializable `and therefore can be serialized to the file system. You can write your own serialization code or you can use the methods of `ClassUtilities `class.
```java
// Create the classifier to train on 'mylabel'.
IClassifier<double[]> classifier = new GMMClassifier();
// ... train model on your sounds.
// Store the classifier to disk.
FixedClassifiers.write("mymodel.ser", classifier);
Once you have a trained model, you can load it and perform classification as follows:
// Load the trained classifier and classify a sound.
classifier = (IClassifier)FixedClassifiers.read("mymodel.ser");
SoundClip clip = SoundClip.readClip("somesound.wav");
Map<String,Classification> c = classifier.classify(clip);