-
Notifications
You must be signed in to change notification settings - Fork 449
Fossils
This is a design document for a distributed thinking project. Don't edit this page unless you're involved in this project.
A given rig session will yield
- A number of cameras with ~1000 images. Image are hi-res (~2Kx3K pixels). Each image has a timestamp. File names may not be unique across cameras.
- A GPS log file with a sequence of timestamped waypoints.
We'll need to develop an image collector program (python/Linux?). This will first read in the GPS log file. Then you'll connect the various cameras, one at a time, to USB, and it will download their images. This will produce a "batch" of images, consisting of:
- A directory of the images, with filenames that are unique within the batch.
- A 'batch description' file, JSON format. Includes
- Batch name (descriptive)
- And for each image:
- time
- lat/long of center of image (estimated from image timestamp and GPS log)
- filename
- x/y size, pixels
- estimated size factor (meters/pixel)
We'll develop a script load_images that takes the above info and does the following:
- Create a Bossa "batch" record
- Copy the images to a directory whose name is the batch ID
- Create medium-res (~1024x768) versions of images.
- Creates a Bossa job record for each image. Each run of this script produces a batch of jobs that can be managed as a unit (see below). Initially the state of the batch is "pending", meaning that it is not available to volunteers.
There will be three levels of participation: Beginning:: Identify bones, no classification. Intended for elementary school kids. Intermediate:: Distinguish Primate/Nonprimate, and Tooth/Skull/Other. Intended for adults from general public. Advanced:: Distinguish ~10 taxa and ~5 bone types. Intended for experts.
The level is stored both in user records and in job instances.
We'll need to develop training courses (probably using Bolt) for each volunteer level. A given course will contain:
- several examples (different ages, lighting conditions) of each type of object to be identified, some negatives, and some images that look like positives but are negative.
- a test consisting of some number of positive and negative examples
Each image will initially be shown in the medium-res (1024x768) form. It will have a control area that remains fixed at the top of the window, even if the image scrolls. The control area includes:
- a "Done" button
- a "Magnify" or "Shrink" button. This toggles between the medium-res and hi-res (3Kx2K) image.
- Menus for feature type, and a comment field.
Other ideas:
- a "Rotate" button that rotates the image 90 deg.
To identify a feature, the user selects a feature type and clicks on the image.
After every N images (N=1?) the volunteer is shown a feedback page (see below).
This page is intended to give volunteers feedback on their efforts, and to encourage them to continue. Possible contents:
- thumbnails of recent jobs, with links so that they can see how other people annotated the same image.
- links to message boards
- "who's online now", with ability to send instant messages
- this users's message inbox
We may want to use "calibration images", i.e. images for which the correct annotation is known, for two reasons:
- to increase the fraction of images that contain something
- to improve our assessment of volunteer error rates
Scientists will interact through a web-based interface. The top-level page will show a list of batches, with their times, names, spatial extent, and status (Pending, In Progress, Completed). For a given batch, the scientist can:
- Change the status of the batch
- View a map in which the images appear as small rectangles, with color coding showing which have been processed by volunteers, and how many features were found. Clicking on a rectangle goes to a page showing the instances of the job, allowing the scientist to view and compare the results, to select a "best" answer, and to add their own annotations.
- View a textual list of images, ordered by decreasing number of features found.