-
Notifications
You must be signed in to change notification settings - Fork 50
/
prototxt_generation_instructions.txt
78 lines (67 loc) · 3.28 KB
/
prototxt_generation_instructions.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
If you want to pretrain a model on an existing model, follow the
instructions on the flickr-style finetuning example:
https://github.com/BVLC/caffe/tree/master/examples/finetune_flickr_style
To avoid a loss that has the value "nan", change the weight_filler types
from gaussian to xavier, and make sure the base learning rate is 0.001
or lower.
Also make sure you change the two base_lr fields of the layer you're
resetting to random weights to 10 and 20 (rather than 1 and 2)
If you want to train a model from scratch, follow these instructions:
After completing these instructions, you will have made these 3 files:
train_val.prototxt, solver.prototxt, and deploy.prototxt
Use absolute paths for simplicity. The `aux` dir is kept in the github
repo, becaues it sucks to lose these prototxt files!
First, copy the model's train_val.prototxt to
video-object-detection/aux/<dataset name>/
Change the files for each "source:" and "mean_file:" line to the lmbd
files and image mean created by prepare_data.py.
Find the last layer that contains a "num_output" field and change that to
the number of categories (in my case, two: one positive, one negative).
I also change the train batch_size to 64, in case I run out of memory
on my macbook.
Balance the test batch_size, and the test_iter in the solver.prototxt,
so that their product >= the number of test images.
Copy the model's solver.prototxt to
video-object-detection/aux/<dataset name>/
Change the "net:" path to the path to the train_val.prototxt you just created.
Change the "snapshot_prefix:" to
data/imagenet/<wnid_dir>/snapshots/images/<dataset>/snapshots/snapshot
The snapshots directory must exist, and the "snapshot" will be the prefix of
"_iter_100..."
Change the "test_iter" to a number such that the test batch_size (as defined
in train_val.prototxt) is just large enough so that
test_iter*batch_size >= count(test_images)
Change the "server_mode:" to CPU
Change the "snapshot:" to 1000. On my macbook, it takes 3 minutes per 20
iterations, so this produces one snapshot per 2.5 hours.
Don't bother reducing "max_iter", because you can interrupt training at any time.
Change "test_iter" to 1000. Testing is quick when there aren't many
images in the test set. The purpose of testing the net when a snapshot is
made is to help us identify when we begin to overfit, which helps us
select which snapshot to use to detect objects.
Copy this to a new file called deploy.prototxt in
video-object-detection/aux/<dataset name>/
```
name: "<the name in the train_val.prototxt>"
input: "data"
input_dim: 10
input_dim: 3
input_dim: 224 # or whatever the crop_size is in train_val.prototxt
input_dim: 224
```
Append to that all layers in train_val.prototxt.
Delete the first few layers that don't have a "bottom" field.
Delete all pramaters that have to do exclusively with learning.
e.g.:
- blobs_lr
- weight_decay
- weight_filler
- bias_filler
Change the value of the layer that contains a "num_output" field
to the number of categories (in my case, 2).
Then train the model until you see its test accuracy decreases significantly.
This decrease indicates the model is overfitting. Simply kill train.py and
use a snapshot of the caffemodel to detect objects.
References for generating deploy.prototxt:
https://github.com/BVLC/caffe/issues/1245
https://github.com/BVLC/caffe/issues/261