Caffe 训练mnist数据

环境: Ubuntu 12.04,  Caffe

cd $CAFFE_ROOT/data/mnist
./

cd $CAFFE_ROOT/examples/mnist
vi lenet_solver.prototxt
修改solver_mode为CPU

./train_lenet.sh

I0823 08:11:04.501404 15183 caffe.cpp:90] Starting Optimization
I0823 08:11:04.502498 15183 solver.cpp:32] Initializing solver from parameters:
test_iter: 100
test_interval: 500
base_lr: 0.01
display: 100
max_iter: 10000
lr_policy: “inv”
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: “lenet”
solver_mode: CPU
net: “lenet_train_test.prototxt”
FATAL: Error inserting nvidia_331 (/lib/modules/3.2.0-57-generic/updates/dkms/nvidia_331.ko): No such device
E0823 08:11:04.762663 15183 common.cpp:91] Cannot create Cublas handle. Cublas won’t be available.
FATAL: Error inserting nvidia_331 (/lib/modules/3.2.0-57-generic/updates/dkms/nvidia_331.ko): No such device
E0823 08:11:04.982652 15183 common.cpp:98] Cannot create Curand generator. Curand won’t be available.
I082308:11:04.982898 15183 solver.cpp:72] Creating training net from net file: lenet_train_test.prototxt
I0823 08:11:04.983438 15183 net.cpp:223] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist
I0823 08:11:04.983516 15183 net.cpp:223] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0823 08:11:04.983629 15183 net.cpp:38] Initializing net from parameters:

name: “LeNet”

layers {
top: “data”
top: “label”
name: “mnist”
type: DATA
data_param {
source: “mnist-test-leveldb”
scale: 0.00390625
batch_size: 100
}
include {
phase: TEST
}
}
layers {
bottom: “data”
top: “conv1”
name: “conv1”
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “conv1”
top: “pool1”
name: “pool1”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: “pool1”
top: “conv2”
name: “conv2”
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “conv2”
top: “pool2”
name: “pool2”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: “pool2”
top: “ip1”
name: “ip1”
type: INNER_PRODUCT
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 500
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “ip1”
top: “ip1”
name: “relu1”
type: RELU
}
layers {
bottom: “ip1”
top: “ip2”
name: “ip2”
type: INNER_PRODUCT
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 10
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “ip2”
bottom: “label”
top: “accuracy”
name: “accuracy”
type: ACCURACY
include {
phase: TEST
}
}
layers {
bottom: “ip2”
bottom: “label”
top: “loss”
name: “loss”
type: SOFTMAX_LOSS
}
state {
phase: TEST
}
I0823 08:11:04.524307  2464 net.cpp:66] Creating Layer mnist
I0823 08:11:04.524438  2464 net.cpp:290] mnist -> data
I0823 08:11:04.524711  2464 net.cpp:290] mnist -> label
I0823 08:11:04.524833  2464 data_layer.cpp:179] Opening leveldb mnist-test-leveldb
I0823 08:11:04.617794  2464 data_layer.cpp:262] output data size: 100,1,28,28
I0823 08:11:04.618073  2464 net.cpp:83] Top shape: 100 1 28 28 (78400)
I0823 08:11:04.618237  2464 net.cpp:83] Top shape: 100 1 1 1 (100)
I0823 08:11:04.618285  2464 net.cpp:130] mnist does not need backward computation.
I0823 08:11:04.618414  2464 net.cpp:66] Creating Layer label_mnist_1_split
I0823 08:11:04.618479  2464 net.cpp:329] label_mnist_1_split <- label
I0823 08:11:04.618859  2464 net.cpp:280] label_mnist_1_split -> label (in-place)
I0823 08:11:04.618948  2464 net.cpp:290] label_mnist_1_split -> label_mnist_1_split_1
I0823 08:11:04.618999  2464 net.cpp:83] Top shape: 100 1 1 1 (100)
I0823 08:11:04.619735  2464 net.cpp:83] Top shape: 100 1 1 1 (100)
I0823 08:11:04.619850  2464 net.cpp:130] label_mnist_1_split does not need backward com                                                                                 putation.
I0823 08:11:04.619900  2464 net.cpp:66] Creating Layer conv1
I0823 08:11:04.620210  2464 net.cpp:329] conv1 <- data
I0823 08:11:04.620262  2464 net.cpp:290] conv1 -> conv1
I0823 08:11:04.620434  2464 net.cpp:83] Top shape: 100 20 24 24 (1152000)
I0823 08:11:04.620515  2464 net.cpp:125] conv1 needs backward computation.
I0823 08:11:04.620580  2464 net.cpp:66] Creating Layer pool1
I0823 08:11:04.620620  2464 net.cpp:329] pool1 <- conv1
I0823 08:11:04.620663  2464 net.cpp:290] pool1 -> pool1
I0823 08:11:04.621214  2464 net.cpp:83] Top shape: 100 20 12 12 (288000)
I0823 08:11:04.621287  2464 net.cpp:125] pool1 needs backward computation.
I0823 08:11:04.621368  2464 net.cpp:66] Creating Layer conv2
I0823 08:11:04.621604  2464 net.cpp:329] conv2 <- pool1
I0823 08:11:04.621724  2464 net.cpp:290] conv2 -> conv2
I0823 08:11:04.622458  2464 net.cpp:83] Top shape: 100 50 8 8 (320000)
I0823 08:11:04.622563  2464 net.cpp:125] conv2 needs backward computation.
I0823 08:11:04.622607  2464 net.cpp:66] Creating Layer pool2
I0823 08:11:04.622648  2464 net.cpp:329] pool2 <- conv2
I0823 08:11:04.622691  2464 net.cpp:290] pool2 -> pool2
I0823 08:11:04.622730  2464 net.cpp:83] Top shape: 100 50 4 4 (80000)
I0823 08:11:04.623108  2464 net.cpp:125] pool2 needs backward computation.
I0823 08:11:04.623181  2464 net.cpp:66] Creating Layer ip1
I0823 08:11:04.623435  2464 net.cpp:329] ip1 <- pool2
I0823 08:11:04.623749  2464 net.cpp:290] ip1 -> ip1
I0823 08:11:04.628530  2464 net.cpp:83] Top shape: 100 500 1 1 (50000)
I0823 08:11:04.628690  2464 net.cpp:125] ip1 needs backward computation.
I0823 08:11:04.628726  2464 net.cpp:66] Creating Layer relu1
I0823 08:11:04.628751  2464 net.cpp:329] relu1 <- ip1
I0823 08:11:04.628779  2464 net.cpp:280] relu1 -> ip1 (in-place)
I0823 08:11:04.628809  2464 net.cpp:83] Top shape: 100 500 1 1 (50000)
I0823 08:11:04.628835  2464 net.cpp:125] relu1 needs backward computation.
I0823 08:11:04.629266  2464 net.cpp:66] Creating Layer ip2
I0823 08:11:04.629317  2464 net.cpp:329] ip2 <- ip1
I0823 08:11:04.629365  2464 net.cpp:290] ip2 -> ip2
I0823 08:11:04.629861  2464 net.cpp:83] Top shape: 100 10 1 1 (1000)
I0823 08:11:04.629947  2464 net.cpp:125] ip2 needs backward computation.
I0823 08:11:04.629992  2464 net.cpp:66] Creating Layer ip2_ip2_0_split
I0823 08:11:04.630108  2464 net.cpp:329] ip2_ip2_0_split <- ip2
I0823 08:11:04.630190  2464 net.cpp:280] ip2_ip2_0_split -> ip2 (in-place)
I0823 08:11:04.630980  2464 net.cpp:290] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0823 08:11:04.631105  2464 net.cpp:83] Top shape: 100 10 1 1 (1000)
I0823 08:11:04.631145  2464 net.cpp:83] Top shape: 100 10 1 1 (1000)
I0823 08:11:04.631182  2464 net.cpp:125] ip2_ip2_0_split needs backward computation.
I0823 08:11:04.631342  2464 net.cpp:66] Creating Layer accuracy
I0823 08:11:04.631391  2464 net.cpp:329] accuracy <- ip2
I0823 08:11:04.631862  2464 net.cpp:329] accuracy <- label
I0823 08:11:04.631963  2464 net.cpp:290] accuracy -> accuracy
I0823 08:11:04.632132  2464 net.cpp:83] Top shape: 1 1 1 1 (1)
I0823 08:11:04.632175  2464 net.cpp:125] accuracy needs backward computation.
I0823 08:11:04.632494  2464 net.cpp:66] Creating Layer loss
I0823 08:11:04.632750  2464 net.cpp:329] loss <- ip2_ip2_0_split_1
I0823 08:11:04.632804  2464 net.cpp:329] loss <- label_mnist_1_split_1
I0823 08:11:04.632853  2464 net.cpp:290] loss -> loss
I0823 08:11:04.633280  2464 net.cpp:83] Top shape: 1 1 1 1 (1)
I0823 08:11:04.633471  2464 net.cpp:125] loss needs backward computation.
I0823 08:11:04.633826  2464 net.cpp:156] This network produces output accuracy
I0823 08:11:04.633872  2464 net.cpp:156] This network produces output loss
I0823 08:11:04.634106  2464 net.cpp:402] Collecting Learning Rate and Weight Decay.
I0823 08:11:04.634172  2464 net.cpp:167] Network initialization done.
I0823 08:11:04.634213  2464 net.cpp:168] Memory required for data: 0
I0823 08:11:04.634326  2464 solver.cpp:46] Solver scaffolding done.
I0823 08:11:04.634436  2464 solver.cpp:165] Solving LeNet
I0823 08:11:04.634881  2464 solver.cpp:232] Iteration 0, Testing net (#0)
I0823 08:11:19.170075  2464 solver.cpp:270] Test score #0: 0.1059
I0823 08:11:19.170248  2464 solver.cpp:270] Test score #1: 2.30245
I0823 08:11:19.417044  2464 solver.cpp:195] Iteration 0, loss = 2.30231
I0823 08:11:19.417177  2464 solver.cpp:365] Iteration 0, lr = 0.01
I0823 08:11:43.741911  2464 solver.cpp:195] Iteration 100, loss = 0.317127
I0823 08:11:43.742342  2464 solver.cpp:365] Iteration 100, lr = 0.00992565
I0823 08:12:07.532147  2464 solver.cpp:195] Iteration 200, loss = 0.173197
I0823 08:12:07.532258  2464 solver.cpp:365] Iteration 200, lr = 0.00985258
I0823 08:12:31.409700  2464 solver.cpp:195] Iteration 300, loss = 0.247124
I0823 08:12:31.410508  2464 solver.cpp:365] Iteration 300, lr = 0.00978075
I0823 08:12:54.552777  2464 solver.cpp:195] Iteration 400, loss = 0.102047
I0823 08:12:54.552903  2464 solver.cpp:365] Iteration 400, lr = 0.00971013
I0823 08:13:17.605888  2464 solver.cpp:232] Iteration 500, Testing net (#0)

……
I0823 09:10:29.736903  2464 solver.cpp:270] Test score #0: 0.9887
I0823 09:10:29.737015  2464 solver.cpp:270] Test score #1: 0.0369187
I0823 09:10:30.063771  2464 solver.cpp:195] Iteration 9500, loss = 0.00306773
I0823 09:10:30.063874  2464 solver.cpp:365] Iteration 9500, lr = 0.00606002
I0823 09:10:57.213291  2464 solver.cpp:195] Iteration 9600, loss = 0.00250475
I0823 09:10:57.213827  2464 solver.cpp:365] Iteration 9600, lr = 0.00603682
I0823 09:11:26.278821  2464 solver.cpp:195] Iteration 9700, loss = 0.00243088
I0823 09:11:26.279002  2464 solver.cpp:365] Iteration 9700, lr = 0.00601382
I0823 09:11:53.438747  2464 solver.cpp:195] Iteration 9800, loss = 0.0136355
I0823 09:11:53.439350  2464 solver.cpp:365] Iteration 9800, lr = 0.00599102
I0823 09:12:20.007823  2464 solver.cpp:195] Iteration 9900, loss = 0.00696897
I0823 09:12:20.008005  2464 solver.cpp:365] Iteration 9900, lr = 0.00596843
I0823 09:12:46.920634  2464 solver.cpp:287] Snapshotting to lenet_iter_10000
I0823 09:12:46.930307  2464 solver.cpp:294] Snapshotting solver state to lenet_iter_10000.solverstate
I0823 09:12:47.039417  2464 solver.cpp:213] Iteration 10000, loss = 0.00343354
I0823 09:12:47.039518  2464 solver.cpp:232] Iteration 10000, Testing net (#0)
I0823 09:13:02.146388  2464 solver.cpp:270] Test score #0: 0.9909
I0823 09:13:02.146509  2464 solver.cpp:270] Test score #1: 0.0288982
I0823 09:13:02.146543  2464 solver.cpp:218] Optimization Done.
I0823 09:13:02.146564  2464 caffe.cpp:113] Optimization Done.

运行最终产生lenet_iter_10000的binary protobuf文件,查看文件内容:

cd /u01/caffe/examples/mnist
jerry@hq:/u01/caffe/examples/mnist$ python
Python 2.7.3 (default, Sep 26 2013, 20:03:06)
[GCC 4.6.3] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import caffe
>>> net = caffe.Net(‘lenet.prototxt’, ‘lenet_iter_10000’)
FATAL: Error inserting nvidia_331 (/lib/modules/3.2.0-57-generic/updates/dkms/nvidia_331.ko): No such device
WARNING: Logging before InitGoogleLogging() is written to STDERR
E0823 10:41:06.040340 16020 common.cpp:91] Cannot create Cublas handle. Cublas won’t be available.
FATAL: Error inserting nvidia_331 (/lib/modules/3.2.0-57-generic/updates/dkms/nvidia_331.ko): No such device
E0823 10:41:06.242882 16020 common.cpp:98] Cannot create Curand generator. Curand won’t be available.
I0823 10:41:06.243221 16020 net.cpp:38] Initializing net from parameters:
name: “LeNet”
layers {
bottom: “data”
top: “conv1”
name: “conv1”
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “conv1”
top: “pool1”
name: “pool1”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: “pool1”
top: “conv2”
name: “conv2”
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “conv2”
top: “pool2”
name: “pool2”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: “pool2”
top: “ip1”
name: “ip1”
type: INNER_PRODUCT
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 500
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “ip1”
top: “ip1”
name: “relu1”
type: RELU
}
layers {
bottom: “ip1”
top: “ip2”
name: “ip2”
type: INNER_PRODUCT
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 10
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “ip2”
top: “prob”
name: “prob”
type: SOFTMAX
}
input: “data”
input_dim: 64
input_dim: 1
input_dim: 28
input_dim: 28
I0823 10:41:06.244067 16020 net.cpp:292] Input 0 -> data
I0823 10:41:06.244173 16020 net.cpp:66] Creating Layer conv1
I0823 10:41:06.244201 16020 net.cpp:329] conv1 <- data
I0823 10:41:06.244228 16020 net.cpp:290] conv1 -> conv1
I0823 10:41:06.245010 16020 net.cpp:83] Top shape: 64 20 24 24 (737280)
I0823 10:41:06.245100 16020 net.cpp:125] conv1 needs backward computation.
I0823 10:41:06.245172 16020 net.cpp:66] Creating Layer pool1
I0823 10:41:06.245210 16020 net.cpp:329] pool1 <- conv1
I0823 10:41:06.245276 16020 net.cpp:290] pool1 -> pool1
I0823 10:41:06.245338 16020 net.cpp:83] Top shape: 64 20 12 12 (184320)
I0823 10:41:06.245378 16020 net.cpp:125] pool1 needs backward computation.
I0823 10:41:06.245426 16020 net.cpp:66] Creating Layer conv2
I0823 10:41:06.245462 16020 net.cpp:329] conv2 <- pool1
I0823 10:41:06.245509 16020 net.cpp:290] conv2 -> conv2
I0823 10:41:06.245834 16020 net.cpp:83] Top shape: 64 50 8 8 (204800)
I0823 10:41:06.245893 16020 net.cpp:125] conv2 needs backward computation.
I0823 10:41:06.245918 16020 net.cpp:66] Creating Layer pool2
I0823 10:41:06.246021 16020 net.cpp:329] pool2 <- conv2
I0823 10:41:06.246088 16020 net.cpp:290] pool2 -> pool2
I0823 10:41:06.246136 16020 net.cpp:83] Top shape: 64 50 4 4 (51200)
I0823 10:41:06.246212 16020 net.cpp:125] pool2 needs backward computation.
I0823 10:41:06.246263 16020 net.cpp:66] Creating Layer ip1
I0823 10:41:06.246296 16020 net.cpp:329] ip1 <- pool2
I0823 10:41:06.246352 16020 net.cpp:290] ip1 -> ip1
I0823 10:41:06.250891 16020 net.cpp:83] Top shape: 64 500 1 1 (32000)
I0823 10:41:06.251027 16020 net.cpp:125] ip1 needs backward computation.
I0823 10:41:06.251073 16020 net.cpp:66] Creating Layer relu1
I0823 10:41:06.251111 16020 net.cpp:329] relu1 <- ip1
I0823 10:41:06.251149 16020 net.cpp:280] relu1 -> ip1 (in-place)
I0823 10:41:06.251196 16020 net.cpp:83] Top shape: 64 500 1 1 (32000)
I0823 10:41:06.251231 16020 net.cpp:125] relu1 needs backward computation.
I0823 10:41:06.251268 16020 net.cpp:66] Creating Layer ip2
I0823 10:41:06.251302 16020 net.cpp:329] ip2 <- ip1
I0823 10:41:06.251461 16020 net.cpp:290] ip2 -> ip2
I0823 10:41:06.251601 16020 net.cpp:83] Top shape: 64 10 1 1 (640)
I0823 10:41:06.251646 16020 net.cpp:125] ip2 needs backward computation.
I0823 10:41:06.251682 16020 net.cpp:66] Creating Layer prob
I0823 10:41:06.251716 16020 net.cpp:329] prob <- ip2
I0823 10:41:06.251757 16020 net.cpp:290] prob -> prob
I0823 10:41:06.252317 16020 net.cpp:83] Top shape: 64 10 1 1 (640)
I0823 10:41:06.252887 16020 net.cpp:125] prob needs backward computation.
I0823 10:41:06.252924 16020 net.cpp:156] This network produces output prob
I0823 10:41:06.252977 16020 net.cpp:402] Collecting Learning Rate and Weight Decay.
I0823 10:41:06.253016 16020 net.cpp:167] Network initialization done.
I0823 10:41:06.253099 16020 net.cpp:168] Memory required for data: 200704

查看网络定义
>>> print open(‘lenet.prototxt’).read()
name: “LeNet”
input: “data”
input_dim: 64
input_dim: 1
input_dim: 28
input_dim: 28
layers {
name: “conv1”
type: CONVOLUTION
bottom: “data”
top: “conv1”
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
name: “pool1”
type: POOLING
bottom: “conv1”
top: “pool1”
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
name: “conv2”
type: CONVOLUTION
bottom: “pool1”
top: “conv2”
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
name: “pool2”
type: POOLING
bottom: “conv2”
top: “pool2”
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
name: “ip1”
type: INNER_PRODUCT
bottom: “pool2”
top: “ip1”
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 500
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
name: “relu1”
type: RELU
bottom: “ip1”
top: “ip1”
}
layers {
name: “ip2”
type: INNER_PRODUCT
bottom: “ip1”
top: “ip2”
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 10
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
name: “prob”
type: SOFTMAX
bottom: “ip2”
top: “prob”
}

查看每层的参数
>>>dir(net)
>>> net.params[“ip2”][0].data
array([[[[-0.00913269,  0.06095703, -0.09719526, …, -0.01292357,
-0.02721527, -0.04921406],
[ 0.07316435, -0.10016691, -0.00194797, …, -0.02357075,
-0.03735601, -0.12467863],
[ 0.11690015, -0.13771389, -0.04632974, …,  0.02967362,
-0.11868649,  0.01114164],
…,
[-0.18345536, -0.01772851,  0.06773216, …, -0.00851034,
-0.02590596,  0.01125562],
[-0.16715027,  0.03873322,  0.03800297, …,  0.0236346 ,
-0.01642762,  0.04072023],
[ 0.10814335, -0.04631414,  0.09708735, …, -0.0280726 ,
-0.14074558,  0.14641024]]]], dtype=float32)

后续功能将继续探索

作者: hqiang1984

量化自我,极简主义