机器学习 – 第2页 – 强的部落格

h2o-sparkling 使用

环境: CentOS 6.2

h2o-sparking 是h2o与spark结合的产物，用于机器学习这一方面，它可在spark环境中使用h2o拥有的机器学习包。

安装如下：
git clone https://github.com/0xdata/h2o-sparkling.git
cd h2o-sparking
sbt assembly

运行测试：
[cloudera@localhost h2o-sparkling]$ sbt -mem 500 “run –local”
[info] Loading project definition from /home/cloudera/h2o-sparkling/project
[info] Set current project to h2o-sparkling-demo (in build file:/home/cloudera/h2o-sparkling/)
[info] Running water.sparkling.demo.SparklingDemo –local
03:41:11.030 main INFO WATER: —– H2O started —–
03:41:11.046 main INFO WATER: Build git branch: (unknown)
03:41:11.047 main INFO WATER: Build git hash: (unknown)
03:41:11.047 main INFO WATER: Build git describe: (unknown)
03:41:11.047 main INFO WATER: Build project version: (unknown)
03:41:11.047 main INFO WATER: Built by: ‘(unknown)’
03:41:11.047 main INFO WATER: Built on: ‘(unknown)’
03:41:11.048 main INFO WATER: Java availableProcessors: 1
03:41:11.077 main INFO WATER: Java heap totalMemory: 3.87 gb
03:41:11.077 main INFO WATER: Java heap maxMemory: 3.87 gb
03:41:11.078 main INFO WATER: Java version: Java 1.6.0_31 (from Sun Microsystems Inc.)
03:41:11.078 main INFO WATER: OS version: Linux 2.6.32-220.23.1.el6.x86_64 (amd64)
03:41:11.381 main INFO WATER: Machine physical memory: 4.83 gb
03:41:11.393 main INFO WATER: ICE root: ‘/tmp/h2o-cloudera’
03:41:11.438 main INFO WATER: Possible IP Address: eth1 (eth1), 192.168.56.101
03:41:11.439 main INFO WATER: Possible IP Address: eth0 (eth0), 10.0.2.15
03:41:11.439 main INFO WATER: Possible IP Address: lo (lo), 127.0.0.1
03:41:11.669 main WARN WATER: Multiple local IPs detected:
+ /192.168.56.101 /10.0.2.15
+ Attempting to determine correct address…
+ Using /10.0.2.15
03:41:11.929 main INFO WATER: Internal communication uses port: 54322
+ Listening for HTTP and REST traffic on http://10.0.2.15:54321/
03:41:12.912 main INFO WATER: H2O cloud name: ‘cloudera’
03:41:12.913 main INFO WATER: (v(unknown)) ‘cloudera’ on /10.0.2.15:54321, discovery address /230.63.2.255:58943
03:41:12.913 main INFO WATER: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
+ 1. Open a terminal and run ‘ssh -L 55555:localhost:54321 cloudera@10.0.2.15’
+ 2. Point your browser to http://localhost:55555
03:41:12.954 main INFO WATER: Cloud of size 1 formed [/10.0.2.15:54321 (00:00:00.000)]
03:41:12.954 main INFO WATER: Log dir: ‘/tmp/h2o-cloudera/h2ologs’
prostate
03:41:20.369 main INFO WATER: Running demo with following configuration: DemoConf(prostate,true,RDDExtractor@file,true)
03:41:20.409 main INFO WATER: Demo configuration: DemoConf(prostate,true,RDDExtractor@file,true)
03:41:21.830 main INFO WATER: Data : data/prostate.csv
03:41:21.831 main INFO WATER: Table: prostate_table
03:41:21.831 main INFO WATER: Query: SELECT * FROM prostate_table WHERE capsule=1
03:41:21.831 main INFO WATER: Spark: LOCAL
03:41:21.901 main INFO WATER: Creating LOCAL Spark context.
03:41:34.616 main INFO WATER: RDD result has: 153 rows
03:41:34.752 main INFO WATER: Going to write RDD into /tmp/rdd_null_6.csv
03:41:36.099 FJ-0-1 INFO WATER: Parse result for rdd_data_6 (153 rows):
03:41:36.136 FJ-0-1 INFO WATER: C1: numeric min(6.000000) max(378.000000)
03:41:36.140 FJ-0-1 INFO WATER: C2: numeric min(1.000000) max(1.000000) constant
03:41:36.146 FJ-0-1 INFO WATER: C3: numeric min(47.000000) max(79.000000)
03:41:36.152 FJ-0-1 INFO WATER: C4: numeric min(0.000000) max(2.000000)
03:41:36.158 FJ-0-1 INFO WATER: C5: numeric min(1.000000) max(4.000000)
03:41:36.161 FJ-0-1 INFO WATER: C6: numeric min(1.000000) max(2.000000)
03:41:36.165 FJ-0-1 INFO WATER: C7: numeric min(1.400000) max(139.700000)
03:41:36.169 FJ-0-1 INFO WATER: C8: numeric min(0.000000) max(73.400000)
03:41:36.176 FJ-0-1 INFO WATER: C9: numeric min(5.000000) max(9.000000)
03:41:37.457 main INFO WATER: Extracted frame from Spark:
03:41:37.474 main INFO WATER: {id,capsule,age,race,dpros,dcaps,psa,vol,gleason}, 2.8 KB
+ Chunk starts: {0,83,}
+ Rows: 153
03:41:37.482 #ti-UDP-R INFO WATER: Orderly shutdown command from /10.0.2.15:54321
[success] Total time: 44 s, completed Aug 4, 2014 3:41:37 AM

本地集群运行：
[cloudera@localhost h2o-sparkling]$ sbt -mem 100 “run –remote”
[info] Loading project definition from /home/cloudera/h2o-sparkling/project
[info] Set current project to h2o-sparkling-demo (in build file:/home/cloudera/h2o-sparkling/)
[info] Running water.sparkling.demo.SparklingDemo –remote
03:25:42.306 main INFO WATER: —– H2O started —–
03:25:42.309 main INFO WATER: Build git branch: (unknown)
03:25:42.309 main INFO WATER: Build git hash: (unknown)
03:25:42.309 main INFO WATER: Build git describe: (unknown)
03:25:42.309 main INFO WATER: Build project version: (unknown)
03:25:42.309 main INFO WATER: Built by: ‘(unknown)’
03:25:42.309 main INFO WATER: Built on: ‘(unknown)’
03:25:42.310 main INFO WATER: Java availableProcessors: 4
03:25:42.316 main INFO WATER: Java heap totalMemory: 3.83 gb
03:25:42.316 main INFO WATER: Java heap maxMemory: 3.83 gb
03:25:42.316 main INFO WATER: Java version: Java 1.6.0_31 (from Sun Microsystems Inc.)
03:25:42.317 main INFO WATER: OS version: Linux 2.6.32-220.23.1.el6.x86_64 (amd64)
03:25:42.383 main INFO WATER: Machine physical memory: 4.95 gb
03:25:42.384 main INFO WATER: ICE root: ‘/tmp/h2o-cloudera’
03:25:42.389 main INFO WATER: Possible IP Address: eth1 (eth1), 192.168.56.101
03:25:42.389 main INFO WATER: Possible IP Address: eth0 (eth0), 10.0.2.15
03:25:42.389 main INFO WATER: Possible IP Address: lo (lo), 127.0.0.1
03:25:42.587 main WARN WATER: Multiple local IPs detected:
+ /192.168.56.101 /10.0.2.15
+ Attempting to determine correct address…
+ Using /10.0.2.15
03:25:42.650 main INFO WATER: Internal communication uses port: 54322
+ Listening for HTTP and REST traffic on http://10.0.2.15:54321/
03:25:43.906 main INFO WATER: H2O cloud name: ‘cloudera’
03:25:43.906 main INFO WATER: (v(unknown)) ‘cloudera’ on /10.0.2.15:54321, discovery address /230.63.2.255:58943
03:25:43.907 main INFO WATER: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
+ 1. Open a terminal and run ‘ssh -L 55555:localhost:54321 cloudera@10.0.2.15’
+ 2. Point your browser to http://localhost:55555
03:25:43.920 main INFO WATER: Cloud of size 1 formed [/10.0.2.15:54321 (00:00:00.000)]
03:25:43.921 main INFO WATER: Log dir: ‘/tmp/h2o-cloudera/h2ologs’
prostate
03:25:46.985 main INFO WATER: Running demo with following configuration: DemoConf(prostate,false,RDDExtractor@file,true)
03:25:46.991 main INFO WATER: Demo configuration: DemoConf(prostate,false,RDDExtractor@file,true)
03:25:48.000 main INFO WATER: Data : data/prostate.csv
03:25:48.000 main INFO WATER: Table: prostate_table
03:25:48.000 main INFO WATER: Query: SELECT * FROM prostate_table WHERE capsule=1
03:25:48.001 main INFO WATER: Spark: REMOTE
03:25:48.024 main INFO WATER: Creating REMOTE (spark://localhost:7077) Spark context.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 1.0:1 failed 4 times, most recent failure: TID 7 on host 192.168.56.101 failed for unknown reason
Driver stacktrace:
03:26:07.151 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
03:26:07.151 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
03:26:07.151 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
03:26:07.152 main INFO WATER: at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
03:26:07.152 main INFO WATER: at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
03:26:07.152 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
03:26:07.152 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
03:26:07.152 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
03:26:07.153 main INFO WATER: at scala.Option.foreach(Option.scala:236)
03:26:07.153 main INFO WATER: at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
03:26:07.153 main INFO WATER: at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
03:26:07.153 main INFO WATER: at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
03:26:07.155 main INFO WATER: at akka.actor.ActorCell.invoke(ActorCell.scala:456)
03:26:07.155 main INFO WATER: at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
03:26:07.156 main INFO WATER: at akka.dispatch.Mailbox.run(Mailbox.scala:219)
03:26:07.156 main INFO WATER: at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
03:26:07.157 main INFO WATER: at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
03:26:07.158 main INFO WATER: at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
03:26:07.158 main INFO WATER: at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
03:26:07.162 main INFO WATER: at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
03:26:07.172 #ti-UDP-R INFO WATER: Orderly shutdown command from /10.0.2.15:54321
[success] Total time: 27 s, completed Aug 4, 2014 3:26:07 PM

运行失败，目前还无法定位问题所在。

Matlab 使用caffe示例

环境: Ubuntu 12.04 , Matlab 2013b

1. 首先修改Makefile.config中的MATLAB_DIR项，如下所示
MATLAB_DIR := /u01/MATLAB/R2013b

2. 编译下caffe下的matlab接口
make matcaffe

3. 切换到目录/u01/caffe/examples/imagenet，运行./get_caffe_reference_imagenet_model.sh下载训练的模型

4. 切换到目录/u01/caffe/matlab/caffe下，运行matlab调用caffe的示例，

matlab -nodisplay

>> run(‘matcaffe_demo.m’)

……
layers {
bottom: “conv4”
top: “conv4”
name: “relu4”
type: RELU
}
layers {
bottom: “conv4”
top: “conv5”
name: “conv5”
type: CONVOLUTION
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layers {
bottom: “conv5”
top: “conv5”
name: “relu5”
type: RELU
}
layers {
bottom: “conv5”
top: “pool5”
name: “pool5”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layers {
bottom: “pool5”
top: “fc6”
name: “fc6”
type: INNER_PRODUCT
inner_product_param {
num_output: 4096
}
}
layers {
bottom: “fc6”
top: “fc6”
name: “relu6”
type: RELU
}
layers {
bottom: “fc6”
top: “fc6”
name: “drop6”
type: DROPOUT
dropout_param {
dropout_ratio: 0.5
}
}
layers {
bottom: “fc6”
top: “fc7”
name: “fc7”
type: INNER_PRODUCT
inner_product_param {
num_output: 4096
}
}
layers {
bottom: “fc7”
top: “fc7”
name: “relu7”
type: RELU
}
layers {
bottom: “fc7”
top: “fc7”
name: “drop7”
type: DROPOUT
dropout_param {
dropout_ratio: 0.5
}
}
layers {
bottom: “fc7”
top: “fc8”
name: “fc8”
type: INNER_PRODUCT
inner_product_param {
num_output: 1000
}
}
layers {
bottom: “fc8”
top: “prob”
name: “prob”
type: SOFTMAX
}
input: “data”
input_dim: 10
input_dim: 3
input_dim: 227
input_dim: 227
I0912 18:22:26.956653 11968 net.cpp:292] Input 0 -> data
I0912 18:22:26.956778 11968 net.cpp:66] Creating Layer conv1
I0912 18:22:26.956809 11968 net.cpp:329] conv1 <- data
I0912 18:22:26.956889 11968 net.cpp:290] conv1 -> conv1
I0912 18:22:26.957068 11968 net.cpp:83] Top shape: 10 96 55 55 (2904000)
I0912 18:22:26.957139 11968 net.cpp:125] conv1 needs backward computation.
I0912 18:22:26.957207 11968 net.cpp:66] Creating Layer relu1
I0912 18:22:26.957243 11968 net.cpp:329] relu1 <- conv1
I0912 18:22:26.957279 11968 net.cpp:280] relu1 -> conv1 (in-place)
I0912 18:22:26.957347 11968 net.cpp:83] Top shape: 10 96 55 55 (2904000)
I0912 18:22:26.957382 11968 net.cpp:125] relu1 needs backward computation.
I0912 18:22:26.957422 11968 net.cpp:66] Creating Layer pool1
I0912 18:22:26.957458 11968 net.cpp:329] pool1 <- conv1
I0912 18:22:26.957496 11968 net.cpp:290] pool1 -> pool1
I0912 18:22:26.957548 11968 net.cpp:83] Top shape: 10 96 27 27 (699840)
I0912 18:22:26.957583 11968 net.cpp:125] pool1 needs backward computation.
I0912 18:22:26.957619 11968 net.cpp:66] Creating Layer norm1
I0912 18:22:26.957681 11968 net.cpp:329] norm1 <- pool1
I0912 18:22:26.957728 11968 net.cpp:290] norm1 -> norm1
I0912 18:22:26.957774 11968 net.cpp:83] Top shape: 10 96 27 27 (699840)
I0912 18:22:26.957809 11968 net.cpp:125] norm1 needs backward computation.
I0912 18:22:26.958052 11968 net.cpp:66] Creating Layer conv2
I0912 18:22:26.958092 11968 net.cpp:329] conv2 <- norm1
I0912 18:22:26.960306 11968 net.cpp:290] conv2 -> conv2
I0912 18:22:26.961231 11968 net.cpp:83] Top shape: 10 256 27 27 (1866240)
I0912 18:22:26.961369 11968 net.cpp:125] conv2 needs backward computation.
I0912 18:22:26.961398 11968 net.cpp:66] Creating Layer relu2
I0912 18:22:26.961436 11968 net.cpp:329] relu2 <- conv2
I0912 18:22:26.961468 11968 net.cpp:280] relu2 -> conv2 (in-place)
I0912 18:22:26.961496 11968 net.cpp:83] Top shape: 10 256 27 27 (1866240)
I0912 18:22:26.961516 11968 net.cpp:125] relu2 needs backward computation.
I0912 18:22:26.961539 11968 net.cpp:66] Creating Layer pool2
I0912 18:22:26.961593 11968 net.cpp:329] pool2 <- conv2
I0912 18:22:26.961629 11968 net.cpp:290] pool2 -> pool2
I0912 18:22:26.961676 11968 net.cpp:83] Top shape: 10 256 13 13 (432640)
I0912 18:22:26.961710 11968 net.cpp:125] pool2 needs backward computation.
I0912 18:22:26.961805 11968 net.cpp:66] Creating Layer norm2
I0912 18:22:26.961841 11968 net.cpp:329] norm2 <- pool2
I0912 18:22:26.961875 11968 net.cpp:290] norm2 -> norm2
I0912 18:22:26.961913 11968 net.cpp:83] Top shape: 10 256 13 13 (432640)
I0912 18:22:26.961969 11968 net.cpp:125] norm2 needs backward computation.
I0912 18:22:26.962023 11968 net.cpp:66] Creating Layer conv3
I0912 18:22:26.962059 11968 net.cpp:329] conv3 <- norm2
I0912 18:22:26.962096 11968 net.cpp:290] conv3 -> conv3
I0912 18:22:26.965011 11968 net.cpp:83] Top shape: 10 384 13 13 (648960)
I0912 18:22:26.965140 11968 net.cpp:125] conv3 needs backward computation.
I0912 18:22:26.965181 11968 net.cpp:66] Creating Layer relu3
I0912 18:22:26.965258 11968 net.cpp:329] relu3 <- conv3
I0912 18:22:26.965299 11968 net.cpp:280] relu3 -> conv3 (in-place)
I0912 18:22:26.965338 11968 net.cpp:83] Top shape: 10 384 13 13 (648960)
I0912 18:22:26.965479 11968 net.cpp:125] relu3 needs backward computation.
I0912 18:22:26.965520 11968 net.cpp:66] Creating Layer conv4
I0912 18:22:26.965555 11968 net.cpp:329] conv4 <- conv3
I0912 18:22:26.965634 11968 net.cpp:290] conv4 -> conv4
I0912 18:22:26.968613 11968 net.cpp:83] Top shape: 10 384 13 13 (648960)
I0912 18:22:26.968745 11968 net.cpp:125] conv4 needs backward computation.
I0912 18:22:26.968781 11968 net.cpp:66] Creating Layer relu4
I0912 18:22:26.968819 11968 net.cpp:329] relu4 <- conv4
I0912 18:22:26.968873 11968 net.cpp:280] relu4 -> conv4 (in-place)
I0912 18:22:26.968919 11968 net.cpp:83] Top shape: 10 384 13 13 (648960)
I0912 18:22:26.968992 11968 net.cpp:125] relu4 needs backward computation.
I0912 18:22:26.969028 11968 net.cpp:66] Creating Layer conv5
I0912 18:22:26.969066 11968 net.cpp:329] conv5 <- conv4
I0912 18:22:26.969108 11968 net.cpp:290] conv5 -> conv5
I0912 18:22:26.970634 11968 net.cpp:83] Top shape: 10 256 13 13 (432640)
I0912 18:22:26.970749 11968 net.cpp:125] conv5 needs backward computation.
I0912 18:22:26.970780 11968 net.cpp:66] Creating Layer relu5
I0912 18:22:26.970803 11968 net.cpp:329] relu5 <- conv5
I0912 18:22:26.970827 11968 net.cpp:280] relu5 -> conv5 (in-place)
I0912 18:22:26.970918 11968 net.cpp:83] Top shape: 10 256 13 13 (432640)
I0912 18:22:26.970952 11968 net.cpp:125] relu5 needs backward computation.
I0912 18:22:26.970988 11968 net.cpp:66] Creating Layer pool5
I0912 18:22:26.971233 11968 net.cpp:329] pool5 <- conv5
I0912 18:22:26.971282 11968 net.cpp:290] pool5 -> pool5
I0912 18:22:26.971361 11968 net.cpp:83] Top shape: 10 256 6 6 (92160)
I0912 18:22:26.971397 11968 net.cpp:125] pool5 needs backward computation.
I0912 18:22:26.971434 11968 net.cpp:66] Creating Layer fc6
I0912 18:22:26.971470 11968 net.cpp:329] fc6 <- pool5
I0912 18:22:26.971559 11968 net.cpp:290] fc6 -> fc6
I0912 18:22:27.069502 11968 net.cpp:83] Top shape: 10 4096 1 1 (40960)
I0912 18:22:27.069640 11968 net.cpp:125] fc6 needs backward computation.
I0912 18:22:27.069672 11968 net.cpp:66] Creating Layer relu6
I0912 18:22:27.069694 11968 net.cpp:329] relu6 <- fc6
I0912 18:22:27.069718 11968 net.cpp:280] relu6 -> fc6 (in-place)
I0912 18:22:27.069743 11968 net.cpp:83] Top shape: 10 4096 1 1 (40960)
I0912 18:22:27.069763 11968 net.cpp:125] relu6 needs backward computation.
I0912 18:22:27.069792 11968 net.cpp:66] Creating Layer drop6
I0912 18:22:27.069824 11968 net.cpp:329] drop6 <- fc6
I0912 18:22:27.069875 11968 net.cpp:280] drop6 -> fc6 (in-place)
I0912 18:22:27.069954 11968 net.cpp:83] Top shape: 10 4096 1 1 (40960)
I0912 18:22:27.069990 11968 net.cpp:125] drop6 needs backward computation.
I0912 18:22:27.070144 11968 net.cpp:66] Creating Layer fc7
I0912 18:22:27.070173 11968 net.cpp:329] fc7 <- fc6
I0912 18:22:27.070199 11968 net.cpp:290] fc7 -> fc7
I0912 18:22:27.111870 11968 net.cpp:83] Top shape: 10 4096 1 1 (40960)
I0912 18:22:27.111963 11968 net.cpp:125] fc7 needs backward computation.
I0912 18:22:27.111991 11968 net.cpp:66] Creating Layer relu7
I0912 18:22:27.112015 11968 net.cpp:329] relu7 <- fc7
I0912 18:22:27.112040 11968 net.cpp:280] relu7 -> fc7 (in-place)
I0912 18:22:27.112068 11968 net.cpp:83] Top shape: 10 4096 1 1 (40960)
I0912 18:22:27.112139 11968 net.cpp:125] relu7 needs backward computation.
I0912 18:22:27.112164 11968 net.cpp:66] Creating Layer drop7
I0912 18:22:27.112184 11968 net.cpp:329] drop7 <- fc7
I0912 18:22:27.112213 11968 net.cpp:280] drop7 -> fc7 (in-place)
I0912 18:22:27.112242 11968 net.cpp:83] Top shape: 10 4096 1 1 (40960)
I0912 18:22:27.112263 11968 net.cpp:125] drop7 needs backward computation.
I0912 18:22:27.112285 11968 net.cpp:66] Creating Layer fc8
I0912 18:22:27.112305 11968 net.cpp:329] fc8 <- fc7
I0912 18:22:27.112334 11968 net.cpp:290] fc8 -> fc8
I0912 18:22:27.122274 11968 net.cpp:83] Top shape: 10 1000 1 1 (10000)
I0912 18:22:27.122380 11968 net.cpp:125] fc8 needs backward computation.
I0912 18:22:27.122421 11968 net.cpp:66] Creating Layer prob
I0912 18:22:27.122503 11968 net.cpp:329] prob <- fc8
I0912 18:22:27.122547 11968 net.cpp:290] prob -> prob
I0912 18:22:27.122660 11968 net.cpp:83] Top shape: 10 1000 1 1 (10000)
I0912 18:22:27.122688 11968 net.cpp:125] prob needs backward computation.
I0912 18:22:27.122706 11968 net.cpp:156] This network produces output prob
I0912 18:22:27.122745 11968 net.cpp:402] Collecting Learning Rate and Weight Decay.
I0912 18:22:27.122769 11968 net.cpp:167] Network initialization done.
I0912 18:22:27.122788 11968 net.cpp:168] Memory required for data: 6183480
Done with init
Using CPU Mode
Done with set_mode
Done with set_phase_test
Elapsed time is 0.579487 seconds.
Elapsed time is 3.748376 seconds.

ans =

1 1 1000 10

Caffe 训练mnist数据

环境: Ubuntu 12.04, Caffe

cd $CAFFE_ROOT/data/mnist
./

cd $CAFFE_ROOT/examples/mnist
vi lenet_solver.prototxt
修改solver_mode为CPU

./train_lenet.sh

I0823 08:11:04.501404 15183 caffe.cpp:90] Starting Optimization
I0823 08:11:04.502498 15183 solver.cpp:32] Initializing solver from parameters:
test_iter: 100
test_interval: 500
base_lr: 0.01
display: 100
max_iter: 10000
lr_policy: “inv”
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: “lenet”
solver_mode: CPU
net: “lenet_train_test.prototxt”
FATAL: Error inserting nvidia_331 (/lib/modules/3.2.0-57-generic/updates/dkms/nvidia_331.ko): No such device
E0823 08:11:04.762663 15183 common.cpp:91] Cannot create Cublas handle. Cublas won’t be available.
FATAL: Error inserting nvidia_331 (/lib/modules/3.2.0-57-generic/updates/dkms/nvidia_331.ko): No such device
E0823 08:11:04.982652 15183 common.cpp:98] Cannot create Curand generator. Curand won’t be available.
I082308:11:04.982898 15183 solver.cpp:72] Creating training net from net file: lenet_train_test.prototxt
I0823 08:11:04.983438 15183 net.cpp:223] The NetState phase (0) differed from the phase (1) specified by a rule in layer mnist
I0823 08:11:04.983516 15183 net.cpp:223] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0823 08:11:04.983629 15183 net.cpp:38] Initializing net from parameters:

name: “LeNet”

layers {
top: “data”
top: “label”
name: “mnist”
type: DATA
data_param {
source: “mnist-test-leveldb”
scale: 0.00390625
batch_size: 100
}
include {
phase: TEST
}
}
layers {
bottom: “data”
top: “conv1”
name: “conv1”
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “conv1”
top: “pool1”
name: “pool1”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: “pool1”
top: “conv2”
name: “conv2”
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “conv2”
top: “pool2”
name: “pool2”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: “pool2”
top: “ip1”
name: “ip1”
type: INNER_PRODUCT
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 500
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “ip1”
top: “ip1”
name: “relu1”
type: RELU
}
layers {
bottom: “ip1”
top: “ip2”
name: “ip2”
type: INNER_PRODUCT
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 10
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “ip2”
bottom: “label”
top: “accuracy”
name: “accuracy”
type: ACCURACY
include {
phase: TEST
}
}
layers {
bottom: “ip2”
bottom: “label”
top: “loss”
name: “loss”
type: SOFTMAX_LOSS
}
state {
phase: TEST
}
I0823 08:11:04.524307 2464 net.cpp:66] Creating Layer mnist
I0823 08:11:04.524438 2464 net.cpp:290] mnist -> data
I0823 08:11:04.524711 2464 net.cpp:290] mnist -> label
I0823 08:11:04.524833 2464 data_layer.cpp:179] Opening leveldb mnist-test-leveldb
I0823 08:11:04.617794 2464 data_layer.cpp:262] output data size: 100,1,28,28
I0823 08:11:04.618073 2464 net.cpp:83] Top shape: 100 1 28 28 (78400)
I0823 08:11:04.618237 2464 net.cpp:83] Top shape: 100 1 1 1 (100)
I0823 08:11:04.618285 2464 net.cpp:130] mnist does not need backward computation.
I0823 08:11:04.618414 2464 net.cpp:66] Creating Layer label_mnist_1_split
I0823 08:11:04.618479 2464 net.cpp:329] label_mnist_1_split <- label
I0823 08:11:04.618859 2464 net.cpp:280] label_mnist_1_split -> label (in-place)
I0823 08:11:04.618948 2464 net.cpp:290] label_mnist_1_split -> label_mnist_1_split_1
I0823 08:11:04.618999 2464 net.cpp:83] Top shape: 100 1 1 1 (100)
I0823 08:11:04.619735 2464 net.cpp:83] Top shape: 100 1 1 1 (100)
I0823 08:11:04.619850 2464 net.cpp:130] label_mnist_1_split does not need backward com putation.
I0823 08:11:04.619900 2464 net.cpp:66] Creating Layer conv1
I0823 08:11:04.620210 2464 net.cpp:329] conv1 <- data
I0823 08:11:04.620262 2464 net.cpp:290] conv1 -> conv1
I0823 08:11:04.620434 2464 net.cpp:83] Top shape: 100 20 24 24 (1152000)
I0823 08:11:04.620515 2464 net.cpp:125] conv1 needs backward computation.
I0823 08:11:04.620580 2464 net.cpp:66] Creating Layer pool1
I0823 08:11:04.620620 2464 net.cpp:329] pool1 <- conv1
I0823 08:11:04.620663 2464 net.cpp:290] pool1 -> pool1
I0823 08:11:04.621214 2464 net.cpp:83] Top shape: 100 20 12 12 (288000)
I0823 08:11:04.621287 2464 net.cpp:125] pool1 needs backward computation.
I0823 08:11:04.621368 2464 net.cpp:66] Creating Layer conv2
I0823 08:11:04.621604 2464 net.cpp:329] conv2 <- pool1
I0823 08:11:04.621724 2464 net.cpp:290] conv2 -> conv2
I0823 08:11:04.622458 2464 net.cpp:83] Top shape: 100 50 8 8 (320000)
I0823 08:11:04.622563 2464 net.cpp:125] conv2 needs backward computation.
I0823 08:11:04.622607 2464 net.cpp:66] Creating Layer pool2
I0823 08:11:04.622648 2464 net.cpp:329] pool2 <- conv2
I0823 08:11:04.622691 2464 net.cpp:290] pool2 -> pool2
I0823 08:11:04.622730 2464 net.cpp:83] Top shape: 100 50 4 4 (80000)
I0823 08:11:04.623108 2464 net.cpp:125] pool2 needs backward computation.
I0823 08:11:04.623181 2464 net.cpp:66] Creating Layer ip1
I0823 08:11:04.623435 2464 net.cpp:329] ip1 <- pool2
I0823 08:11:04.623749 2464 net.cpp:290] ip1 -> ip1
I0823 08:11:04.628530 2464 net.cpp:83] Top shape: 100 500 1 1 (50000)
I0823 08:11:04.628690 2464 net.cpp:125] ip1 needs backward computation.
I0823 08:11:04.628726 2464 net.cpp:66] Creating Layer relu1
I0823 08:11:04.628751 2464 net.cpp:329] relu1 <- ip1
I0823 08:11:04.628779 2464 net.cpp:280] relu1 -> ip1 (in-place)
I0823 08:11:04.628809 2464 net.cpp:83] Top shape: 100 500 1 1 (50000)
I0823 08:11:04.628835 2464 net.cpp:125] relu1 needs backward computation.
I0823 08:11:04.629266 2464 net.cpp:66] Creating Layer ip2
I0823 08:11:04.629317 2464 net.cpp:329] ip2 <- ip1
I0823 08:11:04.629365 2464 net.cpp:290] ip2 -> ip2
I0823 08:11:04.629861 2464 net.cpp:83] Top shape: 100 10 1 1 (1000)
I0823 08:11:04.629947 2464 net.cpp:125] ip2 needs backward computation.
I0823 08:11:04.629992 2464 net.cpp:66] Creating Layer ip2_ip2_0_split
I0823 08:11:04.630108 2464 net.cpp:329] ip2_ip2_0_split <- ip2
I0823 08:11:04.630190 2464 net.cpp:280] ip2_ip2_0_split -> ip2 (in-place)
I0823 08:11:04.630980 2464 net.cpp:290] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0823 08:11:04.631105 2464 net.cpp:83] Top shape: 100 10 1 1 (1000)
I0823 08:11:04.631145 2464 net.cpp:83] Top shape: 100 10 1 1 (1000)
I0823 08:11:04.631182 2464 net.cpp:125] ip2_ip2_0_split needs backward computation.
I0823 08:11:04.631342 2464 net.cpp:66] Creating Layer accuracy
I0823 08:11:04.631391 2464 net.cpp:329] accuracy <- ip2
I0823 08:11:04.631862 2464 net.cpp:329] accuracy <- label
I0823 08:11:04.631963 2464 net.cpp:290] accuracy -> accuracy
I0823 08:11:04.632132 2464 net.cpp:83] Top shape: 1 1 1 1 (1)
I0823 08:11:04.632175 2464 net.cpp:125] accuracy needs backward computation.
I0823 08:11:04.632494 2464 net.cpp:66] Creating Layer loss
I0823 08:11:04.632750 2464 net.cpp:329] loss <- ip2_ip2_0_split_1
I0823 08:11:04.632804 2464 net.cpp:329] loss <- label_mnist_1_split_1
I0823 08:11:04.632853 2464 net.cpp:290] loss -> loss
I0823 08:11:04.633280 2464 net.cpp:83] Top shape: 1 1 1 1 (1)
I0823 08:11:04.633471 2464 net.cpp:125] loss needs backward computation.
I0823 08:11:04.633826 2464 net.cpp:156] This network produces output accuracy
I0823 08:11:04.633872 2464 net.cpp:156] This network produces output loss
I0823 08:11:04.634106 2464 net.cpp:402] Collecting Learning Rate and Weight Decay.
I0823 08:11:04.634172 2464 net.cpp:167] Network initialization done.
I0823 08:11:04.634213 2464 net.cpp:168] Memory required for data: 0
I0823 08:11:04.634326 2464 solver.cpp:46] Solver scaffolding done.
I0823 08:11:04.634436 2464 solver.cpp:165] Solving LeNet
I0823 08:11:04.634881 2464 solver.cpp:232] Iteration 0, Testing net (#0)
I0823 08:11:19.170075 2464 solver.cpp:270] Test score #0: 0.1059
I0823 08:11:19.170248 2464 solver.cpp:270] Test score #1: 2.30245
I0823 08:11:19.417044 2464 solver.cpp:195] Iteration 0, loss = 2.30231
I0823 08:11:19.417177 2464 solver.cpp:365] Iteration 0, lr = 0.01
I0823 08:11:43.741911 2464 solver.cpp:195] Iteration 100, loss = 0.317127
I0823 08:11:43.742342 2464 solver.cpp:365] Iteration 100, lr = 0.00992565
I0823 08:12:07.532147 2464 solver.cpp:195] Iteration 200, loss = 0.173197
I0823 08:12:07.532258 2464 solver.cpp:365] Iteration 200, lr = 0.00985258
I0823 08:12:31.409700 2464 solver.cpp:195] Iteration 300, loss = 0.247124
I0823 08:12:31.410508 2464 solver.cpp:365] Iteration 300, lr = 0.00978075
I0823 08:12:54.552777 2464 solver.cpp:195] Iteration 400, loss = 0.102047
I0823 08:12:54.552903 2464 solver.cpp:365] Iteration 400, lr = 0.00971013
I0823 08:13:17.605888 2464 solver.cpp:232] Iteration 500, Testing net (#0)

……
I0823 09:10:29.736903 2464 solver.cpp:270] Test score #0: 0.9887
I0823 09:10:29.737015 2464 solver.cpp:270] Test score #1: 0.0369187
I0823 09:10:30.063771 2464 solver.cpp:195] Iteration 9500, loss = 0.00306773
I0823 09:10:30.063874 2464 solver.cpp:365] Iteration 9500, lr = 0.00606002
I0823 09:10:57.213291 2464 solver.cpp:195] Iteration 9600, loss = 0.00250475
I0823 09:10:57.213827 2464 solver.cpp:365] Iteration 9600, lr = 0.00603682
I0823 09:11:26.278821 2464 solver.cpp:195] Iteration 9700, loss = 0.00243088
I0823 09:11:26.279002 2464 solver.cpp:365] Iteration 9700, lr = 0.00601382
I0823 09:11:53.438747 2464 solver.cpp:195] Iteration 9800, loss = 0.0136355
I0823 09:11:53.439350 2464 solver.cpp:365] Iteration 9800, lr = 0.00599102
I0823 09:12:20.007823 2464 solver.cpp:195] Iteration 9900, loss = 0.00696897
I0823 09:12:20.008005 2464 solver.cpp:365] Iteration 9900, lr = 0.00596843
I0823 09:12:46.920634 2464 solver.cpp:287] Snapshotting to lenet_iter_10000
I0823 09:12:46.930307 2464 solver.cpp:294] Snapshotting solver state to lenet_iter_10000.solverstate
I0823 09:12:47.039417 2464 solver.cpp:213] Iteration 10000, loss = 0.00343354
I0823 09:12:47.039518 2464 solver.cpp:232] Iteration 10000, Testing net (#0)
I0823 09:13:02.146388 2464 solver.cpp:270] Test score #0: 0.9909
I0823 09:13:02.146509 2464 solver.cpp:270] Test score #1: 0.0288982
I0823 09:13:02.146543 2464 solver.cpp:218] Optimization Done.
I0823 09:13:02.146564 2464 caffe.cpp:113] Optimization Done.

运行最终产生lenet_iter_10000的binary protobuf文件，查看文件内容：

cd /u01/caffe/examples/mnist
jerry@hq:/u01/caffe/examples/mnist$ python
Python 2.7.3 (default, Sep 26 2013, 20:03:06)
[GCC 4.6.3] on linux2
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import caffe
>>> net = caffe.Net(‘lenet.prototxt’, ‘lenet_iter_10000’)
FATAL: Error inserting nvidia_331 (/lib/modules/3.2.0-57-generic/updates/dkms/nvidia_331.ko): No such device
WARNING: Logging before InitGoogleLogging() is written to STDERR
E0823 10:41:06.040340 16020 common.cpp:91] Cannot create Cublas handle. Cublas won’t be available.
FATAL: Error inserting nvidia_331 (/lib/modules/3.2.0-57-generic/updates/dkms/nvidia_331.ko): No such device
E0823 10:41:06.242882 16020 common.cpp:98] Cannot create Curand generator. Curand won’t be available.
I0823 10:41:06.243221 16020 net.cpp:38] Initializing net from parameters:
name: “LeNet”
layers {
bottom: “data”
top: “conv1”
name: “conv1”
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “conv1”
top: “pool1”
name: “pool1”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: “pool1”
top: “conv2”
name: “conv2”
type: CONVOLUTION
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “conv2”
top: “pool2”
name: “pool2”
type: POOLING
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
bottom: “pool2”
top: “ip1”
name: “ip1”
type: INNER_PRODUCT
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 500
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “ip1”
top: “ip1”
name: “relu1”
type: RELU
}
layers {
bottom: “ip1”
top: “ip2”
name: “ip2”
type: INNER_PRODUCT
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 10
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
bottom: “ip2”
top: “prob”
name: “prob”
type: SOFTMAX
}
input: “data”
input_dim: 64
input_dim: 1
input_dim: 28
input_dim: 28
I0823 10:41:06.244067 16020 net.cpp:292] Input 0 -> data
I0823 10:41:06.244173 16020 net.cpp:66] Creating Layer conv1
I0823 10:41:06.244201 16020 net.cpp:329] conv1 <- data
I0823 10:41:06.244228 16020 net.cpp:290] conv1 -> conv1
I0823 10:41:06.245010 16020 net.cpp:83] Top shape: 64 20 24 24 (737280)
I0823 10:41:06.245100 16020 net.cpp:125] conv1 needs backward computation.
I0823 10:41:06.245172 16020 net.cpp:66] Creating Layer pool1
I0823 10:41:06.245210 16020 net.cpp:329] pool1 <- conv1
I0823 10:41:06.245276 16020 net.cpp:290] pool1 -> pool1
I0823 10:41:06.245338 16020 net.cpp:83] Top shape: 64 20 12 12 (184320)
I0823 10:41:06.245378 16020 net.cpp:125] pool1 needs backward computation.
I0823 10:41:06.245426 16020 net.cpp:66] Creating Layer conv2
I0823 10:41:06.245462 16020 net.cpp:329] conv2 <- pool1
I0823 10:41:06.245509 16020 net.cpp:290] conv2 -> conv2
I0823 10:41:06.245834 16020 net.cpp:83] Top shape: 64 50 8 8 (204800)
I0823 10:41:06.245893 16020 net.cpp:125] conv2 needs backward computation.
I0823 10:41:06.245918 16020 net.cpp:66] Creating Layer pool2
I0823 10:41:06.246021 16020 net.cpp:329] pool2 <- conv2
I0823 10:41:06.246088 16020 net.cpp:290] pool2 -> pool2
I0823 10:41:06.246136 16020 net.cpp:83] Top shape: 64 50 4 4 (51200)
I0823 10:41:06.246212 16020 net.cpp:125] pool2 needs backward computation.
I0823 10:41:06.246263 16020 net.cpp:66] Creating Layer ip1
I0823 10:41:06.246296 16020 net.cpp:329] ip1 <- pool2
I0823 10:41:06.246352 16020 net.cpp:290] ip1 -> ip1
I0823 10:41:06.250891 16020 net.cpp:83] Top shape: 64 500 1 1 (32000)
I0823 10:41:06.251027 16020 net.cpp:125] ip1 needs backward computation.
I0823 10:41:06.251073 16020 net.cpp:66] Creating Layer relu1
I0823 10:41:06.251111 16020 net.cpp:329] relu1 <- ip1
I0823 10:41:06.251149 16020 net.cpp:280] relu1 -> ip1 (in-place)
I0823 10:41:06.251196 16020 net.cpp:83] Top shape: 64 500 1 1 (32000)
I0823 10:41:06.251231 16020 net.cpp:125] relu1 needs backward computation.
I0823 10:41:06.251268 16020 net.cpp:66] Creating Layer ip2
I0823 10:41:06.251302 16020 net.cpp:329] ip2 <- ip1
I0823 10:41:06.251461 16020 net.cpp:290] ip2 -> ip2
I0823 10:41:06.251601 16020 net.cpp:83] Top shape: 64 10 1 1 (640)
I0823 10:41:06.251646 16020 net.cpp:125] ip2 needs backward computation.
I0823 10:41:06.251682 16020 net.cpp:66] Creating Layer prob
I0823 10:41:06.251716 16020 net.cpp:329] prob <- ip2
I0823 10:41:06.251757 16020 net.cpp:290] prob -> prob
I0823 10:41:06.252317 16020 net.cpp:83] Top shape: 64 10 1 1 (640)
I0823 10:41:06.252887 16020 net.cpp:125] prob needs backward computation.
I0823 10:41:06.252924 16020 net.cpp:156] This network produces output prob
I0823 10:41:06.252977 16020 net.cpp:402] Collecting Learning Rate and Weight Decay.
I0823 10:41:06.253016 16020 net.cpp:167] Network initialization done.
I0823 10:41:06.253099 16020 net.cpp:168] Memory required for data: 200704

查看网络定义
>>> print open(‘lenet.prototxt’).read()
name: “LeNet”
input: “data”
input_dim: 64
input_dim: 1
input_dim: 28
input_dim: 28
layers {
name: “conv1”
type: CONVOLUTION
bottom: “data”
top: “conv1”
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
name: “pool1”
type: POOLING
bottom: “conv1”
top: “pool1”
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
name: “conv2”
type: CONVOLUTION
bottom: “pool1”
top: “conv2”
blobs_lr: 1
blobs_lr: 2
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
name: “pool2”
type: POOLING
bottom: “conv2”
top: “pool2”
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layers {
name: “ip1”
type: INNER_PRODUCT
bottom: “pool2”
top: “ip1”
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 500
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
name: “relu1”
type: RELU
bottom: “ip1”
top: “ip1”
}
layers {
name: “ip2”
type: INNER_PRODUCT
bottom: “ip1”
top: “ip2”
blobs_lr: 1
blobs_lr: 2
inner_product_param {
num_output: 10
weight_filler {
type: “xavier”
}
bias_filler {
type: “constant”
}
}
}
layers {
name: “prob”
type: SOFTMAX
bottom: “ip2”
top: “prob”
}

查看每层的参数
>>>dir(net)
>>> net.params[“ip2”][0].data
array([[[[-0.00913269, 0.06095703, -0.09719526, …, -0.01292357,
-0.02721527, -0.04921406],
[ 0.07316435, -0.10016691, -0.00194797, …, -0.02357075,
-0.03735601, -0.12467863],
[ 0.11690015, -0.13771389, -0.04632974, …, 0.02967362,
-0.11868649, 0.01114164],
…,
[-0.18345536, -0.01772851, 0.06773216, …, -0.00851034,
-0.02590596, 0.01125562],
[-0.16715027, 0.03873322, 0.03800297, …, 0.0236346 ,
-0.01642762, 0.04072023],
[ 0.10814335, -0.04631414, 0.09708735, …, -0.0280726 ,
-0.14074558, 0.14641024]]]], dtype=float32)

后续功能将继续探索

BVLC Cafe 使用

环境: Ubuntu 12.04, CUDA 6.0,

1. 预先安装软件

pip install -r /u01/caffe/python/requirements.txt
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev

# gflags
wget https://github.com/schuhschuh/gflags/archive/master.zip
unzip master.zip
cd gflags-master
mkdir build && cd build
CXXFLAGS=”-fPIC” cmake .. -DGFLAGS_NAMESPACE=google
make && make install

# glog
wget https://google-glog.googlecode.com/files/glog-0.3.3.tar.gz
tar zxvf glog-0.3.3.tar.gz
cd glog-0.3.3
./configure
make && make install

# lmdb
git clone git://gitorious.org/mdb/mdb.git
cd mdb/libraries/liblmdb
make && make install

2. 配置安装文件

cp Makefile.config.example Makefile.config
vi Makefile.config, 去掉注释（由于虚拟机不支技显卡)
CPU_ONLY := 1

3. 编译，报错如下：

jerry@hq:/u01/caffe$ make
g++ .build_release/tools/convert_imageset.o .build_release/lib/libcaffe.a -o .build_release/tools/convert_imageset.bin -fPIC -DCPU_ONLY -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I.build_release/src -I./src -I./include -Wall -Wno-sign-compare -L/usr/lib -L/usr/local/lib -L/usr/lib -lglog -lgflags -lpthread -lprotobuf -lleveldb -lsnappy -llmdb -lboost_system -lhdf5_hl -lhdf5 -lopencv_core -lopencv_highgui -lopencv_imgproc -lcblas -latlas
.build_release/lib/libcaffe.a(blob.o): In function `caffe::Blob<float>::Update()’:
blob.cpp:(.text._ZN5caffe4BlobIfE6UpdateEv[_ZN5caffe4BlobIfE6UpdateEv]+0x43): undefined reference to `void caffe::caffe_gpu_axpy<float>(int, float, float const*, float*)’
.build_release/lib/libcaffe.a(blob.o): In function `caffe::Blob<float>::asum_data() const’:
blob.cpp:(.text._ZNK5caffe4BlobIfE9asum_dataEv[_ZNK5caffe4BlobIfE9asum_dataEv]+0x3f): undefined reference to `void caffe::caffe_gpu_asum<float>(int, float const*, float*)’
.build_release/lib/libcaffe.a(blob.o): In function `caffe::Blob<float>::asum_diff() const’:
blob.cpp:(.text._ZNK5caffe4BlobIfE9asum_diffEv[_ZNK5caffe4BlobIfE9asum_diffEv]+0x3f): undefined reference to `void caffe::caffe_gpu_asum<float>(int, float const*, float*)’
.build_release/lib/libcaffe.a(blob.o): In function `caffe::Blob<double>::Update()’:
blob.cpp:(.text._ZN5caffe4BlobIdE6UpdateEv[_ZN5caffe4BlobIdE6UpdateEv]+0x43): undefined reference to `void caffe::caffe_gpu_axpy<double>(int, double, double const*, double*)’
.build_release/lib/libcaffe.a(blob.o): In function `caffe::Blob<double>::asum_data() const’:
blob.cpp:(.text._ZNK5caffe4BlobIdE9asum_dataEv[_ZNK5caffe4BlobIdE9asum_dataEv]+0x3f): undefined reference to `void caffe::caffe_gpu_asum<double>(int, double const*, double*)’
.build_release/lib/libcaffe.a(blob.o): In function `caffe::Blob<double>::asum_diff() const’:
blob.cpp:(.text._ZNK5caffe4BlobIdE9asum_diffEv[_ZNK5caffe4BlobIdE9asum_diffEv]+0x3f): undefined reference to `void caffe::caffe_gpu_asum<double>(int, double const*, double*)’
.build_release/lib/libcaffe.a(common.o): In function `caffe::GlobalInit(int*, char***)’:
common.cpp:(.text+0x12a): undefined reference to `gflags::ParseCommandLineFlags(int*, char***, bool)’
.build_release/lib/libcaffe.a(common.o): In function `caffe::Caffe::Caffe()’:
common.cpp:(.text+0x179): undefined reference to `cublasCreate_v2′
common.cpp:(.text+0x1cb): undefined reference to `curandCreateGenerator’
common.cpp:(.text+0x22d): undefined reference to `curandSetPseudoRandomGeneratorSeed’
.build_release/lib/libcaffe.a(common.o): In function `caffe::Caffe::~Caffe()’:
common.cpp:(.text+0x434): undefined reference to `cublasDestroy_v2′
common.cpp:(.text+0x456): undefined reference to `curandDestroyGenerator’
.build_release/lib/libcaffe.a(common.o): In function `caffe::Caffe::DeviceQuery()’:
common.cpp:(.text+0x5f8): undefined reference to `cudaGetDevice’
common.cpp:(.text+0x616): undefined reference to `cudaGetDeviceProperties’
common.cpp:(.text+0xd22): undefined reference to `cudaGetErrorString’
.build_release/lib/libcaffe.a(common.o): In function `caffe::Caffe::SetDevice(int)’:
common.cpp:(.text+0x1222): undefined reference to `cudaGetDevice’
common.cpp:(.text+0x1247): undefined reference to `cudaSetDevice’
common.cpp:(.text+0x127b): undefined reference to `cublasDestroy_v2′
common.cpp:(.text+0x12a9): undefined reference to `curandDestroyGenerator’
common.cpp:(.text+0x12ce): undefined reference to `cublasCreate_v2′
common.cpp:(.text+0x12fc): undefined reference to `curandCreateGenerator’
common.cpp:(.text+0x1330): undefined reference to `curandSetPseudoRandomGeneratorSeed’
common.cpp:(.text+0x1729): undefined reference to `cudaGetErrorString’
common.cpp:(.text+0x1882): undefined reference to `cudaGetErrorString’
.build_release/lib/libcaffe.a(common.o): In function `caffe::Caffe::set_random_seed(unsigned int)’:
common.cpp:(.text+0x1aff): undefined reference to `curandDestroyGenerator’
common.cpp:(.text+0x1b2d): undefined reference to `curandCreateGenerator’
common.cpp:(.text+0x1b5c): undefined reference to `curandSetPseudoRandomGeneratorSeed’
.build_release/lib/libcaffe.a(math_functions.o): In function `void caffe::caffe_copy<double>(int, double const*, double*)’:
math_functions.cpp:(.text._ZN5caffe10caffe_copyIdEEviPKT_PS1_[_ZN5caffe10caffe_copyIdEEviPKT_PS1_]+0x6c): undefined reference to `cudaMemcpy’
math_functions.cpp:(.text._ZN5caffe10caffe_copyIdEEviPKT_PS1_[_ZN5caffe10caffe_copyIdEEviPKT_PS1_]+0x160): undefined reference to `cudaGetErrorString’
.build_release/lib/libcaffe.a(math_functions.o): In function `void caffe::caffe_copy<int>(int, int const*, int*)’:
math_functions.cpp:(.text._ZN5caffe10caffe_copyIiEEviPKT_PS1_[_ZN5caffe10caffe_copyIiEEviPKT_PS1_]+0x6c): undefined reference to `cudaMemcpy’
math_functions.cpp:(.text._ZN5caffe10caffe_copyIiEEviPKT_PS1_[_ZN5caffe10caffe_copyIiEEviPKT_PS1_]+0x160): undefined reference to `cudaGetErrorString’
.build_release/lib/libcaffe.a(math_functions.o): In function `void caffe::caffe_copy<unsigned int>(int, unsigned int const*, unsigned int*)’:
math_functions.cpp:(.text._ZN5caffe10caffe_copyIjEEviPKT_PS1_[_ZN5caffe10caffe_copyIjEEviPKT_PS1_]+0x6c): undefined reference to `cudaMemcpy’
math_functions.cpp:(.text._ZN5caffe10caffe_copyIjEEviPKT_PS1_[_ZN5caffe10caffe_copyIjEEviPKT_PS1_]+0x160): undefined reference to `cudaGetErrorString’
.build_release/lib/libcaffe.a(math_functions.o): In function `void caffe::caffe_copy<float>(int, float const*, float*)’:
math_functions.cpp:(.text._ZN5caffe10caffe_copyIfEEviPKT_PS1_[_ZN5caffe10caffe_copyIfEEviPKT_PS1_]+0x6c): undefined reference to `cudaMemcpy’
math_functions.cpp:(.text._ZN5caffe10caffe_copyIfEEviPKT_PS1_[_ZN5caffe10caffe_copyIfEEviPKT_PS1_]+0x160): undefined reference to `cudaGetErrorString’
.build_release/lib/libcaffe.a(syncedmem.o): In function `caffe::SyncedMemory::cpu_data()’:
syncedmem.cpp:(.text+0x26): undefined reference to `caffe::caffe_gpu_memcpy(unsigned long, void const*, void*)’
.build_release/lib/libcaffe.a(syncedmem.o): In function `caffe::SyncedMemory::mutable_cpu_data()’:
syncedmem.cpp:(.text+0x136): undefined reference to `caffe::caffe_gpu_memcpy(unsigned long, void const*, void*)’
.build_release/lib/libcaffe.a(syncedmem.o): In function `caffe::SyncedMemory::~SyncedMemory()’:
syncedmem.cpp:(.text+0x1c1): undefined reference to `cudaFree’
syncedmem.cpp:(.text+0x20f): undefined reference to `cudaGetErrorString’
.build_release/lib/libcaffe.a(syncedmem.o): In function `caffe::SyncedMemory::mutable_gpu_data()’:
syncedmem.cpp:(.text+0x29a): undefined reference to `caffe::caffe_gpu_memcpy(unsigned long, void const*, void*)’
syncedmem.cpp:(.text+0x2b9): undefined reference to `cudaMalloc’
syncedmem.cpp:(.text+0x2e5): undefined reference to `cudaMemset’
syncedmem.cpp:(.text+0x321): undefined reference to `cudaGetErrorString’
syncedmem.cpp:(.text+0x379): undefined reference to `cudaMalloc’
syncedmem.cpp:(.text+0x3c2): undefined reference to `cudaGetErrorString’
syncedmem.cpp:(.text+0x435): undefined reference to `cudaGetErrorString’
.build_release/lib/libcaffe.a(syncedmem.o): In function `caffe::SyncedMemory::gpu_data()’:
syncedmem.cpp:(.text+0x4ca): undefined reference to `caffe::caffe_gpu_memcpy(unsigned long, void const*, void*)’
syncedmem.cpp:(.text+0x4e9): undefined reference to `cudaMalloc’
syncedmem.cpp:(.text+0x515): undefined reference to `cudaMemset’
syncedmem.cpp:(.text+0x549): undefined reference to `cudaMalloc’
syncedmem.cpp:(.text+0x592): undefined reference to `cudaGetErrorString’
syncedmem.cpp:(.text+0x608): undefined reference to `cudaGetErrorString’
syncedmem.cpp:(.text+0x678): undefined reference to `cudaGetErrorString’
collect2: error: ld returned 1 exit status
make: *** [.build_release/tools/convert_imageset.bin] Error 1

很多引用是gpu的定义，但编译时使用cpu-only选项也是通不过的。

4. 修改Makefile.config, 注释CPU_ONLY := 1, 同时修改CUSTOM_CXX := g++-4.6

sudo apt-get install gcc-4.6 g++-4.6 gcc-4.6-multilib g++-4.6-multilib

修改这两个文件
vi src/caffe/common.cpp
vi tools/caffe.cpp
使用google替代gflags

make clean

make

make pycaffe
g++-4.6 -shared -o python/caffe/_caffe.so python/caffe/_caffe.cpp \\
.build_release/lib/libcaffe.a -fPIC -DNDEBUG -O2 -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/local/include -I.build_release/src -I./src -I./include -I/usr/local/cuda/include -Wall -Wno-sign-compare -L/usr/lib -L/usr/local/lib -L/usr/lib -L/usr/local/cuda/lib64 -L/usr/local/cuda/lib -lcudart -lcublas -lcurand -lglog -lgflags -lpthread -lprotobuf -lleveldb -lsnappy -llmdb -lboost_system -lhdf5_hl -lhdf5 -lopencv_core -lopencv_highgui -lopencv_imgproc -lcblas -latlas -lboost_python -lpython2.7

touch python/caffe/proto/__init__.py
protoc –proto_path=src –python_out=python src/caffe/proto/caffe_pretty_print.proto

protoc –proto_path=src –python_out=python src/caffe/proto/caffe.proto

执行 sudo cp /u01/caffe/python/caffe/ /usr/local/lib/python2.7/dist-packages/ -Rf

使用octave运行DeepLearnToolbox

环境： Ubuntu 12.04, octave 3.8.2

DeepLearnToolbox是matlab/octave的深度学习工具箱，包括Deep Belief Nets, Stacked Autoencoders, Convolutional Neural Nets, Convolutional Autoencoders and vanilla Neural Nets。

下面运行下里面的测试程序：

1. 下载DeepLearnToolbox
git clone https://github.com/rasmusbergpalm/DeepLearnToolbox.git

2. 启动Octave

3. 安装DeepLearnToolbox
octave:1> addpath(genpath(‘DeepLearnToolbox’));

4. 将目录./DeepLearnToolbox/tests/test_example_SAE.m的内容复制到octave的命令行内，最后一行增加”end;”

5. 运行刚才定义的函数
octave:8> test_example_SAE;

6. 运行结果出现绘制的图片和执行结果
octave:11> test_example_SAE
warning: load: file found in load path
Training AE 1/1
epoch 1/1. Took 21.469 seconds. Mini-batch mean squared error on training set is 10.61; Full-batch train err = 11.049844
epoch 1/1. Took 8.4329 seconds. Mini-batch mean squared error on training set is 0.2197; Full-batch train err = 0.094171

cuda-convnet2使用

环境: Ubuntu 12.04, cuda-convnet2, CUDA 6

安装步骤：

1. 预安装需求库

sudo apt-get install python-dev python-numpy python-scipy python-magic python-matplotlib libatlas-base-dev libjpeg-dev libopencv-dev git

2. 安装CUDA 6.0
从http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1204/x86_64/cuda-repo-ubuntu1204_6.5-14_amd64.deb下载 cuda-repo-ubuntu1204_6.5-14_amd64.deb
$wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1204/x86_64/cuda-repo-ubuntu1204_6.5-14_amd64.deb
$ sudo dpkg -i cuda-repo-ubuntu1204_6.0-37_amd64.deb
$ sudo apt-get update
$ sudo apt-get install cuda

3. 配置CUDA环境变量

vi ~/.bashrc

export CUDA_HOME=/usr/local/cuda-6.0
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64

PATH=${CUDA_HOME}/bin:${PATH}
export PATH

4. 下载cuda-convnet2源码

git clone https://code.google.com/p/cuda-convnet2/

5. 编译源码

jerry@hq:/u01/cuda-convnet2$ sh build.sh
mkdir -p ./bin//src
g++ -O3 -c -fPIC -DNUMPY_INTERFACE -I./include -I/usr/include/python2.7 -I/usr/lib/python2.7/dist-packages/numpy/core/include/numpy/ src/matrix.cpp -o ./bin//src/matrix.o
In file included from /usr/include/python2.7/Python.h:8:0,
from src/../include/matrix.h:22,
from src/matrix.cpp:17:
/usr/include/python2.7/pyconfig.h:1161:0: warning: “_POSIX_C_SOURCE” redefined [enabled by default]
#define _POSIX_C_SOURCE 200112L
^
In file included from /usr/include/stdlib.h:25:0,
from src/../include/matrix_funcs.h:20,
from src/../include/matrix.h:20,
from src/matrix.cpp:17:
/usr/include/features.h:164:0: note: this is the location of the previous definition
# define _POSIX_C_SOURCE 200809L
^
In file included from /usr/include/python2.7/Python.h:8:0,
from src/../include/matrix.h:22,
from src/matrix.cpp:17:
/usr/include/python2.7/pyconfig.h:1183:0: warning: “_XOPEN_SOURCE” redefined [enabled by default]
#define _XOPEN_SOURCE 600
^
In file included from /usr/include/stdlib.h:25:0,
from src/../include/matrix_funcs.h:20,
from src/../include/matrix.h:20,
from src/matrix.cpp:17:
/usr/include/features.h:166:0: note: this is the location of the previous definition
# define _XOPEN_SOURCE 700
^
cd ./bin/ && g++ -O3 -DNUMPY_INTERFACE -shared -Wl,-no-undefined -o libutilpy.so src/matrix.o -L/usr/lib/atlas-base -latlas -lcblas -lpython2.7
ln -sf ./bin//libutilpy.so .
mkdir -p ./bin/release
mkdir -p ./obj/release/src
mkdir -p ./bin/release
mkdir -p ./obj/release/src
mkdir -p ./bin/release
mkdir -p ./obj/release/src
mkdir -p ./bin//src
g++ -O3 -c -fPIC -I./include -I/usr/include/python2.7 src/pyext.cpp -o ./bin//src/pyext.o
In file included from /usr/include/python2.7/Python.h:8:0,
from src/../include/pyext.h:23,
from src/pyext.cpp:17:
/usr/include/python2.7/pyconfig.h:1161:0: warning: “_POSIX_C_SOURCE” redefined [enabled by default]
#define _POSIX_C_SOURCE 200112L
^
In file included from /usr/include/stdio.h:28:0,
from src/../include/pyext.h:20,
from src/pyext.cpp:17:
/usr/include/features.h:164:0: note: this is the location of the previous definition
# define _POSIX_C_SOURCE 200809L
^
In file included from /usr/include/python2.7/Python.h:8:0,
from src/../include/pyext.h:23,
from src/pyext.cpp:17:
/usr/include/python2.7/pyconfig.h:1183:0: warning: “_XOPEN_SOURCE” redefined [enabled by default]
#define _XOPEN_SOURCE 600
^
In file included from /usr/include/stdio.h:28:0,
from src/../include/pyext.h:20,
from src/pyext.cpp:17:
/usr/include/features.h:166:0: note: this is the location of the previous definition
# define _XOPEN_SOURCE 700
^
cd ./bin/ && g++ -O3 -shared -Wl,-no-undefined -o _MakeDataPyExt.so src/pyext.o -L/usr/local/cuda/lib64 `pkg-config –libs python` `pkg-config –libs opencv` -lpthread
ln -sf ./bin//_MakeDataPyExt.so .

6. 运行脚本
jerry@hq:/u01/cuda-convnet2$ python convnet.py –data-path=/u01/lisa/data/cifar10/cifar-10-batches-py –save-path=/u01/jerry/tmp –test-range=5 –train-range=1-4 –layer-def=./layers/layers-cifar10-11pct.cfg –layer-params=./layers/layer-params-cifar10-11pct.cfg –data-provider=cifar-cropped –test-freq=13 –epochs=100
Option –gpu (GPU override) not supplied
convnet.py usage:
Option Description Default
[–check-grads <0/1> ] – Check gradients and quit? [0]
[–color-noise <float> ] – Add PCA noise to color channels with given scale [0]
[–conserve-mem <0/1> ] – Conserve GPU memory (slower)? [0]
[–conv-to-local <string,…> ] – Convert given conv layers to unshared local []
[–epochs <int> ] – Number of epochs [50000]
[–feature-path <string> ] – Write test data features to this path (to be used with –write-features) []
[–force-save <0/1> ] – Force save before quitting [0]
[–inner-size <int> ] – Cropped DP: crop size (0 = don’t crop) [0]
[–layer-path <string> ] – Layer file path prefix []
[–load-file <string> ] – Load file []
[–logreg-name <string> ] – Logreg cost layer name (for –test-out) []
[–mini <int> ] – Minibatch size [128]
[–multiview-test <0/1> ] – Cropped DP: test on multiple patches? [0]
[–scalar-mean <float> ] – Subtract this scalar from image (-1 = don’t) [-1]
[–test-freq <int> ] – Testing frequency [57]
[–test-one <0/1> ] – Test on one batch at a time? [1]
[–test-only <0/1> ] – Test and quit? [0]
[–test-out <string> ] – Output test case predictions to given path []
[–unshare-weights <string,…>] – Unshare weight matrices in given layers []
[–write-features <string> ] – Write test data features from given layer []
–data-path <string> – Data path
–data-provider <string> – Data provider
–gpu <int,…> – GPU override
–layer-def <string> – Layer definition file
–layer-params <string> – Layer parameter file
–save-file <string> – Save file override
–save-path <string> – Save path
–test-range <int[-int]> – Data batch range: testing
–train-range <int[-int]> – Data batch range: training

由于ubuntu是在windows里的虚拟机，无法使用本机或外置的gpu显卡，故无法运行程序。有点遗憾

OpenCV训练自己的衣服探测分类器

环境: Ubuntu 12.4, OpenCV, 光影魔术手4, ObjectMarker

步骤如下：

1. 首先收集要探测的物体图片(正样本)和背景图片(负样本)，使用“光影魔术手”批处理图片为固定大小和bmp格式

2. 使用ObjectMarker来抽取物体图片(正样本)生成info.txt，内容如下：
rawdata/136e2b8aef176609829e23e54081db6d.bmp 1 149 36 127 193
rawdata/203d730e146b11af60eef38b5ee280a1.bmp 1 52 92 175 201
rawdata/21b7e18952a3af2975fb407510af943e.bmp 1 82 102 135 182
rawdata/2f9ed3b7e3c87b0890bcdd94fae54415.bmp 1 45 28 190 230

3. 将info.txt文件和正负物体图片导入到Ubuntu环境内。

4. 在背景图片中生成bg.txt文件，内容为相应的文件名：
jerry@hq:~$ more neg_pic/bg3.txt
c15108780ef9f873650f3cbd0259fa6f.bmp
e4f1dce1ed15262848f0b9e0efb0ad56.bmp
6c7fc84c405201ba2a92c38e5f828966.bmp
a00f7dc3c16e8b31d16bb9122f34aa0c.bmp
cad04ea931db04381bb5fed9cb8dea61.bmp
056f0cd1d04d62dfa4ab9fdb0b6c052b.bmp
95f5c539316d9261a83b3a98d90eac58.bmp
08178909a71d88750fc35d5b87e838fd.bmp

4. 准备训练数据
jerry@hq:~$ opencv_createsamples -info info.txt -vec pic.vec -w 30 -h 30 -b neg_pic/bg.txt -num 26

5. 训练haar分类器
jerry@hq:~$ mkdir cascade_data
jerry@hq:~$ opencv_traincascade -data cascade_data -vec pic.vec -bg neg_pic/bg3.txt -w 30 -h 30 -numPos 26 -numNeg 50 -numStages 10

Pylearn2的Stacked Autoencoders示例

环境：Ubuntu 12.4

1. 首先下载训练数据

cd /u01/lisa/data/mnist
wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz

gunzip train-images-idx3-ubyte.gz

wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz

gunzip train-labels-idx1-ubyte.gz wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz gunzip t10k-images-idx3-ubyte.gz

wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz

gunzip t10k-labels-idx1-ubyte.gz

2. 修改文件dae_l1.yaml

进入python命令行模式

layer1_yaml = open('dae_l1.yaml', 'r').read()

hyper_params_l1 = {'train_stop' : 50000, 'batch_size' : 100, 'monitoring_batches' : 5, 'nhid' : 500, 'max_epochs' : 10, 'save_path' : '.'}

layer1_yaml = layer1_yaml % (hyper_params_l1)

print layer1_yaml

将输出的内容全部覆盖掉dae_l1.yaml文件的内容

3. 进入示例脚本目录
cd ~/pylearn2/pylearn2/scripts/tutorials/stacked_autoencoders
执行脚本
python ~/pylearn2/pylearn2/scripts/train.py dae_l1.yaml
输入日志如下：
/home/jerry/pylearn2/pylearn2/utils/call_check.py:98: UserWarning: the `one_hot` parameter is deprecated. To get one-hot e ncoded targets, request that they live in `VectorSpace` through the `data_specs` parameter of MNIST’s iterator method. `on e_hot` will be removed on or after September 20, 2014.
return to_call(**kwargs)
/home/jerry/.local/lib/python2.7/site-packages/theano/sandbox/rng_mrg.py:1183: UserWarning: MRG_RandomStreams Can’t determ ine #streams from size (Shape.0), guessing 60*256
nstreams = self.n_streams(size)
Parameter and initial learning rate summary:
vb: 0.001
hb: 0.001
W: 0.001
Wprime: 0.001
/home/jerry/pylearn2/pylearn2/models/model.py:71: UserWarning: The <class ‘pylearn2.models.autoencoder.DenoisingAutoencode r’> Model subclass seems not to call the Model constructor. This behavior may be considered an error on or after 2014-11-0 1.
warnings.warn(“The ” + str(type(self)) + ” Model subclass ”
Compiling sgd_update…
Compiling sgd_update done. Time elapsed: 7.379370 seconds
compiling begin_record_entry…
compiling begin_record_entry done. Time elapsed: 0.103046 seconds
Monitored channels:
learning_rate
objective
total_seconds_last_epoch
training_seconds_this_epoch
Compiling accum…
graph size: 19
Compiling accum done. Time elapsed: 0.876798 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.001
objective: 89.1907964264
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
Time this epoch: 19.928861 seconds
……
Monitoring step:
Epochs seen: 10
Batches seen: 5000
Examples seen: 500000
learning_rate: 0.001
objective: 11.9511445315
total_seconds_last_epoch: 35.828732
training_seconds_this_epoch: 22.296131
Saving to ./dae_l1.pkl…
Saving to ./dae_l1.pkl done. Time elapsed: 0.936124 seconds
Saving to ./dae_l1.pkl…
Saving to ./dae_l1.pkl done. Time elapsed: 0.886536 seconds

4. 查看参数
>>> from pylearn2.utils import serial
>>> serial.load(‘dae_l1.pkl’)
<pylearn2.models.autoencoder.DenoisingAutoencoder object at 0x46855d0>
>>>
>>> model = serial.load(‘dae_l1.pkl’)
>>>
>>> dir(model)
[‘__call__’, ‘__class__’, ‘__delattr__’, ‘__dict__’, ‘__doc__’, ‘__format__’, ‘__getattribute__’, ‘__getstate__’, ‘__hash__’, ‘__init__’, ‘__metaclass__’, ‘__module__’, ‘__new__’, ‘__reduce__’, ‘__reduce_ex__’, ‘__repr__’, ‘__setattr__’, ‘__setstate__’, ‘__sizeof__’, ‘__str__’, ‘__subclasshook__’, ‘__weakref__’, ‘_disallow_censor_updates’, ‘_ensure_extensions’, ‘_hidden_activation’, ‘_hidden_input’, ‘_initialize_hidbias’, ‘_initialize_visbias’, ‘_initialize_w_prime’, ‘_initialize_weights’, ‘_modify_updates’, ‘_overrides_censor_updates’, ‘_params’, ‘_test_batch_size’, ‘act_dec’, ‘act_enc’, ‘censor_updates’, ‘continue_learning’, ‘corruptor’, ‘cpu_only’, ‘dataset_yaml_src’, ‘decode’, ‘encode’, ‘enforce_constraints’, ‘extensions’, ‘fn’, ‘free_energy’, ‘function’, ‘get_default_cost’, ‘get_input_dim’, ‘get_input_source’, ‘get_input_space’, ‘get_lr_scalers’, ‘get_monitoring_channels’, ‘get_monitoring_data_specs’, ‘get_output_dim’, ‘get_output_space’, ‘get_param_values’, ‘get_param_vector’, ‘get_params’, ‘get_target_source’, ‘get_target_space’, ‘get_test_batch_size’, ‘get_weights’, ‘get_weights_format’, ‘get_weights_topo’, ‘get_weights_view_shape’, ‘hidbias’, ‘input_space’, ‘inverse’, ‘irange’, ‘libv’, ‘modify_updates’, ‘monitor’, ‘nhid’, ‘output_space’, ‘perform’, ‘print_versions’, ‘reconstruct’, ‘redo_theano’, ‘register_names_to_del’, ‘rng’, ‘s_rng’, ‘score’, ‘set_batch_size’, ‘set_input_space’, ‘set_param_values’, ‘set_param_vector’, ‘set_visible_size’, ‘tag’, ‘tied_weights’, ‘train_all’, ‘train_batch’, ‘upward_pass’, ‘visbias’, ‘w_prime’, ‘weights’, ‘yaml_src’]
>>>

5. 类似步骤2，修改dae_l2.yaml文件

layer2_yaml = open('dae_l2.yaml', 'r').read()

hyper_params_l2 = {'train_stop' : 50000, 'batch_size' : 100, 'monitoring_batches' : 5, 'nvis' : 500, 'nhid' : 500, 'max_epochs' : 10, 'save_path' : '.'}

layer2_yaml = layer2_yaml % (hyper_params_l2)

print layer2_yaml

6. 执行dae_l2.yaml ，第二层模型训练
python ~/pylearn2/pylearn2/scripts/train.py dae_l2.yaml
/home/jerry/pylearn2/pylearn2/utils/call_check.py:98: UserWarning: the `one_hot` parameter is deprecated. To get one-hot encoded targets, request that they live in `VectorSpace` through the `data_specs` parameter of MNIST’s iterator method. `one_hot` will be removed on or after September 20, 2014.
return to_call(**kwargs)
/home/jerry/.local/lib/python2.7/site-packages/theano/sandbox/rng_mrg.py:1183: UserWarning: MRG_RandomStreams Can’t determine #streams from size (Shape.0), guessing 60*256
nstreams = self.n_streams(size)
Parameter and initial learning rate summary:
vb: 0.001
hb: 0.001
W: 0.001
Wprime: 0.001
/home/jerry/pylearn2/pylearn2/models/model.py:71: UserWarning: The <class ‘pylearn2.models.autoencoder.DenoisingAutoencoder’> Model subclass seems not to call the Model constructor. This behavior may be considered an error on or after 2014-11-01.
warnings.warn(“The ” + str(type(self)) + ” Model subclass ”
Compiling sgd_update…
Compiling sgd_update done. Time elapsed: 0.339660 seconds
compiling begin_record_entry…
compiling begin_record_entry done. Time elapsed: 0.023657 seconds
Monitored channels:
learning_rate
objective
total_seconds_last_epoch
training_seconds_this_epoch
Compiling accum…
graph size: 19
Compiling accum done. Time elapsed: 0.189965 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.001
objective: 52.2956323286
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
Time this epoch: 17.452593 seconds
……
Monitoring step:
Epochs seen: 10
Batches seen: 5000
Examples seen: 500000
learning_rate: 0.001
objective: 4.33433924602
total_seconds_last_epoch: 30.433518
training_seconds_this_epoch: 19.303109
Saving to ./dae_l2.pkl…
Saving to ./dae_l2.pkl done. Time elapsed: 0.607150 seconds
Saving to ./dae_l2.pkl…
Saving to ./dae_l2.pkl done. Time elapsed: 0.588375 seconds

7. 类似步骤2修改dae_mlp.yaml文件

mlp_yaml = open('dae_mlp.yaml', 'r').read()

hyper_params_mlp = {'train_stop' : 50000, 'valid_stop' : 60000, 'batch_size' : 100, 'max_epochs' : 50, 'save_path' : '.'}

mlp_yaml = mlp_yaml % (hyper_params_mlp)

print mlp_yaml

（注：在原dae_mlp.yaml文件内没有save_path, save_freq这两项，造成参数数据没有保存，因而需要加入这两项，如下：
save_path : ‘./dae_mlp.pkl’,
save_freq : 1
)

8. 运行监督优化–Supervised fine-tuning
python ~/pylearn2/pylearn2/scripts/train.py dae_mlp.yaml
/home/jerry/pylearn2/pylearn2/utils/call_check.py:98: UserWarning: the `one_hot` parameter is deprecated. To get one-hot encoded targets, request that they live in `VectorSpace` through the `data_specs` parameter of MNIST’s iterator method. `one_hot` will be removed on or after September 20, 2014.
return to_call(**kwargs)
Parameter and initial learning rate summary:
vb: 0.05
hb: 0.05
W: 0.05
Wprime: 0.05
vb: 0.05
hb: 0.05
W: 0.05
Wprime: 0.05
softmax_b: 0.05
softmax_W: 0.05
Compiling sgd_update…
Compiling sgd_update done. Time elapsed: 17.156073 seconds
compiling begin_record_entry…
compiling begin_record_entry done. Time elapsed: 0.056943 seconds
Monitored channels:
learning_rate
momentum
total_seconds_last_epoch
training_seconds_this_epoch
valid_objective
valid_y_col_norms_max
valid_y_col_norms_mean
valid_y_col_norms_min
valid_y_max_max_class
valid_y_mean_max_class
valid_y_min_max_class
valid_y_misclass
valid_y_nll
valid_y_row_norms_max
valid_y_row_norms_mean
valid_y_row_norms_min
Compiling accum…
graph size: 63
Compiling accum done. Time elapsed: 8.821601 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.05
momentum: 0.5
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
valid_objective: 2.30245763578
valid_y_col_norms_max: 0.0650026130651
valid_y_col_norms_mean: 0.0641744853852
valid_y_col_norms_min: 0.0624679393698
valid_y_max_max_class: 0.105532125739
valid_y_mean_max_class: 0.102753872501
valid_y_min_max_class: 0.101059172742
valid_y_misclass: 0.9031
valid_y_nll: 2.30245763578
valid_y_row_norms_max: 0.0125483545665
valid_y_row_norms_mean: 0.00897718040255
valid_y_row_norms_min: 0.00411555936503
Time this epoch: 18.159817 seconds
……
Monitoring step:
Epochs seen: 50
Batches seen: 25000
Examples seen: 2500000
learning_rate: 0.0183943399319
momentum: 0.539357429719
total_seconds_last_epoch: 21.789649
training_seconds_this_epoch: 19.881821
valid_objective: 0.0667691463031
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93614117524
valid_y_col_norms_min: 1.93520053981
valid_y_max_max_class: 0.999997756761
valid_y_mean_max_class: 0.980073621031
valid_y_min_max_class: 0.548149309784
valid_y_misclass: 0.02
valid_y_nll: 0.0667691463031
valid_y_row_norms_max: 0.546499525611
valid_y_row_norms_mean: 0.264354016013
valid_y_row_norms_min: 0.101427414171

9. 至此整个训练过程结束
想调参数可以在yaml文件内调整，另外参数数据在三个文件内 dae_l1.pkl， dae_l2.pkl， dae_mlp.pkl

Pylearn2的使用简介

环境: ubuntu 12.4

Pylearn2是基于theano上封装的深度学习包。它实现一些常见的模型，具体请参考： http://deeplearning.net/software/pylearn2/library/index.html#libdoc，比theano在做实际的项目节约时间，只需要配置一些参数来实现模型的训练。
下面来讲解实际的安装和使用：

1. 安装Theano（Bleeding-edge install instruction）

jerry@hq:~$sudo pip install –upgrade –no-deps git+git://github.com/Theano/Theano.git –user

2. 下载Pylearn2
jerry@hq:~$git clone git://github.com/lisa-lab/pylearn2.git

3. 安装pylearn2
jerry@hq:~$cd pylearn2
jerry@hq:~$sudo python setup.py develop –user

4. 测试安装成功
jerry@hq:~$python
import pylearn2
能加载包即安装ok

5. 设置PYTHON2_DATA_PATH, PYLEARN2_VIEWR_COMMAND
vi ~/.bashrc
添加
export PYLEARN2_DATA_PATH=/u01/lisa/data
export PYLEARN2_VIEWER_COMMAND=/usr/bin/eog

如何运行一个示例

1. 下载数据
cd /u01/lisa/data/cifar10
wget http://www.cs.utoronto.ca/~kriz/cifar-10-python.tar.gz
tar xvf cifar-10-python.tar.gz

2. 修改make_dataset.py文件，指定路径/u01/lisa/data/ (由于本机上/空间不足，只能把数据放在其它路径上)
jerry@hq:~$vi /home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/make_dataset.py
修改成这样：
“””
path = pylearn2.__path__[0]
train_example_path = os.path.join(path, ‘scripts’, ‘tutorials’, ‘grbm_smd’)
train.use_design_loc(os.path.join(train_example_path, ‘cifar10_preprocessed_train_design.npy’))
train_pkl_path = os.path.join(train_example_path, ‘cifar10_preprocessed_train.pkl’)
“””
train_pkl_path = os.path.join(‘/u01/lisa/data/’, ‘cifar10_preprocessed_train.pkl’)
serial.save(train_pkl_path, train)

3. 对下载数据进行数据预处理
python /home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/make_dataset.py
处理完后在目录/u01/lisa/data下有一个文件 cifar10_preprocessed_train.pkl，大概652M左右

4. 对数据进行训练
cd /u01/lisa/data
python ~/pylearn2/pylearn2/scripts/train.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.yaml

5. 查看结果
python ~/pylearn2/pylearn2/scripts/show_weights.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

python ~/pylearn2/pylearn2/scripts/plot_monitor.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

python ~/pylearn2/pylearn2/scripts/print_monitor.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

python ~/pylearn2/pylearn2/scripts/summarize_model.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

6. 直接查看生成参数的文件cifar_grbm_smd.pkl

加载模型文件
>>> from pylearn2.utils import serial
>>> model = serial.load(‘/home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl’)
查下文件结构
>>> dir(model)
获取权重参数
>>> model.get_weights()
获取参数名
>>> model.get_params()
获取参数值
>>> model.get_param_values()

CXXNET安装

环境：ubuntu 14.04, cuda 6.5

先安装cuda-toolkit, cuda-cublas, cudart, cuda-curand这四个安装包

cuda_6.5.14_linux_64.run

cuda-cublas-6-5_6.5-14_amd64.deb
cuda-cudart-6-5_6.5-14_amd64.deb
cuda-curand-6-5_6.5-14_amd64.deb

下载路径：http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/

安装 OpenCV

sudo apt-get install libopencv-2.4

配置环境变量

vi ~/.bashrc

export CUDA_HOME=/usr/local/cuda-6.5
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:/usr/local/lib:$LD_LIBRARY_PATH
export CPLUS_INCLUDE_PATH=/usr/local/cuda/include

下载一份cxxnet

git clone https://github.com/dmlc/cxxnet.git

切换至目录 cd cxxnet

拷贝一份配置到当前目录 cp make/config.mk .

修改 vi config.mk

USE_CUDA = 1

USE_BLAS = blas

USE_DIST_PS = 1
USE_OPENMP_ITER = 1

编辑 vi Makefile，修改如下：

CFLAGS += -g -O3 -I./mshadow/ -fPIC $(MSHADOW_CFLAGS) -fopenmp -I/usr/local/cuda/include
LDFLAGS = -pthread $(MSHADOW_LDFLAGS) -L/usr/local/cuda/lib64

最后编译文件

./build.sh