OpenCV训练自己的衣服探测分类器

环境:  Ubuntu 12.4,  OpenCV,  光影魔术手4,  ObjectMarker

步骤如下:

1. 首先收集要探测的物体图片(正样本)和背景图片(负样本),使用“光影魔术手”批处理图片为固定大小和bmp格式

2. 使用ObjectMarker来抽取物体图片(正样本)生成info.txt,内容如下:
rawdata/136e2b8aef176609829e23e54081db6d.bmp 1 149 36 127 193
rawdata/203d730e146b11af60eef38b5ee280a1.bmp 1 52 92 175 201
rawdata/21b7e18952a3af2975fb407510af943e.bmp 1 82 102 135 182
rawdata/2f9ed3b7e3c87b0890bcdd94fae54415.bmp 1 45 28 190 230

3. 将info.txt文件和正负物体图片导入到Ubuntu环境内。

4. 在背景图片中生成bg.txt文件,内容为相应的文件名:
jerry@hq:~$ more neg_pic/bg3.txt
c15108780ef9f873650f3cbd0259fa6f.bmp
e4f1dce1ed15262848f0b9e0efb0ad56.bmp
6c7fc84c405201ba2a92c38e5f828966.bmp
a00f7dc3c16e8b31d16bb9122f34aa0c.bmp
cad04ea931db04381bb5fed9cb8dea61.bmp
056f0cd1d04d62dfa4ab9fdb0b6c052b.bmp
95f5c539316d9261a83b3a98d90eac58.bmp
08178909a71d88750fc35d5b87e838fd.bmp

4. 准备训练数据
jerry@hq:~$ opencv_createsamples -info info.txt -vec pic.vec  -w 30 -h 30 -b neg_pic/bg.txt -num 26

5. 训练haar分类器
jerry@hq:~$ mkdir cascade_data
jerry@hq:~$ opencv_traincascade -data cascade_data -vec pic.vec -bg neg_pic/bg3.txt -w 30 -h 30 -numPos 26 -numNeg 50 -numStages 10

Pylearn2的Stacked Autoencoders示例

环境:Ubuntu 12.4

1. 首先下载训练数据

cd /u01/lisa/data/mnist
wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz

gunzip train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
gunzip train-labels-idx1-ubyte.gz wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz gunzip t10k-images-idx3-ubyte.gz

wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
gunzip t10k-labels-idx1-ubyte.gz

2. 修改文件dae_l1.yaml

进入python命令行模式

layer1_yaml = open('dae_l1.yaml', 'r').read()
hyper_params_l1 = {'train_stop' : 50000, 'batch_size' : 100, 'monitoring_batches' : 5, 'nhid' : 500, 'max_epochs' : 10, 'save_path' : '.'}
layer1_yaml = layer1_yaml % (hyper_params_l1)
print layer1_yaml

将输出的内容全部覆盖掉dae_l1.yaml文件的内容

3.  进入示例脚本目录
cd ~/pylearn2/pylearn2/scripts/tutorials/stacked_autoencoders
执行脚本
python ~/pylearn2/pylearn2/scripts/train.py  dae_l1.yaml
输入日志如下:
/home/jerry/pylearn2/pylearn2/utils/call_check.py:98: UserWarning: the `one_hot` parameter is deprecated. To get one-hot e                                              ncoded targets, request that they live in `VectorSpace` through the `data_specs` parameter of MNIST’s iterator method. `on                                              e_hot` will be removed on or after September 20, 2014.
return to_call(**kwargs)
/home/jerry/.local/lib/python2.7/site-packages/theano/sandbox/rng_mrg.py:1183: UserWarning: MRG_RandomStreams Can’t determ                                              ine #streams from size (Shape.0), guessing 60*256
nstreams = self.n_streams(size)
Parameter and initial learning rate summary:
vb: 0.001
hb: 0.001
W: 0.001
Wprime: 0.001
/home/jerry/pylearn2/pylearn2/models/model.py:71: UserWarning: The <class ‘pylearn2.models.autoencoder.DenoisingAutoencode                                              r’> Model subclass seems not to call the Model constructor. This behavior may be considered an error on or after 2014-11-0                                              1.
warnings.warn(“The ” + str(type(self)) + ” Model subclass ”
Compiling sgd_update…
Compiling sgd_update done. Time elapsed: 7.379370 seconds
compiling begin_record_entry…
compiling begin_record_entry done. Time elapsed: 0.103046 seconds
Monitored channels:
learning_rate
objective
total_seconds_last_epoch
training_seconds_this_epoch
Compiling accum…
graph size: 19
Compiling accum done. Time elapsed: 0.876798 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.001
objective: 89.1907964264
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
Time this epoch: 19.928861 seconds
……
Monitoring step:
Epochs seen: 10
Batches seen: 5000
Examples seen: 500000
learning_rate: 0.001
objective: 11.9511445315
total_seconds_last_epoch: 35.828732
training_seconds_this_epoch: 22.296131
Saving to ./dae_l1.pkl…
Saving to ./dae_l1.pkl done. Time elapsed: 0.936124 seconds
Saving to ./dae_l1.pkl…
Saving to ./dae_l1.pkl done. Time elapsed: 0.886536 seconds

4. 查看参数
>>> from pylearn2.utils import serial
>>> serial.load(‘dae_l1.pkl’)
<pylearn2.models.autoencoder.DenoisingAutoencoder object at 0x46855d0>
>>>
>>> model = serial.load(‘dae_l1.pkl’)
>>>
>>> dir(model)
[‘__call__’, ‘__class__’, ‘__delattr__’, ‘__dict__’, ‘__doc__’, ‘__format__’, ‘__getattribute__’, ‘__getstate__’, ‘__hash__’, ‘__init__’, ‘__metaclass__’, ‘__module__’, ‘__new__’, ‘__reduce__’, ‘__reduce_ex__’, ‘__repr__’, ‘__setattr__’, ‘__setstate__’, ‘__sizeof__’, ‘__str__’, ‘__subclasshook__’, ‘__weakref__’, ‘_disallow_censor_updates’, ‘_ensure_extensions’, ‘_hidden_activation’, ‘_hidden_input’, ‘_initialize_hidbias’, ‘_initialize_visbias’, ‘_initialize_w_prime’, ‘_initialize_weights’, ‘_modify_updates’, ‘_overrides_censor_updates’, ‘_params’, ‘_test_batch_size’, ‘act_dec’, ‘act_enc’, ‘censor_updates’, ‘continue_learning’, ‘corruptor’, ‘cpu_only’, ‘dataset_yaml_src’, ‘decode’, ‘encode’, ‘enforce_constraints’, ‘extensions’, ‘fn’, ‘free_energy’, ‘function’, ‘get_default_cost’, ‘get_input_dim’, ‘get_input_source’, ‘get_input_space’, ‘get_lr_scalers’, ‘get_monitoring_channels’, ‘get_monitoring_data_specs’, ‘get_output_dim’, ‘get_output_space’, ‘get_param_values’, ‘get_param_vector’, ‘get_params’, ‘get_target_source’, ‘get_target_space’, ‘get_test_batch_size’, ‘get_weights’, ‘get_weights_format’, ‘get_weights_topo’, ‘get_weights_view_shape’, ‘hidbias’, ‘input_space’, ‘inverse’, ‘irange’, ‘libv’, ‘modify_updates’, ‘monitor’, ‘nhid’, ‘output_space’, ‘perform’, ‘print_versions’, ‘reconstruct’, ‘redo_theano’, ‘register_names_to_del’, ‘rng’, ‘s_rng’, ‘score’, ‘set_batch_size’, ‘set_input_space’, ‘set_param_values’, ‘set_param_vector’, ‘set_visible_size’, ‘tag’, ‘tied_weights’, ‘train_all’, ‘train_batch’, ‘upward_pass’, ‘visbias’, ‘w_prime’, ‘weights’, ‘yaml_src’]
>>>

5. 类似步骤2,修改dae_l2.yaml文件

layer2_yaml = open('dae_l2.yaml', 'r').read()
hyper_params_l2 = {'train_stop' : 50000, 'batch_size' : 100, 'monitoring_batches' : 5, 'nvis' : 500, 'nhid' : 500, 'max_epochs' : 10, 'save_path' : '.'}
layer2_yaml = layer2_yaml % (hyper_params_l2)
print layer2_yaml

6. 执行dae_l2.yaml ,第二层模型训练
python ~/pylearn2/pylearn2/scripts/train.py dae_l2.yaml
/home/jerry/pylearn2/pylearn2/utils/call_check.py:98: UserWarning: the `one_hot` parameter is deprecated. To get one-hot encoded targets, request that they live in `VectorSpace` through the `data_specs` parameter of MNIST’s iterator method. `one_hot` will be removed on or after September 20, 2014.
return to_call(**kwargs)
/home/jerry/.local/lib/python2.7/site-packages/theano/sandbox/rng_mrg.py:1183: UserWarning: MRG_RandomStreams Can’t determine #streams from size (Shape.0), guessing 60*256
nstreams = self.n_streams(size)
Parameter and initial learning rate summary:
vb: 0.001
hb: 0.001
W: 0.001
Wprime: 0.001
/home/jerry/pylearn2/pylearn2/models/model.py:71: UserWarning: The <class ‘pylearn2.models.autoencoder.DenoisingAutoencoder’> Model subclass seems not to call the Model constructor. This behavior may be considered an error on or after 2014-11-01.
warnings.warn(“The ” + str(type(self)) + ” Model subclass ”
Compiling sgd_update…
Compiling sgd_update done. Time elapsed: 0.339660 seconds
compiling begin_record_entry…
compiling begin_record_entry done. Time elapsed: 0.023657 seconds
Monitored channels:
learning_rate
objective
total_seconds_last_epoch
training_seconds_this_epoch
Compiling accum…
graph size: 19
Compiling accum done. Time elapsed: 0.189965 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.001
objective: 52.2956323286
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
Time this epoch: 17.452593 seconds
……
Monitoring step:
Epochs seen: 10
Batches seen: 5000
Examples seen: 500000
learning_rate: 0.001
objective: 4.33433924602
total_seconds_last_epoch: 30.433518
training_seconds_this_epoch: 19.303109
Saving to ./dae_l2.pkl…
Saving to ./dae_l2.pkl done. Time elapsed: 0.607150 seconds
Saving to ./dae_l2.pkl…
Saving to ./dae_l2.pkl done. Time elapsed: 0.588375 seconds

7.  类似步骤2修改dae_mlp.yaml文件

mlp_yaml = open('dae_mlp.yaml', 'r').read()
hyper_params_mlp = {'train_stop' : 50000, 'valid_stop' : 60000, 'batch_size' : 100, 'max_epochs' : 50, 'save_path' : '.'}
mlp_yaml = mlp_yaml % (hyper_params_mlp)
print mlp_yaml

(注:在原dae_mlp.yaml文件内没有save_path, save_freq这两项,造成参数数据没有保存,因而需要加入这两项,如下:
save_path : ‘./dae_mlp.pkl’,
save_freq : 1
)

8. 运行监督优化–Supervised fine-tuning
python ~/pylearn2/pylearn2/scripts/train.py dae_mlp.yaml
/home/jerry/pylearn2/pylearn2/utils/call_check.py:98: UserWarning: the `one_hot` parameter is deprecated. To get one-hot encoded targets, request that they live in `VectorSpace` through the `data_specs` parameter of MNIST’s iterator method. `one_hot` will be removed on or after September 20, 2014.
return to_call(**kwargs)
Parameter and initial learning rate summary:
vb: 0.05
hb: 0.05
W: 0.05
Wprime: 0.05
vb: 0.05
hb: 0.05
W: 0.05
Wprime: 0.05
softmax_b: 0.05
softmax_W: 0.05
Compiling sgd_update…
Compiling sgd_update done. Time elapsed: 17.156073 seconds
compiling begin_record_entry…
compiling begin_record_entry done. Time elapsed: 0.056943 seconds
Monitored channels:
learning_rate
momentum
total_seconds_last_epoch
training_seconds_this_epoch
valid_objective
valid_y_col_norms_max
valid_y_col_norms_mean
valid_y_col_norms_min
valid_y_max_max_class
valid_y_mean_max_class
valid_y_min_max_class
valid_y_misclass
valid_y_nll
valid_y_row_norms_max
valid_y_row_norms_mean
valid_y_row_norms_min
Compiling accum…
graph size: 63
Compiling accum done. Time elapsed: 8.821601 seconds
Monitoring step:
Epochs seen: 0
Batches seen: 0
Examples seen: 0
learning_rate: 0.05
momentum: 0.5
total_seconds_last_epoch: 0.0
training_seconds_this_epoch: 0.0
valid_objective: 2.30245763578
valid_y_col_norms_max: 0.0650026130651
valid_y_col_norms_mean: 0.0641744853852
valid_y_col_norms_min: 0.0624679393698
valid_y_max_max_class: 0.105532125739
valid_y_mean_max_class: 0.102753872501
valid_y_min_max_class: 0.101059172742
valid_y_misclass: 0.9031
valid_y_nll: 2.30245763578
valid_y_row_norms_max: 0.0125483545665
valid_y_row_norms_mean: 0.00897718040255
valid_y_row_norms_min: 0.00411555936503
Time this epoch: 18.159817 seconds
……
Monitoring step:
Epochs seen: 50
Batches seen: 25000
Examples seen: 2500000
learning_rate: 0.0183943399319
momentum: 0.539357429719
total_seconds_last_epoch: 21.789649
training_seconds_this_epoch: 19.881821
valid_objective: 0.0667691463031
valid_y_col_norms_max: 1.93649990002
valid_y_col_norms_mean: 1.93614117524
valid_y_col_norms_min: 1.93520053981
valid_y_max_max_class: 0.999997756761
valid_y_mean_max_class: 0.980073621031
valid_y_min_max_class: 0.548149309784
valid_y_misclass: 0.02
valid_y_nll: 0.0667691463031
valid_y_row_norms_max: 0.546499525611
valid_y_row_norms_mean: 0.264354016013
valid_y_row_norms_min: 0.101427414171

9. 至此整个训练过程结束
想调参数可以在yaml文件内调整, 另外参数数据在三个文件内 dae_l1.pkl,  dae_l2.pkl,  dae_mlp.pkl

Pylearn2的使用简介

环境: ubuntu 12.4

Pylearn2是基于theano上封装的深度学习包。 它实现一些常见的模型,具体请参考: http://deeplearning.net/software/pylearn2/library/index.html#libdoc,比theano在做实际的项目节约时间,只需要配置一些参数来实现模型的训练。
下面来讲解实际的安装和使用:

1. 安装Theano(Bleeding-edge install instruction)

    jerry@hq:~$sudo pip install –upgrade –no-deps git+git://github.com/Theano/Theano.git –user

2. 下载Pylearn2
jerry@hq:~$git clone git://github.com/lisa-lab/pylearn2.git

3.  安装pylearn2
jerry@hq:~$cd pylearn2
jerry@hq:~$sudo python setup.py develop –user

4. 测试安装成功
jerry@hq:~$python
import pylearn2
能加载包即安装ok

5. 设置PYTHON2_DATA_PATH, PYLEARN2_VIEWR_COMMAND
vi ~/.bashrc
添加
export PYLEARN2_DATA_PATH=/u01/lisa/data
export PYLEARN2_VIEWER_COMMAND=/usr/bin/eog

如何运行一个示例

1. 下载数据
cd /u01/lisa/data/cifar10
wget http://www.cs.utoronto.ca/~kriz/cifar-10-python.tar.gz
tar xvf cifar-10-python.tar.gz

2. 修改make_dataset.py文件 ,指定路径/u01/lisa/data/ (由于本机上/空间不足,只能把数据放在其它路径上)
jerry@hq:~$vi /home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/make_dataset.py
修改成这样:
“””
path = pylearn2.__path__[0]
train_example_path = os.path.join(path, ‘scripts’, ‘tutorials’, ‘grbm_smd’)
train.use_design_loc(os.path.join(train_example_path, ‘cifar10_preprocessed_train_design.npy’))
train_pkl_path = os.path.join(train_example_path, ‘cifar10_preprocessed_train.pkl’)
“””
train_pkl_path = os.path.join(‘/u01/lisa/data/’, ‘cifar10_preprocessed_train.pkl’)
serial.save(train_pkl_path, train)

3. 对下载数据进行数据预处理
python /home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/make_dataset.py
处理完后在目录/u01/lisa/data下有一个文件 cifar10_preprocessed_train.pkl,大概652M左右

4. 对数据进行训练
cd /u01/lisa/data
python ~/pylearn2/pylearn2/scripts/train.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.yaml

5. 查看结果
python ~/pylearn2/pylearn2/scripts/show_weights.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

python ~/pylearn2/pylearn2/scripts/plot_monitor.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

python ~/pylearn2/pylearn2/scripts/print_monitor.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

    python ~/pylearn2/pylearn2/scripts/summarize_model.py ~/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl

 

6. 直接查看生成参数的文件cifar_grbm_smd.pkl

加载模型文件
>>> from pylearn2.utils import serial
>>> model = serial.load(‘/home/jerry/pylearn2/pylearn2/scripts/tutorials/grbm_smd/cifar_grbm_smd.pkl’)
查下文件结构
>>> dir(model)
获取权重参数
>>> model.get_weights()
获取参数名
>>> model.get_params()
获取参数值
>>> model.get_param_values()

python 邮件发送

python 2.7代码如下:

#coding: utf-8
import smtplib
from email.mime.text import MIMEText

#connect smtp server
msg = MIMEText(‘Hello’,’plain’,’utf-8′)
msg[‘Subject’] = ‘ Load data sucess!’
#msg[‘Date’] = formatdate(localtime=True)
smtp = smtplib.SMTP()
smtp.connect(‘proxy-in.xxx.com’)
smtp.sendmail(‘bidev@xxx.com’, ‘swer@xxx.com’, msg.as_string())

 

python 2.4.3

#coding: utf-8
import smtplib
from email.MIMEText import MIMEText

#connect smtp server
msg = MIMEText(‘Hello’,’plain’,’utf-8′)
msg[‘Subject’] = ‘ Load data sucess!’
smtp = smtplib.SMTP()
smtp.connect(‘proxy-in.xxx.com’)
smtp.sendmail(‘bidev@xxx.com’, ‘hsdf@xxx.com’, msg.as_string())

http://m.baidu.com/news?fr=mohome&ssid=0&uid=&pu=sz%401320_2001%2Cta%40iphone_1_7.1_3_537&bd_page_type=1#page/info%3A互联网/http%3A%2F%2Fwww.huxiu.com%2Farticle%2F114327%2F1.html/深挖BAT内部级别和薪资待遇,你敢看?%20/虎嗅网/1430894729000/12489789468531375681

CXXNET安装

环境:ubuntu 14.04,  cuda 6.5

先安装cuda-toolkit, cuda-cublas, cudart, cuda-curand这四个安装包

cuda_6.5.14_linux_64.run

cuda-cublas-6-5_6.5-14_amd64.deb
cuda-cudart-6-5_6.5-14_amd64.deb
cuda-curand-6-5_6.5-14_amd64.deb

下载路径:http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/x86_64/

安装 OpenCV

sudo apt-get install libopencv-2.4

 

配置环境变量

vi ~/.bashrc

export CUDA_HOME=/usr/local/cuda-6.5
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:/usr/local/lib:$LD_LIBRARY_PATH
export CPLUS_INCLUDE_PATH=/usr/local/cuda/include

 

下载一份cxxnet

git clone https://github.com/dmlc/cxxnet.git

切换至目录 cd cxxnet

拷贝一份配置到当前目录 cp make/config.mk .

修改 vi config.mk

USE_CUDA = 1

USE_BLAS = blas

USE_DIST_PS = 1
USE_OPENMP_ITER = 1

编辑 vi  Makefile, 修改如下:

CFLAGS += -g -O3 -I./mshadow/  -fPIC $(MSHADOW_CFLAGS) -fopenmp -I/usr/local/cuda/include
LDFLAGS = -pthread $(MSHADOW_LDFLAGS) -L/usr/local/cuda/lib64

 

最后编译文件

./build.sh

 

 

unable to correct problems you have held broken packages

OS:  Ubuntu 14.04

When installing a ubuntu desktop KDE,  some error  like “unable to correct problems you have held broken packages” happen.  So I finally found the problm is the package python3-software-properties is too new and can’t be compatible with the kde package. The soluation is follling:

sudo apt-get remove python3-software-properties

sudo apt-get install python3-software-properties=0.92.36

在百度的最后一天

今天(2015-03-13)是办理离职流程的最后一天。归还公司的资产,到财务结清工资并拿离职证明。过程还比较顺利。感觉有些轻松又有些伤感。从2012年9月入职到现在离开,时间过的可真快。有成长,有郁闷,有辛酸,现在真有些不是滋味。或许以后还会再回来,不知道。人生有几个三年,来回折腾人生。有些老了,或许不够壮志,但路要走得稳。希望自己在新单位发展如意!