Ubuntu14.04 安装 Caffe+CUDA 7.5

Contents
  1. 1. Caffe依赖包
  2. 2. 安装CUDA 7.5
    1. 2.1 禁用nouveau驱动
    2. 2.2 安装CUDA
    3. 2.3 配置环境变量
    4. 2.4 编译SAMPLE
  3. 3. 安装cuDNN
  4. 4. 安装BLAS
  5. 5. 安装Anaconda
    1. 5.1 安装pyenv
    2. 5.2 用pyenv安装Anaconda
  6. 6. 安装OpenCV
  7. 7. 安装Caffe

Caffe官网的 安装说明 实在太简单了点,主要是参考的 Caffe + Ubuntu 14.04 64bit + CUDA 6.5 配置说明Ubuntu14.04下安装Caffe总结 。系统是Ubuntu 14.04 64bit,显卡是GTX 950M。

1. Caffe依赖包

1
2
sudo apt-get install build-essential  # basic requirement
sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler #required by caffe

2. 安装CUDA 7.5

2.1 禁用nouveau驱动

我作死地相信了上述第二篇文章的话,没有禁用nouveau驱动,结果第一次重启神奇地成功了,第二次重启就无法登陆。。。后来用recovery mode禁用了nouveau。所以一定要禁用nouveau:

创建文件/etc/modprobe.d/blacklist-nouveau.conf

blacklist nouveau
options nouveau modeset=0

2.2 安装CUDA

在NVIDIA开发者官网下载CUDA 7.5,我直接用的deb包,因为觉得比较方便。可以直接双击打开安装,也可以用命令行:

1
2
3
sudo dpkg -i cuda-repo-<distro>_<version>_<architecture>.deb
sudo apt-get update
sudo apt-get install cuda

都说要禁用lightdm安装cuda,但是用deb包安装的话没有关掉lightdm也没有遇到什么问题。安装完CUDA以后使用的就是nvidia闭源驱动了。

2.3 配置环境变量

/etc/profile中添加:

1
export PATH=$PATH:/usr/local/cuda/bin

创建文件/etc/ld.so.conf.d/cuda.conf:

/usr/local/cuda/lib64

使上述配置直接生效:

1
2
source /etc/profile
sudo ldconfig

2.4 编译SAMPLE

进入/usr/local/cuda/samples:

1
sudo make all -j4

完成后进入samples/bin/x86_64/linux/release:

1
./deviceQuery

可以看到你的显卡信息(其实并没什么卵用。

3. 安装cuDNN

cuDNN要注册NVIDIA开发者账号才能下载,我注册了下很快就收到邮件可以下载了,但是蛋疼的intel MKL至今都没收到回复下载不了。。。

这个比较简单,下载下来直接是编译好的lib和include,丢进/usr/local/cuda里就行了:

1
2
3
4
cd cudnn-* #下载解压后的文件夹
sudo cp lib64/* /usr/local/cuda/lib64/
sudo cp include/* /usr/local/cuda/include/
sudo ldconfig

如果你遇到/sbin/ldconfig.real: /usr/local/cuda/lib64/libcudnn.so.5 is not a symbolic link这样的问题,重新链接一下(根据版本自行调整):

1
2
3
cd /usr/local/cuda/lib64
sudo ln -sf libcudnn.so.5.1.3 libcudnn.so.5
sudo ln -sf libcudnn.so.5 libcudnn.so

4. 安装BLAS

为在intel网站申请学生版始终没回复,所以就只有用OpenBLAS了。。。
Intel的MKL在这里注册一下也能用,有图形界面,安装完成之后创建文件/etc/ld.so.conf.d/intel_mkl.conf

/opt/intel/lib/intel64
/opt/intel/mkl/lib/intel64

然后sudo ldconfig

OpenBLAS安装如下:

1
2
3
4
git clone https://github.com/xianyi/OpenBLAS.git
cd OpenBLAS
make -j4
sudo make install PREFIX=/usr/local/openblas

创建文件/etc/ld.conf.d/openblas.conf:

/usr/local/openblas/lib

5. 安装Anaconda

我是直接用pyenv安装的,因为需要在多个python环境下切换。其实一开始直接编译安装了Anaconda后来发现pyenv又删掉了…

5.1 安装pyenv

1
2
3
4
5
git clone https://github.com/yyuu/pyenv.git ~/.pyenv
echo 'export PYENV_ROOT="$HOME/.pyenv"' >> ~/.bashrc
echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc
echo 'eval "$(pyenv init -)"' >> ~/.bashrc
exec $SHELL -l

5.2 用pyenv安装Anaconda

如果觉得网络不好可以直接把anaconda的sh文件下载下来放到~/.pyenv/cache里。

1
2
3
pyenv install anaconda2-4.1.0
pyenv global anaconda2-4.1.0
pyenv versions

这时你应该可以看到你切换到anaconda2-4.1.0了。另外还可以用pyenv-virtualenv,不过我没有折腾。

6. 安装OpenCV

方案一:

1
2
3
sudo add-apt-repository --yes ppa:xqms/opencv-nonfree
sudo apt-get update
sudo apt-get install libopencv-nonfree-dev libopencv-nonfree2.4

方案二:

有个安装脚本,不过靠脚本安装OpenCV毕竟太不靠谱,我是看着脚本一点点自己敲命令装好的。直接把脚本贴上来吧:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
arch=$(uname -m)
if [ "$arch" == "i686" -o "$arch" == "i386" -o "$arch" == "i486" -o "$arch" == "i586" ]; then
flag=1
else
flag=0
fi
echo "Installing OpenCV 2.4.10"
mkdir OpenCV
cd OpenCV
echo "Removing any pre-installed ffmpeg and x264"
sudo apt-get -y remove ffmpeg x264 libx264-dev
echo "Installing Dependenices"
sudo apt-get -y install libopencv-dev
sudo apt-get -y install build-essential checkinstall cmake pkg-config yasm
sudo apt-get -y install libtiff4-dev libjpeg-dev libjasper-dev
sudo apt-get -y install libavcodec-dev libavformat-dev libswscale-dev libdc1394-22-dev libxine-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev libv4l-dev
sudo apt-get -y install python-dev python-numpy
sudo apt-get -y install libtbb-dev libeigen3-dev
sudo apt-get -y install libqt4-dev libgtk2.0-dev
sudo apt-get -y install libfaac-dev libmp3lame-dev libopencore-amrnb-dev libopencore-amrwb-dev libtheora-dev libvorbis-dev libxvidcore-dev
sudo apt-get -y install x264 v4l-utils ffmpeg
sudo apt-get -y install libgtk2.0-dev
echo "Downloading OpenCV 2.4.10"
if ! [ -f "OpenCV-2.4.10.zip" ]; then
wget -O OpenCV-2.4.10.zip http://sourceforge.net/projects/opencvlibrary/files/opencv-unix/2.4.10/opencv-2.4.10.zip/download
fi
echo "Installing OpenCV 2.4.10"
if ! [ -d "opencv-2.4.10" ]; then
unzip OpenCV-2.4.10.zip
fi
rm OpenCV-2.4.10.zip
cd opencv-2.4.10
rm -rf build
mkdir build
cd build
cmake -D CUDA_ARCH_BIN=3.2 -D CUDA_ARCH_PTX=3.2 -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_TBB=ON -D BUILD_NEW_PYTHON_SUPPORT=ON -D WITH_V4L=ON -D BUILD_TIFF=ON -D WITH_QT=ON -D WITH_OPENGL=ON ..
make -j
sudo make install
sudo sh -c 'echo "/usr/local/lib" > /etc/ld.so.conf.d/opencv.conf'
sudo ldconfig
echo "OpenCV 2.4.10 ready to be used"

前面的安装依赖包倒没什么,到了25行wget opencv的时候发现sourceforge的下载链接已经变了,下不下来,于是到sourceforge.net把OpenCV 2.4.10下下来,再命名成脚本需要的OpenCV-2.4.10.zip(注意大小写,如果你需要按脚本来的话)。

其实已经基本不需要这个脚本文件了,直接一点点敲命令来吧。

1
2
3
4
5
6
7
8
cd opencv-2.4.10
mkdir build
cd build
cmake -D CUDA_ARCH_BIN=3.2 -D CUDA_ARCH_PTX=3.2 -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_TBB=ON -D BUILD_NEW_PYTHON_SUPPORT=ON -D WITH_V4L=ON -D BUILD_TIFF=ON -D WITH_QT=ON -D WITH_OPENGL=ON ..
make -j4
sudo make install
sudo sh -c 'echo "/usr/local/lib" > /etc/ld.so.conf.d/opencv.conf'
sudo ldconfig

7. 安装Caffe

终于能装Caffe了,不过先要安装python依赖库。进入caffe安装目录下的python文件夹:

1
for req in $(cat requirements.txt); do pip install $req; done

修改Makefile.config,注意需要的几项就行:

# cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1

# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := mkl

# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
MATLAB_DIR := /usr/local/MATLAB/R2016b

# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
ANACONDA_HOME := $(HOME)/.pyenv/versions/anaconda2-4.1.0
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
    $(ANACONDA_HOME)/include/python2.7 \
    $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \
# PYTHON_LIB := /usr/lib
PYTHON_LIB := $(ANACONDA_HOME)/lib
WITH_PYTHON_LAYER := 1

上面的并不是完整文件,只是贴出来需要修改的几项。若使用matlab,注意要把matlab加入PATH环境变量,如在/etc/profile中加入export PATH=$PATH:/usr/local/MATLAB/R2016b/bin。如果你的BLAS选择的是openblas,那在BLAS := open后面应该加上:

# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
BLAS_INCLUDE := /usr/local/openblas/include
BLAS_LIB := /usr/local/openblas/lib

最后编译caffe:

1
2
3
make all -j4
make test -j4
make runtest

如果还需要python或者matlab接口(需确保Makefile.config中的python和matlab路径正确):

1
2
make pycaffe
make matcaffe

如果在make runtest的时候遇到error while loading shared libraries: libhdf5_hl.so.10: cannot open shared object file: No such file or directory类似的错误,为了避免冲突尽量不要把anaconda的lib路径(即上面的~/.pyenv/versions/anaconda2-4.1.0/lib)添加到LD_LIBRARY_PATH,因为其它软件可能需要系统自带的python而不是anaconda。在/usr/lib/x86_64-linux-gnu下面可能有libhdf5_hl.so.7,版本太低了。

我是这样做的:

1
2
3
4
5
6
cd /usr/lib/x86_64-linux-gnu/
sudo cp ~/.pyenv/versions/anaconda2-4.1.0/lib/libhdf5.so.10* .
sudo cp ~/.pyenv/versions/anaconda2-4.1.0/lib/libhdf5_hl.so.10* .
cd /usr/lib/x86_64-linux-gnu/
sudo ln -sf libhdf5.so.10.1.0 libhdf5.so.10
sudo ln -sf libhdf5_hl.so.10.0.2 libhdf5_hl.so.10

或者更直接的方法是修改caffe的Makefile?

如果还有部分runtest失败:

[  FAILED  ] SGDSolverTest/0.TestSnapshotShare, where TypeParam = caffe::CPUDevice<float>
[  FAILED  ] AdaGradSolverTest/0.TestSnapshotShare, where TypeParam = caffe::CPUDevice<float>
[  FAILED  ] NesterovSolverTest/0.TestSnapshot, where TypeParam = caffe::CPUDevice<float>
[  FAILED  ] NesterovSolverTest/0.TestSnapshotShare, where TypeParam = caffe::CPUDevice<float>
[  FAILED  ] AdaDeltaSolverTest/0.TestSnapshotShare, where TypeParam = caffe::CPUDevice<float>
[  FAILED  ] AdamSolverTest/0.TestSnapshotShare, where TypeParam = caffe::CPUDevice<float>
[  FAILED  ] RMSPropSolverTest/0.TestSnapshotShare, where TypeParam = caffe::CPUDevice<float>

/etc/profile加入:

1
2
export CUDA_VISIBLE_DEVICES=0
export MKL_CBWR=AUTO

别忘了source /etc/profile