2017년 6월 14일 수요일

ppc64le 환경에서의 anaconda python2&3 환경 동시 설정 - XGBoost, OpenCV, KoNLPy, MeCab 등

먼저, ppc64le 환경에서의 miniconda 설치에 대해서는 아래 post를 참조하십시요.

http://hwengineer.blogspot.kr/2017/05/minsky-continuum-anaconda.html

여기서는 miniconda3 (python3.6)이 이미 설치된 환경에서 시작하며, ppc64le 환경에서 anaconda를 사용하실 때 흔히 있을 수 있는 질문과 응답 형식으로 정리했습니다.


Q1.  Anaconda 대신 Miniconda3로 가야하는 경우에는 다음 패키지도 conda install로 설치 가능한지 ? 

numpy, pandas, sklearn(scikit-learn), joblib, flask, bs4(beautiful soup), MKL(math kernel library), 

A1.  예, 아래와 같이 다 conda로 install 잘 됩니다.  단 하나, MKL은 ppc64le에는 없습니다만, 그와 수반되어 사용되는 numpy 및 scipy는 아래와 같이 conda로 잘 설치되므로 꼭 MKL이 있어야 할 필요는 없습니다.

u0017496@sys-87576:~$ conda install joblib
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    joblib: 0.11-py36_0

Proceed ([y]/n)? y

joblib-0.11-py 100% |#########################################################| Time: 0:00:00   7.17 MB/s

u0017496@sys-87576:~$ conda install scikit-learn
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    scikit-learn: 0.18.1-np112py36_1

Proceed ([y]/n)? y

scikit-learn-0 100% |#########################################################| Time: 0:00:00  14.48 MB/s

u0017496@sys-87576:~$ conda install pandas
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    pandas:          0.20.1-np112py36_0
    python-dateutil: 2.6.0-py36_0
    pytz:            2017.2-py36_0

Proceed ([y]/n)? y

pytz-2017.2-py 100% |#########################################################| Time: 0:00:00   7.85 MB/s
python-dateuti 100% |#########################################################| Time: 0:00:00   8.43 MB/s
pandas-0.20.1- 100% |#########################################################| Time: 0:00:02  10.59 MB/s

u0017496@sys-87576:~$ conda install flask
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    click:        6.7-py36_0
    flask:        0.12.2-py36_0
    itsdangerous: 0.24-py36_0
    jinja2:       2.9.6-py36_0
    markupsafe:   0.23-py36_2
    werkzeug:     0.12.2-py36_0

Proceed ([y]/n)? y

click-6.7-py36 100% |#########################################################| Time: 0:00:00   7.14 MB/s
itsdangerous-0 100% |#########################################################| Time: 0:00:00  11.12 MB/s
markupsafe-0.2 100% |#########################################################| Time: 0:00:00  15.42 MB/s
werkzeug-0.12. 100% |#########################################################| Time: 0:00:00   5.71 MB/s
jinja2-2.9.6-p 100% |#########################################################| Time: 0:00:00  14.30 MB/s
flask-0.12.2-p 100% |#########################################################| Time: 0:00:00   7.08 MB/s

u0017496@sys-87576:~$ conda install beautifulsoup4
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3:

The following NEW packages will be INSTALLED:

    beautifulsoup4: 4.6.0-py36_0

Proceed ([y]/n)? y

beautifulsoup4 100% |#########################################################| Time: 0:00:00   7.12 MB/s


u0017496@sys-87576:~$ conda install numpy scipy
Fetching package metadata .........
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /home/u0017496/miniconda3:
#
numpy                     1.12.1                   py36_0
scipy                     0.19.0              np112py36_0


Q2.  Python2 & 3을 위해서 Anaconda2, Anaconda3을 각각 설치하는 대신에 아래 URL에 나오는 것처럼 Anaconda3 하나만 설치하고 그 내부에 python2 가상환경 구축 가능한지?

www.continuum.io/blog/developer-blog/python-3-support-anaconda

A2.  예, 잘 됩니다.  

u0017496@sys-87576:~$ conda create -n py2k python=2 anaconda
Fetching package metadata .........
Solving package specifications: .

Package plan for installation in environment /home/u0017496/miniconda3/envs/py2k:

The following NEW packages will be INSTALLED:

    alabaster:          0.7.10-py27_0
    anaconda:           4.4.0-np112py27_0
    anaconda-client:    1.6.3-py27_0
....
...
jupyter-1.0.0- 100% |#########################################################| Time: 0:00:00  10.50 MB/s
anaconda-4.4.0 100% |#########################################################| Time: 0:00:00   5.53 MB/s
#
# To activate this environment, use:
# > source activate py2k
#
# To deactivate this environment, use:
# > source deactivate py2k
#

py2k, 즉 python2 가상환경에 들어가려면 다음과 같이 하시면 됩니다.

u0017496@sys-87576:~$ source activate py2k

python이 이제 2.7.13 버전이 구동되는 것을 보실 수 있습니다.

(py2k) u0017496@sys-87576:~$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>>


Q3. XGBoost의 경우, pip install을 통한 설치는 CPU 버전 패키지만 지원합니다.  
GPU 사용가능한 XGBoost는 아래 URL의 Github 소스코드 + CUB 관련 파일 컴파일이 필요한데 ppc64le에서도 가능한지?

github.com/dmlc/xgboost/tree/master/plugin/updater_gpu

A3.  예, 잘 됩니다.  git clone으로 위의 package를 download 받아 그 안의 다음 4개의 Makefile에서 intel-specifc한 option인 -msse2만 제거하면 잘 됩니다.

u0017496@sys-87576:~/$ git clone --recursive https://github.com/dmlc/xgboost.git

u0017496@sys-87576:~/$ cd xgboost

u0017496@sys-87576:~/xgboost$ vi ./rabit/guide/Makefile ./rabit/Makefile ./dmlc-core/Makefile ./Makefile

u0017496@sys-87576:~/xgboost$ ./build.sh
...
g++ -std=c++11 -Wall -Wno-unknown-pragmas -Iinclude   -Idmlc-core/include -Irabit/include -I/include -O3 -funroll-loops -mcpu=power8 -fPIC -fopenmp -o xgboost  build/cli_main.o build/logging.o build/learner.o build/common/common.o build/common/hist_util.o build/metric/metric.o build/metric/rank_metric.o build/metric/elementwise_metric.o build/metric/multiclass_metric.o build/objective/multiclass_obj.o build/objective/objective.o build/objective/rank_obj.o build/objective/regression_obj.o build/data/sparse_page_dmatrix.o build/data/sparse_page_source.o build/data/sparse_page_writer.o build/data/simple_csr_source.o build/data/data.o build/data/sparse_page_raw_format.o build/data/simple_dmatrix.o build/tree/updater_prune.o build/tree/tree_updater.o build/tree/updater_histmaker.o build/tree/updater_refresh.o build/tree/updater_sync.o build/tree/updater_colmaker.o build/tree/updater_skmaker.o build/tree/tree_model.o build/tree/updater_fast_hist.o build/gbm/gbtree.o build/gbm/gblinear.o build/gbm/gbm.o build/c_api/c_api.o build/c_api/c_api_error.o dmlc-core/libdmlc.a rabit/lib/librabit.a  -pthread -lm  -fopenmp -lrt  -lrt
Successfully build multi-thread xgboost


Q4. konlpy는 JDK 및 Jpype1 관련 설정 및 PATH, JAVA_HOME 등등 경로 지정 필요합니다. python2, python3에서 모두 잘 작동하는지 ?

A4.  잘 됩니다.

먼저 기본 환경인 python3에서 다음과 같이 잘 됩니다.

u0017496@sys-87576:~/xgboost$ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-ppc64el
u0017496@sys-87576:~/xgboost$ python
Python 3.6.0 |Continuum Analytics, Inc.| (default, Mar 16 2017, 19:36:14)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from konlpy.tag import *
>>> Kkma()
<konlpy.tag._kkma.Kkma instance at 0x3fff9561be60>

이어서 위에서 설치했던 가상환경의 python2에서도 잘 됩니다.  다만 환경 변수 등은 가상환경 속에서 다시 또 해줘야 합니다.

u0017496@sys-87576:~$ source activate py2k
(py2k) u0017496@sys-87576:~$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> from konlpy.tag import *
>>> Kkma()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: Kkma instance has no __call__ method

위에서 error가 난 이유는 JAVA_HOME 설정이 이 가상환경 속에서는 안 되어 있기 때문입니다.  그걸 해주면 error는 없어집니다.

(py2k) u0017496@sys-87576:~$ export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-ppc64el
(py2k) u0017496@sys-87576:~$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> from konlpy.tag import *
>>> Kkma()
<konlpy.tag._kkma.Kkma instance at 0x3fff9561be60>


Q5. KoNLPy의 mecab 추가 설치 및 작동이 잘 되는지 ?  가령 아래 code가 python2, python3에서 모두 잘 되는지 ?

from konlpy.tag import *
Mecab()

A5.  다음과 같이 잘 됩니다.  처음에는 error가 나서 당황했는데, 보니 mecab-ko-dic를 따로 설치하면 해결되는 문제입니다.  

먼저 mecab-python3을 pip로 설치하고, 그 뒤에 mecab-ko-dic을 download 받아 설치합니다.

u0017496@sys-87576:~$ pip install mecab-python3

u0017496@sys-87576:~$ wget https://bitbucket.org/eunjeon/mecab-ko-dic/downloads/mecab-ko-dic-1.6.1-20140814.tar.gz

u0017496@sys-87576:~$ tar -zxvf mecab-ko-dic-1.6.1-20140814.tar.gz
u0017496@sys-87576:~$ cd mecab-ko-dic-1.6.1-20140814
u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ ./autogen.sh
u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ ./configure
u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ make
u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ sudo make install

u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ python
Python 3.6.0 |Continuum Analytics, Inc.| (default, Mar 16 2017, 19:36:14)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from konlpy.tag import *
>>> Mecab('/usr/lib/mecab/dic/mecab-ko-dic')
<konlpy.tag._mecab.Mecab object at 0x3fffa01773c8>

python2의 가상환경에서도 물론 동일하게 잘 됩니다.

u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ source activate py2k
(py2k) u0017496@sys-87576:~/mecab-ko-dic-1.6.1-20140814$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> from konlpy.tag import *
>>> Mecab('/usr/lib/mecab/dic/mecab-ko-dic')
<konlpy.tag._mecab.Mecab instance at 0x3fff754e7cb0>
>>>


Q5.  OpenCV & GraphViz의 경우, conda install 을 사용하지 않고 설치가 가능하다면, python2&3 안에서 호출 가능한 형태로 설치되는 것인지?

import cv2
import graphviz

A6. 예, 둘다 잘 됩니다.   https://repo.continuum.io/pkgs/free/linux-ppc64le에서 opencv-3.1.0-np112py27_2.tar.bz2 등의 opencv package를 받아와서 설치하면 됩니다.

먼저 python3를 위한 opencv package를 설치해서 테스트합니다.

u0017496@sys-87576:~$ wget https://repo.continuum.io/pkgs/free/linux-ppc64le/opencv-3.1.0-np112py36_2.tar.bz2
--2017-06-14 03:35:25--  https://repo.continuum.io/pkgs/free/linux-ppc64le/opencv-3.1.0-np112py36_2.tar.bz2
Resolving repo.continuum.io (repo.continuum.io)... 104.16.19.10, 104.16.18.10, 2400:cb00:2048:1::6810:120a, ...
Connecting to repo.continuum.io (repo.continuum.io)|104.16.19.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13145265 (13M) [application/x-tar]
Saving to: ‘opencv-3.1.0-np112py36_2.tar.bz2’

opencv-3.1.0-np112py36_2.t 100%[=====================================>]  12.54M  9.02MB/s    in 1.4s

2017-06-14 03:35:26 (9.02 MB/s) - ‘opencv-3.1.0-np112py36_2.tar.bz2’ saved [13145265/13145265]


u0017496@sys-87576:~$ mkdir opencv-3.1.0-py36
u0017496@sys-87576:~$ cd opencv-3.1.0-py36
u0017496@sys-87576:~/opencv-3.1.0-py36$ tar -jxvf ../opencv-3.1.0-np112py36_2.tar.bz2

u0017496@sys-87576:~$ export PYTHONPATH=$PYTHONPATH:/home/u0017496/opencv-3.1.0-py36/lib/python3.6/site-packages

u0017496@sys-87576:~$ python
Python 3.6.0 |Continuum Analytics, Inc.| (default, Mar 16 2017, 19:36:14)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> import graphviz
>>>

다음으로 python2를 위한 opencv package를 설치해서 테스트합니다.

u0017496@sys-87576:~$ wget https://repo.continuum.io/pkgs/free/linux-ppc64le/opencv-3.1.0-np112py27_2.tar.bz2
--2017-06-14 03:35:38--  https://repo.continuum.io/pkgs/free/linux-ppc64le/opencv-3.1.0-np112py27_2.tar.bz2
Resolving repo.continuum.io (repo.continuum.io)... 104.16.18.10, 104.16.19.10, 2400:cb00:2048:1::6810:130a, ...
Connecting to repo.continuum.io (repo.continuum.io)|104.16.18.10|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13147104 (13M) [application/x-tar]
Saving to: ‘opencv-3.1.0-np112py27_2.tar.bz2’

opencv-3.1.0-np112py27_2.t 100%[=====================================>]  12.54M  9.76MB/s    in 1.3s

2017-06-14 03:35:39 (9.76 MB/s) - ‘opencv-3.1.0-np112py27_2.tar.bz2’ saved [13147104/13147104]

u0017496@sys-87576:~$ mkdir opencv-3.1.0-py27
u0017496@sys-87576:~$ cd opencv-3.1.0-py27
u0017496@sys-87576:~/opencv-3.1.0-py27$ tar -jxvf ../opencv-3.1.0-np112py27_2.tar.bz2

u0017496@sys-87576:~$ source activate py2k

(py2k) u0017496@sys-87576:~$ export PYTHONPATH=$PYTHONPATH:/home/u0017496/opencv-3.1.0-py27/lib/python2.7/site-packages

(py2k) u0017496@sys-87576:~$ python
Python 2.7.13 |Anaconda 4.4.0 (64-bit)| (default, Mar 16 2017, 18:34:18)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> import cv2
>>> import graphviz
>>>

댓글 없음:

댓글 쓰기