multithreading - parallelized methods in python -


i working on scientific cluster, has been upgraded administrator, , code superslow, whereas used decent. using python 3.4

the way kind of things work following: have guess administrator may have changed , ask him make opportune changes, because if ask him direct question not conclude anything.

so, have run code profiler , have found there routines called many times, these routines are:

  1. built-in method array (called ~10^5, execution time 0.003s)
  2. sort of numpy.ndarray (~5000, 0.03s)
  3. uniformof mtrand.randomstate (~2000, 0.03s)

my guess of these libraries parallelized in previous installed version of python, example being linked mpi-parallelized or multi-threated math kernel libraries.

i know if guess correct or if have think else, because code has not changed.


the routines have quoted here relevant, because account 85% of total time. in particular, array takes 55% if total time. efficiency of code degraded factor 10. before talking system manager confirmation these routines have parallel version.


of course cannot test code on new , old configuration of cluster, because old configuration gone. can see on cluster numpy.array takes 8minutes, while on other cluster have takes 2seconds. top can see memory used low (~0.1%) while single cpu used @ 100%.


 in [3]: numpy.__config__.show()  lapack_info:      libraries = ['lapack']      library_dirs = ['/usr/lib64']      language = f77  atlas_threads_info:      libraries = ['satlas']      library_dirs = ['/usr/lib64/atlas']      define_macros = [('atlas_without_lapack', none)]      language = c      include_dirs = ['/usr/include']  blas_opt_info:      libraries = ['satlas']      library_dirs = ['/usr/lib64/atlas']      define_macros = [('atlas_info', '"\\"3.10.1\\""')]      language = c      include_dirs = ['/usr/include']  atlas_blas_threads_info:      libraries = ['satlas']      library_dirs = ['/usr/lib64/atlas']      define_macros = [('atlas_info', '"\\"3.10.1\\""')]      language = c      include_dirs = ['/usr/include']  openblas_info:    not available  lapack_opt_info:      libraries = ['satlas', 'lapack']      library_dirs = ['/usr/lib64/atlas', '/usr/lib64']      define_macros = [('atlas_without_lapack', none)]      language = f77      include_dirs = ['/usr/include']  lapack_mkl_info:    not available  blas_mkl_info:    not available  mkl_info:    not available             

ldd /usr/lib64/python3.4/site-packages/numpy/core/_dotblas.cpython-34m.so      linux-vdso.so.1 =>  (0x00007fff46172000)      libsatlas.so.3 => /usr/lib64/atlas/libsatlas.so.3 (0x00007f0d941a0000)      libpython3.4m.so.1.0 => /lib64/libpython3.4m.so.1.0 (0x00007f0d93d08000)      libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0d93ae8000)      libc.so.6 => /lib64/libc.so.6 (0x00007f0d93728000)      libgfortran.so.3 => /lib64/libgfortran.so.3 (0x00007f0d93400000)      libm.so.6 => /lib64/libm.so.6 (0x00007f0d930f8000)      libdl.so.2 => /lib64/libdl.so.2 (0x00007f0d92ef0000)      libutil.so.1 => /lib64/libutil.so.1 (0x00007f0d92ce8000)      /lib64/ld-linux-x86-64.so.2 (0x00007f0d950e0000)      libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00007f0d92aa8000)      libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f0d92890000) 

numpy linked atlas, , see link libpthread.so (so assume multithreated, it?).

on other side, updated version of numpy 1.8.2 1.9.2 , array method takes 5 s instead of 300s. think reason of code slowing down (maybe, did system-adminstrator downgrade numpy version? knows!)

a parallelized blas helps limited amount of numpy/scipy functions (see these test scripts);

  • numpy.dot
  • scipy.linalg.cholesky
  • scipy.linalg.svd

if can run

import numpy.core._dotblas 

without getting importerror, have optimized numpy.dot available.

array creation speed should not influenced this, however.

can post code , how use it? or else minimal example has problem? how code run on cluster?


Comments