multithreading - parallelized methods in python -
i working on scientific cluster, has been upgraded administrator, , code superslow, whereas used decent. using python 3.4
the way kind of things work following: have guess administrator may have changed , ask him make opportune changes, because if ask him direct question not conclude anything.
so, have run code profiler , have found there routines called many times, these routines are:
- built-in method array (called ~10^5, execution time 0.003s)
- sort of numpy.ndarray (~5000, 0.03s)
- uniformof mtrand.randomstate (~2000, 0.03s)
my guess of these libraries parallelized in previous installed version of python, example being linked mpi-parallelized or multi-threated math kernel libraries.
i know if guess correct or if have think else, because code has not changed.
the routines have quoted here relevant, because account 85% of total time. in particular, array takes 55% if total time. efficiency of code degraded factor 10. before talking system manager confirmation these routines have parallel version.
of course cannot test code on new , old configuration of cluster, because old configuration gone. can see on cluster numpy.array takes 8minutes, while on other cluster have takes 2seconds. top can see memory used low (~0.1%) while single cpu used @ 100%.
in [3]: numpy.__config__.show() lapack_info: libraries = ['lapack'] library_dirs = ['/usr/lib64'] language = f77 atlas_threads_info: libraries = ['satlas'] library_dirs = ['/usr/lib64/atlas'] define_macros = [('atlas_without_lapack', none)] language = c include_dirs = ['/usr/include'] blas_opt_info: libraries = ['satlas'] library_dirs = ['/usr/lib64/atlas'] define_macros = [('atlas_info', '"\\"3.10.1\\""')] language = c include_dirs = ['/usr/include'] atlas_blas_threads_info: libraries = ['satlas'] library_dirs = ['/usr/lib64/atlas'] define_macros = [('atlas_info', '"\\"3.10.1\\""')] language = c include_dirs = ['/usr/include'] openblas_info: not available lapack_opt_info: libraries = ['satlas', 'lapack'] library_dirs = ['/usr/lib64/atlas', '/usr/lib64'] define_macros = [('atlas_without_lapack', none)] language = f77 include_dirs = ['/usr/include'] lapack_mkl_info: not available blas_mkl_info: not available mkl_info: not available ldd /usr/lib64/python3.4/site-packages/numpy/core/_dotblas.cpython-34m.so linux-vdso.so.1 => (0x00007fff46172000) libsatlas.so.3 => /usr/lib64/atlas/libsatlas.so.3 (0x00007f0d941a0000) libpython3.4m.so.1.0 => /lib64/libpython3.4m.so.1.0 (0x00007f0d93d08000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0d93ae8000) libc.so.6 => /lib64/libc.so.6 (0x00007f0d93728000) libgfortran.so.3 => /lib64/libgfortran.so.3 (0x00007f0d93400000) libm.so.6 => /lib64/libm.so.6 (0x00007f0d930f8000) libdl.so.2 => /lib64/libdl.so.2 (0x00007f0d92ef0000) libutil.so.1 => /lib64/libutil.so.1 (0x00007f0d92ce8000) /lib64/ld-linux-x86-64.so.2 (0x00007f0d950e0000) libquadmath.so.0 => /lib64/libquadmath.so.0 (0x00007f0d92aa8000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f0d92890000) numpy linked atlas, , see link libpthread.so (so assume multithreated, it?).
on other side, updated version of numpy 1.8.2 1.9.2 , array method takes 5 s instead of 300s. think reason of code slowing down (maybe, did system-adminstrator downgrade numpy version? knows!)
a parallelized blas helps limited amount of numpy/scipy functions (see these test scripts);
numpy.dotscipy.linalg.choleskyscipy.linalg.svd
if can run
import numpy.core._dotblas without getting importerror, have optimized numpy.dot available.
array creation speed should not influenced this, however.
can post code , how use it? or else minimal example has problem? how code run on cluster?
Comments
Post a Comment