#mlpack on 2014-12-27 — irc logs at libera.irclog.whitequark.org

2014-09-13 04:58 cameron.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

00:17 < billLiu> I know what you mean naywhayare. In my case, it need a lot of runs to finish the kmean. And it just runs on a single core. And when the training refers to EM, the matrix production would be benefit from OpenBlas and use 32 cores. So the kmeans becomes the bottleneck of the algorithm

00:19 < billLiu> Of course, normally speaking, the main part EM steps takes the longest time. I just talked about my case :) And I use openMP to parallelize KMean update and to see whether it could be faster. Thx for you advice

01:34 < billLiu> I found the comment "We require OpenMP now." just appears in mlpack 1.1.0, but no in my version 1.0.11. So I just copy from 1.1.0 to my version. I rebuild it and get error "". I comment out the line " set(CMAKE_EXE_LINKER_FLAGS ...", and it passes. Again, I check FindOpenMP in my cmake Module and could not find OpenMP_EXE_LINKER_FLAGS, is it correct? My cmake version 3.1.0.

02:11 jbc_ has quit [Quit: jbc_]

02:13 jbc_ has joined #mlpack

04:11 jbc_ has quit [Quit: jbc_]

04:54 govg has joined #mlpack

04:54 govg is now known as Guest71641

05:09 < naywhayare> billLiu: sorry for the slow response... it is a holiday week so I am not spending much time checking IRC :)

05:10 < naywhayare> I guess the OpenMP_EXE_LINKER_FLAGS line is wrong and doesn't work with new CMake

05:10 < naywhayare> I followed this guide when I did it with CMake 2.8: http://berenger.eu/blog/cmake-openmp-with-cmake/

05:10 < naywhayare> but it seems like commenting out the line like you did is just fine

05:10 < billLiu> Never mind. Enjoy the holiday is the most important, I think :) Just respond any free time

05:10 < billLiu> yes, you are right, comment out is fine for me

05:10 < naywhayare> another idea to accelerate the k-means step is to use a different LloydStepType parameter

05:11 < naywhayare> I recently implemented the ElkanKMeans class which might work well

05:11 < naywhayare> you would use it like this... KMeans<EuclideanDistance, RandomPartition, AllowEmptyClusters, ElkanKMeans>

05:12 < naywhayare> actually, using AllowEmptyClusters instead of the default MaxVarianceNewCluster might cause some speedup too...

05:12 < naywhayare> so your GMM object would look like this (maybe a bit ugly...):

05:13 < naywhayare> GMM<EMFit<KMeans<EuclideanDistance, RandomPartition, AllowEmptyClusters, ElkanKMeans> > >

05:21 < billLiu> Hmm, I know. I will do it as the next step. I implement some other algorithm according to my own requirement, so if I use new version mlpack, I should rewrite some of my codes...

05:23 < billLiu> If I just use CMAKE_C_FLAGS and CMAKE_CXX_FLAGS, is it OK to link openMP when linking? It seem to me that my compiled code just run as serial code, and the number of thread is 1 (using omp_get_num_thread in parallel block)

05:26 < naywhayare> hm, do you have a #pragma omp parallel? (silly question maybe)

05:26 < naywhayare> you might also specify OMP_NUM_THREADS as an environment variable on the command-line

05:28 < billLiu> actually I do this explicitly to verify my doubt 187 #pragma omp parallel num_threads(16) 188 { 189 Log::Info << omp_get_num_threads() << std::endl; 190 }

05:29 < naywhayare> I thought you didn't need to specify num_threads explicitly, and if you didn't the system would decide automatically or with the use of the environment variable OMP_NUM_THREADS

05:29 < naywhayare> but honestly I am not an OpenMP expert (yet)... :(

05:30 < billLiu> I have already try to let the system choose the number of threads automatically, but failed as the same

05:30 < billLiu> OK, I will try some other methods to figure it out. If I have a solution, I will tell you

05:31 < naywhayare> yes, please do -- I would be interested to know what the solution is :)

06:04 Guest71641 has quit [Quit: leaving]

06:04 < billLiu> I see my problem... I drop -fopenmp when compiling my own program. I used to think -lmlpack will link to mlpack lib which is compiled using -fopenmp, and no need to reassign -fopenmp because in my own program, I do not use openmp.

06:05 < billLiu> Add -fopenmp to compile my program, KMenas training become parallel.

06:05 < billLiu> For mlpack cmake, CMAKE_C_FLAGS and CMAKE_CXX_FLAGS is enough for me

06:34 ajkl has joined #mlpack

06:35 ajkl has quit [Client Quit]

07:50 govg has joined #mlpack

12:24 govg has quit [Ping timeout: 265 seconds]

13:11 govg has joined #mlpack

14:58 govg has quit [Ping timeout: 240 seconds]

14:59 govg has joined #mlpack

15:39 < naywhayare> govg: I meant to update you on this a while ago -- 1.0.11 is released and has the testing bugs worked out, so you should be able to easily push it to AUR, I think :)

15:40 < naywhayare> billLiu: ah, yeah, you'll need -fopenmp for your program, because GMM<...> is templated and will be instantiated not in libmlpack.so but instead in the program you are compiling

15:51 govg has quit [Ping timeout: 258 seconds]

16:28 govg has joined #mlpack

18:01 krsna has joined #mlpack

18:01 < krsna> Hi

18:03 < krsna> can some one please share any helpful resources ebooks/links on ML, C++ Memory Management etc., I'm a newbie on ML and interested in contributing to mlpack.

18:03 < krsna> trying to get up to the speed. any help ?

18:17 < zoq> krsna: Hello, there is a neat blog post which tries to answer your question: http://metacademy.org/roadmaps/cjrd/level-up-your-ml

18:17 < zoq> I think another or additional way to get started in machine learning is to watch video courses (e.g. https://www.coursera.org/course/ml).

18:17 < zoq> Once you are familiar with the basic methods, download and compile mlpack, and go through the tutorials; especially the parts of them that involve the C++ interface. That will help give you an idea of how the code is laid out and the code standards that are used.

18:25 < krsna> hey @zoq thanks a lot for the information.

18:27 < zoq> krsna: Sure, if you have any further questions, please do not hesitate to ask, either on the mailing list or IRC.

18:29 < krsna> one more, are there any resources on c++ specific to memory management programming ?

18:29 < krsna> which you think will be helpful for beginners

18:33 < zoq> hm, right now I can't recall any but maybe someone on the channel can recommend something?

19:10 krsna has quit [Quit: Page closed]

20:44 jbc_ has joined #mlpack

21:10 jbc_ has quit [Quit: jbc_]

21:29 govg has quit [Ping timeout: 240 seconds]