#mlpack on 2016-05-25 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

01:47 < rcurtin> lozhnikov: okay, sounds good. I still think maybe AuxiliaryInformationType is the better place for normalModeMaxNumChildren even though SplitType uses it, but I'll think about it

01:47 < rcurtin> if it is, I can do the refactoring

03:57 nilay_ has joined #mlpack

03:58 nilay has quit [Quit: Page closed]

03:58 nilay_ has quit [Client Quit]

04:00 nilay has joined #mlpack

04:20 < nilay> in armadillo, we can slice the cube(x,y,z) into matrices by the third dimension, z, by using cube.slice(0) , cube.slice(1) and so on . . . is there a way to slice the cube(x,y,z) into matrices by the first dimension x.?

05:04 Mathnerd314 has quit [Ping timeout: 264 seconds]

05:28 carbon_addict has joined #mlpack

05:29 < carbon_addict> Armadillo 7.100 released: http://arma.sourceforge.net/download.html

05:30 carbon_addict has left #mlpack []

06:34 mentekid has joined #mlpack

07:01 < lozhnikov> rcurtin: You're right. I thought about it and i decided that normalModeMaxNumChildren should be moved to AuxiliaryInformationType. I'll do the refactoring.

07:23 tham has joined #mlpack

07:26 < tham> nilay : You said you want to slice through x axis?

07:26 < tham> What do you mean?

07:26 < tham> Cube is a third dimension "matrix"

07:27 < tham> You can treat it as a container which store a lot of two dimension matrix, just like std::vector<arma::Mat_<T>>

07:27 < tham> you can access the x axis like this

07:29 < tham> http://arma.sourceforge.net/docs.html#submat

07:30 < tham> my_cube.slice(0).col(0); //access column 0 of matrix 0 in my_cube

07:38 < tham> You can use this solution to access the matrix of cube too

07:38 < tham> auto &b = a.slice(0);

07:39 < tham> in most of the cases, you should be able to treat b as the reference of matrix

07:40 < tham> About copyMakeBorder, what kind of mode you want to implement? I do not think you need implement all of the modes, you can add them step by step if you like

07:41 < tham> Forgot to said, I was quite busy last few days, now I have more times to participate in GSOC

07:44 < tham> May I study your codes about the copyMakeBorder?

07:46 < tham> I am not good at study those complicated theory(I am glad zoq is very good at this), but I think I am able to help you debug, design the api

07:50 < tham> rcurtin : keonkim said he want to remove the normalize algorithm at here--https://github.com/mlpack/mlpack/blob/637809fec8d341829e4cd122cf5a385e5e219c9b/src/mlpack/core/data/normalize_labels.hpp

07:50 < tham> I saw your suggestion of the api, I think using Train() and Apply() is a good idea, this could make the style more consistent

07:51 < tham> What do you think about remove the normalize algorithm?

07:52 < tham> I think this file could keep there for backward compatility too

07:52 < tham> Talk a lot, I need to get some sleep, exhausted in recent days

07:52 tham has quit [Quit: Page closed]

10:14 < zoq> tham nilay: I don't think we should implement more than one border type. Any border type should be good enough, for our purpose.

11:14 nilay has quit [Quit: Page closed]

11:43 marcosirc has joined #mlpack

12:12 mentekid has quit [Remote host closed the connection]

12:14 mentekid has joined #mlpack

12:18 travis-ci has joined #mlpack

12:18 < travis-ci> mlpack/mlpack#827 (master - cd7f063 : Marcus Edel): The build has errored.

12:18 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/ade7fa08ffa8...cd7f06319f88

12:18 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/132825781

12:18 travis-ci has left #mlpack []

12:25 < rcurtin> tham: keonkim: if we remove normalize_labels, what will we replace it with? that function is used for a few algortihms (like NCA I think)

12:26 < rcurtin> I have no problems removing it as long as we think through some alternative or something

12:50 < keonkim> rcurtin: I found that normalization functions are implemented seperately in other methods, so I thought gathering all them into one class would be great.

12:58 reshabh has joined #mlpack

12:59 travis-ci has joined #mlpack

12:59 < travis-ci> mlpack/mlpack#828 (master - c15541b : Marcus Edel): The build passed.

12:59 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/cd7f06319f88...c15541b94ce0

12:59 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/132831128

12:59 travis-ci has left #mlpack []

13:06 < rcurtin> keonkim: that sounds reasonable to me

13:36 nilay has joined #mlpack

14:10 < rcurtin> marcosirc: thanks for answering #647, I think you are right with your diagnosis

14:10 < rcurtin> :)

14:16 < marcosirc> Thanks!

14:17 Mathnerd314 has joined #mlpack

14:21 reshabh has left #mlpack []

14:38 travis-ci has joined #mlpack

14:38 < travis-ci> mlpack/mlpack#829 (master - 04fe0d5 : Marcus Edel): The build was broken.

14:38 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/c15541b94ce0...04fe0d5198b6

14:38 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/132855626

14:38 travis-ci has left #mlpack []

15:29 nilay has quit [Ping timeout: 250 seconds]

15:44 < rcurtin> marcosirc: I changed my mind, I think B_2 is right, I am writing up a proof now

15:44 < rcurtin> you will have to try to find an error in the proof :)

15:45 < marcosirc> Mmm. Do you think so? It will be great to see that proof.

15:52 < rcurtin> yeah... almost done

15:57 nilay has joined #mlpack

16:03 sumedhghaisas has joined #mlpack

16:39 < nilay> tham: sorry for late reply... i implemented reflection border type. here is the code: http://pastebin.com/hJc0gWB8

16:40 < nilay> we only pad image on right and bottom as done in python codes.

16:41 < nilay> to multiply (2 cubes) or a (mat and a cube), i have to write a for loop and use dot product?

16:45 < zoq> nilay: If you use cubes, that's right.

16:45 < nilay> zoq: cubes should be used for images?

16:47 < zoq> nilay: If you like you can use you can internally use cubes, but since the rest of mlpack works with matrices, we should design the interface accordingly.

16:49 < nilay> zoq: so the public function will input matrices and private ones can input cubes? cubes give neater code i guess.. if you want i can do all by matrices only. .

16:51 < zoq> nilay: It's fine for me to use cubes in private functions.

16:52 < nilay> zoq: ok

16:54 < rcurtin> be careful with cubes, note that slices are not contiguous in memory

16:56 < rcurtin> ack, sorry, I think I am incorrect

16:57 < rcurtin> yeah, the thing to be aware of is that Mat.slice(0).col(0) is not directly next to Mat.slice(1).col(0) in memory

16:57 < rcurtin> that doesn't mean you shouldn't use it, it's just a thing to be aware of when thinking about memory access patterns

16:59 < nilay> rcurtin: column need to be copied this way so i don't have option

16:59 < rcurtin> yeah, I do not know the details of what you are doing, I'm just pointing that out

16:59 < rcurtin> you want to access the columns of a slice sequentially, instead of accessing a given column sequentially across slices

16:59 < rcurtin> i.e. slice(i).col(0), slice(i).col(1), ...

17:00 < rcurtin> not slice(0).col(i), slice(1).col(i), ...

17:00 < rcurtin> that's all I meant :)

17:03 < nilay> rcurtin: ok thanks. i'll keep that in mind. :)

17:10 < mentekid> rcurtin: I have run into somewhat of a dilemma regarding my implementation... Can I bother you for a sec?

17:12 < rcurtin> mentekid: sure. I am in a meeting, but I am not 100% paying attention to it :)

17:13 < mentekid> Cool. The problem is this: I have to create a min-heap that holds some scores (doubles). I can do that with a priority queue from stl, no problem

17:14 < mentekid> thing is, these scores correspond to some vectors (so if we have 8 scores we have 8 vectors)

17:14 < mentekid> but I can't push the vector on the heap because it'

17:15 < mentekid> (sorry mispressed return) it's not comparable to stuff... I have thought of the complex work-around of creating my dummy class that has a value and a vector and implementing a friend-function that does comparisons... I just wanted to ask if you can think (or have seen) anything simpler

17:16 < rcurtin> I think you could use std::pair<double, size_t> for this

17:16 < rcurtin> I forget what the default comparison is for std::pair, but I think it compares the first value first, and upon equality it will compare the second value

17:16 < rcurtin> so I think tha tmight work in your case

17:16 < mentekid> ah

17:16 < mentekid> I knew I had to ask you, I don't know the STL yet

17:17 < rcurtin> I have found, in the past, that std::pair can be slow, but that was 6 years ago and in a different situation

17:17 < rcurtin> here I think there may be not better alternative

17:17 < mentekid> so I could also have a pair<double, vector<size_t>> right?

17:17 < rcurtin> yeah, you can also do that, but the comparison might be a little bit more trouble there

17:17 < rcurtin> but you can still write a custom comparator and I think pass it as a template argument to the priority_queue

17:17 < mentekid> ah, yes I can do that

17:18 < rcurtin> I might consider holding the vector<size_t> separately and just holding an index to the vector in the std::pair

17:18 < rcurtin> this could avoid copies/moves (depending on the underlying implementation)

17:18 < mentekid> hmm so that way I would have the second item of the pair pointing to the vector

17:18 < mentekid> I think that would work faster yeah

17:18 < rcurtin> might be worth trying both, but my intuition suggests holding an index to the vector would be faster

17:19 < rcurtin> but it is pretty clear my intuition is not always right :)

17:19 < mentekid> cool I'll try each and see how it goes

17:20 < mentekid> at least I avoided creating a whole class from scratch just for one minheap

17:20 < rcurtin> :)

17:20 < mentekid> thanks :)

17:20 < rcurtin> sure, let me know if I can help with anything else

17:20 < rcurtin> it is my job after all :)

17:33 tsathoggua has joined #mlpack

17:34 tsathoggua has quit [Client Quit]

17:54 sumedhghaisas has quit [Ping timeout: 244 seconds]

17:56 sumedhghaisas has joined #mlpack

18:06 nilay has quit [Ping timeout: 250 seconds]

18:10 nilay has joined #mlpack

18:14 nilay has quit [Ping timeout: 250 seconds]

18:16 nilay has joined #mlpack

18:26 < mentekid> I think the multiprobe LSH was designed explicitly so it would be impossible to be written in C++ :P

18:26 < mentekid> my head hurts

18:27 < mentekid> I think after I write it I'll find out there was a much simpler way which I didn't see

18:36 travis-ci has joined #mlpack

18:36 < travis-ci> mlpack/mlpack#831 (master - 5d1723d : Ryan Curtin): The build was fixed.

18:36 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/04fe0d5198b6...5d1723d3305e

18:36 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/132913203

18:36 travis-ci has left #mlpack []

18:47 nilay has quit [Ping timeout: 250 seconds]

18:51 < rcurtin> mentekid: I think that has more to do with C++'s design than multiprobe LSH :)

19:30 dnm_ has joined #mlpack

19:30 < dnm_> hi

19:32 < dnm_> I am trying to run a multithreaded program with a c++ code that uses mlpack, , but I cannot get any parallelization.

19:32 < dnm_> Only one core is being used (there is no cpu usage on the other cores) and omp_get_max_threads always returns 1.

19:32 < dnm_> the same code runs fine if I don't include mlpack .

19:33 < dnm_> Did anyone have the same problem before?

19:33 < dnm_> Thanks

19:33 < rcurtin> dnm_: hi there, so do you mean that including mlpack causes your OpenMP-ized code to no longer be parallel?

19:33 < dnm_> yes

19:33 < rcurtin> there should not be anything in mlpack that changes your OpenMP configuration

19:33 < dnm_> And it is happening only in one of the compute nodes of the server that I use

19:33 < rcurtin> some of mlpack is OpenMP-ized (specifically, the density estimation tree code), but this should not be affecting your code

19:34 < rcurtin> so it does not happen in every compute node on the server you are using either, only sometimes?

19:34 < dnm_> I tried to use c++11 thread for parallelization but the same thing happened

19:34 < dnm_> Only one cpu usage

19:35 < dnm_> there is no problem with forking, but I cannot get more than one CPU

19:35 < rcurtin> if you can construct a minimal working example that demonstrates this behavior (1 CPU when including mlpack, n CPUs when not including mlpack), I can look into this further

19:35 < rcurtin> but I am really dubious that mlpack is the problem here

19:36 < dnm_> I tried it with a couple of examples, I will try to send one of them

19:39 < rcurtin> okay, if you send it I will take a look

19:39 < rcurtin> there is nothing in the mlpack code that modifies the OpenMP configuration

19:39 < rcurtin> this is why I think maybe the issue is something else

19:39 < rcurtin> but still, if I can reproduce it, I can look into it

19:40 < dnm_> thanks, I am trying to write a small example that shows the problem

19:41 < dnm_> but it is not just openmp issue I guess.

19:41 < dnm_> becuase I could not get parallelization when trying to fork with c++11 threads

19:41 < dnm_> forking was ok

19:41 < dnm_> but there is cpu binding

19:41 < dnm_> they are all in the same cpu

19:42 < dnm_> And the weird thing is I have this problem in only one the nodes of the server that I use

19:43 < dnm_> the other ones run the code fine

19:43 < dnm_> anyway I am writing a small example

19:46 < rcurtin> yes so I am saying, if your problem is only in one of the nodes on your server, I suspect the problem may be with the server, not mlpack

19:46 < rcurtin> but if you can get me a small example, like I said, I will try it and we will see what happens :)

19:59 < dnm_> #include <cmath>

19:59 < dnm_> #include <iostream>

19:59 < dnm_> I will sent all of it in one line

20:00 < dnm_> formatting will be weird

20:00 < dnm_> #include <cmath> #include <iostream> #include <fstream> // std::ifstream #include <ostream> // std::ifstream #include <omp.h> #include <mlpack/methods/linear_regression/linear_regression.hpp> using namespace std; using namespace arma; using namespace mlpack::regression;

20:00 < dnm_> This is the include section

20:00 < dnm_> vec multivarLinearRegression(mat data, vec responses) { vec responsesDB = conv_to< vec >::from(responses); mat dataDB = conv_to<mat>::from(data); // Regress. LinearRegression lr(dataDB,responsesDB, 0, false); // temporary solution for conversion (FL) //return conv_to< vec >::from(lr.Parameters()); return lr.Parameters(); }

20:00 < dnm_> void testMLR(){ mat data(2,5); // 2-dimensional, 5 points vec responses(5); data(0,0)= 2; data(1,0)= 3; data(0,1)= 5; data(1,1)= 1; data(0,2)= 4; data(1,2)= 20; data(0,3)= 7; data(1,3)= 16; data(0,4)= 9; data(1,4)= 12; cout << data << endl; responses(0)= 13; responses(1)= 13; responses(2)= 68; responses(3)= 62; responses(4)= 54; //cout << responses << endl; vec vc = multivar

20:00 < zoq> dnm_: please you pastebin or something like that

20:00 < dnm_> int main() { const unsigned size = 900000; double sinTable[size]; int tid; arma::mat A = zeros(2,3); cout << A << endl; cout << "max thread " << omp_get_max_threads() << endl; cout << "num of procs " << omp_get_num_procs() << endl; omp_set_num_threads(4); //#pragma omp parallel shared(sinTable) private(tid) //{ #pragma omp parallel for schedule(dynamic, 1) for(size_t n=0; n<s

20:01 < dnm_> all of the code is this

20:02 < dnm_> I first compile with

20:02 < dnm_> g++ -c -m64 -pipe -DARMA_DONT_USE_WRAPPER -fopenmp -std=c++11 -lpthread -lgomp -O2 -Wall -W -D_REENTRANT -DARMA_USE_ARPACK -DARMA_64BIT_WORD -I. -I/usr/include -I/usr/local/include/mlpack -I/usr/local/include/boost -o tmp.o tmp.cpp

20:02 < dnm_> then run

20:02 < dnm_> g++ -m64 -DARMA_DONT_USE_WRAPPER -g -ggdb -fopenmp -lpthread -lgomp -Wl,-O1 -o aaa tmp.o -L/usr/lib/x86_64-linux-gnu -lmlpack -larmadillo -lblas -llapack -larpack -L/usr/local/lib/ -fopenmp -lpthread

20:02 < dnm_> link

20:03 < dnm_> Although I don't call the function using mlpack in the main function, still the same problem happens

20:03 < rcurtin> can you provide the code in pastebin or something? it's basically impossible for me to copy-paste that

20:11 < dnm_> https://docs.google.com/document/d/1egOvD9pqwLnUISXyFOmgJLeiZy_GiPsIJeeqNlfa0Yc/pub

20:12 < dnm_> thanks again

20:13 < dnm_> to compile and link I use the following lines

20:13 < dnm_> g++ -c -m64 -pipe -DARMA_DONT_USE_WRAPPER -fopenmp -std=c++11 -lpthread -lgomp -O2 -Wall -W -D_REENTRANT -DARMA_USE_ARPACK -DARMA_64BIT_WORD -I. -I/usr/include -I/usr/local/include/mlpack -I/usr/local/include/boost -o tmp.o tmp.cpp

20:13 < dnm_> g++ -m64 -DARMA_DONT_USE_WRAPPER -fopenmp -lpthread -lgomp -Wl,-O1 -o aaa tmp.o -L/usr/lib/x86_64-linux-gnu -lmlpack -larmadillo -lblas -llapack -larpack -L/usr/local/lib/ -fopenmp -lpthread

20:17 < rcurtin> and on your system what is the output when you run the program?

20:17 < rcurtin> on my system it reports:

20:17 < rcurtin> max thread 32

20:17 < rcurtin> num of procs 32

20:17 < dnm_> max thread 1 num of procs 1

20:17 < dnm_> only one

20:18 < rcurtin> and what you are saying is

20:18 < dnm_> but if I do not include mlpack and the function , it shows 24 threads

20:18 < rcurtin> that if you comment out the line '#include <mlpack/methods/linear_regression/linear_regression.hpp>', the output is different

20:18 < dnm_> and the functions that uses mlpack

20:18 < dnm_> yes

20:19 < rcurtin> I see no difference on my system when I comment out the mlpack code

20:19 < rcurtin> and you also say that this only happens on one system, right?

20:20 < dnm_> this is happening in only one of the machines

20:20 < dnm_> in the server I use

20:20 < dnm_> but due to the memory limits on the machines I need to use that certain node

20:20 < dnm_> and I am actually not sure why this might be happening

20:21 < dnm_> somehow mlpack is binding CPU usage

20:21 < rcurtin> I am not convinced of that

20:21 < dnm_> and it is not just an openmp issue

20:21 < rcurtin> I think that the symptom you are seeing is that when you include mlpack, the problem occurs

20:21 < rcurtin> but I do not think it is because of mlpack

20:21 < rcurtin> I think it is because of other system configuration somewhere, or possibly slight differences in how you are compiling with and without mlpack, and other reasons

20:22 < rcurtin> either way, whatever the issue is, I unfortunately can't debug it unless I can reproduce it, and I am not able to

20:22 < dnm_> but it is blocking more than one cpu usage with c++11 threads as well...

20:22 < dnm_> ok, thanks anyway..

20:22 < rcurtin> yes, you said this, but there is not any reason why mlpack would do this

20:22 < rcurtin> I mean, I can try and help, but I can't really dig in here if I can't reproduce it

20:22 < rcurtin> is there anything special about the one particular node you are running on?

20:23 < dnm_> not really

20:23 < dnm_> other than having a very large memory

20:24 < rcurtin> I think that probably the best thing that you can do to figure out what is going on here is to find some auxiliary openmp or c++11 threads functions that will tell you something about the system in question and why you are only getting one processor from omp_get_max_threads()

20:25 < rcurtin> but I am not sure of exactly what would be needed to debug at that level, as I've never seen this problem or anything resembling it

20:25 < dnm_> ok, thanks. I will try..

20:25 < rcurtin> yes, let me know what you find out

20:25 < dnm_> sure, thanks again

20:25 < rcurtin> if we can actually pinpoint the problem to some code in mlpack, I will fix it, but like I said earlier, I really can't think of any reasonable theory for how mlpack would be affecting your setup

20:25 < rcurtin> sorry that I can't be more helpful at this time

20:26 < dnm_> ok, thanks again for your time.

20:28 < rcurtin> sure, no worries, that is what we are here for :)

20:33 dnm_ has quit [Ping timeout: 250 seconds]

20:36 dnm_ has joined #mlpack

20:37 dnm_ has left #mlpack []

21:49 marcosirc has quit [Quit: WeeChat 1.4]

22:09 mentekid has quit [Ping timeout: 250 seconds]

23:48 tsathoggua has joined #mlpack

23:51 tsathoggua has quit [Client Quit]