#mlpack on 2016-06-19 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

02:08 TD has quit [Ping timeout: 250 seconds]

05:00 travis-ci has joined #mlpack

05:00 < travis-ci> mlpack/mlpack#1027 (master - 2da9c5b : Tham): The build was fixed.

05:00 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/a0b31abe5ff6...2da9c5bac14a

05:00 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/138666669

05:00 travis-ci has left #mlpack []

06:13 Mathnerd314 has quit [Ping timeout: 264 seconds]

11:47 < mentekid> rcurtin: I took another shot at the vectorization, still couldn't get it to be faster than the master branch. Maybe this wasn't such a good idea after all

11:49 < mentekid> I started with openMP yesterday, I've finished the parallel query processing part and the corresponding unit tests. I can see good results, speedup of x2-x3 on my machine...

11:51 < mentekid> I'll open a PR so you can take a look at the code. I've tried to keep it as simple as I can - just some code to track the number of threads, code to control the maximum number of threads LSHSearch is allowed to use, and of course the loop parallelization

12:50 nilay has joined #mlpack

14:26 Mathnerd314 has joined #mlpack

14:49 < rcurtin> mentekid: sounds good, I will take a look tomorrow

14:49 < rcurtin> I should also say, I am traveling from Tuesday night until Monday morning, so I will find time each day to be responsive and answer questions, but the hours will be weird

14:59 < keonkim> Hello, I have a military training (commute) during 21, 22, 23. I can still be online and work on the project, but my responses can be delayed.

15:09 < nilay> zoq: Hello, I have a doubt in the discretize function, after applying pca why do we have to do this: lbls += (zs[:, i] < 0).astype(N.int32) * 2 ** i

15:27 < nilay> From the paper: we quantize z based on the top log 2 (k) PCA dimensions, assigning z a discrete label c according to the orthant (generalization of quadrant) into which z falls.

15:28 < nilay> so here k = 2, so we take only the top PCA dimension.

15:32 nilay has quit [Ping timeout: 250 seconds]

16:10 < mentekid> rcurtin: It's ok - I'll be working on the parallelization so I'll notify you if I have any questions/doubts

16:11 nilay has joined #mlpack

17:13 travis-ci has joined #mlpack

17:13 < travis-ci> mlpack/mlpack#1031 (master - a7e8d3b : Ryan Curtin): The build passed.

17:13 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/2da9c5bac14a...a7e8d3bac60d

17:13 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/138736604

17:13 travis-ci has left #mlpack []

17:26 < zoq> nilay: hmm, good point, in case you have more than 2 classes, you shift everything into the same quadrant, so you don't end up with the same binary label ... however, I can't think of a situation where we end up with more than 2 classes, so why should we implement that transition, can you think of any situation?

17:28 < nilay> no i can't see why more than 2 classes are needed... we only see for a pair of pixels whether segments are same or they are not.

17:29 < zoq> okay, I guess, if we can't think of any situation, we don't need to implement that transition.

17:29 < nilay> but does that line of code make any sense, what does it do

17:30 < zoq> shifts any value into the same quadrant

17:30 < zoq> but only if k > 2

17:31 < nilay> i am talking about lbls += (zs[:, i] < 0).astype(N.int32) * 2 ** i

17:31 < zoq> yes

17:32 < nilay> so if zs < 0 we multiply by 2 and square. and otherwise we let the label be 0 only?

17:33 < mentekid> rcurtin: does PR #701 affect the entire project or just that certain class? Should I change core.hpp to include openMP if present?

17:33 < mentekid> (this one: https://github.com/mlpack/mlpack/pull/701)

17:33 < nilay> and then we find unique lbls? the zs > 0, which have value 0 would all be gone

17:37 < rcurtin> mentekid: that affects the whole project, so yeah, I guess it is relevant to you too :)

17:37 < rcurtin> I think that you should be able to do most to all of the parallelization just using openmp #pragmas, for which omp.h should not be necessary

17:41 < zoq> nilay: That's another line we don't need because we use k = 2, if we have more than two labels, we take the most representative label over all dimensions (unique).

17:42 < zoq> nilay: Nice side effect, we could speed thing up, good catch :)

17:42 < zoq> *things

17:43 < zoq> nilay: So I would propose to avoid the for loop and the unique operation.

17:49 < nilay> zoq: so after applying pca, we only have 2 classes, one with val > 0 and one with val < 0 ?

17:49 < nilay> can you explain me the case where say we had 4 classes, how will we have divided into quadrants using that line of code?

17:53 < zoq> nilay: that's right

17:54 < zoq> I guess it would become more clear with a simple example and a plot, I can write that tonight (probably) and send it to you.

17:54 < nilay> ok no, i think i get the quadrant thing now

17:54 < nilay> but why would we want to do N.unique then?

17:55 < nilay> the labels can only have 4 possible values (for a quadrant) right?

18:01 < zoq> let's say we have k = 3 so, we use the first 2 components (PCA), each component is a potential label which could be something like this [0.2 0.4 -0.1 ...] after we converted each component into a binary representation, we end up with e.g: [0 0 1], [0 0 1], [0 1 1], since we like to get a unique label over all components, we use unique

18:04 < zoq> The output data has nothing to do with the number of quadrants: http://mccormickml.com/2014/06/03/deep-learning-tutorial-pca-and-whitening/

19:02 < nilay> zoq: thanks, i think i understand now, after so long :)

19:07 < nilay> we don't need unique because we only use one component

20:22 < zoq> yes, right

22:10 mentekid has quit [Ping timeout: 260 seconds]

23:06 nilay has quit [Ping timeout: 250 seconds]