verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
TD has quit [Ping timeout: 250 seconds]
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1027 (master - 2da9c5b : Tham): The build was fixed.
travis-ci has left #mlpack []
Mathnerd314 has quit [Ping timeout: 264 seconds]
< mentekid> rcurtin: I took another shot at the vectorization, still couldn't get it to be faster than the master branch. Maybe this wasn't such a good idea after all
< mentekid> I started with openMP yesterday, I've finished the parallel query processing part and the corresponding unit tests. I can see good results, speedup of x2-x3 on my machine...
< mentekid> I'll open a PR so you can take a look at the code. I've tried to keep it as simple as I can - just some code to track the number of threads, code to control the maximum number of threads LSHSearch is allowed to use, and of course the loop parallelization
nilay has joined #mlpack
Mathnerd314 has joined #mlpack
< rcurtin> mentekid: sounds good, I will take a look tomorrow
< rcurtin> I should also say, I am traveling from Tuesday night until Monday morning, so I will find time each day to be responsive and answer questions, but the hours will be weird
< keonkim> Hello, I have a military training (commute) during 21, 22, 23. I can still be online and work on the project, but my responses can be delayed.
< nilay> zoq: Hello, I have a doubt in the discretize function, after applying pca why do we have to do this: lbls += (zs[:, i] < 0).astype(N.int32) * 2 ** i
< nilay> From the paper: we quantize z based on the top log 2 (k) PCA dimensions, assigning z a discrete label c according to the orthant (generalization of quadrant) into which z falls.
< nilay> so here k = 2, so we take only the top PCA dimension.
nilay has quit [Ping timeout: 250 seconds]
< mentekid> rcurtin: It's ok - I'll be working on the parallelization so I'll notify you if I have any questions/doubts
nilay has joined #mlpack
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1031 (master - a7e8d3b : Ryan Curtin): The build passed.
travis-ci has left #mlpack []
< zoq> nilay: hmm, good point, in case you have more than 2 classes, you shift everything into the same quadrant, so you don't end up with the same binary label ... however, I can't think of a situation where we end up with more than 2 classes, so why should we implement that transition, can you think of any situation?
< nilay> no i can't see why more than 2 classes are needed... we only see for a pair of pixels whether segments are same or they are not.
< zoq> okay, I guess, if we can't think of any situation, we don't need to implement that transition.
< nilay> but does that line of code make any sense, what does it do
< zoq> shifts any value into the same quadrant
< zoq> but only if k > 2
< nilay> i am talking about lbls += (zs[:, i] < 0).astype(N.int32) * 2 ** i
< zoq> yes
< nilay> so if zs < 0 we multiply by 2 and square. and otherwise we let the label be 0 only?
< mentekid> rcurtin: does PR #701 affect the entire project or just that certain class? Should I change core.hpp to include openMP if present?
< nilay> and then we find unique lbls? the zs > 0, which have value 0 would all be gone
< rcurtin> mentekid: that affects the whole project, so yeah, I guess it is relevant to you too :)
< rcurtin> I think that you should be able to do most to all of the parallelization just using openmp #pragmas, for which omp.h should not be necessary
< zoq> nilay: That's another line we don't need because we use k = 2, if we have more than two labels, we take the most representative label over all dimensions (unique).
< zoq> nilay: Nice side effect, we could speed thing up, good catch :)
< zoq> *things
< zoq> nilay: So I would propose to avoid the for loop and the unique operation.
< nilay> zoq: so after applying pca, we only have 2 classes, one with val > 0 and one with val < 0 ?
< nilay> can you explain me the case where say we had 4 classes, how will we have divided into quadrants using that line of code?
< zoq> nilay: that's right
< zoq> I guess it would become more clear with a simple example and a plot, I can write that tonight (probably) and send it to you.
< nilay> ok no, i think i get the quadrant thing now
< nilay> but why would we want to do N.unique then?
< nilay> the labels can only have 4 possible values (for a quadrant) right?
< zoq> let's say we have k = 3 so, we use the first 2 components (PCA), each component is a potential label which could be something like this [0.2 0.4 -0.1 ...] after we converted each component into a binary representation, we end up with e.g: [0 0 1], [0 0 1], [0 1 1], since we like to get a unique label over all components, we use unique
< zoq> The output data has nothing to do with the number of quadrants: http://mccormickml.com/2014/06/03/deep-learning-tutorial-pca-and-whitening/
< nilay> zoq: thanks, i think i understand now, after so long :)
< nilay> we don't need unique because we only use one component
< zoq> yes, right
mentekid has quit [Ping timeout: 260 seconds]
nilay has quit [Ping timeout: 250 seconds]