verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
tham has joined #mlpack
Mathnerd314_ is now known as Mathnerd314
kwikadi has quit [Remote host closed the connection]
kwikadi has joined #mlpack
< lozhnikov> marcosirc: You're right, thanks. I'll try to do that.
nilay has joined #mlpack
Mathnerd314 has quit [Ping timeout: 264 seconds]
mentekid has joined #mlpack
< mentekid> rcurtin: So everything got hashed to bucket 0. I would have never seen that... Cool, thanks :)
mentekid has quit [Ping timeout: 244 seconds]
mentekid has joined #mlpack
< mentekid> rcurtin: I think we should do the same thing in returnIndicesFromTables, right? There's a similar problem there I think
< mentekid> let me look at the code
tham has quit [Quit: Page closed]
< mentekid> rcurtin: I think there's still some bug in the LSH code, my tests crash... Here's a backtrace: http://pastebin.com/Psvadsgp
< mentekid> (I've added some markers every few lines of code to isolate the error)
< Karl_> rcurtin_: sorry for not getting back. I got stuck with other things yesterday. I think my kernel isn't proper... I get negative eigenvalues
< Karl_> zoq: if you want a beta tester let me know how to get the svd-pca code...
< Karl_> zoq: or was it just the normal pca method?
< zoq> Karl_: Thanks, I'll get back to you once it is finished.
< lozhnikov> marcosirc: rcurtin: I opened a PR that contains some changes proposed by Marcos Pividori (RectangleTree::NumDescendants() optimization).
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
< lozhnikov> mentekid: Hi, there is a segfault in LSHTest/NumTablesTest. Are you sure that you should use secondHashVectors[j] instead of secondHashVectors(i, j)? (lsh_search_impl.hpp:200 and 202)
mentekid has quit [Ping timeout: 246 seconds]
< lozhnikov> rcurtin: The error appears in e6bc4b4.
mentekid has joined #mlpack
Mathnerd314 has joined #mlpack
marcosirc has joined #mlpack
< marcosirc> lozhnikov: great, thanks.
< rcurtin> lozhnikov: marcosirc: odd, I tested it on my system, I guess I did not run valgrind and now I pay the price :)
nilay_ has joined #mlpack
< mentekid> rcurtin: I fixed what lozhnikov but I still get a segmentation fault at LSHTrainTest
< mentekid> the other tests seem to run fine :/
< mentekid> actually... In Train(), shouldn't secondHashTable be cleared when Train is called?
< rcurtin> mentekid: I'm an idiot, I have the fix, hang on
nilay_ has quit [Ping timeout: 250 seconds]
< rcurtin> actually, I don't quite have the fix, this is more complex than I thought
< rcurtin> okay, fixed in eea2aa4, sorry for the issue
< mentekid> ah thanks :) I'll finish the style changes and push the final multiprobe tests
< mentekid> sorry multiprobe changes*
< rcurtin> sounds good
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1112 (master - eea2aa4 : Ryan Curtin): The build is still failing.
travis-ci has left #mlpack []
< rcurtin> Karl_: no worries, if you can show the code for the kernel, I can take a glance and see if I see anything wrong
mentekid has quit [Ping timeout: 264 seconds]
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1115 (master - eaa7182 : Ryan Curtin): The build was broken.
travis-ci has left #mlpack []
nilay_ has joined #mlpack
mentekid has joined #mlpack
sumedhghaisas has joined #mlpack
< sumedhghaisas> @marcosirc: Hey marcos...
< marcosirc> sumedhghaisas: Hi!
< sumedhghaisas> sorry about the delay...
< sumedhghaisas> I read through the paper paper... you are right
< sumedhghaisas> Will spill trees we cannot guarantee the error...
< sumedhghaisas> But I guess Ryan is right...
< sumedhghaisas> Considering the popularity of Spill trees... I think we should implement it...
< sumedhghaisas> we need to decide on the implementation...
< marcosirc> Yeah, I agree.
< sumedhghaisas> do you think we should implement a separate command line for defeatist search??
< marcosirc> Ok, I have been thinking on the implementation.
< marcosirc> Mm I don't think we should implement it as a separate command line program.
< marcosirc> Maybe we can include it as a flag to the main mlpack_knn program...
< marcosirc> It would be clearer this way, I think.. For benchmarks, etc.
< marcosirc> we could print an error if epsilon value is specified for spill trees...
< sumedhghaisas> hmmm...
< marcosirc> But I don't have a strong preference... maybe we can start working implementing spill trees
< marcosirc> and onces it is ready, we decided.
< sumedhghaisas> flag does sound a viable option to me...
< sumedhghaisas> yeah I agree...
< marcosirc> yeah, maybe it will be confussing...
< sumedhghaisas> We can also decide when spill tree implementation is ready...
< marcosirc> Ok.
< marcosirc> Regarding spill trees implementation.
< marcosirc> I think it will be similar to binary space trees.
< marcosirc> However, we need to manage the list of points differently. We are going to have overlapping nodes, so we can not use range of indexes of the main dataset's matrix as we do with binary trees.
< marcosirc> I am thinking of having a general dataset instance (as we do with binary trees), and leaf nodes will hold a vector of indexes pointing to columns of that matrix.
< marcosirc> (This is what I mentioned in the last email)
< marcosirc> I think this will be the simplest/most efficient approach.
< sumedhghaisas> yes... it does look simple...
< sumedhghaisas> give me some time to think on it...
< marcosirc> ok, sure!
< rcurtin> marcosirc: I agree, I think vector of indices is the easiest way to go here
sumedhghaisas has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
< marcosirc> rcurtin: ok, thanks!
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
sumedhghaisas has joined #mlpack
mentekid has quit [Ping timeout: 258 seconds]
mentekid has joined #mlpack
< nilay_> zoq: Hello, in the forward pass of bias unit, why do we add input?
sumedhghaisas has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
< zoq> nilay_: Do you mean the forward pass of the BiasLayer?
< nilay_> yes, bias unit is not connected to any input?
< zoq> It is connected to all units in the next layer.
< zoq> So, first we call the forward pass of e.g. the LinearLayer and afterwards we use the input from the LinearLayer and add the bias term.
< zoq> The input in the Forward function is the output of the e.g. LinearLayer
< zoq> or in general the layer before the bias layer
< zoq> ah, the bias unit isn't connected with any input units, it's only connected with the units in the following layer
< nilay_> yes thats why i asked, or is it some convention
< zoq> That the bias isn't connected with the input?
< zoq> you can also integrate the bias term into the linear layer
< nilay_> ok now i understand it is the total output
< nilay_> bias layer is wrapped over the linearlayer
< zoq> yes or any other layer
< nilay_> yeah ok, thanks.
< zoq> It's uncommon to use a bias term in combination with a convolution layer.
< nilay_> it is used in the vanilla network though
< nilay_> convolutional_network_test
< zoq> yeah, maybe I should say it's uncommon for very deep network :)
< zoq> *networks
< nilay_> so i should not put it in the inception layer?
< zoq> You can do that, but e.g. if the user sets the bias to 0 you can avoid the bias term operation?
< nilay_> if user sets to zero then output = input.
< zoq> yes
< nilay_> is there a reason to not use bias? because bias are useful
< nilay_> in deep networks
< zoq> Performance reasons, it's always challenging to figure out how the network should look like for a certain task.
< nilay_> ok.
< zoq> Btw. I tested another approach (quic svd) for the pca method, in some cases it looks promising.
< nilay_> i tried understanding the math of randomized svd, i read a blog but then it referred to a paper of 74 pages :P
< zoq> FINDING STRUCTURE WITH RANDOMNESS: PROBABILISTIC ALGORITHMS FOR CONSTRUCTING APPROXIMATE MATRIX DECOMPOSITIONS
< zoq> yeah, right :)
< nilay_> so did you read it, before implementing this thing, its a lot
< zoq> I skipped the proofs :)
< nilay_> so did you get the idea of the error this(randomized svd) technique has compared to normal svd
< zoq> By reading some other realted papers. Right, now I'm not sure if I do something wrong, it looks like the QUIC-SVD method doesn't work if m=n
< nilay_> so do we need to integrate r-svd with PCA::Apply or replace it. (if the error is less we might as well replace it?)
< nilay_> or we still take components according to eigVal so it is correct always
< zoq> I think what we could do here is to change the PCA method and let the user define which method he likes to use, right now we use exact svd, randomized svd is just an approximation. In case of edge boxes an approximation is totally fine.
< nilay_> yes what i don't get is what do we lose by doing randomized svd as compared to when we do normal svd.
< zoq> precision, in case of randomized svd, we just use parts of the full data matrix.
< zoq> Probably I can work out a proof of concept ... perhaps in the next hours
< zoq> I think in that case I'll have to figure out why the quic svd method only works when m < n.
< zoq> maybe rcurtin can provide any insight?
< rcurtin> hm, it has been a while since I thought about it
< rcurtin> in this case, m is the number of returned eigenvectors, and n is the number of dimensions in the dataset?
< rcurtin> ah I guess the matrix being decomposed is m x n
< zoq> yeah, right
< rcurtin> but the first paragraph of the paper says quic-svd works for m >= n, but not m < n
< zoq> right
< rcurtin> when m < n, we can just transpose the matrix and then once the SVD is done, we switch V and U
< rcurtin> so I think maybe I don't understand what the issue is
< rcurtin> maybe I am looking at the wrong part of the paper
< zoq> maybe I used the wrong dimension, not sure right now, but I used m=n and it didn't work
< rcurtin> hm, hang on, let me take a look at the code
< rcurtin> what happens if you change quic_svd_impl.hpp:29 to be >= instead of just > ?
< zoq> it's not urgent, there is another bug in my randomized svd implementation ...
< zoq> I think I already tested >=, let's check again
< rcurtin> yeah, if that does not work, can you open a bug on github?
< rcurtin> if you want you could assign it to siddharth, but I don't know if he will be able to do anything, I am not sure how much time he has
< rcurtin> I dunno if he'll even see an email, I haven't heard from him in a while :)
< rcurtin> but I can take a look into it when I have some time (maybe a week or two, maybe more?)
< zoq> :) I'll open a bug if I get too frustrated with the code.
< rcurtin> yeah; the primary quic-svd code is in core/tree/cosine_tree/, not in methods/quic_svd/
nilay_ has quit [Ping timeout: 250 seconds]
marcosirc has quit [Quit: WeeChat 1.4]
mentekid has quit [Ping timeout: 272 seconds]