verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
Mathnerd314 has quit [Ping timeout: 250 seconds]
govg has quit [Ping timeout: 258 seconds]
govg has joined #mlpack
kwikadi has quit [Ping timeout: 258 seconds]
mentekid has joined #mlpack
kwikadi has joined #mlpack
mentekid has quit [Ping timeout: 264 seconds]
Mathnerd314 has joined #mlpack
mentekid has joined #mlpack
mentekid has quit [Ping timeout: 250 seconds]
mentekid has joined #mlpack
mentekid has quit [Ping timeout: 276 seconds]
nilay has joined #mlpack
mentekid has joined #mlpack
gtank has quit [Ping timeout: 272 seconds]
mentekid has quit [Ping timeout: 276 seconds]
gtank has joined #mlpack
< nilay> zoq: hi, i have a doubt, can you run this test and tell me why an error comes.
govg has quit [Ping timeout: 240 seconds]
govg has joined #mlpack
< nilay> zoq: to initialize the weights i will have to change code in the conv_layer, but then that test wouldn't be one which stands, it'll only be for evaluation
pantsforbirds has joined #mlpack
< pantsforbirds> if im interested in contributing is there some documentation i can read? I've found the google summer of code projects, but i cant find any other contribution documents
< rcurtin> pantsforbirds: have you seen http://www.mlpack.org/involved.html ?
< rcurtin> I like the nick, by the way :)
< pantsforbirds> rcurtin, ah thats exactly what i was looking for!
< pantsforbirds> and thanks!
< rcurtin> sure, please feel free to ask more questions if you like :)
nilay has quit [Ping timeout: 250 seconds]
< pantsforbirds> so if i wanted to help with some optimizer algorithms that would be possible?
sumedhghaisas_ has joined #mlpack
< sumedhghaisas_> marcosirc: Hey Marcos...
< marcosirc> sumedhghaisas: Hi! how are you?!
< sumedhghaisas_> great... had a great trip.
< sumedhghaisas_> was exhausted the whole day...
< sumedhghaisas_> involved too much driving around
< sumedhghaisas_> so I looked at you mail...
< marcosirc> Nice! I can imagine!
< marcosirc> Ok.
< sumedhghaisas_> So the less that k neighbours problem...
< marcosirc> I was writing a new mail in response to ryan comments.
< sumedhghaisas_> So my best bet would be the first solution... given that its properly documented...
< sumedhghaisas_> but it would be fai to also consider how other libraries handle this case...
< sumedhghaisas_> *fair
< sumedhghaisas_> like in the defeatist search if less than k neighbours are found...
< marcosirc> Yeah, I understand. I have implemented the 2nd solution because it was very simple to do and I thought it would be more useful for future users.
< marcosirc> I couldn't find many libraries implementing defeatist search.
< marcosirc> I have searched in google for a while, and found some libraries with different approaches.
< sumedhghaisas_> I am not sure I understand the second option correctly...
< marcosirc> I was trying to understand how them consider the tau value. I didn't analysed how they work with different values of k, so I will review this!
< marcosirc> Sorry, maybe I didn't explained it well.
< marcosirc> I am just writing a new email with more info.
< sumedhghaisas_> So the third options checks for less than k candidates...
< sumedhghaisas_> and if not... converts the overlapping node to normal node...
< sumedhghaisas_> is that right?
< sumedhghaisas_> I agree with you that this will add lot of complexity... checking if points are revisited or not...
< marcosirc> Sorry, do you mean the second option?
< marcosirc> yeah.
< marcosirc> If less than k candidates, it considers the node as a non-overlapping node and does backtracking
< marcosirc> At the end it was not much complexity. Only 3 lines of code :) I have implemented that in the spill-trees branch.
< sumedhghaisas_> ahh yes sorry...
< sumedhghaisas_> I meant runtime complexity... but this can a valid option...
< sumedhghaisas_> if user wants all k neighbours...
< marcosirc> yeah. I implemente a new tree trait
< marcosirc> to know if the tree has duplicated points
< marcosirc> it only check for duplicated candidates when the tree has duplicated points
< sumedhghaisas_> if switching between them does not involve lot of code... I would prefer keeping both ... and passing flags to switch
< marcosirc> so it won't modify the behaviour on other tree types.
< rcurtin> pantsforbirds: sorry for the slow response, I was in a meeting. you are absolutely welcome to help with optimizer algorithms!
< marcosirc> I also think it doesn't involve importante runtime complexity.
< marcosirc> because I implemented it this way:
< sumedhghaisas_> So without flag it would be the straightforward hybrid search.... with flag it will guarantee k neighbours...
< marcosirc> - you calculate the position in the sorted list of candidate where you want to insert the new point.
< marcosirc> let's call it "i".
< sumedhghaisas_> hmmm... okay
< marcosirc> then you analyse all the position greater or equal to "i" that have the same distance that the candidate you want to include.
< marcosirc> if the candidate was inserted before, you will find it there, and the probability of having other candidate with the same distance is really really low.
< marcosirc> so it won't require many operations...
< marcosirc> Ok, I will consider the flag, but I think it could involve many changes to actual implementation...
< rcurtin> marcosirc: the probability of having another candidate with the same distance is exceedingly low if the data is uniformly distributed, but if instead it comes from a discrete distribution (like the cloud dataset, or possibly even MNIST), neighbors with identical distances are very possible
< marcosirc> rcurtin: Ok, I understand. Anyway, I don't think it will require too many operations.
< rcurtin> yeah, you can simply check the neighbor index also
< marcosirc> Yeah, that is what I mean.
< marcosirc> I check that index "i" is not present in all the candidate with same distance than the candidate "i".
< rcurtin> ah, okay, I see what you mean now, sorry for the misunderstanding
< sumedhghaisas_> but still... won;t it be extra effort for the user who wants hybrid search?
< marcosirc> Sure, sorry if I don't explain myself properly.
< sumedhghaisas_> thats why I was suggesting maybe like a 'force k neighbours' flag :)
< marcosirc> Mmmm, ok. But if you specify a given k, is that you want k neighbors, not less...
< marcosirc> If you think this would be more useful, I can modify actual implementation to consider a new flag.
< marcosirc> If you agree, I can review what is the approach of other libraries.
< sumedhghaisas_> yes you are right... but should we alter the algorithm for it? Thats what is hard to decide...
< sumedhghaisas_> rcurtin: What do you think about the flag option?
< marcosirc> I sent a new email with the last information :)
< sumedhghaisas_> marcosirc: And yes I agree that we should look into the approach by other libraries ...
< sumedhghaisas_> sorry slipped out of my mind...
< marcosirc> sumedhghaisas_: ok, I will do it now.
< rcurtin> sumedhghaisas_: I am not totally sure it is necessary; I don't have much of an opinion either way
< rcurtin> one of the things to consider is, if we do add a flag that will force the program to return k neighbors, then we should probably make the same option available for LSH and other techniques, but it is not always clear the best way to do that
< sumedhghaisas_> rcurtin: I understand, for consistency, but in this specific case as the overhead of checking the duplicate point is not much, we will be able to provide user with more control
< sumedhghaisas_> rcurtin: Also I installed ubuntu 16.04 ... and the default compiler is g++ 5.4.0 ... :)
< sumedhghaisas_> I will try to solve all those issues...
< rcurtin> (sorry, I am in a meeting... too many meetings... !)
< rcurtin> (I'll respond when I have a chance)
pantsforbirds has quit [Ping timeout: 260 seconds]
< sumedhghaisas_> rcurtin: ahh tell me about it :) I think my team waste more time in following agile terminology than optimizing the code
< marcosirc> haha
< sumedhghaisas_> btw I installed touchegg on my new installation.. amazing it is. now I can do macbook like trackpad getures on ubuntu...
< sumedhghaisas_> took me a night to set it up so if someone else wants help I can provide the prebuilt scripts :)
nilay has joined #mlpack
< nilay> zoq: does this test look good? https://gist.github.com/nilayjain/e2ec2fbb02955508b64812b1b996d1aa ? i know there are a few tweaks to be made, right now i am just printing values, but does this establishes correctness for forward and backward pass or do we need more stern tests? , let me know
nilay has quit [Quit: Page closed]
< marcosirc> sumedhghaisas_: rcurtin: In this implementation:
< marcosirc> They do defeatist search while the node has at least k descendant points.
< marcosirc> Then, they do normal dfs search with prune rules. So, they can guarantee they will always return k points.
< marcosirc> It is a similar approach to what we do.
< marcosirc> But, it looks like they don't care about repeated points... I can't find more documentation than the code itself!
< zoq> nilay: nice way to initialize the weights, one last step you should do is to compare the output with a reference output, similar to the convolution test: https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/convolution_test.cpp
< zoq> nilay: Does this mean you fixed the error?
sumedhghaisas_ has quit [Ping timeout: 260 seconds]
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1226 (master - 8900b8c : Ryan Curtin): The build is still failing.
travis-ci has left #mlpack []
< marcosirc> sumedhghaisas_: rcurtin: after searching some time in google and github, I couldn't find many popular libraries implementing spill trees.
< marcosirc> The most relevant code that I found is an old opencv implementation (the same that I mentioned before):
< marcosirc> I couldn't find more documentation than the code itself. After reading it, it looks like they guarantee k candidates, but I don't think they check for repeated points..