verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
Mathnerd314 has quit [Ping timeout: 250 seconds]
govg has quit [Ping timeout: 258 seconds]
govg has joined #mlpack
kwikadi has quit [Ping timeout: 258 seconds]
mentekid has joined #mlpack
kwikadi has joined #mlpack
mentekid has quit [Ping timeout: 264 seconds]
Mathnerd314 has joined #mlpack
mentekid has joined #mlpack
mentekid has quit [Ping timeout: 250 seconds]
mentekid has joined #mlpack
mentekid has quit [Ping timeout: 276 seconds]
nilay has joined #mlpack
mentekid has joined #mlpack
gtank has quit [Ping timeout: 272 seconds]
mentekid has quit [Ping timeout: 276 seconds]
gtank has joined #mlpack
< nilay>
zoq: hi, i have a doubt, can you run this test and tell me why an error comes.
< nilay>
zoq: to initialize the weights i will have to change code in the conv_layer, but then that test wouldn't be one which stands, it'll only be for evaluation
pantsforbirds has joined #mlpack
< pantsforbirds>
if im interested in contributing is there some documentation i can read? I've found the google summer of code projects, but i cant find any other contribution documents
< pantsforbirds>
rcurtin, ah thats exactly what i was looking for!
< pantsforbirds>
and thanks!
< rcurtin>
sure, please feel free to ask more questions if you like :)
nilay has quit [Ping timeout: 250 seconds]
< pantsforbirds>
so if i wanted to help with some optimizer algorithms that would be possible?
sumedhghaisas_ has joined #mlpack
< sumedhghaisas_>
marcosirc: Hey Marcos...
< marcosirc>
sumedhghaisas: Hi! how are you?!
< sumedhghaisas_>
great... had a great trip.
< sumedhghaisas_>
was exhausted the whole day...
< sumedhghaisas_>
involved too much driving around
< sumedhghaisas_>
so I looked at you mail...
< marcosirc>
Nice! I can imagine!
< marcosirc>
Ok.
< sumedhghaisas_>
So the less that k neighbours problem...
< marcosirc>
I was writing a new mail in response to ryan comments.
< sumedhghaisas_>
So my best bet would be the first solution... given that its properly documented...
< sumedhghaisas_>
but it would be fai to also consider how other libraries handle this case...
< sumedhghaisas_>
*fair
< sumedhghaisas_>
like in the defeatist search if less than k neighbours are found...
< marcosirc>
Yeah, I understand. I have implemented the 2nd solution because it was very simple to do and I thought it would be more useful for future users.
< marcosirc>
I couldn't find many libraries implementing defeatist search.
< marcosirc>
I have searched in google for a while, and found some libraries with different approaches.
< sumedhghaisas_>
I am not sure I understand the second option correctly...
< marcosirc>
I was trying to understand how them consider the tau value. I didn't analysed how they work with different values of k, so I will review this!
< marcosirc>
Sorry, maybe I didn't explained it well.
< marcosirc>
I am just writing a new email with more info.
< sumedhghaisas_>
So the third options checks for less than k candidates...
< sumedhghaisas_>
and if not... converts the overlapping node to normal node...
< sumedhghaisas_>
is that right?
< sumedhghaisas_>
I agree with you that this will add lot of complexity... checking if points are revisited or not...
< marcosirc>
Sorry, do you mean the second option?
< marcosirc>
yeah.
< marcosirc>
If less than k candidates, it considers the node as a non-overlapping node and does backtracking
< marcosirc>
At the end it was not much complexity. Only 3 lines of code :) I have implemented that in the spill-trees branch.
< sumedhghaisas_>
ahh yes sorry...
< sumedhghaisas_>
I meant runtime complexity... but this can a valid option...
< sumedhghaisas_>
if user wants all k neighbours...
< marcosirc>
yeah. I implemente a new tree trait
< marcosirc>
to know if the tree has duplicated points
< marcosirc>
it only check for duplicated candidates when the tree has duplicated points
< sumedhghaisas_>
if switching between them does not involve lot of code... I would prefer keeping both ... and passing flags to switch
< marcosirc>
so it won't modify the behaviour on other tree types.
< rcurtin>
pantsforbirds: sorry for the slow response, I was in a meeting. you are absolutely welcome to help with optimizer algorithms!
< marcosirc>
I also think it doesn't involve importante runtime complexity.
< marcosirc>
because I implemented it this way:
< sumedhghaisas_>
So without flag it would be the straightforward hybrid search.... with flag it will guarantee k neighbours...
< marcosirc>
- you calculate the position in the sorted list of candidate where you want to insert the new point.
< marcosirc>
let's call it "i".
< sumedhghaisas_>
hmmm... okay
< marcosirc>
then you analyse all the position greater or equal to "i" that have the same distance that the candidate you want to include.
< marcosirc>
if the candidate was inserted before, you will find it there, and the probability of having other candidate with the same distance is really really low.
< marcosirc>
so it won't require many operations...
< marcosirc>
Ok, I will consider the flag, but I think it could involve many changes to actual implementation...
< rcurtin>
marcosirc: the probability of having another candidate with the same distance is exceedingly low if the data is uniformly distributed, but if instead it comes from a discrete distribution (like the cloud dataset, or possibly even MNIST), neighbors with identical distances are very possible
< marcosirc>
rcurtin: Ok, I understand. Anyway, I don't think it will require too many operations.
< rcurtin>
yeah, you can simply check the neighbor index also
< marcosirc>
Yeah, that is what I mean.
< marcosirc>
I check that index "i" is not present in all the candidate with same distance than the candidate "i".
< rcurtin>
ah, okay, I see what you mean now, sorry for the misunderstanding
< sumedhghaisas_>
but still... won;t it be extra effort for the user who wants hybrid search?
< marcosirc>
Sure, sorry if I don't explain myself properly.
< sumedhghaisas_>
thats why I was suggesting maybe like a 'force k neighbours' flag :)
< marcosirc>
Mmmm, ok. But if you specify a given k, is that you want k neighbors, not less...
< marcosirc>
If you think this would be more useful, I can modify actual implementation to consider a new flag.
< marcosirc>
If you agree, I can review what is the approach of other libraries.
< sumedhghaisas_>
yes you are right... but should we alter the algorithm for it? Thats what is hard to decide...
< sumedhghaisas_>
rcurtin: What do you think about the flag option?
< marcosirc>
I sent a new email with the last information :)
< sumedhghaisas_>
marcosirc: And yes I agree that we should look into the approach by other libraries ...
< sumedhghaisas_>
sorry slipped out of my mind...
< marcosirc>
sumedhghaisas_: ok, I will do it now.
< rcurtin>
sumedhghaisas_: I am not totally sure it is necessary; I don't have much of an opinion either way
< rcurtin>
one of the things to consider is, if we do add a flag that will force the program to return k neighbors, then we should probably make the same option available for LSH and other techniques, but it is not always clear the best way to do that
< sumedhghaisas_>
rcurtin: I understand, for consistency, but in this specific case as the overhead of checking the duplicate point is not much, we will be able to provide user with more control
< sumedhghaisas_>
rcurtin: Also I installed ubuntu 16.04 ... and the default compiler is g++ 5.4.0 ... :)
< sumedhghaisas_>
I will try to solve all those issues...
< rcurtin>
(sorry, I am in a meeting... too many meetings... !)
< rcurtin>
(I'll respond when I have a chance)
pantsforbirds has quit [Ping timeout: 260 seconds]
< sumedhghaisas_>
rcurtin: ahh tell me about it :) I think my team waste more time in following agile terminology than optimizing the code
< marcosirc>
haha
< sumedhghaisas_>
btw I installed touchegg on my new installation.. amazing it is. now I can do macbook like trackpad getures on ubuntu...
< sumedhghaisas_>
took me a night to set it up so if someone else wants help I can provide the prebuilt scripts :)
nilay has joined #mlpack
< nilay>
zoq: does this test look good? https://gist.github.com/nilayjain/e2ec2fbb02955508b64812b1b996d1aa ? i know there are a few tweaks to be made, right now i am just printing values, but does this establishes correctness for forward and backward pass or do we need more stern tests? , let me know
nilay has quit [Quit: Page closed]
< marcosirc>
sumedhghaisas_: rcurtin: In this implementation:
< marcosirc>
I couldn't find more documentation than the code itself. After reading it, it looks like they guarantee k candidates, but I don't think they check for repeated points..