verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
slardar has quit [Ping timeout: 276 seconds]
< marcosirc>
quit
marcosirc has quit [Quit: WeeChat 1.4]
tham has joined #mlpack
< tham>
nilay : as rcurtin mentioned, you can use arma::shuffle
< tham>
or use std::random_device to generate random seed
< tham>
std::random_shuffle will be deprecated in c++17, it is not recommend to use it
< tham>
random library of c++11 is very powerful
< tham>
you can assigned your own seed to the default_random_eigine too
mentekid has joined #mlpack
nilay has joined #mlpack
nilay has quit [Ping timeout: 250 seconds]
nilay has joined #mlpack
< nilay>
tham: thanks, i used arma::shuffle only.
nilay has quit [Ping timeout: 250 seconds]
tham has quit [Ping timeout: 250 seconds]
Mathnerd314 has quit [Ping timeout: 272 seconds]
tsathoggua has joined #mlpack
tsathoggua has quit [Client Quit]
< mentekid>
rcurtin: I completed the changes for #663 (LSH Table Access). I added a new constructor that creates the object directly from an arma::cube (and some other optional parameters) and merged Train() and BuildHash()
< mentekid>
I also described the changes that affect users in History.md, so please check that too :_
< mentekid>
:) *
< mentekid>
tests pass on my machine, but some merge conflicts remain, I'm not sure what to do about those but they occurred some time after my first commits
anshu has joined #mlpack
< mentekid>
I think next thing I should do is maybe remove some of the probabilistic tests we had and a small one we've solved "by hand" - for example 10 points projected onto 1-2 tables we have defined
< mentekid>
I need to think how we can account for the "bias" vector though
< mentekid>
which as it is we don't have access to
< anshu>
Hi guys! I am interested in machine learning and want to contribute to mlpack. How do I start with it?
< zoq>
anshu: Hello, see http://mlpack.org/involved.html for more information ... that page has a lot of other good links for getting involved too.
< anshu>
how much C++ should i know in order to contribute to mlpack?
< zoq>
anshu: That depends on the things you like to do; I'd like to cite the mlpack wiki here:
< zoq>
"The "necessary knowledge" sections can often be replaced with "willing to learn" for the easier projects, and for some of the more difficult problems, a full understanding of the description statement and some coding knowledge is sufficient."
< anshu>
actually i am very new to all this, i am doing the andrew NG course on machine learning and i'd like to learn more so that i could contribute to mlpack. Can any of you tell me, how should i go about in doing it?
< zoq>
anshu: So, I think you could take a look at the issues list and see if you find something interesting. Another way is to contribute some new method e.g. if you are interested in neural networks you could implement an interesting network layer.
< zoq>
anshu: If you are interested in trees, you can implement an interesting tree type.
< keonkim>
rcurtin: hmm... regarding the absolute path problem, should we be just using train.csv and test.csv?
< rcurtin>
keonkim: that's a solution, I think that is reasonable
< rcurtin>
I think it would also be reasonable to remove the warnings if the user doesn't pass --training_file or --test_file, but I don't have a problem leaving them there either
< rcurtin>
since if they type '-h' it will show them what the defaults are
< rcurtin>
(that's why I removed the "default is 0.2" from the documentation of --test_ratio)
< keonkim>
wow you type very fast
< rcurtin>
hah, maybe those typing classes in high school paid off :)
< keonkim>
I think leaving the warnings is ok because people might use this in a crowded directory without carefully reading -h (thats what I do usually..)
< rcurtin>
yep, fine by me
< marcosirc>
rcurtin: sorry, I am not sure I undertood properly. Do, you want to merge the modifications to use b_aux, or just forget about that?
< rcurtin>
let's just go ahead and merge both, since you made NeighborSearchStat smaller overall (well, probably, depends on compiler and alignment and other things like that... but it's at least no larger)
< rcurtin>
b_aux would be good to merge in, since maybe lozhnikov might be creating a tree type that causes the B_2 bug to surface
< rcurtin>
I think when you write spill trees that that will not expose the issue
< rcurtin>
since spill trees will basically be like overlapping kd-trees and you can do it with all the points in the leaves
< marcosirc>
Yes, I think so.
< marcosirc>
Thanks, ok I will made the pull request!
marcosirc has quit [Quit: WeeChat 1.4]
< zoq>
rcurtin keonkim: I'm not sure it is a good idea, to set a default training and test set. We probably overwrite an already existing file right?
< keonkim>
zoq: yes it writes on existing file. I remember now, thats why I decided to prepend.
< rcurtin>
zoq: keonkim: but if we choose a default file regardless it will overwrite something, so maybe we should make the training and test file PARAM_STRING_REQ() parameters, and then force the user to specify --training_labels_file and --test_labels_file if --labels_file is specified
< rcurtin>
I think there are a couple other places in the code where default filenames are given, maybe we should go through and remove all those situations
< rcurtin>
lozhnikov: I am still looking through your PR, there is a lot of code to understand so it may be a few hours before I am able to add more comments :)
< zoq>
I agree, using PARAM_STRING_REQ would save us a lot of trouble.
< zoq>
Also, I think deleting line 21 in serialization_template_version.hpp, would fix the windows build, I'm on the phone, so I can't test it right now.
< keonkim>
zoq rcurtin: I can fix (or find) those default filename issues on the other files while fixing this one.
benchmark has joined #mlpack
benchmark has quit [Client Quit]
< rcurtin>
keonkim: sure, please do, basically all you'll need to do is look for PARAM_STRING()s that have default filenames, and then change the code so that the output is not saved if the file is not specified
< rcurtin>
this is done in src/mlpack/methods/neighbor_search/knn_main.cpp, take a look at how --neighbors_file is handled if it isn't specified to see what I mean
< keonkim>
yep, thanks for the tip
sumedhghaisas has quit [Ping timeout: 240 seconds]
< zoq>
lpack_allkrann: option '-s' is ambiguous and matches '--seed', and '--single_mode' ... 'S' is already taken
< rcurtin>
I think maybe we should change --single_mode to -S and change --single_sample_limit to -L
< rcurtin>
ah but -L is taken! bah
< rcurtin>
I guess -m is not taken
< zoq>
:)
< rcurtin>
I think we need some kind of static checking utility to ensure the same option is not registered twice
< rcurtin>
I guess maybe that wouldn't be static, we could do it when the Option object is added to the CLI object...
< rcurtin>
I'll go ahead and open a ticket, maybe some beginner might be interested in that
< zoq>
sounds good, I just noticed it because of the failed benchmark