verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
manish7294 has joined #mlpack
< manish7294> rcurtin: I am trying to load covertype dataset as a csv file. But somehow, I am repeatedly getting file extension not recognised error. Do you have any idea, what could possibly be going wrong here?
< manish7294> I am using kaggle's covertype.csv
< rcurtin> manish7294: how are you loading it?
< manish7294> rcurtin: using command line -i covertype.csv
< rcurtin> there should not be a problem with that, but I don't really have enough information to help you debug here
< rcurtin> what is the full command line that you are using?
< manish7294> rcurtin: bin/mlpack_lmnn -i covertype.csv -k 5 -a 0.01 -o output.csv --verbose
< manish7294> rcurtin: I will try to debug more.
< rcurtin> I think you'll have to use a debugger, I don't easily see any reason why that should give you problems with file extensions
< rcurtin> you could catch when the exception is thrown and step through the backtrace from there, I imagine that will help you figure out where the issue is
< rcurtin> if it's an issue in the mlpack loading code, let's definitely fix it, but that's not one I've seen before
< rcurtin> I'm going to head to bed now---good night! :)
< manish7294> good night :)
< manish7294> converting dataset to .txt worked.
< jenkins-mlpack> Project docker mlpack weekly build build #44: STILL UNSTABLE in 3 hr 11 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20weekly%20build/44/
ImQ009 has joined #mlpack
ShikharJ has joined #mlpack
< ShikharJ> zoq: Apart from the epochs and the pre-training mentioned, the parameters were unchanged. So, atleast we have some meaningful defaults.
< ShikharJ> zoq: Thanks for the link, but I figured out a way to plot using matplotlib :)
< ShikharJ> zoq: I got some results, I'll comment on the PR.
< ShikharJ> This was really fast :)
mikeling has joined #mlpack
ShikharJ has quit [Quit: Page closed]
ImQ009 has quit [Quit: Leaving]
ShikharJ has joined #mlpack
< ShikharJ> zoq: Ack, I set the sampling size to 400 instead of 10, so it ended up in the output being incomprehensible, I have tmux'd it again, with the changes, let's see.
< ShikharJ> zoq: Sorry, if I missed any messages after I said that I'll comment on the PR. EliteBNC is supposedly down, so I'm back on freenode webchat.
< Atharva> ShikharJ: You can always check the irc logs :)
< ShikharJ> Atharva: Yeah, did that. Thankfully, I didn't miss anything :)
< Atharva> Good to know.
< ShikharJ> zoq: If you have some hyper-parameter suggestions, we can also explore them. It takes less than 12 hours on the full dataset with the maxed out O'Reilly test parameters. We should really test our output for the single optimizer case, and then contrast them with the dual optimizer case output.
ImQ009 has joined #mlpack
< ShikharJ> zoq: The optimization support, I feel, would be better done along with support for batch-sizes, as I really can't think of much benefit in having two separate optimizers that iterate on singular inputs (apart from the check reduction that you mentioned). Let's push in the support for GAN and DCGAN till Phase 1, so that we can have the basic infrastructure ready, and then we can focus on this task.
< ShikharJ> In the meantime, we can also collect a lot of output data on different parameters, so in case the separation leads to a worse output than before, we wouldn't have to worry, as the GAN infrastructure would be already incorporated into mlpack.
ShikharJ is now known as 43UAB3PW7
ShikharJ has joined #mlpack
petris has joined #mlpack
43UAB3PW7 has quit [Quit: Page closed]
mikeling has quit [Quit: Connection closed for inactivity]
ImQ009 has quit [Quit: Leaving]
< jenkins-mlpack> Yippee, build fixed!
< jenkins-mlpack> Project docker mlpack nightly build build #336: FIXED in 2 hr 35 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/336/
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
< Atharva> rcurtin: armadillo doesn't overload the ? : operators, does it?
< Atharva> In the softplus activation function, for vectors it has been done by looping over all the elements, and it doesn't support matrices.
< manish7294> Atharva: You may use for_each or transform for this.
< Atharva> manish7294: Performance wise, is for_each comparable to for loops or operator overloading (like we have for + - / ...)?
< manish7294> Atharva: I am not sure regarding this, but I think here for loop may be the case. I think Ryan can tell more about this. You can also have a quick look over armadillo code, it may help.
< Atharva> Anyway, I think I will go with for_each. Thanks for that! Yeah, I will have a look over the armadillo code.
< manish7294> great :)
< ShikharJ> zoq: It turns out on the benchmark systems, the training takes a lot faster than I expected (probably less than 5 hours with maxed out hyperparameters).
ImQ009 has joined #mlpack
< ShikharJ> zoq: Looks like I messed up by providing the wrong input dataset. Gosh!
< ShikharJ> rcurtin: Do we have a ready to deploy mnist dataset available for use in mlpack? Would the one in mlpack/models work?
< rcurtin> yeah, that one could work. a subset of mnist is in src/mlpack/tests/data/, but it's not very large
< ShikharJ> rcurtin: Yeah, I worked with that and got some promising results which I've put up on the PR. But I'm not aware if the format of individual pixels (along rows or along columns) is the same in both?
< rcurtin> hmm, I think the subset may actually be transposed, but I'm not sure. it's stored as a binary Armadillo matrix (.arm), so you could load it and see
< rcurtin> it should be 784 rows, N columns (where N is the number of points)
< ShikharJ> rcurtin: In mlpack/models, the test dataset is 784 columns, so definitely transposed.
< rcurtin> ah, ok... not sure why it is transposed
< ShikharJ> zoq: It'll take me some time to prepare the dataset and upload the results it seems, nevertheless, I have begun the work on DCGAN.
manish7294 has quit [Ping timeout: 245 seconds]
wenhao has joined #mlpack
< Atharva> rcurint: zoq: Do take a look at #1414
< rcurtin> Atharva: yeah, I saw the email, I'll try to get to it today
< Atharva> rcurtin: sure, whenever you get time
Trion has joined #mlpack
< wenhao> lozhnikov: Hi Mikhail. I am trying to do neighbor search with cosine distance but NeighborSearch with KDTree may not work with cosine distance. I guess one way to do that is to first normalize all query vectors and reference set vectors to unit length, and then perform NeighborSearch with Euclidean distance.
< wenhao> That's because, with normalized vectors, neighbor search with Cosine Distance is equivalent to neighbor search with Euclidean Distance. But I am not sure whether my proof/calculation is correct. What do you think?
< rcurtin> wenhao: that's correct
< rcurtin> if you don't want to normalize, another option would be to use FastMKS (fast max-kernel search)
< rcurtin> however, the bounds are tighter for pruning with nearest neighbor search, so unless there is a good reason to avoid normalization I think that may be the better strategy
< rcurtin> if you're interested in reading more, the paper http://www.ratml.org/pub/pdf/2014fastmks.pdf has a description of the max-kernel search problem and when it reduces to nearest neighbor search
< rcurtin> but I am not sure how interesting the paper is. my perspective is biased :)
Trion has quit [Quit: Entering a wormhole]
< wenhao> rcurtin: Thanks! The mks problem sounds interesting. What's the time complexity if I use fastmks to search for k neighbors (k > 1) instead of only one neighbor with maximal kernel?
mikeling has joined #mlpack
< wenhao> I am comparing which solution could be faster
< rcurtin> hm, so the asymptotic time complexity depends all kinds of strange constants that we probably won't know
< rcurtin> but in reality the FastMKS algorithm requires building a cover tree in kernel space, and usually that takes a lot longer than building a kd-tree in the original space
< rcurtin> hang on, I have a meeting, back later...
< Atharva> rcurtin: Sorry I just realized i misspelled your nick in my second last message.
< rcurtin> Atharva: no worries, I didn't notice :)
mikeling has quit [Quit: Connection closed for inactivity]
ImQ009 has quit [Read error: Connection reset by peer]
ImQ009 has joined #mlpack
vivekp has quit [Ping timeout: 244 seconds]
travis-ci has joined #mlpack
< travis-ci> ShikharJ/mlpack#168 (DCGAN - 1f6ee54 : Shikhar Jaiswal): The build has errored.
< travis-ci> Change view : https://github.com/ShikharJ/mlpack/compare/38e691ba9b9a^...1f6ee54a2196
travis-ci has left #mlpack []
ImQ009 has quit [Quit: Leaving]