#mlpack on 2018-06-14 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

01:49 vivekp has quit [Ping timeout: 248 seconds]

01:50 vivekp has joined #mlpack

02:31 yaswagner has quit [Ping timeout: 260 seconds]

04:46 travis-ci has joined #mlpack

04:46 < travis-ci> manish7294/mlpack#28 (master - 0128ef7 : Marcus Edel): The build passed.

04:46 < travis-ci> Change view : https://github.com/manish7294/mlpack/compare/d390433adc55...0128ef719418

04:46 < travis-ci> Build details : https://travis-ci.com/manish7294/mlpack/builds/76244163

04:46 travis-ci has left #mlpack []

05:12 vivekp has quit [Ping timeout: 245 seconds]

05:15 vivekp has joined #mlpack

07:04 < ShikharJ> zoq: I'm sorry I couldn't reach out at that time, could you tell me what timezone are you in so that I can plan accordingly?

07:58 < ShikharJ> zoq: Pardon me if I'm misunderstanding, but isn't inSize parameter intended to serve as the parameter for the number of channels in an input image?

08:34 < zoq> ShikharJ: No worries, I got distracted so I couldn't take a look at the issue at the time you posted it, I'm in UTC + 2.

08:34 < zoq> ShikharJ: About inSize you are right, and that would effect the number of weights, so we have to modify the conv layer; take a look at the inputTemp parameter, currently it dos only account for the first sample.

08:52 < ShikharJ> zoq: I'll let you know what I find by today evening.

08:57 < zoq> ShikharJ: Okay, hopefully the changes are minor.

09:48 < jenkins-mlpack> Project docker mlpack nightly build build #349: STILL UNSTABLE in 2 hr 34 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/349/

10:12 sulan_ has joined #mlpack

10:20 sulan_ has quit [Quit: Leaving]

10:58 manish7294 has joined #mlpack

11:00 < manish7294> zoq: How do we make some new dataset additions to mlpack.org/datasets?

11:02 < manish7294> And, can we add categorical datasets, mainly having categorical labels?

11:03 < zoq> You can open an issue with the link to the dataset; I'll upload the dataset afterwards or just add the dataset to the list; or you can post the link here.

11:04 < zoq> sure

11:05 < manish7294> https://archive.ics.uci.edu/ml/datasets/letter+recognition

11:05 < manish7294> http://archive.ics.uci.edu/ml/datasets/balance+scale

11:05 < manish7294> These two additions will be good enough.

11:14 < manish7294> zoq: Can benchmarking systems automatically handle categorical dataset?

11:15 < zoq> mlpack.org/datasets/balance_scale.tar.gz

11:15 < zoq> mlpack.org/datasets/letter_recognition.tar.gz

11:15 < manish7294> zoq: great :)

11:16 < zoq> manish7294: That depends on the lib, in case of mlpack all we do is forwarding the dataset (filename + path).

11:17 < zoq> manish7294: I guess if you like to benchmark against matlab you might have to adjust the benchmark script since it's probably using dlmread.

11:18 < zoq> manish7294: If mlpack can't handle the dataset right away we can write a simple preprocess step in python and pass the modified dataset.

11:21 < manish7294> zoq: That's sounds about right. Can we modify the method too as I remeber Haritha doing something like that for decision tree to support categorical data, though I am not sure?

11:22 < zoq> manish7294: To handle categorical data?

11:22 < manish7294> right

11:22 < manish7294> I am not sure :)

11:23 < manish7294> It was long time ago

11:25 < manish7294> zoq: https://github.com/mlpack/mlpack/pull/1195 may be this is what I was thinking about.

11:27 < zoq> manish7294: yes, that's the one.

11:27 < zoq> manish7294: Should be straightforward, if you follow the PR.

11:28 < manish7294> zoq: I will follow it then :)

12:37 < manish7294> rcurtin: Here is some results on letters dataset(20000 instances, 16 attributes) - lbfgs, k = 3, total time - 3 mins 49.5 secs, initial accu - 96.285, final - 96.905; amsgrad, total time - 4 mins 53.6 secs, final - 97.335;

13:19 < Atharva> sumedhghaisas: You there?

13:42 < sumedhghaisas> Atharva: Hi Atharva

13:42 < sumedhghaisas> here now :)

14:11 < Atharva> Will you be available in 2 hours?

14:25 ImQ009 has joined #mlpack

14:29 < manish7294> zoq: rcurtin: Currently I am able to successfully execute LMNN benchmarks on my local system by copying mlpack_lmnn to libraries/bin/mlpack_lmnn. Can you please guide me how I can run the same on benchmarks system using my lmnn branch as lmnn is not yet merged?

14:30 < rcurtin> manish7294: hey there... I slept in a little this morning

14:30 < rcurtin> so, there's a script in libraries/ called 'mlpack_install.sh'

14:30 < manish7294> rcurtin: :)

14:31 < rcurtin> basically, all that does is build mlpack in the libraries/mlpack/ directory

14:31 < sumedhghaisas> Atharva: Ahh sorry missed your msg

14:31 < rcurtin> so, you could, after setting up all the other libraries, manually build your branch of mlpack in libraries/mlpack/

14:31 < rcurtin> the CMake configuration used is:

14:31 < rcurtin> cmake -DCMAKE_INSTALL_PREFIX=../../ -DBUILD_TESTS=OFF ../

14:31 < rcurtin> and you would also 'make install'

14:31 < sumedhghaisas> What time are you referring to?

14:32 < rcurtin> does that make sense?

14:33 < manish7294> Right, I got it.

14:33 < Atharva> sumedhghaisas: 1 hour 40 minutes feom now

14:33 < Atharva> 9:45 ist

14:34 < manish7294> All I have to do is same copying the lmnn branch bin folder to libraries/bin

14:34 < rcurtin> it might also be a good idea to, in your local benchmarks repo, comment out the mlpack_install.sh call from the install_all.sh script, in case you accidentally type 'make setup'

14:34 < rcurtin> manish7294: no, I don't think that will work, because that may depend on parts of libmlpack.so that aren't there

14:35 < rcurtin> so I think it's better to build the whole library in libraries/mlpack/ with the CMake configuration above, then 'make install' to put it correctly in libraries/bin and libraries/lib

14:35 < manish7294> Okay, I will take care

14:35 < manish7294> And should I do this on slake itself

14:35 < rcurtin> yeah, I think it's fine to do it on slake

14:36 < rcurtin> if you do all the 'make run' runs there, they'll be comparable to each other

14:37 < manish7294> I will be taking covtype as the limiting(in terms of maximum size) dataset

14:38 < rcurtin> right, that sounds good

14:38 < manish7294> rcurtin: And should I create a pull request on benchmarks repo or should I wait till lmnn merge?

14:39 < rcurtin> if it is taking like 13 hours to run, you might want to start with a 5k or 50k subset

14:39 < rcurtin> you can open the PR now if you like, but we should wait to merge it until LMNN is merged

14:39 < rcurtin> (which should be fairly soon I think)

14:40 < manish7294> rcurtin: Are you randomly selecting 5k points of covertype to make 5k covertype.

14:40 < manish7294> or are they 1st 5k points

14:40 < rcurtin> I took them randomly

14:43 < manish7294> Can you share it with me, if it is an independent file? Or I will make a more substantial one for me as I was earlier using a subset of covertype-small, which was missing some classes.

14:43 < sumedhghaisas> Atharva: Sure. I think I will be free.

14:45 < manish7294> rcurtin: Hoping, I am not ruining your vacation ( It's once in a while chance) :)

14:46 < rcurtin> no, it's no problem at all, it is a vacation from Symantec not everything :)

14:46 < rcurtin> I am just staying at home this week anyway

14:46 < rcurtin> soon I will go see if my new brakes work well (hopefully they do so I will come back)

14:46 < rcurtin> which I guess means it is important that I get you the datasets now, it could be the last chance :)

14:46 < rcurtin> http://www.ratml.org/misc/covertype-5k.csv

14:46 < rcurtin> http://www.ratml.org/misc/covertype-5k.labels.csv

14:46 < rcurtin> http://www.ratml.org/misc/covertype-50k.csv

14:46 < manish7294> rcurtin: great :)

14:46 < rcurtin> http://www.ratml.org/misc/covertype-50k.labels.csv

14:47 < rcurtin> ok, I'll be back later. in the worst case I might have to use the emergency brake but I think everything will be fine :)

14:49 < manish7294> Thanks!

15:07 sulan_ has joined #mlpack

15:20 < sumedhghaisas> Atharva: Hi Atharava, I have to sync up with someone at work at 17 BST. Could we discuss at 18:00 BST?

15:22 < rcurtin> perfect, brakes work great :)

15:28 < manish7294> rcurtin: That was a quick test drive. So, emergency breaks didn't get a chance, haha :)

15:29 manish72942 has joined #mlpack

15:33 manish7294 has quit [Ping timeout: 260 seconds]

16:00 < Atharva> sumedhghaisas: Yes sure!

17:01 < Atharva> sumedhghaisas: You there?

17:05 manish72942 has quit [Ping timeout: 255 seconds]

17:15 vivekp has quit [Ping timeout: 255 seconds]

17:30 < sumedhghaisas> Atharva: So sorry. The meeting stretched for long.

17:30 < sumedhghaisas> I am here now :)

17:31 < Atharva> sumedhghaisas: Meetings always do :)

17:31 < sumedhghaisas> Atharva: So true :)

17:31 < sumedhghaisas> so whats up?

17:32 < Atharva> I had some questions about the normaldist class

17:32 < Atharva> Do we need the Train functions which the GaussianDistribution class has?

17:34 < sumedhghaisas> Ohh... ummm not really. Let me think.

17:35 < sumedhghaisas> In any case we can implement it later

17:38 < Atharva> Yeah, so not now.

17:38 < sumedhghaisas> Sure.

17:38 < Atharva> Another question is if we should allow vectors and cubes or just matrices?

17:38 < Atharva> I think we should think about RNN support as well, so cubes should be allowed I guess

17:38 < sumedhghaisas> ohh cubes are necessary as the output might be conv

17:38 < Atharva> Yup

17:38 < sumedhghaisas> so I would say vector, matrices and cubes

17:38 < Atharva> So, I will define multiple constructors

17:38 < sumedhghaisas> or maybe just templatize it?

17:38 < Atharva> Yes, but still, we need to take different number of parameters for each data type

17:38 < Atharva> in the constructor

17:38 < Atharva> for the size

17:38 < sumedhghaisas> I am not sure I understand that

17:38 < sumedhghaisas> why exactly you need size?

17:39 < sumedhghaisas> also you could infer size from the matrix itself

17:39 < Atharva> Hmm, in the constructor NormalDistribution(), when we are making a new set of distributions, if it's a matrix, then we need something like NormalDistribution(n_rows, n_cols)

17:40 < Atharva> Because in Gaussian, it only supports vector and it's multivariate, so they take GaussianDistribution(dimension)

17:41 < sumedhghaisas> What about NormalDistribution(arma::mat mean, arma::mat variance)?

17:41 < Atharva> This creates standard normals

17:41 < sumedhghaisas> ahh I mean templatize it properly

17:41 < Atharva> That will be there, this one is for standard normal of a given size

17:41 < Atharva> We don't need it, but it's good to have that I think

17:42 < sumedhghaisas> hmm... I am conteplating if we should define these distributions inside ANN framework or not

17:42 < sumedhghaisas> although I would say lets not have the size constructor

17:43 < Atharva> Okayy

17:43 < sumedhghaisas> in case if the user wants, he can create a dist by generating constant matrix

17:43 < sumedhghaisas> of required size

17:43 < Atharva> Yes that's easy

17:44 < Atharva> So, as Ryan said, we shouldn't make layers output distributions if layers after them expect matrices

17:44 < Atharva> So, this is just for the final layer, right?

17:45 < sumedhghaisas> Ahh no, so as Ryan suggested, what we do instead is, accept the distribution in the layer its being used rather than the layer outputing it

17:45 < sumedhghaisas> So VAE would output a matrix, and loss layer will define a dist over it and use it

17:46 < Atharva> Got it, so the network isn't affected, everything happens in one layer

17:46 < sumedhghaisas> everything happend in one layer? sorry didn't get that

17:47 < Atharva> Sorry, I meant that the dist objects are just used in the layer which needs it as input, rest of the network operates normally on matrices

17:48 < sumedhghaisas> ahh yes. :)

17:48 < Atharva> You said that the logprob will give us the reconstruction loss, but what if the user wants to use some loss for reconstruction?

17:49 < sumedhghaisas> Usually in VAEs, reconstruction loss is defined over some distribution only

17:49 < sumedhghaisas> in any case, if the user wants some other loss, he could replace the ReconstructionLoss layer and use his own

17:50 < Atharva> Okay, so after the distribution, next task to is to implement a ReconstructionLoss layer, right?

17:50 < sumedhghaisas> yes

17:50 < sumedhghaisas> ReconstructionLoss will take a distribution to define the loss

17:51 < Atharva> So, you are saying that in VAEs we shouldn't sample from the output distribution, say an image, and then use some simple loss such as mean squared between the output image and training image

17:51 < Atharva> Will the loss always be taken with the output distribution?

17:55 travis-ci has joined #mlpack

17:55 < travis-ci> mlpack/mlpack#5079 (master - e08e761 : Marcus Edel): The build has errored.

17:55 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/86bcca6f7b7e...e08e76105072

17:55 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/392371943

17:55 travis-ci has left #mlpack []

17:58 < Atharva> Also, I think we should discuss implementation details of the ReconstructionLoss layer now, because the dist won't take long now. I will try to complete the layer and test it before the week ends

17:59 < sumedhghaisas> Atharva: got distracted with something

17:59 < sumedhghaisas> Sure.

17:59 < sumedhghaisas> I still need to merge the Repar layer

18:00 < Atharva> Can you please read my message before the last one?

18:00 < sumedhghaisas> Sorry for being pedantic, but could you rebase the PR on master rather than merging? Merging just creates a complicated history

18:01 < sumedhghaisas> Mean squares loss is basically a log_prob loss with normal distribution

18:01 < Atharva> Okay, if that makes thing easier

18:02 < sumedhghaisas> and thats what we will provide as default, although with binary MNIST we will have to use bernoulli dist log_prob, thus distributions will make this easier

18:02 < sumedhghaisas> Now for sampling, what use can do is define a distribution over the FFN.Predict function

18:02 < sumedhghaisas> and sample from it

18:03 < Atharva> Yeah

18:04 < Atharva> The dists will help, we just need to make some adjustments in FFN class

18:05 < sumedhghaisas> Not really, now that we no longer output dist, the current framework will work, we will need to Implement ReconstructionLoss thats it

18:07 < Atharva> Okay, but we do need to change the Predict function for dists, right?

18:07 < sumedhghaisas> umm... No. Predict will output a matrix.

18:07 < sumedhghaisas> While sampling, we will define a Dist over the Predict output

18:08 < Atharva> I think, we will need to make one detailed tutorial because we will be doing a lot of things externally and not in some class

18:09 < sumedhghaisas> A tutorial and a simple MNIST model in models :)

18:09 < Atharva> Yes, after that I hope we get some time for RNNs :)

18:09 < sumedhghaisas> Although this style of sampling is common across other frameworks as well

18:10 < Atharva> According to the planned timeline, the next week was for testing VAE class, but we don't have that now, so by then everything else should be ready so that we can play with some VAE models after that

18:10 < sumedhghaisas> For RNNs we will need to find a dataset as well

18:11 < Atharva> Yeah, what do you say about a music dataset?

18:11 < sumedhghaisas> for generation?

18:11 < Atharva> Yeah, in RNNs

18:12 < sumedhghaisas> Thats a very hard task for VAEs

18:12 < sumedhghaisas> to be honest

18:12 < Atharva> Oh, okay, maybe something else then, we will decide later

18:13 < sumedhghaisas> yeah, maybe we could play around with Reber grammar that exists currently

18:13 < sumedhghaisas> Lets see if we could generate a correct grammar with VAEs

18:13 < sumedhghaisas> thats a very interesting experiment...

18:14 < Atharva> That's interesting, working with models is going to be so much fun!

18:14 < sumedhghaisas> Although shouldn't be difficult

18:14 < Atharva> We also have to reproduce results from the papers

18:15 < sumedhghaisas> That would be the first task as soon as we get MNIST working

18:15 < sumedhghaisas> does the paper mention MNIST or Binary MNIST?

18:16 < Atharva> Yeah, about the ReconstructionLoss layer, do you have something specific that I should keep in mind while implementing

18:16 < Atharva> I will check

18:17 < sumedhghaisas> Not really. Have you understood the role that it plays?

18:17 < Atharva> Yes I have

18:17 < Atharva> It's forward function will return a double just like the other loss layers that we have

18:20 < Atharva> It will take in matrix and then use the dist to define a distribution over it

18:20 < Atharva> The dist object will then have logprob and logprob backwards which the layer will use for the forward and backward functions

18:22 < sumedhghaisas> Yup, yup and yup

18:22 < Atharva> We also need support for Bernoulli dist

18:23 < sumedhghaisas> thats for later :)

18:23 < Atharva> Okay

18:23 < Atharva> I will get on this then

18:23 < sumedhghaisas> Lets go with pure MNIST now

18:23 < Atharva> Yeah

19:58 ImQ009 has quit [Quit: Leaving]

19:59 < ShikharJ> zoq: Are you there?

20:00 < zoq> ShikharJ: yes

20:00 sulan_ has quit [Quit: Leaving]

20:02 < ShikharJ> zoq: I had a theoretical doubt. Let's say we have a 4 3x3 input points. So its shape becomes 3x3x4, and let it convolve with a 3x3x1 kernel, so that the output now is 1x1x4, for the 4 inputs.

20:03 < ShikharJ> zoq: Now when I'm computing the gradients for the above operation in the Gradient method, should I compute 3x3x4 gradients pertaining to the 4 inputs or something else has to be done?

20:04 < ShikharJ> zoq: Because our kernel size is just 3x3x1?

20:09 < zoq> The gradient has to be calculated for each input separately and you take the sum over the gradients at the end, so theoretically you could just write a for loop around everything and take the sum at the end; but since this is slow we should see if we can vectorize the operation.

20:09 < zoq> Here is a really simple example: https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/linear_impl.hpp#L72-L73

20:11 < ShikharJ> zoq: Ah, so I calculate the 3x3x4 sized gradients (for 4 inputs) and then reduce it to 3x3x1 by summing, is that right?

20:11 < zoq> correct

20:11 < ShikharJ> zoq: Summing along slices?

20:12 < zoq> yeah, if we use the cube representation

20:13 < ShikharJ> zoq: Ah, that cleared a lot on how convolutions actually work, for me. Thanks for the help!

20:13 < zoq> ShikharJ: Here to help.

20:17 < zoq> A simple test could check if the output is the same for two seperate runs (two inputs) and a single run with the two inputs combined.

20:18 < ShikharJ> zoq: Ah, yes, we can try that out as well.

20:19 < ShikharJ> zoq: The implementation for Batch Support on Convolutional Layers is nearly complete, we can test after I push the code.

20:20 < zoq> ShikharJ: awesome

21:57 witness_ has quit [Quit: Connection closed for inactivity]

22:26 < ShikharJ> zoq: I think I also found a bug in the Gradients method of convolution_impl.cpp, though I'll be needing you to review it, pushing the code for now.

22:39 < ShikharJ> zoq: I also posted the results for DCGAN MNIST test on the full dataset on the PR!

23:00 < zoq> ShikharJ: Are you going to open another PR or do we use the DCGAN PR?

23:02 < ShikharJ> zoq: For the bug, I'll push to the BatchSupport PR. DCGAN should be good to go for MNIST, but I still need to get it running for CelebA, which I think should benefit from the BatchSupport PR improvements.

23:06 < zoq> ShikharJ: Okay, sounds fine for me.

23:19 < ShikharJ> zoq: Pushed in the changes.

23:26 < zoq> ShikharJ: Nice, does this one incoperate the fix for the gradient function, not sure I see the issue.