#mlpack on 2017-07-21 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

00:08 kris1 has quit [Quit: kris1]

00:09 kris1 has joined #mlpack

01:58 kris1 has quit [Quit: kris1]

04:01 jenkins-mlpack has quit [Excess Flood]

04:02 jenkins-mlpack has joined #mlpack

04:54 witness_ has joined #mlpack

04:55 kris1 has joined #mlpack

04:56 kris1 has quit [Client Quit]

05:03 andrzejk_ has joined #mlpack

05:51 andrzejk_ has quit [Quit: Textual IRC Client: www.textualapp.com]

06:25 sumedhghaisas has quit [Ping timeout: 268 seconds]

06:29 vivekp has joined #mlpack

06:43 sumedhghaisas has joined #mlpack

06:53 partobs-mdp has quit [Remote host closed the connection]

06:59 sumedhghaisas has quit [Ping timeout: 268 seconds]

08:03 kris1 has joined #mlpack

08:05 PhilippeC has joined #mlpack

08:06 PhilippeC has quit [Client Quit]

08:22 < lozhnikov> kris1: I pointed out the error that leads to segfault at github.

08:31 < kris1> Ahhh i will have a look.

08:31 < kris1> Thanks

08:33 sumedhghaisas has joined #mlpack

09:01 aashay has joined #mlpack

09:21 kris1 has quit [Quit: kris1]

09:41 kris1 has joined #mlpack

10:26 < kris1> lozhnikov: The weight shape comment you gave in ssRBM. I was just following convention of outsize * insize

10:27 < kris1> do you want me to change that.

10:31 < lozhnikov> kris1: I think that's not important. But I prefer the same notation as the paper suggests.

10:31 < kris1> Hmmm okay then i think we should change it.

10:32 < kris1> Also i don’t understand this

10:32 < kris1> I think we have to restrict the number of iterations here.

10:32 < kris1> this is for rejection sampling.

10:33 < lozhnikov> I think we shouldn't do more that N iterations

10:33 < lozhnikov> *than

10:36 < kris1> But what if the sample is still not in feasible region?

10:38 < lozhnikov> Why do you think that is so critical?

10:38 < lozhnikov> I think this is not a problem

10:39 < kris1> Well the paper mentions that if don’t do this the variance matrix would not remain PSD

10:41 < lozhnikov> Could you show me the page and the line? I don't see that

10:43 < lozhnikov> The paper states only "This restriction to a finite domain guarantees that the partition function Z remains finite"

10:43 < kris1> sampling from this Gaussian is straightforward, when using re-jection sampling to excludev outside the bounded do-main.

10:43 < kris1> in page 235 column 2

10:45 sumedhghaisas has quit [Ping timeout: 268 seconds]

10:46 < lozhnikov> hm... This doesn't influence on the matrix properties

10:46 < kris1> Also page 237 column1 starting paragraph

10:51 < lozhnikov> okay, I agree with the arguments at the page 237. But I still think it is not critical if sometimes samples occur outside of the region.

11:11 aashay has quit [Quit: Connection closed for inactivity]

11:38 < kris1> lozhnikov: I think input * input.t() is correct there as yes you would d*d matrix. But the size of visibleBias/LambdaBias is also d*1. We have to obviously make the visibleBias positiveGrad d*1.

11:39 < kris1> You said that we input.t() * input which would be scalar. rember that input is d*1 vector

11:40 < kris1> visibleBiasPositiveGrad.diag() extracts the diagonal elements from the matrix and hence makes visibleBiasPositiveGradient to D*1

11:40 < kris1> so i think it is correct in that way

11:40 < kris1> Let me know what you think

11:42 < lozhnikov> Lambda is a diagonal matrix. Hence input * input.t() is definitely incorrect

11:43 < lozhnikov> Moreover the paper suggests 0.5 v^T * v (see page 236, column 2)

11:44 < kris1> v.t()*v would be scalar right.

11:45 < kris1> so are you suggesting that we fill D * 1 vector using the this saclar.

11:46 < lozhnikov> yeah, all diagonal elements are equal

11:46 < kris1> ok i will chnage accordinly

11:54 vivekp has quit [Ping timeout: 248 seconds]

11:55 vivekp has joined #mlpack

11:57 kris1 has quit [Quit: kris1]

12:25 kris1 has joined #mlpack

12:56 < kris1> lozhnikov: i made the changes you suggested but the problem of sigkill still persists. Mainly std::memcpy is unsuccesfull.

12:56 < kris1> the program stops at ssRBM.hpp:22

12:58 < lozhnikov> kris1: push the changes, then I'll take a look

13:16 kris1 has quit [Quit: kris1]

13:54 kris1 has joined #mlpack

14:11 < lozhnikov> kris1: The issue with ssRBM looks like the issue with GAN

14:11 < lozhnikov> the implementation of the Reset() function is incorrect

14:15 < kris1> Reset function of the ssrbm.hpp or the reset function of rbm.hpp

14:15 < lozhnikov> SpikeSlabLayer::Reset()

14:16 vivekp has quit [Ping timeout: 240 seconds]

14:17 < lozhnikov> I mean that you can not use this function if you didn't initialize the weights variable

14:18 < lozhnikov> But you do that at temp.cpp:47 and temp.cpp:48

14:19 < kris1> Can you send the temp.cpp file. I think i changed my mine for a diffrent test.

14:20 < kris1> But why the does the program work for smaller sized inputs ie ssRBMNetworkTest gets passed

14:20 < kris1> with gan’s that was not the case

14:21 vivekp has joined #mlpack

14:21 < lozhnikov> oh, I thought you asked about the test from your gist

14:22 < kris1> Wait i will send you the new test file that i am testing on.

14:23 < kris1> https://gist.github.com/kris-singh/8e6b0afd01d303e1b75eae04b90b82a9

14:30 < lozhnikov> hmm... This doesn't work since you are trying to allocate 30GB memory

14:31 < lozhnikov> Are you sure in the formulas below?

14:31 < lozhnikov> visible.Parameters().set_size(

14:31 < lozhnikov> inSize * ((outSize * inSize) + (poolSize * poolSize) + 1) + outSize,

14:31 < lozhnikov> outSize * outSize);

14:34 < lozhnikov> Regarding GAN: I mean this error: https://github.com/mlpack/mlpack/pull/1066#discussion_r128708271

14:36 < kris1> since the weight d*k*n, spike bias 1 * N, Lambda bias D * 1, slab bias k *n. So i think i can reduce the parameters. I was actually storing the full the matrix right now.

14:36 < kris1> So i guess i can i do that.

15:19 partobs-mdp has joined #mlpack

16:52 < partobs-mdp> zoq: rcurtin: implemented changes from the review. Also, I transferred gradient clipping and CrossEntropyError to the separate PR - so I did manage to get that code out of the way :)

16:58 < rcurtin> sounds good---I guess you can just quote and respond to the comments in that PR, we can do it like that :)

17:00 < zoq> Agreed! sounds good :)

17:27 < partobs-mdp> Implemented the GradientClipping API as proposed by rcurtin. I'll try to test it now, but I start thinking that I just need a good sleep :)

17:28 < partobs-mdp> (disclaimer: I don't add docs so far to limit the scope of discussion to the API & code)

17:32 < rcurtin> partobs-mdp: fair enough, but without docs you risk me getting confused when I review it later, so you may have to answer extra clarification questions :)

17:35 < partobs-mdp> rcurtin: I want to clarify: do *you* feel OK about it? (I got so much used to it during my high school period, so not a problem for me ^_^)

17:35 < partobs-mdp> If not, I will add docs before I fall asleep ;)

17:36 < partobs-mdp> What do you think if I write a unit test that is identical to SimpleSGDTestFunction except that it uses clipped gradient instread of StandardSGD?

17:38 < rcurtin> partobs-mdp: that's fine with me, I'll review it either way

17:38 < rcurtin> but I review a lot of things so typically when I review, there is little memory of what I previously looked at (although in this case we've discussed it enough now that I have it mostly cached in)

17:39 < rcurtin> I'm happy to review it without documentation, it just increases P(misguided comment) :)

17:39 < rcurtin> the unit test idea sounds fine---but it sounds like you should get some sleep too :)

17:44 < partobs-mdp> rcurtin: (Compiling unit test) During that time, I would like to ask some off-topic questions. Do you mind me doing it?

18:08 < partobs-mdp> rcurtin: zoq: Meanwhile the unit test has compiled and successfully ran. Pushed everything to the PR> I think here I can safely go to sleep :)

18:11 partobs-mdp has quit [Remote host closed the connection]

18:25 < rcurtin> partobs-mdp: sure, you can always ask off-topic questions :)

18:26 < rcurtin> unfortunately I couldn't respond because I'd stepped out !

18:26 < rcurtin> and now you are asleep :(

18:26 < ironstark> zoq: rcurtin: This is my decision tree weka code: https://paste.ubuntu.com/25140948/

18:26 < ironstark> I get the following error :

18:26 < ironstark> https://www.irccloud.com/pastebin/O9HFEbwB/

18:27 < ironstark> I am unfamiliar with weka so having trouble understanding the errors

18:31 < rcurtin> are you sure that weka.classifiers.trees.SimpleCart is the right name of the class for the version of Weka we have?

18:33 < rcurtin> I think that another possibility here is that SimpleCart is not from core Weka but instead one of the add-on packages: http://weka.sourceforge.net/packageMetaData/

18:33 < rcurtin> we should probably stick with core Weka for benchmarking

19:06 < ironstark> okay .. Thanks for the help :)

19:48 mikeling has quit [Quit: Connection closed for inactivity]

19:51 < zoq> ironstark: here is a list of implemented tree based classifier: http://weka.sourceforge.net/doc.stable/weka/classifiers/trees/package-summary.html

19:52 < ironstark> This page mentions SimpleCart but it doesnt work

19:52 < ironstark> I have implemented in J48

19:53 < ironstark> but on running using Weka the predictions are coming out all wrong

19:53 < ironstark> its only 0's

19:54 < zoq> can you post the code?

19:59 < zoq> strange, NBC is also in weka.classifiers

20:06 < ironstark> https://paste.ubuntu.com/25141492/

20:06 < ironstark> the code

20:22 < zoq> ironstark: okay, the code looks good to me, I'll have to take a closer look into the issue

20:23 < zoq> ironstark: I'll do this later today and get back to you.

20:23 < ironstark> sure. Thanks a lot :)

21:23 mentekid has joined #mlpack

23:32 < kris1> lozhnikov: I was able to remove the errors from the ssRBM. I added the classification test but the problem seems to me that msgd is running into nan if we keep the number of iteration very high

23:32 < kris1> I will try to fix it if i can. Could you also have a look.

23:43 < zoq> ironstark: Turns out, weka expects that the unlabeled data set (test set) has a class, for me, this makes sense if someone likes to use the weka Evaluation class e.g. to get the MSE.

23:43 < zoq> ironstark: However for someone just interested in the prediction ... Anyway let's add a pseudo class if there is no class, here is the updated code: https://gist.github.com/zoq/e387fd24b117890a141ebe7cff9c2abb the interesting part is line 52-61.

23:43 < zoq> ironstark: Also, make sure to test on arff files e.g. iris_train.arff, iris_test.arff, or you could use the convert function to convert csv to arff.