verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
sumedhghaisas has quit [Ping timeout: 268 seconds]
vivekp has joined #mlpack
sumedhghaisas has joined #mlpack
partobs-mdp has quit [Remote host closed the connection]
sumedhghaisas has quit [Ping timeout: 268 seconds]
kris1 has joined #mlpack
PhilippeC has joined #mlpack
PhilippeC has quit [Client Quit]
< lozhnikov>
kris1: I pointed out the error that leads to segfault at github.
< kris1>
Ahhh i will have a look.
< kris1>
Thanks
sumedhghaisas has joined #mlpack
aashay has joined #mlpack
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
< kris1>
lozhnikov: The weight shape comment you gave in ssRBM. I was just following convention of outsize * insize
< kris1>
do you want me to change that.
< lozhnikov>
kris1: I think that's not important. But I prefer the same notation as the paper suggests.
< kris1>
Hmmm okay then i think we should change it.
< kris1>
Also i don’t understand this
< kris1>
I think we have to restrict the number of iterations here.
< kris1>
this is for rejection sampling.
< lozhnikov>
I think we shouldn't do more that N iterations
< lozhnikov>
*than
< kris1>
But what if the sample is still not in feasible region?
< lozhnikov>
Why do you think that is so critical?
< lozhnikov>
I think this is not a problem
< kris1>
Well the paper mentions that if don’t do this the variance matrix would not remain PSD
< lozhnikov>
Could you show me the page and the line? I don't see that
< lozhnikov>
The paper states only "This restriction to a finite domain guarantees that the partition function Z remains finite"
< kris1>
sampling from this Gaussian is straightforward, when using re-jection sampling to excludev outside the bounded do-main.
< kris1>
in page 235 column 2
sumedhghaisas has quit [Ping timeout: 268 seconds]
< lozhnikov>
hm... This doesn't influence on the matrix properties
< kris1>
Also page 237 column1 starting paragraph
< lozhnikov>
okay, I agree with the arguments at the page 237. But I still think it is not critical if sometimes samples occur outside of the region.
aashay has quit [Quit: Connection closed for inactivity]
< kris1>
lozhnikov: I think input * input.t() is correct there as yes you would d*d matrix. But the size of visibleBias/LambdaBias is also d*1. We have to obviously make the visibleBias positiveGrad d*1.
< kris1>
You said that we input.t() * input which would be scalar. rember that input is d*1 vector
< kris1>
visibleBiasPositiveGrad.diag() extracts the diagonal elements from the matrix and hence makes visibleBiasPositiveGradient to D*1
< kris1>
so i think it is correct in that way
< kris1>
Let me know what you think
< lozhnikov>
Lambda is a diagonal matrix. Hence input * input.t() is definitely incorrect
< lozhnikov>
Moreover the paper suggests 0.5 v^T * v (see page 236, column 2)
< kris1>
v.t()*v would be scalar right.
< kris1>
so are you suggesting that we fill D * 1 vector using the this saclar.
< lozhnikov>
yeah, all diagonal elements are equal
< kris1>
ok i will chnage accordinly
vivekp has quit [Ping timeout: 248 seconds]
vivekp has joined #mlpack
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
< kris1>
lozhnikov: i made the changes you suggested but the problem of sigkill still persists. Mainly std::memcpy is unsuccesfull.
< kris1>
the program stops at ssRBM.hpp:22
< lozhnikov>
kris1: push the changes, then I'll take a look
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
< lozhnikov>
kris1: The issue with ssRBM looks like the issue with GAN
< lozhnikov>
the implementation of the Reset() function is incorrect
< kris1>
Reset function of the ssrbm.hpp or the reset function of rbm.hpp
< lozhnikov>
SpikeSlabLayer::Reset()
vivekp has quit [Ping timeout: 240 seconds]
< lozhnikov>
I mean that you can not use this function if you didn't initialize the weights variable
< lozhnikov>
But you do that at temp.cpp:47 and temp.cpp:48
< kris1>
Can you send the temp.cpp file. I think i changed my mine for a diffrent test.
< kris1>
But why the does the program work for smaller sized inputs ie ssRBMNetworkTest gets passed
< kris1>
with gan’s that was not the case
vivekp has joined #mlpack
< lozhnikov>
oh, I thought you asked about the test from your gist
< kris1>
Wait i will send you the new test file that i am testing on.
< kris1>
since the weight d*k*n, spike bias 1 * N, Lambda bias D * 1, slab bias k *n. So i think i can reduce the parameters. I was actually storing the full the matrix right now.
< kris1>
So i guess i can i do that.
partobs-mdp has joined #mlpack
< partobs-mdp>
zoq: rcurtin: implemented changes from the review. Also, I transferred gradient clipping and CrossEntropyError to the separate PR - so I did manage to get that code out of the way :)
< rcurtin>
sounds good---I guess you can just quote and respond to the comments in that PR, we can do it like that :)
< zoq>
Agreed! sounds good :)
< partobs-mdp>
Implemented the GradientClipping API as proposed by rcurtin. I'll try to test it now, but I start thinking that I just need a good sleep :)
< partobs-mdp>
(disclaimer: I don't add docs so far to limit the scope of discussion to the API & code)
< rcurtin>
partobs-mdp: fair enough, but without docs you risk me getting confused when I review it later, so you may have to answer extra clarification questions :)
< partobs-mdp>
rcurtin: I want to clarify: do *you* feel OK about it? (I got so much used to it during my high school period, so not a problem for me ^_^)
< partobs-mdp>
If not, I will add docs before I fall asleep ;)
< partobs-mdp>
What do you think if I write a unit test that is identical to SimpleSGDTestFunction except that it uses clipped gradient instread of StandardSGD?
< rcurtin>
partobs-mdp: that's fine with me, I'll review it either way
< rcurtin>
but I review a lot of things so typically when I review, there is little memory of what I previously looked at (although in this case we've discussed it enough now that I have it mostly cached in)
< rcurtin>
I'm happy to review it without documentation, it just increases P(misguided comment) :)
< rcurtin>
the unit test idea sounds fine---but it sounds like you should get some sleep too :)
< partobs-mdp>
rcurtin: (Compiling unit test) During that time, I would like to ask some off-topic questions. Do you mind me doing it?
< partobs-mdp>
rcurtin: zoq: Meanwhile the unit test has compiled and successfully ran. Pushed everything to the PR> I think here I can safely go to sleep :)
partobs-mdp has quit [Remote host closed the connection]
< rcurtin>
partobs-mdp: sure, you can always ask off-topic questions :)
< rcurtin>
unfortunately I couldn't respond because I'd stepped out !
< zoq>
ironstark: okay, the code looks good to me, I'll have to take a closer look into the issue
< zoq>
ironstark: I'll do this later today and get back to you.
< ironstark>
sure. Thanks a lot :)
mentekid has joined #mlpack
< kris1>
lozhnikov: I was able to remove the errors from the ssRBM. I added the classification test but the problem seems to me that msgd is running into nan if we keep the number of iteration very high
< kris1>
I will try to fix it if i can. Could you also have a look.
< zoq>
ironstark: Turns out, weka expects that the unlabeled data set (test set) has a class, for me, this makes sense if someone likes to use the weka Evaluation class e.g. to get the MSE.
< zoq>
ironstark: However for someone just interested in the prediction ... Anyway let's add a pseudo class if there is no class, here is the updated code: https://gist.github.com/zoq/e387fd24b117890a141ebe7cff9c2abb the interesting part is line 52-61.
< zoq>
ironstark: Also, make sure to test on arff files e.g. iris_train.arff, iris_test.arff, or you could use the convert function to convert csv to arff.