verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
kris1 has quit [Quit: kris1]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Read error: Connection reset by peer]
kris__ has quit [Quit: Connection closed for inactivity]
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#3244 (mlpack-2.2.x - 4f6f2e5 : Ryan Curtin): The build has errored.
travis-ci has left #mlpack []
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 246 seconds]
sumedhghaisas has joined #mlpack
kris1 has joined #mlpack
kris__ has joined #mlpack
vivekp has quit [Ping timeout: 248 seconds]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 255 seconds]
vivekp has joined #mlpack
< kris__> Hi lozhnikov
< kris__> I was able to fix the orilley test.
< kris__> The problem seems to be it is very slow.
< kris__> So i don't think i would be able to get the optimal arguments for the test in time.
< kris__> -i train7.txt -o output.txt -e 20 -m 2000 -x 3 -N 100 -r 0.003 -v you can try it with this.
< kris__> Other than that i was able to fix the ssRBM test last night ie find parameters that give good accuracy as well as the time is less. I updated the PR for the same.
< kris__> I also added the test for GAN as we discussed using Pre-Trained gan. for the gaussian data set
< kris__> It would be great if you could give a final review for the GAN PR. I would like to fix the errors and then go for writing the Final Blog. Meanwhile test for orilley example is running on system.
< lozhnikov> kris__: I suggested other parameters at github (https://github.com/mlpack/mlpack/pull/1046#issuecomment-325116460).
< lozhnikov> I'll look through the GAN PR today
< kris__> Great....on my system the orilley example takes around 15 min for one batch.
sumedhghaisas has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
< lozhnikov> I'll start the test soon
sumedhghaisas has quit [Ping timeout: 246 seconds]
vivekp has quit [Ping timeout: 248 seconds]
vivekp has joined #mlpack
miagar has joined #mlpack
miagar has quit [Client Quit]
sumedhghaisas has joined #mlpack
maigar has joined #mlpack
sumedhghaisas has quit [Read error: Connection reset by peer]
< lozhnikov> kris__: Looks like you did an error somewhere. I got the following output
< lozhnikov> [INFO ] gradientDiscriminator = 0.000000e+00
< lozhnikov> [INFO ] gradientGenerator = 0.000000e+00
sumedhghaisas has joined #mlpack
< kris__> Hmmm yes the first gradient is zero in my case also ...
< kris__> but after that it starts "converging"
< lozhnikov> 5e-290 is too small
< kris1> Well it actually alternates to 0.39 also at one training iteration it was 2000
< kris__> Also i did this on training size of 200. I will try for training size = 2000.
< lozhnikov> kris__ : I can't reproduce. I get zeros each time. I tried to reduce the size of the dataset and I got the same result. I guess there is an error in the layer structure
maigar has quit [Ping timeout: 260 seconds]
< kris__> Well i actually tested both the discriminator and generator seprately also. i will send the code. Try running that and see if that is working for you.
< lozhnikov> kris__: I found the issue. The discriminator network shouldn't contain SigmoidLayer. But without that the Evaluate() function returns NAN. I wrote about that yesterday
< kris__> The discriminator by itself trains fine.........
< kris__> The output is in the comments....
< kris__> Yes i found that not having a sigmoid layer was causing nan's that's why i added them....
< lozhnikov> again, the discriminator shouldn't contain the sigmoid layer. Look at the oreilly example
< lozhnikov> and that's why the gradients are too small
< kris__> generator network alone also trains fine https://gist.github.com/kris-singh/55f84f603aa1e84555a8f0ab1812a34d
< kris__> Yes but without the sigmoid, i get nan's in the output.
< lozhnikov> probably, changing the network structure is not a good idea. As for me, it is better to figure out why the Evaluate() function returns NAN
< kris__> sigmoid_cross_entropy_with_logits operates on unscaled values rather than probability values from 0 to 1. Take a look at the last line of our discriminator: there's no softmax or sigmoid layer at the end. GANs can fail if their discriminators "saturate," or become confident enough to return exactly 0 when they're given a generated image; that leaves the discriminator without a useful gradient to descend.
< kris__> This is from the tutorial....
< kris__> i mean the orilley example..
< kris__> I don't if the cross entropy that is implemented mlpack works the same i will take a look.
< kris__> Okay so softmax is done at loss function level in the case of orilley example and we do it at the architecture level.
< lozhnikov> hmm, they added a workaround in order to avoid overflow. So, this implementation differs from our implementation
< kris__> Were could you point me to it.
< kris__> is reduce_mean function.
< lozhnikov> sigmoid_cross_entropy_with_logits differs from SoftMax+CrossEntropy
< kris__> Well softmax for one class is the same as sigmoid. I do agree about the overflow part though.
< lozhnikov> I think that could be the reason of small gradients
< kris__> I don't understand the eps part in the cross entropy implementation of mlpack.
< lozhnikov> I think epsilon is added in order to avoid NANs
< kris__> I can edit the cross entropy layer and make it take care of overflow but i am not sure about the back prop though.
< kris__> Ahhh no the backprop is easy also.
< lozhnikov> no, I don't think so since the overflow happens in exp(-x) i.e. in the sigmoid layer
< kris__> The logistic function is implemented to avoid overflows like that.
< kris__> line 41-47 in logistic_function.hpp
< lozhnikov> actually, no
< lozhnikov> if (x < arma::Datum<eT>::log_max)
< lozhnikov> {
< lozhnikov> if (x > -arma::Datum<eT>::log_max)
< lozhnikov> return 1.0 / (1.0 + std::exp(-x));
< lozhnikov> return 0.0;
< lozhnikov> }
< lozhnikov> return 1.0;
< lozhnikov> it handles overflows differently
< lozhnikov> the present implementation just rejects some values
< kris__> Hmmm should i just implement the sigmoid_cross_entropy_with_logits
< lozhnikov> looks reasonable to me, it shouldn't take a lot of time
< kris__> okay, i will do that straight away. I will use the same branch though other wise i have to switch and rebuilding takes around 20-25 minutes. Is that okay.
< lozhnikov> okay. On the other hand you can clone the repo into a separate directory in order to avoid rebuilding
< kris__> Just one question. In this equation max(x, 0) - x * z + log(1 + exp(-abs(x))) x's are scalar right.
< kris__> So in the case of arma:: mat we have do every operation per element basis.
< lozhnikov> sure
sumedhghaisas has quit [Read error: Connection reset by peer]
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
< kris__> lozhnikov: should arma::accu or arma::mean when doing forward pass for loss function.
< lozhnikov> kris__: yeah, I think you should apply arma::accu
< kris__> most of the examples i saw were actually using tf.reduce_mean
manjuransari has joined #mlpack
manjuransari has quit [Quit: Page closed]
< kris__> lozhnikov: I have implemented the layer
< kris__> but some test are failing.
< kris__> like the i label is 0 and input is 0.5 the output should be 0.29...
< kris__> but the tf output is 0.97407699
< kris__> okay i figured it out.
mikeling has quit [Quit: Connection closed for inactivity]
< lozhnikov> kris__: Great! Could you share the code? I'll look through that tomorrow
< kris__> Sure i will create a new PR for it. I think that's better.
< lozhnikov> could you cherry-pick commit "Fix depth for bilinear function" (07972dd26e362f442b3a2a5b746a098ccee220fd) to the ResizeLayer branch?
< kris__> I don't understand cherry pick for what. In which PR are you talking about.
< lozhnikov> you changed the ResizeLayer implementation inside the GAN branch
< kris__> Okay you want me merge this commit "Fix depth for bilinear function" to the Resize Layer.
< lozhnikov> yeah
< kris__> Okay i will do that.
< lozhnikov> ok, thanks
< kris__> I just wanted to ask can we merge RBM and ssRBM after i add the parameters that you sent in the patch.
< lozhnikov> I have to look through the whole PR again
< kris__> Sure.... would be good if we could merge something before Tuesday.
< lozhnikov> And I think we should ask Marcus. Maybe he wants to add something
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
< kris__> cstdin not found with mlpack 5.5
< kris__> Using clang any help
< kris__> I found that using -stdlib=libc++ this works
< kris__> i see that this has already done in the cmake file but it dose not work for me
< kris__> i am using clang btw...