verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
< haritha1313>
Here I suppose the emedding layer is supposed to give an output of 20X10 where 20 is the embedding size and 10 is the input vector's size
< zoq>
haritha1313: right
< haritha1313>
I think the network is giving the right dimensions then. Sorry, I'm not able to point out what is wrong.
< haritha1313>
The output after using merge model is of dimension 100X2, and this is because of the flattening of input by concat layer, as we discussed yesterday.
< haritha1313>
Is there anything I am missing?
< zoq>
haritha1313: The issue I see is that, the concat layer returns 200 x 2 but the linear layer after the concat one expects: network.Add<Linear<> >(20, 5);
< zoq>
This could be a failure on my side.
< zoq>
I would expect a single sample as output (..., 1)
< zoq>
So, there might be need for a flatten layer, or an option to flatten the output for the concat layer.
< haritha1313>
Sorry for the delay. I had gone for dinner.
< haritha1313>
Yes, that (20, 5) was written expecting concat layer to give 20X10 output.
< zoq>
haritha1313: Ahh, I see, that makes sense :)
< haritha1313>
After yesterday's discussion I worked on it so that the 200X2 output uses the subview layer for flattening.
< haritha1313>
The way we discussed earlier that subview will convert each batch into a single vector, so I thought that could just flatten it for us.
< zoq>
Right, using subview should work
travis-ci has joined #mlpack
< travis-ci>
mlpack/mlpack#5153 (master - a2abf9d : Marcus Edel): The build has errored.
< ShikharJ>
zoq: Sorry, for reaching out late. Are you there?
< zoq>
ShikharJ: I'm here.
< ShikharJ>
zoq: I was re-thinking whether there's a need for implementing two separate modules for Weight Clipping and Gradient Penalty methods for WGAN.
< ShikharJ>
zoq: They both would require us to probably make certain changes to the Evaluate function and the Gradient routine of the WGAN.
< ShikharJ>
zoq: Their pseudocode implementations are pretty different.
< ShikharJ>
zoq: I'm not sure if the existing gradient_clipping class can be re-used. I'll have to investigate.
< zoq>
I think, it's not the same, I was just thinking about the idea to implement this as an update policy for the optimizer class.
< zoq>
We could even combine multiple methods using a parameter pack.
< ShikharJ>
zoq: I see, in gradient_clipping, first the clipping is done and then the update is done. This is not the same with the original wgan algorithm.
< zoq>
I don't mind to implement this explicitly for the GAN class.
< ShikharJ>
zoq: Exactly, and not to forget, clipping is done only in the discriminator, so this has to be done while inside Gradient / Evaluate (wherever we calculate the gradients for the discriminator).
< zoq>
It might be too specific for the optimizer class policy.
< zoq>
You are right.
< ShikharJ>
zoq: I'll formulate a basic API over the next couple of days, and maybe we can discuss further then.
< zoq>
This sounds like a great idea to me, I guess we could reuse some ideas from the optimizer class (policy design), this might be useful to disable/enable certain features.