verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
kris1 has quit [Quit: kris1]
< lozhnikov>
kris__: you didn't take care of the depth. Each convolution layer produces an output of size 28*28*depth. So, you should resize the image at each slice.
< lozhnikov>
I see two possible ways:
< lozhnikov>
1. Add the third dimension to BiLinearFunction in such a way that the input has the shape (x, y, z) and the output has the shape (nx, ny, nz).
< lozhnikov>
2. Add the number of slices to BiLinearFunction in such a way that the input has the shape (x, y, depth) and the output has the shape (nx, ny, depth) i.e. the depth is constant.
kris__ has quit [Quit: Connection closed for inactivity]
sumedhghaisas has joined #mlpack
kris__ has joined #mlpack
kris1 has joined #mlpack
< kris__>
Hey lozhnikov,
< kris__>
I don't think that's entirely correct the linear dosen't support the depth parameter if you look at it. And it can be successfully used with the convolution layer.
< kris__>
I will just check why the conv --> linear works...
< kris__>
and then see.
< lozhnikov>
there are no problems to use the linear layer with the convolution one. But the resize layer accepts an input of size 28*28, however the convolution layer provides an output of size 28 * 28 * depth
< kris__>
Hmmm in that way a quick solution would be just overload the constructor of the bilinear function with (inRowSize, inColSize, outRowSize, outColSize, depth).
< lozhnikov>
I agree
< kris__>
I am just think we would have to change the indexing in the input(i,j,d) = d * depth+ j * colSize + i.
< kris__>
That would require using 3 loops. That makes this pretty expensive.
< lozhnikov>
no, that's incorrect. I think (input(i, j, k) = k * colSize * rowSize + j * colSize + i) is correct
< kris__>
yes sure that still requires using 3 loops though.
< lozhnikov>
sure, the input is 3-dimensional. therefore it requires 3 loops
< kris__>
the input can be 32 dim(# of channels) also so i think this would be pretty slow. But i think thats the quick solution. let me implement that...
< lozhnikov>
I don't see any problems here. The oreilly example does the same and works very well
< kris__>
Okay i updated the code directly in the Gan PR.
< kris__>
You could have a look.
< lozhnikov>
look like the changes are correct
sumedhghaisas has quit [Ping timeout: 240 seconds]
partobs-mdp has joined #mlpack
< partobs-mdp>
zoq: Can't figure out the issue with zero_init.hpp not found
< partobs-mdp>
On my computer everything correctly compiles even after make clean
< partobs-mdp>
In CMakeLists it is included, so no idea why Travis doesn't see it
< partobs-mdp>
Could you take a look? (My plane is leaving in 4 hours, so I would appreciate if someone would be able to respond ASAP)
< zoq>
partobs-mdp: Looks like zero_init was renamed to const_init, so if you swap zero_init with const_init it should work.
< zoq>
partobs-mdp: You might also have to add a default constructor to the ConstInitialization class, that uses 0 as initVal.
kris1 has quit [Quit: kris1]
< kris__>
Lozhnikov the code still behaves pretty weird still
< kris__>
Different errors at different runs and sometimes running successfully
< partobs-mdp>
(instead of NetworkInitialization<> networkInit();)
< zoq>
Can you push the code?
< partobs-mdp>
Pushed
< partobs-mdp>
Looked at the code more carefully - I found that someone has removed offset parameter
< partobs-mdp>
Even though it would cause an error otherwise, it is still not the true reason for the error message - it still crashed even after I returned the old implementation of the method
< partobs-mdp>
zoq: Any ideas hwo to fix that?
< partobs-mdp>
*how
< partobs-mdp>
Fixed the error by removing parentheses: NetworkInitialization<ConstInitialization> networkInit;
< partobs-mdp>
Why would putting empty parentheses matter? (providing I have the constructor that takers no parameters)
< lozhnikov>
kris__: Moreover, I pointed a few issues at github. There are still some errors: the Evaluate() function returns NAN. Try to figure out which layer produces NANs and why
sumedhghaisas has joined #mlpack
kris__ has quit [Quit: Connection closed for inactivity]
kris1 has joined #mlpack
sumedhghaisas has quit [Read error: Connection reset by peer]
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
johnlennon has joined #mlpack
johnlennon has quit [Client Quit]
kris__ has joined #mlpack
sumedhghaisas has joined #mlpack
< sumedhghaisas>
zoq: Hey Marcus... there?
< zoq>
sumedhghais: yes about to step out.
< sumedhghaisas>
zoq: Will catch up with you later then. Wanted to talk o you about that windows problem
< sumedhghaisas>
Have fixed all the other comments
< sumedhghaisas>
Also fixing the batch norm
< zoq>
sumedhghais: Commented on the PR, just a few seconds ago.
< sumedhghaisas>
ahh... okay. I think that should work for now. I am not sure about the performance comparison
< sumedhghaisas>
And the batch norm implementation is not passing the gradients tests
< zoq>
hm, okay for the linear layer, I guess
< sumedhghaisas>
zoq: sorry didnt get that. okay for linear layer?
< zoq>
You tested the batchnorm with the linear layer and not with the conv layer?
< sumedhghaisas>
ahh I know the problem... the output is alwas zero. Cause there is only a single entry in the batch
< sumedhghaisas>
how do I use multiple entries in a batch?
< sumedhghaisas>
yeah with a linear layer. The gradient tests shoul pass right? Or am I missing something?
< zoq>
I'm about to refactor the FNN class so that it works with real batches, for now you have to manully pass a batch to the layer n_cols > 1.
< sumedhghaisas>
ooohhh... okay. What is the refactoring? maybe I can help a little? if we finish it ... then I can properly test batch norm...
< zoq>
Currently the FFN class splits the input using .col, so basically all we have to do is to use cols or submat. Rajiv already put a lot of work into supporting batches: https://github.com/mlpack/mlpack/pull/1073
< sumedhghaisas>
Okay I will try to take a look at his code
sumedhghaisas has quit [Ping timeout: 246 seconds]