verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
deep-book-gk_ has joined #mlpack
deep-book-gk_ has left #mlpack []
govg has joined #mlpack
kris1 has joined #mlpack
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
andrzejku has joined #mlpack
andrzejku has quit [Quit: My iMac has gone to sleep. ZZZzzz…]
shikhar has joined #mlpack
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
andrzejku has joined #mlpack
andrzejku has quit [Quit: My iMac has gone to sleep. ZZZzzz…]
andrzejku has joined #mlpack
andrzejku has quit [Read error: Connection reset by peer]
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
partobs-mdp has joined #mlpack
andrzejku has joined #mlpack
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
govg has quit [Ping timeout: 260 seconds]
sumedhghaisas_ has joined #mlpack
< sumedhghaisas_> zoq: Hey Marcus... I am little confused about the whole batch thing. How does it work?
shikhar has quit [Quit: WeeChat 1.7]
< sumedhghaisas_> I mean right now while training RNN we train each sequence separately... so how are we providig batches?
andrzejku has quit [Quit: My iMac has gone to sleep. ZZZzzz…]
< zoq> sumedhghais: Hello, right now, the input holds a single sequence correct, so input.n_cols is 1. To support batches each layer should handle input.n_cols > 1 where each col holds another sequence. Right now the RNN class doesn't support batches, but it will in the near future.
< zoq> Shangtong already refactored most of the layer e.g. for the linear layer instead of using 'output = (weight * input) + bias;' we do
< zoq> output = weight * input;
< zoq> output.each_col() += bias;
andrzejku has joined #mlpack
< zoq> I guess, for the GRU class, we just have to adjust the prevError which can only handle batch size = 1, the rest should work since we just use the already existing layer like linear
< zoq> About the NTM PR, it looks like it's ready for a first review?
< sumedhghaisas_> zoq: For NTM I am trying to add that shift function to arma_extend... stuck there
< sumedhghaisas_> I will try to fix that today so that we have tests results.
andrzejku has quit [Quit: My iMac has gone to sleep. ZZZzzz…]
< sumedhghaisas_> another thing remaining is the controller design. have some questions about that as well
< zoq> okay, I'm about to get something to eat, can we talk about the design once I get back or tomorrow?
< sumedhghaisas_> Sure. Ping me once you get back... I think I will be here.
< sumedhghaisas_> I will try to fix the shift thing till then
< zoq> Previously we backported the ind2sub function, so maybe this is helpful: https://github.com/mlpack/mlpack/commit/8f8cc458201cab8136addfecb420b4d8bfb09145#diff-b3e69ecff96ebaae403c243f71c038db
< zoq> just search for fn_ind2sub.hpp
< sumedhghaisas_> okay I will take look at that...
partobs-mdp has quit [Remote host closed the connection]
andrzejku has joined #mlpack
< zoq> sumedhghais: ping
< sumedhghaisas_> zoq: hey Marcus...
< sumedhghaisas_> So I backported the shift function
< zoq> sounds good
< sumedhghaisas_> but somehow the gradient tests are failing online... by a very small margin
< sumedhghaisas_> on my computer they pass...
< sumedhghaisas_> but NTM gradient check is passing...
< sumedhghaisas_> but individual read memory and write memory checks are failing.... sooo weird
< zoq> hm, strange indeed
< zoq> Have to take a look into the issue, the initialization shouldn't really matter
< sumedhghaisas_> I am just making ResetCellVisitor recursive to model inside the layer, so the tests will run again. Lets see if the results change
< sumedhghaisas_> okay now for the Controller thing... how do you want to do it?
< sumedhghaisas_> My idea was to create a ontroller layer
< sumedhghaisas_> so user will have to initialise that layer ... add layers to this layer to make the controller network
< sumedhghaisas_> and pass this object to NTM constructor...
< zoq> Yeah, some generic layer that implements all the necessary functions, I was thinking if it would be possible to pass an FFN or RNN instance, because that would allow us to test a forward or recurrent controller
< sumedhghaisas_> hmmm... actually I like your idea better.
< sumedhghaisas_> so we have to make sure that entire RNN can be used as a layer...
< sumedhghaisas_> all we have to do is implement Forward and Backward signatures ... and set currentInput and call the internal forward and backward ... is that right?
< zoq> If, that's all you expect from a Controller, I think yes.
< sumedhghaisas_> yes... but also Model() function... to store and restore parameters
< zoq> We could also write a wrapper layer, that takes RNN or FFN, not sure which one is easier or better.
< sumedhghaisas_> but then we have to make sure the input dimensions and output dimensions of the controller are correct
< zoq> Do you think we could provide another constructor to set the dimension?
< sumedhghaisas_> according to the number of things that are required ... I think writing a new layer is as same as using RNN or FNN... even the current implementation of NTM has a generic controller... I mean it saves the layers in a vector and passes through them
< sumedhghaisas_> another constructor to RNN you mean?
< zoq> yes
< zoq> I think in this case, we should use the existing RNN/FFN class, might be cleaner.
< sumedhghaisas_> Currently we don't check if the dimensions match across layers right? or do we?
< sumedhghaisas_> I mean if I have a linear layer with 10 as output dimension
< zoq> Right, we don't check the dimension.
< sumedhghaisas_> So we just have to trust the user to set the controller dimensions correctly.. but he also has to take into consideration the memory size ... cause first layer also takes last read as input
< sumedhghaisas_> But I think we should create a visitor that verifies if the network is correctly set
< zoq> Sounds like a good idea to me, ideally a user has to specify only one dimension, since we could extract the extra information from the rest.
< sumedhghaisas_> Using RNN and FNN sounds good. But for now, we go ahead with user trust? :P when we implement the visitor thing ... we plug that in.
< sumedhghaisas_> but the chain will break when there is a non linearity in the middle. cause it does not have any information right?
< sumedhghaisas_> or do we set information in non linear layers too?
< zoq> Absolutely, we should make that clear in the documentation how the the correct dimension.
< zoq> Right, i guess in case of a non linear layer, we would have to somehow set that information first, but the layer should know what the output size would be if the input size is know?
< zoq> *known
andrzejku has quit [Quit: My iMac has gone to sleep. ZZZzzz…]
< sumedhghaisas_> okay. So use RNN and FNN as controller. Now the initialization... initializaing memory and previous read? I am not so sure about the correct architecture there...
< sumedhghaisas_> So every layer has to implement a function like OutputDimension(const size_t inputDim)
< zoq> With memory initialization you mean to use a trainable layer? I guess for now we could go with a constant value, I think it should work reasonable well.
< sumedhghaisas_> right now it is set to zero values... that sounds good?
< zoq> yes, makes sense
< zoq> About OutputDimension(const size_t inputDim), right, but I think for now we should go with the correct user input idea.
< sumedhghaisas_> ohh yeah. This sounds like lot of changes.
< sumedhghaisas_> okay so I will get the controller working now. I also wanted to ask about those Copy and other tasks... is that code already merged?
< sumedhghaisas_> I can maybe run some tests on that tonight...
< zoq> No, I guess we will merge the PR soon (this week), for testing you could already use it: https://github.com/mlpack/mlpack/pull/1005
< zoq> That might also be helpful: https://github.com/mlpack/models/pull/1
< zoq> I would start with the copy task.
< sumedhghaisas_> ahh great... I will try to follow the baseline model.
< sumedhghaisas_> ahh should I add FNN and RNN to LayerTypes?
< sumedhghaisas_> cause I have to accept that object in the NTM constructor somehow...
< zoq> yes, hopefully that works
< sumedhghaisas_> zoq: Hey Marcus... question about Armadillo...
< sumedhghaisas_> is I say std::move(mat1) + mat2... does Armadillo perform in memory addition?
andrzejku has joined #mlpack
andrzejku has quit [Client Quit]
pretorium[m] has quit [Ping timeout: 240 seconds]
sumedhghaisas_ has quit [Ping timeout: 268 seconds]
pretorium[m] has joined #mlpack