#mlpack on 2018-10-26 — irc logs at libera.irclog.whitequark.org

2018-08-06 04:28 ChanServ changed the topic of #mlpack to: Due to ongoing spam on freenode, we've muted unregistered users. See http://www.mlpack.org/ircspam.txt for more information, or also you could join #mlpack-temp and chat there.

05:58 rcurtin has joined #mlpack

05:58 robertohueso has joined #mlpack

08:06 vivekp has quit [Ping timeout: 264 seconds]

08:16 vivekp has joined #mlpack

08:19 vivekp has quit [Read error: Connection reset by peer]

08:21 vivekp has joined #mlpack

08:41 vivekp has quit [Ping timeout: 252 seconds]

08:42 vivekp has joined #mlpack

09:04 davida has joined #mlpack

09:05 < davida> Could someone please share how to correctly implement Gradient Clipping on an optimizer?

09:06 < davida> I am using the following SGD with Adam:

09:06 < davida> mlpack::optimization::SGD<mlpack::optimization::AdamUpdate> optimizer(0.01, 32, 10000, 1e-05, true, mlpack::optimization::AdamUpdate(1e-8, 0.9, 0.999));

09:06 < davida> I tried to do this:

09:06 < davida> mlpack::optimization::GradientClipping clippedOptimizer(-5, 5, optimizer);

09:07 vivekp has quit [Ping timeout: 276 seconds]

09:07 < davida> ... but I get compile errors basically telling me:

09:07 < davida> 'Optimize': is not a member of 'mlpack::optimization::GradientClipping<mlpack::optimization::SGD<mlpack::optimization::AdamUpdate,mlpack::optimization::NoDecay>>'

09:09 < davida> I should add that I am using the clippedOptimizer like this:

09:09 < davida> model.Train(X, Y, clippedOptimizer);

09:16 vivekp has joined #mlpack

10:22 < zoq> davida: AdamUpdate adamUpdate(...);

10:22 < zoq> GradientClipping<AdamUpdate> clipping(-5, 5, adamUpdate);

10:22 < zoq> SGD<GradientClipping<AdamUpdate> > optimizer(0.01, 32, 100000, 1e-5, true, clipping);

10:22 < zoq> davida: Let me klnow if that works for you.

10:45 < davida> zoq: Thx. I will give that a go shortly and get back to you.

11:21 < davida> zoq: Yes. That works and I now realise that I have to apply clipping to the update policy and not the optimizer.

11:38 < davida> I am building an RNN with my ClippedGradients. Now I am at the point to train my network with X inputs and Y labels configured in arma::mat variables. My compiler is giving me an error stating that in function Train, the compiler cannot convert from arma::mat to arma::cube .

11:39 < davida> Why would my RNN expect cubes? Is that a default template that needs to be overridden somewhere?

11:41 < davida> The full error reads: 'void mlpack::ann::RNN<mlpack::ann::NegativeLogLikelihood<arma::mat,arma::mat>,mlpack::ann::HeInitialization>::Train<mlpack::optimization::SGD<mlpack::optimization::GradientClipping<mlpack::optimization::AdamUpdate>,mlpack::optimization::NoDecay>>(arma::cube,arma::cube,OptimizerType &)': cannot convert argument 1 from 'arma::mat' to 'arma::cube'

11:45 < davida> Hmmm. OK - reading documentation it mentions need to use cubes with i,j,k representing the i'th dimension of the j'th data point at time slice k.

11:47 < davida> I was following the example in the tutorials (http://www.mlpack.org/docs/mlpack-3.0.3/doxygen/anntutorial.html) which uses arma::mat for inputs and labels. Perhaps that page needs updating.

11:54 ayesdie has joined #mlpack

12:04 ayesdie has quit [Ping timeout: 246 seconds]

12:34 caiocarvalho has joined #mlpack

12:34 cjlcarvalho has quit [Quit: Konversation terminated!]

12:46 caiocarvalho has quit [Ping timeout: 245 seconds]

13:09 < davida> I do have a question regarding this cube for inputs. If each slice is a step in time and this can vary by datapoint, then the cube needs to have the 'k' dimension to be the size of the longest datapoint. With this in mind, how do you instruct the RNN to stop processing those datapoints that are shorter than the longest datapoint?

13:23 caiocarvalho has joined #mlpack

13:29 caiocarvalho has quit [Ping timeout: 252 seconds]

15:58 vivekp has quit [Read error: Connection reset by peer]

16:00 vivekp has joined #mlpack

18:15 < davida> Hi, I am struggling a little with creating an RNN. I have a cube of labels that is X(27, 1500, 25) and my labels match as Y(27,1500,25). The input is a OneHot vector and the output should be another OneHot vector of the same dimension (prediction). I want to have 50 nodes in my RNN. I have built is like this:

18:16 < davida> mlpack::ann::Add<> add(nbrNodes); mlpack::ann::Linear<> lookup(27, nbrNodes); mlpack::ann::TanHLayer<> tanHLayer; mlpack::ann::Linear<> linear(50, 50); mlpack::ann::Recurrent<>* recurrent = new mlpack::ann::Recurrent<>(add, lookup, linear, tanHLayer, rho); mlpack::ann::RNN<mlpack::ann::MeanSquaredError<>, mlpack::ann::HeInitialization> model(rho); model.Add<mlpack::ann::IdentityLayer<> >(); model.Add(recurrent); model.Add<m

18:17 < davida> This is basically trying to follow the example provided in the tutorial on the mlpack website with a few modifications. I am not sure at all if what I am doing here is correct since I cannot find many examples of RNNs with MLPACK.

18:18 < davida> Could someone have a look at my model and see if it makes sense. I am getting an error when I run the code that tells me I have a matrix multiplication error: "addition: incompatible matrix dimensions: 50x32 and 50x5"

18:19 < davida> It seems a strange error to me as I do not have 5 inputs on any layer.

18:19 < davida> I should note here that my batch_size for SGD is 32, hence the 50x32.

18:20 < davida> Also, "nbrNodes" in the model above = 50

18:43 < davida> ... some more info from the aboce error is that it is being thrown in Backward() of the optimizer.

19:16 < davida> ...

19:18 < davida> I have tracked the error down to the optimizer by implementing a very simple optimizer instead which works but will not converge, hence the reason I need GradientClipping. The simple optimizer I tried was:

19:18 < davida> StandardSGD opt(0.01, 1, 500 * data.size(), -100);

19:18 < davida> ... which is taken from the recurrent_network_tests.cpp

19:19 < davida> The optimizer I need to implement but is failing with the matrix addition error I mentioned before is:

19:19 < davida> mlpack::optimization::AdamUpdate adamUpdate(10e-8, 0.9, 0.999);

19:19 < davida> mlpack::optimization::GradientClipping<mlpack::optimization::AdamUpdate> clipping(-5, 5, adamUpdate);

19:19 < davida> mlpack::optimization::SGD<mlpack::optimization::GradientClipping<mlpack::optimization::AdamUpdate> > optimizer(0.01, 32, 10000, 1e-05, true, clipping);

19:20 < davida> ...

19:20 < davida> I really would appreciate some help on this. Thanks.

19:34 < davida> ...

19:35 < davida> Some additional input on above error after more debugging. I removed the GradientClipping and the error is still there which means it is within the AdamUpdate portion of the optimizer. Here is the simpified optimizer code:

19:35 < davida> mlpack::optimization::AdamUpdate adamUpdate(10e-8, 0.9, 0.999);

19:35 < davida> mlpack::optimization::SGD<mlpack::optimization::AdamUpdate> optimizer(0.01, 32, 10000, 1e-05, true, adamUpdate);

22:03 < zoq> davida: Unfortunately we had to strip out the dynamic sequence size support for now, but it should be possible to reintegrate the support again. For now, you might like to pad the input/output.

22:03 < zoq> davida: About the model, you might want to take a look at the example here: https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/recurrent_network_test.cpp

22:03 < zoq> davida: Also, if you can provide a simple mockup with some random data, I can probably provide some more input.

23:25 caiocarvalho has joined #mlpack

23:44 caiocarvalho has quit [Quit: Konversation terminated!]

23:45 caiocarvalho has joined #mlpack