#mlpack on 2023-04-07 — irc logs at libera.irclog.whitequark.org

2021-07-27 15:44 rcurtin_irc changed the topic of #mlpack to: mlpack: a scalable machine learning library (https://www.mlpack.org/) -- channel logs: https://libera.irclog.whitequark.org/mlpack -- NOTE: messages sent here might not be seen by bridged users on matrix, gitter, or slack

09:00 _slack_mlpack_16 has quit [Quit: You have been kicked for being idle]

09:00 _slack_mlpack_34 has quit [Quit: You have been kicked for being idle]

09:00 ArnabMukherjee[m has quit [Quit: You have been kicked for being idle]

09:12 say4n has quit [Quit: Connection closed for inactivity]

16:00 AdithyaPenagonda has quit [Quit: You have been kicked for being idle]

16:00 _slack_mlpack_19 has quit [Quit: You have been kicked for being idle]

18:50 <vaibhavp[m]> Hey zoq : I was looking at the code in the mlpack tests, that you were the author of, that I find conflicting(or maybe I am misunderstanding something). So in the [JacobianTest](https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/ann/ann_test_tools.hpp#L77) function, you have passed the input of the... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/cef8b8b710348f4477113b1eadf323aaf4530c35>)

18:51 <vaibhavp[m]> * Hey zoq : I was looking at the code in the mlpack tests, that you were the author of, that I find conflicting(or maybe I am misunderstanding something). So in the [JacobianTest](https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/ann/ann_test_tools.hpp#L77) function, you have passed the input of the... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/0075896e26f963864e5a3fe12fb4c8f206a05b02>)

18:53 <vaibhavp[m]> * Hey zoq : I was looking at the code in the mlpack tests, that you were the author of, that I find conflicting(or maybe I am misunderstanding something). So in the [JacobianTest](https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/ann/ann_test_tools.hpp#L77) function, you have passed the input of the... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/502dfcc816c89f9c1e11fd0e2616f6cbe2f3aed2>)

18:54 <rcurtin[m]> Sometimes the input is necessary to properly compute the backward pass. I can't remember which layers do that, but if you look at the implementations of the Backward() functions, many of them have the input parameter commented out because it's not necessary. However, a couple do need to use it. I think Convolution is one of them, but I haven't checked and don't remember for sure 👍️

18:56 <zoq[m]> Right, most of the layers e.g. https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/linear_impl.hpp#L116-L120 do not use the input.

18:58 <vaibhavp[m]> But the some layers compute the Backward using the output of the layer as well like if you look at the derivative of the [logistic_function](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/activation_functions/logistic_function.hpp#L83) (which is used as the Sigmoid layer)

18:58 <vaibhavp[m]> Now, here if input is passed, it wil be wrong.

18:58 <vaibhavp[m]> s/wil/will/

18:59 <vaibhavp[m]> s/the//, s/logistic_function/logistic\_function/

18:59 <zoq[m]> Right, we should use the output of the forward pass.

19:00 <vaibhavp[m]> Then, I think maybe a few tests in the ann module need to be rectified.

19:00 <rcurtin[m]> To be clear, the layer API and the activation function API are different 👍️ the logistic function you linked to is not a layer in and of itself

19:01 <zoq[m]> True, but is forwards to the function vaibhavp referenced https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/base_layer.hpp#L104-L109

19:02 <vaibhavp[m]> rcurtin[m]: But, in the base_layer Deriv is used to calculate Backward which uses the output of the Forward method.

19:02 <zoq[m]> s/is/it/, s/base_layer/base\_layer/

19:03 <rcurtin[m]> I have no idea, I'm not able to dig deeply into the code here. Maybe there is a problem in the tests, but it would be hard to believe there is a problem at such a fundamental level since so many of our other tests and example models pass

19:07 <vaibhavp[m]> Actually, I also had a hard time believe this was the case, because I had seen this many times and re-read the MultiLayer code multiple times to verify what was happening. But if you look at the [Backward code in the MutliLayer.hpp](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/multi_layer_impl.hpp#L187) you will see that the layerOutput of that layer is passed which is same as the output of [Forward Method in

19:07 <vaibhavp[m]> the MultiLayer](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/multi_layer_impl.hpp#L162), therefore output of the Forward method should be passed instead of the input of the Forward method.

19:07 <vaibhavp[m]> s/believe/believing/

19:07 <vaibhavp[m]> * Actually, I also had a hard time believing this was the case, because I had seen this many times error and re-read the MultiLayer code multiple times to verify what was happening. But if you look at the [Backward code in the MutliLayer.hpp](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/multi_layer_impl.hpp#L187) you will see that the layerOutput of that layer is passed which is same as the output of [Forward

19:07 <vaibhavp[m]> Method in the MultiLayer](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/multi_layer_impl.hpp#L162), therefore output of the Forward method should be passed instead of the input of the Forward method.

19:07 <vaibhavp[m]> * Actually, I also had a hard time believing this was the case, because I had seen this error many times and re-read the MultiLayer code multiple times to verify what was happening. But if you look at the [Backward code in the MutliLayer.hpp](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/multi_layer_impl.hpp#L187) you will see that the layerOutput of that layer is passed which is same as the output of [Forward

19:08 <vaibhavp[m]> * Actually, I also had a hard time believing this was the case, because I had seen this error many times and re-read the MultiLayer code multiple times to verify what was happening. But if you look at the [Backward code in the MutliLayer.hpp](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/multi_layer_impl.hpp#L187) you will see that the layerOutput[i] is passed which is same as the output of [Forward Method in

19:08 <vaibhavp[m]> the MultiLayer](https://github.com/mlpack/mlpack/blob/master/src/mlpack/methods/ann/layer/multi_layer_impl.hpp#L162), therefore output of the Forward method should be passed instead of the input of the Forward method.

19:09 <vaibhavp[m]> I don't think I explained it well, but if you look at the links it will all be clear.

19:09 <vaibhavp[m]> s/be//

19:10 <vaibhavp[m]> * I don't think I explained it well, but if you look at the links it will be all clear.

19:10 <zoq[m]> If you can open a PR, we can check how many tests will fail and investigate further.

19:10 <vaibhavp[m]> sure, on it.

19:23 <vaibhavp[m]> Also to be noted is that some layers do not use the JacobianTest but maybe create their own check for Backward method like LogSoftMax

19:32 <rcurtin[m]> I'd love to make the tests for each layer be a little bit more automated, but I don't think anyone's had a chance to do that. Ideally every layer should pass some simple tests on the forward, backward, and gradient passes. The JacobianTest is great because it just computes a numerical approximation of the gradient... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/a28233c58a337ad468b64189ed2e6e4f5db06474>)

19:32 <vaibhavp[m]> <zoq[m]> "If you can open a PR, we can..." <- I performed the "ANNLayerTest" tests on my local machine and all tests passed.

19:33 <vaibhavp[m]> But not all layers use the JacobianTest.

19:34 <rcurtin[m]> If it passed before your change, and it also passed after your change, at least personally I'd be uncomfortable accepting your change (that implies that the particular code you are changing has not been tested properly!), or, at least, I'd want to take a very deep dive into it to understand why 😃

19:35 <vaibhavp[m]> So, let me make a few more changes to see if something happens.

19:35 <rcurtin[m]> It's super tedious to develop like this, but one way I like to develop when I think some code isn't right is, I write a test case that exposes the bug I think should exist. If that test case fails, then I can go fix the code and it will start passing, and I can sleep easy at night knowing I actually found and addressed an issue 😃

19:35 <rcurtin[m]> Yeah, that's also a great idea to build knowledge about the code... change things around, see if anything changes, rinse and repeat, then sooner or later it all starts to make sense :)

19:45 <vaibhavp[m]> <rcurtin[m]> "If it passed before your change,..." <- I completely agree. The reason the tests passed, at least from what I saw, is because most of the test correctly perform the test(without using the JacobianTest or its siblings) or either input is not even required. What I would suggest is that

19:56 <vaibhavp[m]> <rcurtin[m]> "If it passed before your change,..." <- I completely agree. The reason the tests passed, at least from what I saw, is because most of the tests are correctly written(without using the JacobianTest or its siblings) or even if they do use the JacobianTest, input is not even required(Linear layer) or derivative is already calculated in the Forward method(like ELU layer). What I would suggest is perhaps that more tests be

19:56 <vaibhavp[m]> implemented for each layer using JacobianTest(s) instead of using their own tests for Backward method. Perhaps, there should be a guide for how the a layer should be implemented for future contributors, which specifies what should passed to the Backward layer(and Gradient method also) because most people are not aware of what is happening internally in the FFN and RNN, and it is completely logically to assume that input is used to

19:56 <vaibhavp[m]> calculate the derivative of a function(perhaps because it is what they have done in the school or college)

19:58 <vaibhavp[m]> All this considering, there is a error here and output should be passed to the Backward layer instead of input.

21:42 <vaibhavp[m]> So, I added JacobianTest for LogSoftMax, which failed. This is with the current code without changes.

21:43 <vaibhavp[m]> Perhaps there is an issue here. What do you guys think?

22:56 <vaibhavp[m]> The tests are failing in PR(#3465) as they should. Can someone do a sanity check for the changes I made?