rcurtin_irc changed the topic of #mlpack to: mlpack: a scalable machine learning library (https://www.mlpack.org/) -- channel logs: https://libera.irclog.whitequark.org/mlpack -- NOTE: messages sent here might not be seen by bridged users on matrix, gitter, or slack
say4n has quit [Quit: Connection closed for inactivity]
robobub has joined #mlpack
krushia has quit [Ping timeout: 276 seconds]
silentbatv2[m] has quit [Quit: You have been kicked for being idle]
<vaibhavp[m]> So, zoq , rcurtin , I dug a little deeper into the incorrect use of Backward method(also Deriv function now) and implemented Jacobian test for all activation functions. Since, zoq agrees that output should be used to calculate the derivative, 14 activation layers failed the Jacobian test. Looking at those function, they have clearly used the input to calculate derivative which is incorrect. I am very surprised to see so many layers
<vaibhavp[m]> incorrectly implemented. Only LogisticFunction and TanHfunction have implemented their derivative correctly.
<vaibhavp[m]> Softplus, mish functions are some of the activation which fails JacobianTest
<rcurtin[m]> vaibhavp: thanks for doing this digging. I took a look at the LogSoftMax issue and PR earlier this morning, and I haven't had a chance to dig too deeply there, but I guess none of our example networks or tests really ever used a network with a LogSoftMax in it. Maybe in addition to the Jacobian test it makes sense to add a style of test where we build some really trivial small network on a very small easy dataset, and make sure it can learn
<rcurtin[m]> anything at all
<vaibhavp[m]> Also rcurtin , can you review PR #3471 (this fixes JacobianTest) because all additional tests that I am adding depends on it? It's a small PR.
<rcurtin[m]> it is indeed a small PR, but it has lots of implications and I have to fully understand things before I give it the thumbs-up... that's why I didn't approve it this morning
krushia has joined #mlpack