rcurtin_irc changed the topic of #mlpack to: mlpack: a scalable machine learning library (https://www.mlpack.org/) -- channel logs: https://libera.irclog.whitequark.org/mlpack -- NOTE: messages sent here might not be seen by bridged users on matrix, gitter, or slack
<jonpsy[m]> Anything new? Eshaan Agarwal
<EshaanAgarwal[m]> <jonpsy[m]> "Anything new? Eshaan Agarwal..." <- yeah ! i was checking values on random runs and i found that that the new actor network's `actionProb` is coming same as old actor network's `actionProb`. This means that the new actor is not learning !
<EshaanAgarwal[m]> EshaanAgarwal[m]: this is one example but this is consistent across all runs.
<jonpsy[m]> So i guess that backward hunch was right all along
<jonpsy[m]> s/hunch/guess/
<EshaanAgarwal[m]> jonpsy[m]: Yeah ! But I am not sure where we are missing.
<jonpsy[m]> First ensure pytorch & you are calculating the same loss
<jonpsy[m]> It's possible your loss is 0
<jonpsy[m]> But I think that's unlikely, what's more likely is you're setting it 0 somewhere later OR not updating the value.
<jonpsy[m]> Either back prop function isn't working as expected, or loss is not updated. That's most likely
<EshaanAgarwal[m]> jonpsy[m]: I don't think this is happening. Let me share you a screen shots of all the values I computed in that run.
<EshaanAgarwal[m]> jonpsy[m]: Agreed.
<jonpsy[m]> Can you print the updated weights AFTER the backprops.
<jonpsy[m]> I think this will make everything clear
<EshaanAgarwal[m]> jonpsy[m]: Of both the networks ? Okay ! I will do it.
<jonpsy[m]> Those are gradients right? I asked weights.
<EshaanAgarwal[m]> jonpsy[m]: Yeah ! Sorry these were the previous screenshots that I mentioned.
<EshaanAgarwal[m]> Sending them in 1 min
<jonpsy[m]> Nw, take your time. More essential to be correct than fast
<jonpsy[m]> <EshaanAgarwal[m]> "Screenshot from 2022-09-14 20-51..." <- This comes first, the above comes later. Right/
<jonpsy[m]> s///?/
<EshaanAgarwal[m]> jonpsy[m]: Right 😅
<jonpsy[m]> what's dGrad?
<jonpsy[m]> EshaanAgarwal[m]: For the second one, can you show me the full picture? With actor -3.238 something
<EshaanAgarwal[m]> jonpsy[m]: After applying softmax on `dLoss`. This is what we send to backward pass
<jonpsy[m]> Ok, are the `dGrad` values matching?
<jonpsy[m]> in `pytorch`?
<EshaanAgarwal[m]> jonpsy[m]: Let me try that ! Before that I have got the weights for both actor and critic before and after update
<jonpsy[m]> sure, send it
<EshaanAgarwal[m]> jonpsy[m]: Let me check that after this
<jonpsy[m]> create a gmeet, lets connect
<EshaanAgarwal[m]> Sure. Sending the link
<jonpsy[m]> zoq: mind joining?
<EshaanAgarwal[m]> jonpsy[m]: https://meet.google.com/pnp-rtjw-unz
<jonpsy[m]> Ping back when your net fixes
<EshaanAgarwal[m]> jonpsy[m]: Just got back.
<EshaanAgarwal[m]> EshaanAgarwal[m]: jonpsy: would it be possible that we can't meet at 9:45 ?
<EshaanAgarwal[m]> * jonpsy: would it be possible that we can't meet at 9:45 ?
<EshaanAgarwal[m]> > <@eshaanagarwal:matrix.org> Just got back.
<EshaanAgarwal[m]> * jonpsy: would it be possible that we meet at 9:45 ?
<EshaanAgarwal[m]> s/can't//
<jonpsy[m]> ok
<jonpsy[m]> sorry had company work, can we meet again
<jonpsy[m]> Eshaan Agarwal: ^^
<EshaanAgarwal[m]> jonpsy[m]: Sure.
<jonpsy[m]> same meet?
<EshaanAgarwal[m]> <jonpsy[m]> "same meet?..." <- i did try subtracting old and new weights ! but i am facing precision issues in python.
<ShubhamAgrawal[m> EshaanAgarwal[m]: subtraction is not numerically stable
<ShubhamAgrawal[m> try to take average
<ShubhamAgrawal[m> maybe that will solve some problem
<EshaanAgarwal[m]> ShubhamAgrawal[m: But how would that let me know the difference in their values ?
<ShubhamAgrawal[m> How much error are you getting rn?
<EshaanAgarwal[m]> ShubhamAgrawal[m: as of now for 0.3890 - 0.3080 i am getting ans as 0
<ShubhamAgrawal[m> <EshaanAgarwal[m]> "as of now for 0.3890 - 0.3080..." <- Is this maximum or accumulation of all errors
<ShubhamAgrawal[m> ?
<EshaanAgarwal[m]> ShubhamAgrawal[m: All are of this order
<ShubhamAgrawal[m> EshaanAgarwal[m]: Then idts it's precision error
<ShubhamAgrawal[m> There is something else that is missing