<EshaanAgarwal[m]> "No it isn't. Almost done. Had..." <- I have done this. Should I share the outputfile for mlpack implementation here ?
<rcurtin[m]> "@jonpsy can you log in to..." <- Hey thanks! I wanted to learn about deployments, so I thought this would be a good starter.
<EshaanAgarwal[m]> "I have done this. Should I share..." <- Sure
and in those implementation this piece of code came with this documentation - // Reset all the networks.
// Note: the actor and critic networks have an if condition before reset.
// passed using this constructor.
// This is because we don't want to reset a loaded(possibly pretrained) model
jonpsy[m]: during initaliztion of constructor of PPO
well, then it shouldn't matter; but ofc you'd remove this line for now
since you're loading network params manually
jonpsy[m]: yeah agreed but how is it working ?
EshaanAgarwal[m]: i think that it is not doing what it is supposed to do. since here i am loading model but it was still reseting the network
didn't you remove it?
jonpsy[m]: then i did when i checked the print statments
but then i had questions that arent the weights initialized at the we set the network up ?
yes; but read the comment. He's assuming there's can be a pre-trained model here
also how does checking the number of elements in network with env sample size helps in detecting that ?
I can’t join the meeting right now, so please start without me.
I thought we are able to load the parameters already?
Didn’t you say you did that already, kinda confused.
zoq[m]: We are able to. Just when I was checking the weights and how they changed with epsiodes, I noticed that this piece of code reset the network during initialisation itself. Then I uncommented it and wondered what it's doing
EshaanAgarwal[m]: Commented*
Did you disable exploration?
zoq[m]: What do you mean by that ?
If we have exploration enabled we reset the weights
But looks like it’s not enabled
If you remove it, does it give the correct results?
zoq[m]: I am not sure if I have done anything related to that in implementation 👀
zoq[m]: Yes for manually provided weights everything got smooth. Then out of curiosity I tried to do it with without manual set weights
Regardless, what of pytorch forward result?
jonpsy[m]: I don't know how to save that in a file. I looked up on the internet
Can store an SS; anyway what did you find?
jonpsy[m]: Loss values were a bit different ! I think actor loss is also coming different. There wasn't much difference in critic losses. Even the updated weights looked almost same
So that sounds like forward pass is okay.
Backward pass as well?
You checked the errors?
Can you push the debugging for the forward and backward pass?
What errors ?
Of the backward pass.
he meant backward gradients
zoq[m]: Push where ?
To the PR
I like to run it myself
or PR or chat anywhere we can see
But did you check the backward check?
zoq[m]: Ok sure ! I have made some changes in cart pole evironment too.
zoq[m]: Almost going there when I noticed that weights issue and then tried to resolve that
jonpsy: if you are free would you like to join the meet ?
Might not make it.
nw, Eshaan Agarwal will write the meeting summary
EshaanAgarwal[m]: jonpsy: please look now ! actully in pytorch i was saving only one of them in variable . i changed that now
they look similiar
zoq @marcusedel:matrix.org: jonpsy: can you please reopen the pull request. Apparently it got closed due to inactivity.
jonpsy: sounds good, just let me know if I can help explain the Jenkins setup or anything. it has years and years of cruft and insider knowledge from things we've set up 😃