<EshaanAgarwal[m]>
<EshaanAgarwal[m]> "No it isn't. Almost done. Had..." <- I have done this. Should I share the outputfile for mlpack implementation here ?
<jonpsy[m]>
<rcurtin[m]> "@jonpsy can you log in to..." <- Hey thanks! I wanted to learn about deployments, so I thought this would be a good starter.
<jonpsy[m]>
<EshaanAgarwal[m]> "I have done this. Should I share..." <- Sure
<EshaanAgarwal[m]>
and in those implementation this piece of code came with this documentation - // Reset all the networks.
<EshaanAgarwal[m]>
// Note: the actor and critic networks have an if condition before reset.
<EshaanAgarwal[m]>
// passed using this constructor.
<EshaanAgarwal[m]>
// This is because we don't want to reset a loaded(possibly pretrained) model
<EshaanAgarwal[m]>
jonpsy[m]: during initaliztion of constructor of PPO
<jonpsy[m]>
well, then it shouldn't matter; but ofc you'd remove this line for now
<jonpsy[m]>
since you're loading network params manually
<EshaanAgarwal[m]>
jonpsy[m]: yeah agreed but how is it working ?
<EshaanAgarwal[m]>
EshaanAgarwal[m]: i think that it is not doing what it is supposed to do. since here i am loading model but it was still reseting the network
<jonpsy[m]>
didn't you remove it?
<EshaanAgarwal[m]>
jonpsy[m]: then i did when i checked the print statments
<EshaanAgarwal[m]>
but then i had questions that arent the weights initialized at the we set the network up ?
<jonpsy[m]>
yes; but read the comment. He's assuming there's can be a pre-trained model here
<EshaanAgarwal[m]>
also how does checking the number of elements in network with env sample size helps in detecting that ?
<zoq[m]>
I can’t join the meeting right now, so please start without me.
<zoq[m]>
I thought we are able to load the parameters already?
<zoq[m]>
Didn’t you say you did that already, kinda confused.
<EshaanAgarwal[m]>
zoq[m]: We are able to. Just when I was checking the weights and how they changed with epsiodes, I noticed that this piece of code reset the network during initialisation itself. Then I uncommented it and wondered what it's doing
<EshaanAgarwal[m]>
EshaanAgarwal[m]: Commented*
<zoq[m]>
Did you disable exploration?
<EshaanAgarwal[m]>
zoq[m]: What do you mean by that ?
<zoq[m]>
If we have exploration enabled we reset the weights
<zoq[m]>
But looks like it’s not enabled
<zoq[m]>
If you remove it, does it give the correct results?
<EshaanAgarwal[m]>
zoq[m]: I am not sure if I have done anything related to that in implementation 👀
<EshaanAgarwal[m]>
zoq[m]: Yes for manually provided weights everything got smooth. Then out of curiosity I tried to do it with without manual set weights
<jonpsy[m]>
Regardless, what of pytorch forward result?
<EshaanAgarwal[m]>
jonpsy[m]: I don't know how to save that in a file. I looked up on the internet
<jonpsy[m]>
Can store an SS; anyway what did you find?
<EshaanAgarwal[m]>
jonpsy[m]: Loss values were a bit different ! I think actor loss is also coming different. There wasn't much difference in critic losses. Even the updated weights looked almost same
<zoq[m]>
So that sounds like forward pass is okay.
<zoq[m]>
Backward pass as well?
<zoq[m]>
You checked the errors?
<zoq[m]>
Can you push the debugging for the forward and backward pass?
<EshaanAgarwal[m]>
What errors ?
<zoq[m]>
Of the backward pass.
<jonpsy[m]>
he meant backward gradients
<zoq[m]>
Right
<EshaanAgarwal[m]>
zoq[m]: Push where ?
<jonpsy[m]>
Diary?
<zoq[m]>
To the PR
<zoq[m]>
I like to run it myself
<jonpsy[m]>
or PR or chat anywhere we can see
<zoq[m]>
But did you check the backward check?
<EshaanAgarwal[m]>
zoq[m]: Ok sure ! I have made some changes in cart pole evironment too.
<EshaanAgarwal[m]>
zoq[m]: Almost going there when I noticed that weights issue and then tried to resolve that
<EshaanAgarwal[m]>
jonpsy: if you are free would you like to join the meet ?
<zoq[m]>
Might not make it.
<jonpsy[m]>
nw, Eshaan Agarwal will write the meeting summary
<EshaanAgarwal[m]>
EshaanAgarwal[m]: jonpsy: please look now ! actully in pytorch i was saving only one of them in variable . i changed that now
<EshaanAgarwal[m]>
they look similiar
<EshaanAgarwal[m]>
zoq @marcusedel:matrix.org: jonpsy: can you please reopen the pull request. Apparently it got closed due to inactivity.
<rcurtin[m]>
jonpsy: sounds good, just let me know if I can help explain the Jenkins setup or anything. it has years and years of cruft and insider knowledge from things we've set up 😃