<EshaanAgarwal[m]>
* checking the forward and backward function
<jonathanplatkiew>
Thanks ryan>! I saw that though. What confuses me is that SigPack (in their documentation) stipulates that `arma::Col<T` with any type for `T` is an option. But it seems that is not, which is misleading.
<jonpsy[m]>
<EshaanAgarwal[m]> "Sure ! I am checking the..." <- So we're sure that inited weights of both are same? Do you have any screenshots or data to share?
psydroid has quit [Quit: Bridge terminating on SIGTERM]
Cadair has quit [Quit: Bridge terminating on SIGTERM]
SlackIntegration has quit [Quit: Bridge terminating on SIGTERM]
rcurtin[m] has quit [Quit: Bridge terminating on SIGTERM]
brongulus[m] has quit [Quit: Bridge terminating on SIGTERM]
jjb[m] has quit [Quit: Bridge terminating on SIGTERM]
jonpsy[m] has quit [Quit: Bridge terminating on SIGTERM]
shrit[m] has quit [Quit: Bridge terminating on SIGTERM]
ShubhamAgrawal[m has quit [Quit: Bridge terminating on SIGTERM]
HimanshuPathak[m has quit [Quit: Bridge terminating on SIGTERM]
fieryblade[m] has quit [Quit: Bridge terminating on SIGTERM]
EshaanAgarwal[m] has quit [Quit: Bridge terminating on SIGTERM]
kartikdutt18[m] has quit [Quit: Bridge terminating on SIGTERM]
zoq[m]1 has quit [Quit: Bridge terminating on SIGTERM]
TarekElsayed[m] has quit [Quit: Bridge terminating on SIGTERM]
say4n[m] has quit [Quit: Bridge terminating on SIGTERM]
zoq[m] has quit [Quit: Bridge terminating on SIGTERM]
AnwaarKhalid[m] has quit [Quit: Bridge terminating on SIGTERM]
_slack_mlpack_U0 has quit [Quit: Bridge terminating on SIGTERM]
jonathanplatkiew has quit [Quit: Bridge terminating on SIGTERM]
Cadair has joined #mlpack
rcurtin[m] has joined #mlpack
SlackIntegration has joined #mlpack
psydroid has joined #mlpack
shrit[m] has joined #mlpack
zoq[m] has joined #mlpack
jjb[m] has joined #mlpack
TarekElsayed[m] has joined #mlpack
kartikdutt18[m] has joined #mlpack
jonpsy[m] has joined #mlpack
brongulus[m] has joined #mlpack
fieryblade[m] has joined #mlpack
jonathanplatkiew has joined #mlpack
AnwaarKhalid[m] has joined #mlpack
say4n[m] has joined #mlpack
zoq[m]1 has joined #mlpack
_slack_mlpack_U0 has joined #mlpack
EshaanAgarwal[m] has joined #mlpack
ShubhamAgrawal[m has joined #mlpack
HimanshuPathak[m has joined #mlpack
<rcurtin[m]>
maybe it's worth filing a bug report with the sigpack developers?
<EshaanAgarwal[m]>
<jonpsy[m]> "So we're sure that inited..." <- i had one question. lets say we were able to make everything same in pytorch and mlpack network including weights. We still wont be able to make equal state that we get from the environment in both implementations ?
<EshaanAgarwal[m]>
EshaanAgarwal[m]: zoq:
<zoq[m]>
EshaanAgarwal[m]: True, but we can still make sure it's the same for a single run. e.g. by taking the input from pytorch, store it in a matrix and return it as part of the mlpack env.
<EshaanAgarwal[m]>
zoq[m]: okay ! sure then i will incorporate that! otherwise making everything same wont do much
<zoq[m]>
Yeah, we can even do it for two samples, but I would start with one since it's easier.
<EshaanAgarwal[m]>
also i wanted to print results from forward and backward. I dont see any function to get that ! so should i print it in my implementation (in mlpack library) itself for the time being using `std::cout` or do we have any getter function for that ?
<zoq[m]>
I would do it as part of the implementation itself (mlpack implementation), easiest solution.
<EshaanAgarwal[m]>
zoq[m]: cool! thanks
<EshaanAgarwal[m]>
zoq[m]: is there a way to load this in a easier way ? otherwise i would have to change the environment implementation too to incorporate loading from a vector.
<zoq[m]>
I would just modify the `Sample()` function, and load the matrix and return it, and just discard any other implemetnation.
<zoq[m]>
Or if you think it's easier, just not deal with the env and provide the output from pytorch.
<EshaanAgarwal[m]>
zoq[m]: i guess you are talking `InitialSample()` here. since `Sample()`;s implementation in both implmentation is same. But initial sample that we get would differ because of random functions of both implementation
<EshaanAgarwal[m]>
EshaanAgarwal[m]: and different initial sample can lead to different results.
<EshaanAgarwal[m]>
EshaanAgarwal[m]: i meant this !
<zoq[m]>
Yeah, I would "overwrite" both functions, but you can step through all the steps to make sure you get the right output.
<EshaanAgarwal[m]>
zoq[m]: ok ! i will look and make necessary change.
<jonpsy[m]>
Super busy with office work today; wouldn't be able to attend. If zoq you're attending the meet then I expect Eshaan Agarwal to write meeting summary. Else if no one's attending I'm still expecting some progress report.
<EshaanAgarwal[m]>
jonpsy[m]: I am comfortable with both! I will share my progress for today ! i have a bit of direction for what else i have to do for now ( make sure initial sample of environment is same for both impl). We can meet maybe on tmrw and Then i would be to share you more consolidated progress.
<EshaanAgarwal[m]>
> <@jonpsy:matrix.org> Super busy with office work today; wouldn't be able to attend. If zoq you're attending the meet then I expect Eshaan Agarwal to write meeting summary. Else if no one's attending I'm still expecting some progress report.
<EshaanAgarwal[m]>
* I am comfortable with both! I will share my progress for today ! i have a bit of direction for what else i have to do for now ( make sure initial sample of environment is same for both impl). We can meet maybe on tmrw and Then i would be to share you more consolidated progress.
<EshaanAgarwal[m]>
* I am comfortable with both! I will share my progress for today ! i have a bit of direction for what else i have to do for now ( make sure initial sample of environment is same for both impl). We can meet maybe on tmrw and Then i would be to share you more consolidated progress.
<EshaanAgarwal[m]>
EshaanAgarwal[m]: because without making that same,we wont be able to make initial conditions same for both networks.
<zoq[m]>
Are you able to finish it by tomorrow?
<jonpsy[m]>
I mean, I can't guarantee being able to attend meets besides weekends.
<EshaanAgarwal[m]>
zoq[m]: You mean changes in environment? Sure I would by then
<jonpsy[m]>
Didn't we ensure initial weight are same for both? Why that topic again, im confused
<EshaanAgarwal[m]>
jonpsy[m]: Weights of network. But both environment implementation use a random function to get the initial sample. Although the range from which we get the random value is same but the final random values would differ.
<EshaanAgarwal[m]>
This would make the initial sample which both policy get different and therefore corresponding value and actions may or may not be different.
<EshaanAgarwal[m]>
* This would make the initial sample which both policy get different and therefore corresponding value and actions will be different in virtually all of the cases.
<EshaanAgarwal[m]>
* This would make the initial sample which both policy get different and therefore corresponding value and actions (that our network compute) will be different in virtually all of the cases.
<EshaanAgarwal[m]>
* Weights of network. But both environment implementation use a random function to get the initial sample. Although the range from which we get the random value is same but the final random state that we sample would differ.
<zoq[m]>
We should make sure, we finish the steps we discussed, which includes making sure the backward step works as expected.
<jonpsy[m]>
EshaanAgarwal[m]: Can't we fix it too?
<jonpsy[m]>
Seems only natural.
<EshaanAgarwal[m]>
zoq[m]: Sure thing ! We would be able to that make sure only when we have the initial sample same.
<EshaanAgarwal[m]>
jonpsy[m]: Sure if I save that from Pytorch and load it in mlpack implementation.
<jonpsy[m]>
should be a quick one
<EshaanAgarwal[m]>
EshaanAgarwal[m]: But it would be a bit messy because ! For every training run that I make, I will have to manually do it again.
<EshaanAgarwal[m]>
jonpsy[m]: Will have to make changes in environment and compile it. That might take some time.
<jonpsy[m]>
EshaanAgarwal[m]: We only need it for one forward & backward pass.
<jonpsy[m]>
So we can set the random number fixed, such as 0.6
<EshaanAgarwal[m]>
jonpsy[m]: Ok but we won't be able to control that in put pytorch implementation. So in my opinion, loading the sample from Pytorch to mlpack seems a bit more reasonable
<EshaanAgarwal[m]>
* Ok but we won't be able to control that in our pytorch implementation since we use openai gym. So in my opinion, loading the sample from Pytorch to mlpack seems a bit more reasonable
<EshaanAgarwal[m]>
Quite tedious. Although from what I could make out from digging in the code, I am sure that forward pass is working just fine.
<EshaanAgarwal[m]>
Bug is somewhere in our `update()` as you mentioned in the previous meet. 😬
<EshaanAgarwal[m]>
<EshaanAgarwal[m]> "I am comfortable with both! I..." <- zoq @marcusedel:matrix.org: would we be meeting today then ?
<zoq[m]>
Let's do it tomorrow, looks like you are working on it right now, unless you have any questions.
<EshaanAgarwal[m]>
zoq[m]: Cool. I don't have any. Will ask if I get them.
<EshaanAgarwal[m]>
* Cool. I have a direction to work on Don't have any. Will ask if I get them.
<EshaanAgarwal[m]>
* Cool. I have a direction to work on. Don't have any questions. Will ask if I get them.