04:43
<
jonpsy[m] >
so i found out that it is indeed training
04:43
<
jonpsy[m] >
but the error seems to be with the env
06:57
<
jonpsy[m] >
say4n: could you have a look at ```agent.learn()``` code part?
07:01
<
say4n[m] >
jonpsy[m]: Sure, I can have a look later today. Specifically what about it?
09:23
<
jonpsy[m] >
so, i suppose they've used ```model``` and ```model_``` as ```learningNetwork``` and ```targetNetwork```, but im not sure
09:23
<
jonpsy[m] >
and how theyve implemented the algo in general.
09:24
<
jonpsy[m] >
we could have a chat after you've had a pass, thoughts?
09:33
<
jonpsy[m] >
* and how theyve implemented the algo in general. ```loss```
10:59
<
jonpsy[m] >
thanks a lot!
11:47
rcurtin_1rc has joined #mlpack
11:51
AbhishekMishra[m has quit [*.net *.split]
11:51
LokeshJawale[m] has quit [*.net *.split]
11:51
RudraPatil[m] has quit [*.net *.split]
11:51
HrithikNambiar[m has quit [*.net *.split]
11:51
jonpsy[m] has quit [*.net *.split]
11:51
jjb[m] has quit [*.net *.split]
11:51
JatoJoseph[m] has quit [*.net *.split]
11:51
_slack_mlpack_34 has quit [*.net *.split]
11:51
_slack_mlpack_16 has quit [*.net *.split]
11:51
Shadow3049[m] has quit [*.net *.split]
11:51
mlpack-inviter[m has quit [*.net *.split]
11:51
KrishnaSashank[m has quit [*.net *.split]
11:51
Pushker[m] has quit [*.net *.split]
11:51
rcurtin[m] has quit [*.net *.split]
11:51
AbhinavGudipati[ has quit [*.net *.split]
11:51
GauravGhati[m] has quit [*.net *.split]
11:51
M7Ain7Soph77Ain7 has quit [*.net *.split]
11:51
MayankRaj[m] has quit [*.net *.split]
11:51
halfy has quit [*.net *.split]
11:51
ShivamShaurya[m] has quit [*.net *.split]
11:51
_slack_mlpack_25 has quit [*.net *.split]
11:51
_slack_mlpack_13 has quit [*.net *.split]
11:51
_slack_mlpack_U7 has quit [*.net *.split]
11:51
Cadair has quit [*.net *.split]
11:51
MatheusAlcntaraS has quit [*.net *.split]
11:51
jeffin143[m] has quit [*.net *.split]
11:51
AbhishekNimje[m] has quit [*.net *.split]
11:51
VarunGupta[m] has quit [*.net *.split]
11:51
AyushiJain[m] has quit [*.net *.split]
11:51
ABHINAVANAND[m] has quit [*.net *.split]
11:51
dkipke[m] has quit [*.net *.split]
11:51
ShivamNayak[m] has quit [*.net *.split]
11:51
OleksandrNikolsk has quit [*.net *.split]
11:51
DavidportlouisDa has quit [*.net *.split]
11:51
RishabhGoel[m] has quit [*.net *.split]
11:51
DivyanshKumar[m] has quit [*.net *.split]
11:51
Kaushalc64[m] has quit [*.net *.split]
11:51
SergioMoralesE[m has quit [*.net *.split]
11:51
ServerStatsDisco has quit [*.net *.split]
11:51
swaingotnochills has quit [*.net *.split]
11:51
AyushSingh[m] has quit [*.net *.split]
11:51
SlackIntegration has quit [*.net *.split]
11:51
zoq[m]1 has quit [*.net *.split]
11:51
GopiManoharTatir has quit [*.net *.split]
11:51
sailor[m] has quit [*.net *.split]
11:51
NippunSharmaNipp has quit [*.net *.split]
11:51
M074AABGKS has quit [*.net *.split]
11:51
_slack_mlpack_27 has quit [*.net *.split]
11:51
ShahAnwaarKhalid has quit [*.net *.split]
11:51
kartikdutt18kart has quit [*.net *.split]
11:51
AbdullahKhilji[m has quit [*.net *.split]
11:51
_slack_mlpack_14 has quit [*.net *.split]
11:51
_slack_mlpack_31 has quit [*.net *.split]
11:51
ZanHuang[m] has quit [*.net *.split]
11:51
fieryblade[m] has quit [*.net *.split]
11:51
AlexNguyen[m] has quit [*.net *.split]
11:51
shrit[m] has quit [*.net *.split]
11:51
rcurtin_irc has quit [*.net *.split]
11:53
AbhishekMishra[m has joined #mlpack
11:53
LokeshJawale[m] has joined #mlpack
11:53
RudraPatil[m] has joined #mlpack
11:53
HrithikNambiar[m has joined #mlpack
11:53
JatoJoseph[m] has joined #mlpack
11:53
_slack_mlpack_34 has joined #mlpack
11:53
jjb[m] has joined #mlpack
11:53
jonpsy[m] has joined #mlpack
11:53
_slack_mlpack_16 has joined #mlpack
11:56
KshitijAggarwal[ has quit [Ping timeout: 240 seconds]
11:56
AvikantSrivastav has quit [Ping timeout: 240 seconds]
11:56
TrinhNgo[m] has quit [Ping timeout: 240 seconds]
11:56
SaiVamsi[m] has quit [Ping timeout: 240 seconds]
11:56
MohomedShalik[m] has quit [Ping timeout: 240 seconds]
11:56
RishabhGarg108[m has quit [Ping timeout: 240 seconds]
11:56
_slack_mlpack_37 has quit [Ping timeout: 240 seconds]
11:56
AbhishekMishra[m has quit [Ping timeout: 245 seconds]
11:56
LokeshJawale[m] has quit [Ping timeout: 245 seconds]
11:56
HrithikNambiar[m has quit [Ping timeout: 245 seconds]
11:56
jjb[m] has quit [Ping timeout: 245 seconds]
11:56
_slack_mlpack_34 has quit [Ping timeout: 245 seconds]
11:56
rcurtin_matrixor has quit [Ping timeout: 252 seconds]
11:56
MatrixTravelerbo has quit [Ping timeout: 252 seconds]
11:56
HARSHCHAUHAN[m] has quit [Ping timeout: 252 seconds]
11:56
prasad-dashprasa has quit [Ping timeout: 252 seconds]
11:56
jonathanplatkiew has quit [Ping timeout: 252 seconds]
11:56
AmanKashyap[m] has quit [Ping timeout: 252 seconds]
11:56
Aakash-kaushikAa has quit [Ping timeout: 252 seconds]
11:56
NitikJain[m] has quit [Ping timeout: 252 seconds]
11:56
abernauer[m] has quit [Ping timeout: 252 seconds]
11:56
VedantaJha[m] has quit [Ping timeout: 252 seconds]
11:57
GauravGhati[m] has joined #mlpack
11:57
rcurtin[m] has joined #mlpack
11:57
shrit[m] has joined #mlpack
11:57
MayankRaj[m] has joined #mlpack
11:57
KrishnaSashank[m has joined #mlpack
11:57
Shadow3049[m] has joined #mlpack
11:57
AbhinavGudipati[ has joined #mlpack
11:57
M7Ain7Soph77Ain7 has joined #mlpack
11:57
mlpack-inviter[m has joined #mlpack
11:57
_slack_mlpack_13 has joined #mlpack
11:57
Pushker[m] has joined #mlpack
11:57
AlexNguyen[m] has joined #mlpack
11:57
ShivamShaurya[m] has joined #mlpack
11:57
_slack_mlpack_25 has joined #mlpack
11:57
_slack_mlpack_U7 has joined #mlpack
11:57
AyushSingh[m] has joined #mlpack
11:57
M074AABGKS has joined #mlpack
11:57
ZanHuang[m] has joined #mlpack
11:57
swaingotnochills has joined #mlpack
11:57
kartikdutt18kart has joined #mlpack
11:57
sailor[m] has joined #mlpack
11:57
NippunSharmaNipp has joined #mlpack
11:57
fieryblade[m] has joined #mlpack
11:57
zoq[m]1 has joined #mlpack
11:57
ShahAnwaarKhalid has joined #mlpack
11:57
SlackIntegration has joined #mlpack
11:57
GopiManoharTatir has joined #mlpack
11:57
_slack_mlpack_27 has joined #mlpack
11:57
AbdullahKhilji[m has joined #mlpack
11:57
_slack_mlpack_14 has joined #mlpack
11:57
_slack_mlpack_31 has joined #mlpack
11:57
Cadair has joined #mlpack
11:57
AyushiJain[m] has joined #mlpack
11:57
AbhishekNimje[m] has joined #mlpack
11:57
MatheusAlcntaraS has joined #mlpack
11:57
jeffin143[m] has joined #mlpack
11:57
VarunGupta[m] has joined #mlpack
11:57
ABHINAVANAND[m] has joined #mlpack
11:57
OleksandrNikolsk has joined #mlpack
11:57
RishabhGoel[m] has joined #mlpack
11:57
dkipke[m] has joined #mlpack
11:57
DavidportlouisDa has joined #mlpack
11:57
Kaushalc64[m] has joined #mlpack
11:57
ShivamNayak[m] has joined #mlpack
11:57
DivyanshKumar[m] has joined #mlpack
11:57
SergioMoralesE[m has joined #mlpack
12:00
Gulshan[m] has quit [Ping timeout: 240 seconds]
12:00
_slack_mlpack_28 has quit [Ping timeout: 240 seconds]
12:00
ABoodhayanaSVish has quit [Ping timeout: 240 seconds]
12:00
swaingotnochills has quit [Ping timeout: 252 seconds]
12:00
AyushSingh[m] has quit [Ping timeout: 252 seconds]
12:00
SlackIntegration has quit [Ping timeout: 252 seconds]
12:00
zoq[m]1 has quit [Ping timeout: 252 seconds]
12:00
sailor[m] has quit [Ping timeout: 252 seconds]
12:00
NippunSharmaNipp has quit [Ping timeout: 252 seconds]
12:00
GopiManoharTatir has quit [Ping timeout: 252 seconds]
12:00
fieryblade[m] has quit [Ping timeout: 252 seconds]
12:00
ZanHuang[m] has quit [Ping timeout: 252 seconds]
12:00
M074AABGKS has quit [Ping timeout: 252 seconds]
12:00
_slack_mlpack_14 has quit [Ping timeout: 252 seconds]
12:00
_slack_mlpack_27 has quit [Ping timeout: 252 seconds]
12:00
ShahAnwaarKhalid has quit [Ping timeout: 252 seconds]
12:00
AbdullahKhilji[m has quit [Ping timeout: 252 seconds]
12:00
kartikdutt18kart has quit [Ping timeout: 252 seconds]
12:00
_slack_mlpack_31 has quit [Ping timeout: 252 seconds]
12:00
ArunavShandeelya has quit [Ping timeout: 252 seconds]
12:00
Gman[m] has quit [Ping timeout: 252 seconds]
12:00
GaborBakos[m] has quit [Ping timeout: 252 seconds]
12:00
KumarArnav[m] has quit [Ping timeout: 252 seconds]
12:00
RudraPatil[m] has quit [Ping timeout: 245 seconds]
12:00
jonpsy[m] has quit [Ping timeout: 245 seconds]
12:00
JatoJoseph[m] has quit [Ping timeout: 245 seconds]
12:00
_slack_mlpack_16 has quit [Ping timeout: 245 seconds]
12:01
mlpack-inviter[m has quit [Ping timeout: 256 seconds]
12:01
Shadow3049[m] has quit [Ping timeout: 256 seconds]
12:01
KrishnaSashank[m has quit [Ping timeout: 256 seconds]
12:01
AlexNguyen[m] has quit [Ping timeout: 256 seconds]
12:01
Pushker[m] has quit [Ping timeout: 256 seconds]
12:01
rcurtin[m] has quit [Ping timeout: 256 seconds]
12:01
GauravGhati[m] has quit [Ping timeout: 256 seconds]
12:01
AbhinavGudipati[ has quit [Ping timeout: 256 seconds]
12:01
M7Ain7Soph77Ain7 has quit [Ping timeout: 256 seconds]
12:01
MayankRaj[m] has quit [Ping timeout: 256 seconds]
12:01
_slack_mlpack_13 has quit [Ping timeout: 256 seconds]
12:01
shrit[m] has quit [Ping timeout: 256 seconds]
12:01
_slack_mlpack_25 has quit [Ping timeout: 256 seconds]
12:01
ShivamShaurya[m] has quit [Ping timeout: 256 seconds]
12:01
_slack_mlpack_U7 has quit [Ping timeout: 256 seconds]
12:01
jeffin143[m] has quit [Ping timeout: 272 seconds]
12:01
MatheusAlcntaraS has quit [Ping timeout: 272 seconds]
12:01
Cadair has quit [Ping timeout: 272 seconds]
12:01
AbhishekNimje[m] has quit [Ping timeout: 272 seconds]
12:01
dkipke[m] has quit [Ping timeout: 272 seconds]
12:01
AyushiJain[m] has quit [Ping timeout: 272 seconds]
12:01
DavidportlouisDa has quit [Ping timeout: 272 seconds]
12:01
OleksandrNikolsk has quit [Ping timeout: 272 seconds]
12:01
ABHINAVANAND[m] has quit [Ping timeout: 272 seconds]
12:01
VarunGupta[m] has quit [Ping timeout: 272 seconds]
12:01
ShivamNayak[m] has quit [Ping timeout: 272 seconds]
12:01
SergioMoralesE[m has quit [Ping timeout: 272 seconds]
12:01
DivyanshKumar[m] has quit [Ping timeout: 272 seconds]
12:01
RishabhGoel[m] has quit [Ping timeout: 272 seconds]
12:01
Kaushalc64[m] has quit [Ping timeout: 272 seconds]
12:01
HarshVardhanKuma has quit [Ping timeout: 272 seconds]
12:01
SoumyadipSarkar[ has quit [Ping timeout: 272 seconds]
12:01
swaingotnochill[ has quit [Ping timeout: 272 seconds]
12:01
ryan[m]1 has quit [Ping timeout: 272 seconds]
12:01
say4n[m] has quit [Ping timeout: 272 seconds]
12:01
_slack_mlpack_10 has quit [Ping timeout: 272 seconds]
12:01
_slack_mlpack_22 has quit [Ping timeout: 272 seconds]
12:01
GauravTirodkar[m has quit [Ping timeout: 272 seconds]
12:01
Gauravkumar[m] has quit [Ping timeout: 268 seconds]
12:01
sdev_7211[m] has quit [Ping timeout: 268 seconds]
12:01
bisakh[m] has quit [Ping timeout: 268 seconds]
12:01
ChaithanyaNaik[m has quit [Ping timeout: 268 seconds]
12:01
ManishKausikH[m] has quit [Ping timeout: 268 seconds]
12:01
ronakypatel[m] has quit [Ping timeout: 268 seconds]
12:01
huberspot[m] has quit [Ping timeout: 268 seconds]
12:01
FranchisNSaikia[ has quit [Ping timeout: 268 seconds]
12:01
Amankumar[m] has quit [Ping timeout: 268 seconds]
12:01
Saksham[m] has quit [Ping timeout: 268 seconds]
12:01
RishabhGarg108Ri has quit [Ping timeout: 268 seconds]
12:01
_slack_mlpack_17 has quit [Ping timeout: 268 seconds]
12:01
_slack_mlpack_24 has quit [Ping timeout: 268 seconds]
12:01
EricTroupeTester has quit [Ping timeout: 268 seconds]
12:01
SiddhantJain[m] has quit [Ping timeout: 276 seconds]
12:01
TathagataRaha[m] has quit [Ping timeout: 276 seconds]
12:01
HemalMamtora[m] has quit [Ping timeout: 276 seconds]
12:01
DillonKipke[m] has quit [Ping timeout: 276 seconds]
12:01
DirkEddelbuettel has quit [Ping timeout: 276 seconds]
12:01
Prometheus[m] has quit [Ping timeout: 276 seconds]
12:01
LolitaNazarov[m] has quit [Ping timeout: 276 seconds]
12:01
fazamuhammad[m] has quit [Ping timeout: 276 seconds]
12:01
M068AABMUC has quit [Ping timeout: 276 seconds]
12:01
gitter-badgerThe has quit [Ping timeout: 276 seconds]
12:01
zoq[m] has quit [Ping timeout: 276 seconds]
12:01
heisenbuugGopiMT has quit [Ping timeout: 276 seconds]
12:01
AyushKumarLavani has quit [Ping timeout: 276 seconds]
12:01
_slack_mlpack_U4 has quit [Ping timeout: 276 seconds]
12:01
_slack_mlpack_19 has quit [Ping timeout: 276 seconds]
12:01
_slack_mlpack_U0 has quit [Ping timeout: 276 seconds]
12:04
_slack_mlpack_U0 has joined #mlpack
12:04
_slack_mlpack_U4 has joined #mlpack
12:04
_slack_mlpack_U7 has joined #mlpack
12:04
_slack_mlpack_13 has joined #mlpack
12:05
_slack_mlpack_10 has joined #mlpack
12:05
_slack_mlpack_34 has joined #mlpack
12:05
_slack_mlpack_16 has joined #mlpack
12:05
_slack_mlpack_22 has joined #mlpack
12:05
_slack_mlpack_37 has joined #mlpack
12:05
_slack_mlpack_31 has joined #mlpack
12:05
_slack_mlpack_19 has joined #mlpack
12:06
_slack_mlpack_25 has joined #mlpack
12:07
_slack_mlpack_28 has joined #mlpack
12:08
_slack_mlpack_14 has joined #mlpack
12:09
LokeshJawale[m] has joined #mlpack
12:09
HrithikNambiar[m has joined #mlpack
12:09
jjb[m] has joined #mlpack
12:09
AbhishekMishra[m has joined #mlpack
12:12
AvikantSrivastav has joined #mlpack
12:12
RishabhGarg108[m has joined #mlpack
12:12
MohomedShalik[m] has joined #mlpack
12:12
SaiVamsi[m] has joined #mlpack
12:12
KshitijAggarwal[ has joined #mlpack
12:12
TrinhNgo[m] has joined #mlpack
12:12
Aakash-kaushikAa has joined #mlpack
12:12
NitikJain[m] has joined #mlpack
12:12
HARSHCHAUHAN[m] has joined #mlpack
12:12
abernauer[m] has joined #mlpack
12:12
VedantaJha[m] has joined #mlpack
12:13
jonathanplatkiew has joined #mlpack
12:13
AmanKashyap[m] has joined #mlpack
12:13
prasad-dashprasa has joined #mlpack
12:25
KumarArnav[m] has joined #mlpack
12:32
AlexNguyen[m] has joined #mlpack
12:32
Pushker[m] has joined #mlpack
12:32
GauravGhati[m] has joined #mlpack
12:32
AbhinavGudipati[ has joined #mlpack
12:32
mlpack-inviter[m has joined #mlpack
12:32
Shadow3049[m] has joined #mlpack
12:32
M7Ain7Soph77Ain7 has joined #mlpack
12:32
rcurtin[m] has joined #mlpack
12:32
ShivamShaurya[m] has joined #mlpack
12:32
shrit[m] has joined #mlpack
12:32
KrishnaSashank[m has joined #mlpack
12:32
MayankRaj[m] has joined #mlpack
12:32
_slack_mlpack_24 has joined #mlpack
12:32
_slack_mlpack_17 has joined #mlpack
12:33
Gulshan[m] has joined #mlpack
12:33
ABoodhayanaSVish has joined #mlpack
12:33
_slack_mlpack_27 has joined #mlpack
12:33
SoumyadipSarkar[ has joined #mlpack
12:33
HarshVardhanKuma has joined #mlpack
12:33
swaingotnochill[ has joined #mlpack
12:33
GauravTirodkar[m has joined #mlpack
12:33
say4n[m] has joined #mlpack
12:33
ryan[m]1 has joined #mlpack
12:33
JatoJoseph[m] has joined #mlpack
12:33
RudraPatil[m] has joined #mlpack
12:33
ArunavShandeelya has joined #mlpack
12:33
Gman[m] has joined #mlpack
12:33
jonpsy[m] has joined #mlpack
12:33
GaborBakos[m] has joined #mlpack
12:37
jeffin143[m] has joined #mlpack
12:37
AbhishekNimje[m] has joined #mlpack
12:37
dkipke[m] has joined #mlpack
12:37
RishabhGoel[m] has joined #mlpack
12:37
MatheusAlcntaraS has joined #mlpack
12:37
SergioMoralesE[m has joined #mlpack
12:37
Kaushalc64[m] has joined #mlpack
12:37
AyushiJain[m] has joined #mlpack
12:37
DavidportlouisDa has joined #mlpack
12:37
DivyanshKumar[m] has joined #mlpack
12:37
ShivamNayak[m] has joined #mlpack
12:37
ABHINAVANAND[m] has joined #mlpack
12:38
VarunGupta[m] has joined #mlpack
12:38
OleksandrNikolsk has joined #mlpack
12:40
AyushSingh[m] has joined #mlpack
12:40
GopiManoharTatir has joined #mlpack
12:40
ShahAnwaarKhalid has joined #mlpack
12:40
fieryblade[m] has joined #mlpack
12:40
NippunSharmaNipp has joined #mlpack
12:40
kartikdutt18kart has joined #mlpack
12:40
M074AABGKS has joined #mlpack
12:40
swaingotnochills has joined #mlpack
12:40
ZanHuang[m] has joined #mlpack
12:40
zoq[m]1 has joined #mlpack
12:40
sailor[m] has joined #mlpack
12:41
AbdullahKhilji[m has joined #mlpack
12:54
bisakh[m] has joined #mlpack
12:54
huberspot[m] has joined #mlpack
12:54
RishabhGarg108Ri has joined #mlpack
12:54
Gauravkumar[m] has joined #mlpack
12:54
ManishKausikH[m] has joined #mlpack
12:54
ChaithanyaNaik[m has joined #mlpack
12:54
ronakypatel[m] has joined #mlpack
12:54
FranchisNSaikia[ has joined #mlpack
12:54
Saksham[m] has joined #mlpack
12:54
AyushKumarLavani has joined #mlpack
12:54
Prometheus[m] has joined #mlpack
12:54
fazamuhammad[m] has joined #mlpack
12:54
zoq[m] has joined #mlpack
12:54
TathagataRaha[m] has joined #mlpack
12:54
DillonKipke[m] has joined #mlpack
12:54
heisenbuugGopiMT has joined #mlpack
12:54
SiddhantJain[m] has joined #mlpack
12:54
LolitaNazarov[m] has joined #mlpack
12:54
HemalMamtora[m] has joined #mlpack
12:54
DirkEddelbuettel has joined #mlpack
12:54
M068AABMUC has joined #mlpack
12:57
Amankumar[m] has joined #mlpack
12:57
sdev_7211[m] has joined #mlpack
12:57
EricTroupeTester has joined #mlpack
13:01
<
kartikdutt18kart >
Sure.
13:11
rcurtin_matrixor has joined #mlpack
13:13
ServerStatsDisco has joined #mlpack
13:16
gitter-badgerThe has joined #mlpack
13:26
MatrixTravelerbo has joined #mlpack
13:27
halfy has joined #mlpack
13:31
Cadair has joined #mlpack
13:31
SlackIntegration has joined #mlpack
13:56
<
shrit[m] >
heisenbuug (Gopi M Tatiraju): would you send me the link for the meeting
13:57
<
heisenbuugGopiMT >
Is mlpack room busy?
13:58
<
heisenbuugGopiMT >
Okay...
14:00
<
shrit[m] >
I will join in a couple of seconds
14:01
<
heisenbuugGopiMT >
Yupp
14:04
<
zoq[m] >
room is free now
15:07
<
say4n[m] >
<jonpsy[m]> "we could have a chat after you'v" <- went through the method :)
15:08
<
jonpsy[m] >
@ me when you're free
15:16
<
jonpsy[m] >
btw, could someone please explain what this means?
15:38
<
heisenbuugGopiMT >
I might try. Are S and A which are in power state and action?
15:38
<
heisenbuugGopiMT >
R must be representing Real Numbers. Can you send which alphabet means what? Or is that what you want to know?
15:40
<
say4n[m] >
jonpsy: now?
15:49
<
jonpsy[m] >
free now right?
15:49
<
jonpsy[m] >
> I might try. Are S and A which are in power state and action?
15:49
<
jonpsy[m] >
> R must be representing Real Numbers. Can you send which alphabet means what? Or is that what you want to know?
15:49
<
jonpsy[m] >
i'll get back to this
15:50
<
jonpsy[m] >
so before we continue, lemme share some more details ive found
15:50
<
jonpsy[m] >
the paper says they're using a modified version of HER
15:50
<
jonpsy[m] >
aka Hindsight Experience Replay
15:51
<
jonpsy[m] >
so the logic they're going for, is
15:52
<
jonpsy[m] >
* aka Hindsight Experience Replay A.2.3
15:52
<
jonpsy[m] >
per episode, they generate a preference vector from some distribution
15:54
<
jonpsy[m] >
lets call it w_0; this is the preference we're meant to learn. During the stage of learning from replay buffer, for each sample of buffer they generate N_w amount of preference vectors.
15:54
<
jonpsy[m] >
then, they search for w' in N_w, such that utility of w0 * Q =~ w' * Q
16:02
<
jonpsy[m] >
you there?
16:03
<
jonpsy[m] >
i guess you understood what i was saying
16:03
<
jonpsy[m] >
ive written it in docs as well, so you can always re -read it
16:04
<
jonpsy[m] >
wanna discuss on ```agent.learn()``` ?
16:04
<
jonpsy[m] >
hm, would you like to start..?
16:05
<
say4n[m] >
right so the first thing they check for is the size of replay buffer and then if they have enough samples in it, they go on to produce minibatches for training
16:05
<
jonpsy[m] >
about the ```state_batch```
16:06
<
jonpsy[m] >
i think they're copying the same state right?
16:06
<
jonpsy[m] >
so, if state was [1]
16:06
<
say4n[m] >
lemme look at batchify
16:06
<
jonpsy[m] >
first, unsqueeze would be ``[[1]]```
16:06
<
jonpsy[m] >
* first, unsqueeze would be ```[[1]]```
16:06
<
say4n[m] >
weight_num is a scalar?
16:07
<
jonpsy[m] >
yes, its the number of weight's sampled from distribution
16:07
<
jonpsy[m] >
* yes, its the number of weights sampled from distribution
16:09
<
say4n[m] >
jonpsy[m]: yes, so all x.s elements of the minibatch are unsqueezed with the map and then these are multiplied with weight_num
16:10
<
jonpsy[m] >
yep so we'l get [ [1], [1], [1] .... weight_num number of times ]
16:11
<
jonpsy[m] >
btw about the ```model``` and ```model_```
16:11
<
jonpsy[m] >
* about the `model` and `model_`??
16:15
<
jonpsy[m] >
hm, i thought they were learning & target networks
16:16
<
say4n[m] >
you mean y and Q from eq 6?
16:18
<
say4n[m] >
also one more question, do they reassign the values to be the same for model and model_ apart from when they do init?
16:18
<
jonpsy[m] >
i dont think so..
16:18
<
jonpsy[m] >
you mean making them equal, right?
16:19
<
jonpsy[m] >
hm if they wouldve done that, then it wouldve been target and learning network , right?
16:26
<
say4n[m] >
okay from the loss function definition in the code, tauQ should be our y
16:28
<
jonpsy[m] >
yeah the optimality symbol
16:28
<
jonpsy[m] >
wondering about the mask thingy...
16:29
<
say4n[m] >
mask for nonterminal states
16:30
<
jonpsy[m] >
btw the paper did mention about usign target + learning network
16:30
<
jonpsy[m] >
and you're idea that its the same network, just detached seems to be correct
16:31
<
jonpsy[m] >
so, im wondering where are the two networks
16:31
<
say4n[m] >
jonpsy[m]: you mean the model definitions?
16:32
<
say4n[m] >
jonpsy[m]: my hypothesis is that they mask and use all but the terminal batches because the terminal minibatch may be smaller than their mini-batch size.
16:33
<
jonpsy[m] >
about this
16:34
<
say4n[m] >
ah that is why they are probably masking it right?
16:34
<
jonpsy[m] >
Yeaaaha
16:34
<
jonpsy[m] >
makes sense right?
16:35
<
jonpsy[m] >
okay so that part is clar
16:35
<
say4n[m] >
* ~my hypothesis is that they mask and use all but the terminal batches because the terminal minibatch may be smaller than their mini-batch size.~
16:35
<
say4n[m] >
> <@jonpsy:matrix.org> yeah
16:35
<
jonpsy[m] >
> you mean the model definitions?
16:35
<
jonpsy[m] >
and their usage, eyeah
16:35
<
say4n[m] >
* my hypothesis is that they mask and use all but the terminal batches because the terminal minibatch may be smaller than their mini-batch size.
16:35
<
say4n[m] >
> <@jonpsy:matrix.org> > you mean the model definitions?
16:35
<
say4n[m] >
> and their usage, eyeah
16:36
<
say4n[m] >
^ here they define the `EnvelopeLinearCQN` model
16:38
<
jonpsy[m] >
so thats the model type right?
16:38
<
jonpsy[m] >
i was talkin about ```learningNetwork``` and ```targetNetwork```
16:39
<
say4n[m] >
yes that is the model, the forward propagation is implemented in the forward() method
16:39
<
say4n[m] >
as in the architecture
16:40
<
say4n[m] >
btw, you were asking about some {16, 32, 64, 32} from the paper right?
16:41
<
jonpsy[m] >
yeah i figured that out
16:41
<
jonpsy[m] >
<jonpsy[m]> "image.png" <- any idea waht this means?
16:41
<
say4n[m] >
left side is probably the Q function parameterised on the preferences
16:46
<
say4n[m] >
I am not entirely sure, it looks something like saying Q_w(s,a) is a subset of a mapping that maps preferences to R^m for all states and actions. In other words, it says that given a preference vector, the Q maps to R^m for each element in SxA.
16:46
<
say4n[m] >
I think James can explain the notation better. 😅
16:47
<
jonpsy[m] >
ah so the SxA power, means for each state actioin pair
16:47
<
jonpsy[m] >
wow, first time comin across this notation
16:48
<
jonpsy[m] >
<jonpsy[m]> "i was talkin about ```learningNe" <- thoughts on this?
16:49
<
say4n[m] >
jonpsy[m]: me neither :p
16:50
<
say4n[m] >
jonpsy[m]: none as of now
16:50
<
jonpsy[m] >
Thanks a ton! That's already a lot of help
20:55
<
shrit[m] >
rcurtin: The path for ensmallen is good now, but Cmake is not to extract the version
21:14
<
shrit[m] >
I resolved it 👍️
21:14
<
shrit[m] >
It passed. I think nothing left from my side for this pull request
21:23
<
rcurtin[m] >
(I like that the element client displays confetti for a 🎉 😃)
21:25
<
rcurtin[m] >
I am watching the logs---it looks like there are some linking issues?
21:25
<
rcurtin[m] >
##[error]cf_main.obj(0,0): Error LNK2019: unresolved external symbol wrapper2_dgetrf_ referenced in function "public: static bool __cdecl arma::auxlib::solve_square_rcond<class arma::Mat<double> >(class arma::Mat<double> &,double &,class arma::Mat<double> &,struct arma::Base<double,class arma::Mat<double> > const &,bool)" (??$solve_square_rcond@V?$Mat@N@arma@@@auxlib@arma@@SA_NAEAV?$Mat@N@1@AEAN0AEBU?$Base@NV?$Mat@N@arma@@@1@_N@Z)
21:26
<
rcurtin[m] >
it looks like CMake is not linking against libarmadillo?
21:27
<
rcurtin[m] >
but, I guess, I should wait for the build to finish. I guess we could check `${ARMADILLO_LIBRARIES}` and `${MLPACK_LIBRARIES}` from CMake to see what is being used
21:31
<
shrit[m] >
Well, I did not touch armadillo at all
21:31
<
shrit[m] >
it might be an old error
21:31
<
rcurtin[m] >
yeah---I'll take a look at the build when it's done, and if it did fail, I can take a quick look and get some idea of what might be wrong
21:31
<
rcurtin[m] >
honestly I think getting the ensmallen path right was the hard part here 😃 nothing should be different about Armadillo, so maybe there is some tiny tweak or something but it should not be difficult to get it working