<jonpsy[m]>
Eshaan Agarwal: where are we currently?
<EshaanAgarwal[m]>
<jonpsy[m]> "Eshaan Agarwal: where are we..." <- Almost wrote the maze generation code. I will test it after classes.
<EshaanAgarwal[m]>
When the constructor of Q Learning is called, it further calls the InitialSample() function of environment but since the environment object isn't initialised yet and hence it's private variables aren't initialised yet I get error ( using loop which needs to access my variables matrix's index)
<EshaanAgarwal[m]>
How can I deal with this ?
<EshaanAgarwal[m]>
This problem is coming because we have a random maze which makes starting points random every time. I select one of the starting points from the particular maze for the epsiode using InitialSample()
<EshaanAgarwal[m]>
What could be a possible walk around ?
<EshaanAgarwal[m]>
In previous environments, initial sample never depended on anything so that worked.
<EshaanAgarwal[m]>
> <@eshaanagarwal:matrix.org> When the constructor of Q Learning is called, it further calls the InitialSample() function of environment but since the environment object isn't initialised yet and hence it's private variables aren't initialised yet I get error ( using loop which needs to access my variables matrix's index)
<EshaanAgarwal[m]>
>
<EshaanAgarwal[m]>
jonpsy: zoq:
<EshaanAgarwal[m]>
> How can I deal with this ?
<EshaanAgarwal[m]>
* jonpsy: zoq
<EshaanAgarwal[m]>
> <@eshaanagarwal:matrix.org> When the constructor of Q Learning is called, it further calls the InitialSample() function of environment but since the environment object isn't initialised yet and hence it's private variables aren't initialised yet I get error ( using loop which needs to access my variables... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/4d19da05edd10b3dcb51ed0d3b8b08e6f0328147>)
<jonpsy[m]>
you need nod push it ig, since we're dealing with large scale tests. It'll only slow down our test-suite
<EshaanAgarwal[m]>
i am just pushing the maze generation code
<jonpsy[m]>
ok, can u show me some samples
<EshaanAgarwal[m]>
jonpsy[m]: sample of mazes ?
<jonpsy[m]>
+ lets get HER & others running.
<jonpsy[m]>
EshaanAgarwal[m]: whats the N, M you've tested it with
<EshaanAgarwal[m]>
jonpsy[m]: i mean i checked with 100 * 100 ! otherwise it works for any n and m.
<EshaanAgarwal[m]>
10*10
<jonpsy[m]>
for the start point to goal
<EshaanAgarwal[m]>
not HER though ! HER i checked with 10*10
<jonpsy[m]>
the num steps is root(M* N)
<jonpsy[m]>
right?
<EshaanAgarwal[m]>
jonpsy[m]: start point is random from any 0th cell and using the dfs approach i have fixed the goal for a generated matrices
<EshaanAgarwal[m]>
jonpsy[m]: numeber of steps for what ?
<EshaanAgarwal[m]>
if you can jump on call ,i might as well explain what i did.
<EshaanAgarwal[m]>
i can show a sample of 10 * 10 generated maze if you want
<jonpsy[m]>
can't
<jonpsy[m]>
anyway, from what we discussed. The idea was to get a random start point, move X number of steps randomly. Then mark it as goal
<EshaanAgarwal[m]>
jonpsy[m]: i have exactly the same
<EshaanAgarwal[m]>
> <@jonpsy:matrix.org> anyway, from what we discussed. The idea was to get a random start point, move X number of steps randomly. Then mark it as goal
<EshaanAgarwal[m]>
i have exactly the same
<EshaanAgarwal[m]>
*
<jonpsy[m]>
so how's X being defined here
<EshaanAgarwal[m]>
jonpsy[m]: for now i have kept X as 0.5*n*m
<EshaanAgarwal[m]>
but we can change it ofcourse
<EshaanAgarwal[m]>
> <@jonpsy:matrix.org> so how's X being defined here
<EshaanAgarwal[m]>
* for now i have kept X as 0.5 * n * m
<jonpsy[m]>
<EshaanAgarwal[m]> "numeber of steps for what ?" <- .
<jonpsy[m]>
<jonpsy[m]> "the num steps is root(M* N)" <- .
<EshaanAgarwal[m]>
jonpsy[m]: ok i will change it to that
<EshaanAgarwal[m]>
apart from that all things work.
<jonpsy[m]>
Cool, with that done. Can you try generating 500 x 500 matrix
<EshaanAgarwal[m]>
jonpsy[m]: ok should i test the agent on it ?
<EshaanAgarwal[m]>
also what about exploration steps ?
<jonpsy[m]>
lets start with ebing able to create a matrix
<jonpsy[m]>
500x500, then 1k x 1k
<jonpsy[m]>
and time it
<EshaanAgarwal[m]>
jonpsy[m]: time the generation of matrix ?
<jonpsy[m]>
yh
<jonpsy[m]>
it wouldnt matter much, but im just interested to know
<EshaanAgarwal[m]>
ok how do we time it ?
<EshaanAgarwal[m]>
<EshaanAgarwal[m]> "ok how do we time it ?" <- jonpsy:
<EshaanAgarwal[m]>
<jonpsy[m]> "." <- i think steps which we use to make the matric for that sqrt(n*m) will be quite less ! sqrt(n*m) is even less then n + m. Some factor is necessary. for now i am keep it 3 * sqrt(n *m)
<EshaanAgarwal[m]>
This is more appropriate.
<EshaanAgarwal[m]>
also i am able to generate the matrices.
<jonpsy[m]>
ok
<zoq[m]>
<EshaanAgarwal[m]> "jonpsy:..." <- You can use armadillos tic/toc
<EshaanAgarwal[m]>
zoq[m]: Ok i will check this out !
<EshaanAgarwal[m]>
jonpsy: what else do I need to work on ?
<EshaanAgarwal[m]>
So we should set that manually and not have formula for it
<EshaanAgarwal[m]>
jonpsy[m]: What should the exploration steps and max steps that I set ?
<EshaanAgarwal[m]>
I was currently running 50*50 and it has taken almost 1 hr to complete 80-100 iterations
<jonpsy[m]>
oh
<EshaanAgarwal[m]>
> <@eshaanagarwal:matrix.org> What should the exploration steps and max steps that I set ?
<EshaanAgarwal[m]>
> I was currently running 50*50 and it has taken almost 1 hr to complete 80-100 iterations
<EshaanAgarwal[m]>
Even when the exploration steps are less to what should be for a good performance I guess.
<EshaanAgarwal[m]>
I think 200*200 is punching way over the belt. Like in the HER paper they said that even using HER they weren't able to solve but flipping environment with length more then 15
<EshaanAgarwal[m]>
s/but/bit/
<EshaanAgarwal[m]>
Anything more than 50*50 seems a bit over in my opinion as per all the runs that I was trying.
<zoq[m]>
The purpose of the test is to make sure HER works, so if it works on a reasonable number of different settings it’s fine.
<zoq[m]>
That said we can test it on a larger size outside of the test suite.
<EshaanAgarwal[m]>
zoq[m]: I think as per what we discussed for testing purposes. 4*4 or 10*10 was ok. We were doing this to guage the limits of our algo
<zoq[m]>
gives us some more insight into the correctness of the implementation
<zoq[m]>
Yes, makes sense.
<EshaanAgarwal[m]>
EshaanAgarwal[m]: Infact in the test I implemented it works on 10*10. And as per the PR test they all have converged
<zoq[m]>
Good
<EshaanAgarwal[m]>
And I verified that independently in my system too
<zoq[m]>
And they converge for each run or 1 out of 5?
<EshaanAgarwal[m]>
zoq[m]: I will say 4/5 for the threshold I have set.
<EshaanAgarwal[m]>
But we check for only one run
<EshaanAgarwal[m]>
I set the threshold just to make sure that it's feasible and also not a one time thing
<EshaanAgarwal[m]>
EshaanAgarwal[m]: Meaning it needs to converge in only one out of the five.
<zoq[m]>
Yes was just curious how stable it is.
<EshaanAgarwal[m]>
zoq[m]: Yup it does.
<EshaanAgarwal[m]>
zoq[m]: Ok so how do you suggest that I should proceed !
<EshaanAgarwal[m]>
I have pushed the code with 10*10 in the PR. I am currently checking 50*50 but frankly it might need more exploration steps to perform and it's quite time consuming.
<EshaanAgarwal[m]>
EshaanAgarwal[m]: 10*10 as the test. For test I think that is more than sufficient and serves the purpose well
<zoq[m]>
If you can let it run on the side, I would start on the documentation for HER. Which includes a description of the method, an example I guess we can use the maze.
<EshaanAgarwal[m]>
zoq[m]: Where do I have to write the description ? I think I have provided in line documentation with the code.
<EshaanAgarwal[m]>
* Where do I have to write the description ? I think I have provided in line documentation with the code. I will go over it today again and push all the changes