<jonpsy[m]>
We'll have to prove it works here, (agian with numbers & graphs). Then I'll have one final look at the PR & It's okay from my side to merge HER
<jonpsy[m]>
s/400/NxM/, s/x/(N/, s/400/may not be equal to M)/
<EshaanAgarwal[m]>
jonpsy[m]: You want this for the test in mlpack too ?
<jonpsy[m]>
a) We could keep N, M random natural numbers.
<jonpsy[m]>
b) Correct, but we're generating maze randomly & we'll train for *each* of these mazes. So if you generate K mazes, you'll have K profiles.
<EshaanAgarwal[m]>
> <@jonpsy:matrix.org> a) We could keep N, M random natural numbers.
<EshaanAgarwal[m]>
> b) Correct, but we're generating maze randomly & we'll train for *each* of these mazes. So if you generate K mazes, you'll have K profiles.
<EshaanAgarwal[m]>
>
<EshaanAgarwal[m]>
The issue with N and M is that then we would have increase the steps and limit accordingly
<EshaanAgarwal[m]>
We have that fixed
<jonpsy[m]>
Fair point
<EshaanAgarwal[m]>
Steps for exploring and etc are fixed will determine everything
<EshaanAgarwal[m]>
I say that for a simple test this is fine !
<EshaanAgarwal[m]>
For proving its performance we can set a good number maybe 6*6 or 8*8 and test it separately for it and maybe work on random mazes this time
<jonpsy[m]>
So, you're fixing N, M
<jonpsy[m]>
Again, it need not be square
<EshaanAgarwal[m]>
> <@jonpsy:matrix.org> a) We could keep N, M random natural numbers.
<EshaanAgarwal[m]>
> b) Correct, but we're generating maze randomly & we'll train for *each* of these mazes. So if you generate K mazes, you'll have K profiles.
<EshaanAgarwal[m]>
>
<EshaanAgarwal[m]>
Can you explain b point ? What ideally happens is that for each epsiode that we train it we can have a random maze that it will solve ( provided we write the code for that ) and then it trains on episodes till it either converged the threshold reward average or run out of max episodes.
<EshaanAgarwal[m]>
jonpsy[m]: Yeah we can do that ! But fixing it is essential to set the exploration steps and total step limits
<jonpsy[m]>
EshaanAgarwal[m]: Fine
<jonpsy[m]>
EshaanAgarwal[m]: Whatever you coded till now, was for one maze right?
<EshaanAgarwal[m]>
jonpsy[m]: It solves only a particular maze in each epsiode and try to be better in it
<EshaanAgarwal[m]>
As of now
<jonpsy[m]>
over the episodes, the maze is fixed?
<EshaanAgarwal[m]>
jonpsy[m]: Yes as of now ! The maze that you provided ! 4*4 one
<jonpsy[m]>
if `run()` is your code till now. What we do is
<EshaanAgarwal[m]>
> <@jonpsy:matrix.org> if `run()` is your code till now. What we do is
<EshaanAgarwal[m]>
> `[run(maze) for maze in random_maze]`
<EshaanAgarwal[m]>
Thing is that q impl code is fixed ! If we change anything in it it will have to reflected for other environments.
<EshaanAgarwal[m]>
What I suggest we will generate random maze for each episode
<EshaanAgarwal[m]>
That it trains
<jonpsy[m]>
Not sure 1 episode is enough for it to converge
<jonpsy[m]>
I'm not telling you to write a code that we'll commit
<EshaanAgarwal[m]>
jonpsy[m]: No but it will have 100s of episodes of different mazes ! So hopefully it will learn from it to solve any random generated maze
<jonpsy[m]>
a separate cpp file where you'll do this. You need not push this to our code
<jonpsy[m]>
EshaanAgarwal[m]: Ah, you're going that way.
<jonpsy[m]>
Okay I think this might be good, from what I know. It should be able to learn from each maze (HER) that is
<EshaanAgarwal[m]>
jonpsy[m]: Suggesting. Although I will say for the test purposes what we have done till now is okay
<EshaanAgarwal[m]>
jonpsy[m]: So should I try this ? Will have to see how it goes.
<jonpsy[m]>
So, every episode it'll be a new maze
<jonpsy[m]>
and since HER's entire point is to adapt to multiple goals
<jonpsy[m]>
it should be able to pick real reward after K episodes
<EshaanAgarwal[m]>
jonpsy[m]: Hopefully yes.
<jonpsy[m]>
& consistently maintain it, if not increase
<jonpsy[m]>
We'll need a big loop for this, again , with proper graphs comparing each policies
<EshaanAgarwal[m]>
jonpsy[m]: Yeah I mean will see how it runs and set reward threshold accordingly
<jonpsy[m]>
for now, don't set thresholds
<jonpsy[m]>
just let it run for like 1k or 2k episodes
<jonpsy[m]>
store the data, and plot average reward
<EshaanAgarwal[m]>
jonpsy[m]: Hopefully 500-700 epsiodes should do ! Will see also the exploration steps that needs to be set for good performance
<EshaanAgarwal[m]>
jonpsy[m]: How do we do that ?
<jonpsy[m]>
keep the matrix big, let's not leave things for random luck here
<jonpsy[m]>
I'm thinkin 1k x 1k
<EshaanAgarwal[m]>
jonpsy[m]: I think we are setting it way too much
<jonpsy[m]>
haha
<jonpsy[m]>
700 x 256, 256 x 700, 700 x 700
<EshaanAgarwal[m]>
jonpsy[m]: How about 200*250 something ? Since it's a random maze too
<jonpsy[m]>
For starters you could do that
<jonpsy[m]>
but lets be ambitious here
<EshaanAgarwal[m]>
Also can you provide me any reference for the random maze algorithm ?
<EshaanAgarwal[m]>
jonpsy[m]: Sure. For mlpack_test purposes too should we keep it this big ?
<jonpsy[m]>
which is why i said you need not push this code
<jonpsy[m]>
You can create a thread in our mlpack github Issues & store the benchmarks
<EshaanAgarwal[m]>
Also how do you store the test results ?
<jonpsy[m]>
Figure it out
<EshaanAgarwal[m]>
jonpsy[m]: Okay ! I will see it.
<jonpsy[m]>
Would you be able to do it today?
<EshaanAgarwal[m]>
EshaanAgarwal[m]: A little help here please 😅.
<EshaanAgarwal[m]>
jonpsy[m]: I actually have vivas ! They will end by 6 pm. I will try to complete my tomorrow morning.
<EshaanAgarwal[m]>
* I actually have vivas ! They will end by 6 pm. I will try to complete by tomorrow morning.
<jonpsy[m]>
EshaanAgarwal[m]: Honestly, I have no idea myself. But if I had to start, I'd loop over the matrix and pick a random number between `[0, -1]`. Finally, I'll pick a random point `(x, y)` and make `matrix[x][y] = +1`
<jonpsy[m]>
The only problem here is, we might end up with a wall
<EshaanAgarwal[m]>
jonpsy[m]: We will have to check for walls too right !
<EshaanAgarwal[m]>
jonpsy[m]: Yeah ! That's why I asked.
<jonpsy[m]>
We could have another go to detect & remove walls.
<jonpsy[m]>
That'll be O(2N) to find & correct
<jonpsy[m]>
but its a one time cost, so its fine
<EshaanAgarwal[m]>
jonpsy[m]: How is it one time
<EshaanAgarwal[m]>
We are make new maze in each epsiofe
<EshaanAgarwal[m]>
jonpsy[m]: Ohkay ! Another thing can we fix the dimension of matrices to generate ?
<EshaanAgarwal[m]>
jonpsy[m]: Okay I will try my best. Another thing ! What about PPO ?
<jonpsy[m]>
Let's focus on getting one thing delivered. We'll move to PPO then
<EshaanAgarwal[m]>
EshaanAgarwal[m]: I am just saying this because of the paucity of time we have !
<EshaanAgarwal[m]>
jonpsy[m]: Okay. I will try to wrap it HER. But I would suggest that we could atleast try to merge PPO with the test side by side. Because we will try to gauge the performance of HER so that can take time if we keep on increasing the expectations.
<EshaanAgarwal[m]>
jonpsy: I was giving thought to randomly generate maze for each epsiode! I think that wouldn't work. Because if we are giving the agent different maze each time ! What exactly is it able to learn even with HER ! It never knows the cells ( nearby ) ! It will not be able to make any coorelation. It's won't be able to learn and will just make random actions all the time.
<EshaanAgarwal[m]>
Point of HER was to achieve multiple goals in a same setting.
<EshaanAgarwal[m]>
I suggest we can make a random generated maze but will have to keep it the same across all training epsiodes.
<EshaanAgarwal[m]>
EshaanAgarwal[m]: And what we can do is keep that it needs to converge in 5 independent runs ! Where each run can have different maze.
<EshaanAgarwal[m]>
Anything more than that would be out of the scope.
<fieryblade[m]>
<EshaanAgarwal[m]> "Also can you provide me any..." <- We can use Minimum Spanning Tree or a random walk sort of algorithm
<fieryblade[m]>
<EshaanAgarwal[m]> "I suggest we can make a random..." <- We can fix the maze for certain number of episodes then change it at intervals.
<EshaanAgarwal[m]>
fieryblade[m]: Actually we call the new maze at start of each epsiode.
<EshaanAgarwal[m]>
<fieryblade[m]> "We can use Minimum Spanning Tree..." <- Can you please elaborate more on this ! If there is any sample it would help 😅
<EshaanAgarwal[m]>
fieryblade[m]: We can try this ! I will see what we can do.
<fieryblade[m]>
I'll find some resources for it. But in simple sense, for MST, we just create a graph where each cell wall is an edge with random weight. We just find a MST for it.
<fieryblade[m]>
The second one is easier as we just do a random walk like Depth First Search with no consideration for direction except that one node is visited only once.
<fieryblade[m]>
s/one/a/
<EshaanAgarwal[m]>
> <@fieryblade313:matrix.org> I'll find some resources for it. But in simple sense, for MST, we just create a graph where each cell wall is an edge with random weight. We just find a MST for it.
<EshaanAgarwal[m]>
> The second one is easier as we just do a random walk like Depth First Search with no consideration for direction except that one node is visited only once.
<EshaanAgarwal[m]>
I have a question ! Let's say on doing DFS we find that we couldn't reach the place ! Then how do we rectify the maze ?
<EshaanAgarwal[m]>
> <@fieryblade313:matrix.org> I'll find some resources for it. But in simple sense, for MST, we just create a graph where each cell wall is an edge with random weight. We just find a MST for it.
<EshaanAgarwal[m]>
> The second one is easier as we just do a random walk like Depth First Search with no consideration for direction except that a node is visited only once.
<EshaanAgarwal[m]>
Also I will look at the first approach in sometime! I have a viva right now.
<fieryblade[m]>
So with both these algorithms, the maze is continuous. Thus there will not be a region which is completely disconnected.
<EshaanAgarwal[m]>
fieryblade[m]: Ok so you want to create the maze using DFS or this ?
<EshaanAgarwal[m]>
EshaanAgarwal[m]: Not like first intialise randomly and then rectify it
<fieryblade[m]>
Both the algorithms require random values. In MST we initialize weights randomly and in randomized DFS we walk randomly.
<EshaanAgarwal[m]>
fieryblade[m]: I think I have got the jist ! If you could find any resource then pls do share whenever possible ! I am thinking of going with random DFS to first get a path ( for this we fix the starting and goal cell ) and then fill others with -1
<jonpsy[m]>
fieryblade: so you're sayin to use path finding algo to generate teh actual path
<jonpsy[m]>
<EshaanAgarwal[m]> "I think I have got the jist ! If..." <- The actual path should be filled with 0, the remaining can be randomly filled with either 0/-1. So that there can be multiple ways to go from Start => End
<fieryblade[m]>
<jonpsy[m]> "fieryblade: so you're sayin..." <- in a sense, but we are just randomly walking and not doing any pathfinding
<fieryblade[m]>
Eshaan Agarwal: I was not able to find any code on this, only some videos on how it works.
<EshaanAgarwal[m]>
<jonpsy[m]> "The actual path should be filled..." <- How should I proceed then ?
<EshaanAgarwal[m]>
<fieryblade[m]> "Eshaan Agarwal: I was not..." <- I will check this out ! So should I go with MST ? Pls suggest
<EshaanAgarwal[m]>
EshaanAgarwal[m]: jonpsy: zoq:
<EshaanAgarwal[m]>
> <@eshaanagarwal:matrix.org> I will check this out ! So should I go with MST ? Pls suggest
<EshaanAgarwal[m]>
* jonpsy: zoq
<EshaanAgarwal[m]>
<jonpsy[m]> "The actual path should be filled..." <- i think we should keep the goal cell fixed and then create a path from it.
<fieryblade[m]>
<EshaanAgarwal[m]> "I will check this out ! So..." <- Anything is fine, but I think randomized DFS will be simpler.
<EshaanAgarwal[m]>
fieryblade[m]: could there be an issue of stack overflow there ?
<EshaanAgarwal[m]>
also how do we get the path ! i am not able to visualise properly
<EshaanAgarwal[m]>
fieryblade: we were thinking a maze using (-1 for wall, 0 for path and 1 for goal) like this i am not sure how we would create that using the method you mentioned (it is more like a graph way and has edges between cells which can be removed to make the maze.
<EshaanAgarwal[m]>
i was thinking this
<EshaanAgarwal[m]>
EshaanAgarwal[m]: lets do a random dfs for 80-90 steps in a maze ! put 0 in all of then and rest can be 0 or -1
<fieryblade[m]>
<EshaanAgarwal[m]> "fieryblade: we were thinking a..." <- The graph generated from the algos can easily be converted to (-1, 0, 1). You already have 0 and 1 in the graph. Now if two adjacent cells don't have an edge between them, you know it's a -1.
<fieryblade[m]>
s/algos/algorithms/
<EshaanAgarwal[m]>
fieryblade[m]: -1 at where ?
<EshaanAgarwal[m]>
can we jump on quick call maybe ?
<EshaanAgarwal[m]>
zoq: jonpsy Can we have a meet at 10:PM IST today ? Regarding deliverables of the project and next step forward for HER (regarding the maze and its generation or benchmarking)
<jonpsy[m]>
fieryblade: you therei n the meet?
<EshaanAgarwal[m]>
<fieryblade[m]> "We can fix the maze for certain..." <- Changing the maze during the training again misses the point according to me ! Her can help in multi goal situation but in the same environment. If we change the maze in between training, it will go back to square one.
<EshaanAgarwal[m]>
Her is supposed to be efficient to take us to any reachable place in the same environment. So we can definitely change the goal cell in the same environment or maybe after it has trained or converged we can ask it to maybe achieve a different goal cell then that we trained it for.
<EshaanAgarwal[m]>
jonpsy[m]: He was. If you are free. Maybe we can do it right now too.
<EshaanAgarwal[m]>
* Changing the maze during the training again misses the point according to me ! Her can help in multi goal situation but in the same environment. If we change the maze in between training, it will go back to square one.
<EshaanAgarwal[m]>
Her is supposed to be efficient to take us to any reachable place in the same environment. So we can definitely change the goal cell in the same environment or maybe after it has trained or converged we can ask it to maybe achieve a different goal cell then that we trained it for but not an entirely different maze.
<jonpsy[m]>
allow?
<EshaanAgarwal[m]>
jonpsy[m]: What ? 😅
<EshaanAgarwal[m]>
Ohkay just a minute even I left the meet.
<fieryblade[m]>
We are joining the meet, can you allow us to join it
<zoq[m]>
<fieryblade[m]> "We are joining the meet, can you..." <- Do you still want to have the 10 pm ist meeting?
<akhunti1[m]>
otherwise it is throwing me this error
<akhunti1[m]>
Could you pls give me some suggestion how can i compile with the given specification , I mean Mlpack 3.1.1 , armadillo 9.300.2 , ensmallen 2.16.2 and boost 1.67
<rcurtin[m]>
the error message you are showing is not at compilation time; it is at runtime
<rcurtin[m]>
also, I think that you should not have any problem using armadillo 10 instead of 9 with mlpack 3.1.1
<rcurtin[m]>
in fact, I am not sure that the error message you are showing has anything to do with mlpack
<akhunti1[m]>
sorry it is run time
<akhunti1[m]>
libarmadillo.so.10 this file is coming from Armadillo 10..
<akhunti1[m]>
To run Mlpack 3.1.1. it is searching libarmadillo.so.10 this file inside docker container .
<rcurtin[m]>
when you say "it is searching", I am not sure what "it" is. I think that it is not mlpack that is looking for libarmadillo.so.10 based on the output you pasted
<rcurtin[m]>
unless you are using the Python bindings and that is what happens when you run `import mlpack`?
<rcurtin[m]>
I'm not learning seldon, I don't have time
<akhunti1[m]>
No no
<akhunti1[m]>
I just share with you .
<akhunti1[m]>
Basically I am trying to Integrate Mlpack with seldon . so that I can containerize Mlpack C++ model .and it will create http end point for deployment . like flask and Django for python based machine learning model .
<akhunti1[m]>
Because C++ based Machine learning model we cannot create Rest API.
<akhunti1[m]>
for predication.
<rcurtin[m]>
right, that seems like a reasonable thing to do; but I am thinking that your problem has to do with linking somewhere. I don't have deep advice or specific suggestions, other than that you should carefully inspect each thing that you are compiling to make sure it is linked the way you expect
<akhunti1[m]>
<rcurtin[m]> "unless you are using the..." <- yes
<rcurtin[m]>
all I can say is, building the Python bindings correctly and getting them to link correctly can be a really awful and tedious affair... you will want to inspect the exact command line being used to compile them when you build mlpack
krushia_ has quit [Quit: Konversation terminated!]