#mlpack on 2022-10-31 — irc logs at libera.irclog.whitequark.org

2021-07-27 15:44 rcurtin_irc changed the topic of #mlpack to: mlpack: a scalable machine learning library (https://www.mlpack.org/) -- channel logs: https://libera.irclog.whitequark.org/mlpack -- NOTE: messages sent here might not be seen by bridged users on matrix, gitter, or slack

02:53 <jonpsy[m]> Eshaan Agarwal: how's the gdb progress? Is only your test failing or all the test failing??

03:06 <EshaanAgarwal[m]> <jonpsy[m]> "Eshaan Agarwal: how's the gdb..." <- I did not check other tests but I don't think they should fail. Also checking all tests takes some time. Let me do that now.

03:06 <jonpsy[m]> That should. be the first thing to do. Added test shouldn't break existing test

03:08 <EshaanAgarwal[m]> jonpsy[m]: Ok ! I will runn them and let you know but as far as I feel they shouldn't !

03:19 <EshaanAgarwal[m]> If I have to check the equality of two matrices.I can simply do 'matriceA == matriceB' but this will return a matrix object. How could I convert that into a bool ?

03:20 <EshaanAgarwal[m]> s/'/`/, s/'/`/

03:20 <EshaanAgarwal[m]> * If I have to check the equality of two matrices.I can simply do

03:20 <EshaanAgarwal[m]> `matriceA == matriceB`

03:20 <EshaanAgarwal[m]> but this will return a matrix object. How could I convert that into a bool ?

03:24 <EshaanAgarwal[m]> jonpsy: zoq: I think I have fixed the reward now.

03:24 <EshaanAgarwal[m]> I basically wanted to check whether the two binary vectors ( goal and state ) are equal or not.

03:47 <jonpsy[m]> Phew good one

03:47 <jonpsy[m]> > <@eshaanagarwal:matrix.org> If I have to check the equality of two matrices.I can simply do... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/00576fccf01e2e73f242e31029f89de829d061f9>)

03:48 <jonpsy[m]> AFAIK. I think a better way to compare this is using `std::equal`

03:49 <EshaanAgarwal[m]> jonpsy[m]: How to use that ?

03:49 * jonpsy[m] uploaded an image: (54KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/MboiNBCjJinnUOvZhmUgTrej/image.png >

03:49 <EshaanAgarwal[m]> jonpsy[m]: Yes ! For this right now I did this. Let me paste that

03:49 <EshaanAgarwal[m]> EshaanAgarwal[m]: `sum(nextState.Data() == transitionGoal.Data()) == nextState.Data().n_elem`

03:49 <jonpsy[m]> https://arma.sourceforge.net/docs.html, always keep this handy with you whenever `arma` doesn't work as expected.

03:50 <EshaanAgarwal[m]> jonpsy[m]: yeah that is why i needed to convert that mat object to bool ! but i couldnt find something direct to do that.

03:50 <EshaanAgarwal[m]> EshaanAgarwal[m]: so i did this

03:51 <jonpsy[m]> these are binary values, right?

03:51 <jonpsy[m]> 1 & 0. Right?

03:51 <EshaanAgarwal[m]> jonpsy[m]: yes and after equality checking it will be a matrix of binary values depicting indvidual element's quality

03:52 <jonpsy[m]> why use `arma::vec`

03:52 <EshaanAgarwal[m]> s/quality/equality/

03:52 <EshaanAgarwal[m]> jonpsy[m]: can you elaborate ?

03:53 <jonpsy[m]> okay. What's the `dtype` `arma::vec` stores?

03:54 <EshaanAgarwal[m]> jonpsy[m]: i got what you are asking but the issue with uvec was that in Train() it uses the state and action values and there we have used normal vec so it gives some kind of error. i had tried that

03:55 <jonpsy[m]> I thought it was all templatized

03:55 <jonpsy[m]> Nw, consider `approx_equal`

03:56 <jonpsy[m]> check the armadillo doc link, look for `approx_equal`. Set rel. tol to a predefined value

03:57 <EshaanAgarwal[m]> Okay !

03:57 <EshaanAgarwal[m]> I just did the test ! One of the other test is failing but I am not sure if I have made any changes to that. It's just not converging

03:57 <jonpsy[m]> Create a gmeet link

03:57 <EshaanAgarwal[m]> `RewardClippedAcrobotWithDQN... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/27a1b327aef75ea9434eb779180adef932517467>)

03:58 <EshaanAgarwal[m]> jonpsy[m]: just a minute

03:58 <jonpsy[m]> > <@eshaanagarwal:matrix.org> `RewardClippedAcrobotWithDQN... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/962f15cf061b84bf9c7ceddcb73488512337b4ea>)

03:58 <EshaanAgarwal[m]> jonpsy[m]: i did made some changes in that ! because i added that goal parameter in all functions but nothing else

03:58 <EshaanAgarwal[m]> so not sure why failing

04:00 <EshaanAgarwal[m]> jonpsy[m]: https://meet.google.com/pnp-rtjw-unz

05:12 * jonpsy[m] uploaded an image: (72KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/acPUnHVwCQLNbPEZqAKLaUwE/image.png >

05:12 <jonpsy[m]> Getting this weird error in DEBUG mode :?

06:04 * EshaanAgarwal[m] uploaded an image: (45KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/jQczVDlJIvjJINhWPzQOmtlc/Screenshot%20from%202022-10-31%2011-34-23.png >

06:06 <EshaanAgarwal[m]> EshaanAgarwal[m]: getting this error when i tried to compile mlpack test after i cloned it from mlpack repo master branch

07:01 <EshaanAgarwal[m]> <EshaanAgarwal[m]> "Screenshot from 2022-10-31 11-34..." <- strangely this happened when i built it in debug mode

10:16 <jonpsy[m]> is the issue persisting?

10:17 <EshaanAgarwal[m]> jonpsy[m]: Not right now ! But definitely there were some different warning when I used make with debug mode

11:05 kristjansson has quit [Ping timeout: 250 seconds]

11:07 kristjansson has joined #mlpack

11:07 <EshaanAgarwal[m]> <jonpsy[m]> "is the issue persisting?" <- Fixed the acrobat test ! It was a very trivial but small thing. Pushing changes sometime.

11:54 <jonpsy[m]> Heard you fixed the test? Eshaan Agarwal

11:54 <jonpsy[m]> HELL YEAH

11:55 <jonpsy[m]> what was teh trick?

11:56 <EshaanAgarwal[m]> jonpsy[m]: During the function calls, the next state which was passed in the function and the next state variable present in the function created a mess. So I took care of that.

11:56 <EshaanAgarwal[m]> It was updating values in wrong variables hence wasn't learning

11:56 <jonpsy[m]> Thought so

11:57 <jonpsy[m]> So all RL tests are passing?

11:57 <EshaanAgarwal[m]> jonpsy[m]: I guess !

11:57 <jonpsy[m]> Run all test

11:57 <jonpsy[m]> + your own test

11:57 <EshaanAgarwal[m]> jonpsy[m]: All RL right ?

11:58 <jonpsy[m]> yeah

11:58 <EshaanAgarwal[m]> Will do ! Hopefully they should work now 😀

11:58 <jonpsy[m]> I'm having a good feeling it will

11:58 <jonpsy[m]> so if HER passes as well; we need to benchmark, thats v. imp

11:59 <jonpsy[m]> a concrete graph & numbers. I want a report showing HER is consistently better

11:59 <EshaanAgarwal[m]> jonpsy[m]: For that we might require some other evironments ? 👀

12:02 <jonpsy[m]> Lets start with bit flipping

12:02 <EshaanAgarwal[m]> jonpsy[m]: For HER I think we should think over our thresholds for bit flipping environment. Nevertheless I have started the test. Only QLearningTest file is running ! Rest have converged and passed

12:03 <jonpsy[m]> You told me you've taken someone else work as inspiratoin for bit flip right?

12:03 <jonpsy[m]> Someone else had this code; what were their result?

12:04 <EshaanAgarwal[m]> jonpsy[m]: I mean it was a fairly easy environment. But I took reference from 2 repo. Let me check if they gave their result there or not

12:04 <jonpsy[m]> Also, pls mention the source of ur code in the file

12:04 <EshaanAgarwal[m]> jonpsy[m]: Sure ! I will do

12:10 <EshaanAgarwal[m]> <jonpsy[m]> "Also, pls mention the source..." <- Can I share the test file of one of the implementations here ?

12:24 <jonpsy[m]> sure

12:25 <jonpsy[m]> but generally paste the link of the source in code always. Any helpful doc. related to it is also welcome

12:27 <EshaanAgarwal[m]> jonpsy[m]: https://github.com/IntelLabs/coach/blob/master/rl_coach/tests/memories/test_hindsight_experience_replay.py i could find this one ! others dont have test.

12:27 <jonpsy[m]> what happened to the Qlearnign test

12:27 <EshaanAgarwal[m]> jonpsy[m]: going on ! SAC test is going right now ! it takes time

12:27 <EshaanAgarwal[m]> i checked in master branch too

12:31 <EshaanAgarwal[m]> jonpsy[m]: all passed except my new test

12:32 * EshaanAgarwal[m] uploaded an image: (38KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/UCXrKLJSaKicoNUftCWpSSjs/Screenshot%20from%202022-10-31%2018-01-39.png >

12:38 <jonpsy[m]> That's good news (relatively speaking)

12:38 <jonpsy[m]> so only thing remaining is setting up a good thresh

12:46 <EshaanAgarwal[m]> <jonpsy[m]> "so only thing remaining is..." <- Let me dig into it more. I think it's not performing as expected

13:09 <akhunti1[m]> Hi rcurtin

13:09 <akhunti1[m]> cmake_minimum_required(VERSION 3.4.1)... (full message at <https://libera.ems.host/_matrix/media/v3/download/libera.chat/3e872f520ba9365caabada58fcbdc93d5de814ee>)

13:10 <akhunti1[m]> This is my cmakelist.txt file .

13:10 * akhunti1[m] uploaded an image: (78KiB) < https://libera.ems.host/_matrix/media/v3/download/matrix.org/eszoLAozJwsgtLyBwUjuaXYF/image.png >

13:11 <akhunti1[m]> This is the error i am getting .and unfortunately i am not able to resolve the issue in cmakelist.txt file .

13:12 <akhunti1[m]> If time permit could you pls look into this .

13:19 <rcurtin[m]> akhunti1: I'm sorry, time doesn't really permit... the best I can do is give quick guesses. my suggestion would be to check that libarmadillo.so.10 exists in the system. once you have confirmed that, it looks like you are loading armadillo from inside Python? so you might want to check the exact path that Python is trying to use to load libarmadillo.so.10. unfortunately, I don't know enough about your situation to tell you precisely how to

13:19 <rcurtin[m]> do that, so you will probably have to do some investigation and reading

13:22 <akhunti1[m]> Thanks rcurtin for your time

13:22 <akhunti1[m]> I will try your suggesation.

13:24 <akhunti1[m]> rcurtin[m]: Yes I am loading armadillo from inside Python.

14:42 <EshaanAgarwal[m]> jonpsy: zoq: I had a question. For all the imaginary transitions that I am storing should I also add their reward to the total reward of the episode ?

14:55 <jonpsy[m]> Wouldn't make sense imo

14:55 <jonpsy[m]> Your cricket skills isn't valuable during football & vice-versa

14:55 <jonpsy[m]> * Nobody asks how fast you swim in a cricket match*

14:56 <EshaanAgarwal[m]> jonpsy[m]: But kind of that's the whole point right ? We give positive reward even when we have not achieved goal according to goal strategy

14:56 <EshaanAgarwal[m]> Just asking

14:56 <jonpsy[m]> To guide the process yes, but it isn't the ultimate aim

14:56 <EshaanAgarwal[m]> Becausw then no matter how ! The reward for each episode would always be 1

14:56 <EshaanAgarwal[m]> EshaanAgarwal[m]: At Max*

14:57 <jonpsy[m]> It's good that its adaptable but we shouldn't forget the REAL goal here

14:57 <jonpsy[m]> perhaps we could increase iter

14:58 <EshaanAgarwal[m]> Because final state is the goal we need to achieve for which we give it a reward in bit flipping case. In real life scenarios like picking of something by robotic arm we still could have calculated reward by calculating distance between the place where robot placed and the target place coordinates

14:59 <EshaanAgarwal[m]> jonpsy[m]: Even then we are giving reward in the epsiode when it reached final goal right ? So for that particular transition only we would hav positive reward in the whole episode

14:59 <jonpsy[m]> EshaanAgarwal[m]: Yes, but the entire point is that we don't engineer reward. That it should be sparse & just a "Win" "no Win" like scenario

14:59 <EshaanAgarwal[m]> * calculated reward a good reward for our agent by calculating

15:00 <EshaanAgarwal[m]> jonpsy[m]: Yes agreed ; I am just saying that then just for the but flip case the reward threshold would 1 ! What could be gauged to see HER performance is the number of steps the trained agent takes to reach it goal

15:01 <jonpsy[m]> We should only get `1` reward when we actually achieve our goal

15:02 <EshaanAgarwal[m]> jonpsy[m]: Yes ! And in our case we also don't have any intermediate reward ! The only performance HER should be bringing to the table should be that it solves to the goal faster then others which might not even solve it

15:02 <jonpsy[m]> yep

15:03 <jonpsy[m]> HER engineers rewards for itself, so env by itself should just give `1` only when it reaches true goal.

15:03 <EshaanAgarwal[m]> EshaanAgarwal[m]: Okay then I might need to see the test again ! Because threshold wise it's working fine.

15:03 <jonpsy[m]> First, do it freely. See how many iter it takes for it to achieve real goal

15:04 <EshaanAgarwal[m]> jonpsy[m]: Yeah actually we were not checking that before. Will take a look at it and see then

15:05 <jonpsy[m]> so it was reachin real goal?

15:06 <jonpsy[m]> we can try:

15:06 <jonpsy[m]> a) Use another policy, see how fast it reaches the goal (if it does at all)

15:06 <jonpsy[m]> b) The code you've taken this from, see how iter does it take. We could use that as thresh

15:07 <EshaanAgarwal[m]> > <@jonpsy:matrix.org> we can try:

15:07 <EshaanAgarwal[m]> > b) The code you've taken this from, see how iter does it take. We could use that as thresh

15:07 <EshaanAgarwal[m]> > a) Use another policy, see how fast it reaches the goal (if it does at all)

15:07 <EshaanAgarwal[m]> yeah we can do it that way ! let me take a look at it ! All i am pointing was that setting threshold during train is not neede here

15:09 <EshaanAgarwal[m]> jonpsy[m]: a episode runs till it get to terminal state ! that can only achieve when a) it reached its goal in that case reward is 1 or b> exploration steps got over in which case it gets 0

15:10 <EshaanAgarwal[m]> EshaanAgarwal[m]: fthreshold for reward*

15:10 <jonpsy[m]> EshaanAgarwal[m]: so are we sure we fall in a)?

15:10 <EshaanAgarwal[m]> jonpsy[m]: it will reflect in episode return ! in most of the episode when its exploring we get 1 has reward

15:11 <jonpsy[m]> so per episode, it is able to clear the true goal

15:11 <jonpsy[m]> in most cases i.e.

15:11 <EshaanAgarwal[m]> instead of setting reward threshold for training ! lets give it a suitable samples and then see whether it reached the goal in good number of steps or not

15:12 <EshaanAgarwal[m]> jonpsy[m]: yes as per isterminal function! let me point you the code for it

15:12 <jonpsy[m]> EshaanAgarwal[m]: yeah thats crap

15:12 <EshaanAgarwal[m]> EshaanAgarwal[m]: https://github.com/mlpack/mlpack/pull/3283/files#diff-7d507d43784594b5d221fb77f3750feadae8191533cd4f4417769c9f8911dbeeR184

15:13 <jonpsy[m]> Or what we could do is, it should be able to collect true reward K% of the total episodes

15:13 <jonpsy[m]> i.e. success_rate

15:13 <EshaanAgarwal[m]> jonpsy[m]: yeah so we were doing it kinda wrong !

15:14 <EshaanAgarwal[m]> jonpsy[m]: that could be done too ! i will look into it and revert back to you in some time

15:14 <jonpsy[m]> Ok. Also once you've finalised `successRate` or whicever criteria you like. Run it on the original code repo first, then yours, then other policies

15:14 <jonpsy[m]> & report the results

15:15 <EshaanAgarwal[m]> jonpsy[m]: i will have to see if that repo works or not ! we can try other enviornments or maybe SAC. since that also takes replay

15:15 <jonpsy[m]> We need to make sure its sparse

15:15 <jonpsy[m]> & bit flip is super easy to implement

15:16 <EshaanAgarwal[m]> jonpsy[m]: yeah ! SAC then because writing environment is an issue.

15:16 <jonpsy[m]> What's the ETA of this?

15:17 <EshaanAgarwal[m]> jonpsy[m]: benchmarking ? if all goes well by tomorrow

15:17 <EshaanAgarwal[m]> meanwhile can we get ppo merged ?

15:17 <jonpsy[m]> its fixed?

15:17 <jonpsy[m]> did zoq push his changes?

15:17 <EshaanAgarwal[m]> jonpsy[m]: he hasnt

15:18 <EshaanAgarwal[m]> thats why i asked that we could wrap up maybe one implementation too side by side.

15:19 <EshaanAgarwal[m]> i can help in doing the wrapping up things

15:19 <jonpsy[m]> Nw, focus on getting HER up & running

15:20 <EshaanAgarwal[m]> jonpsy[m]: It's up I guess ! Let's see how it performs.

15:55 <EshaanAgarwal[m]> zoq: jonpsy: can you pls reopen the PPO pull request ?

15:59 <zoq[m]> <EshaanAgarwal[m]> "zoq: jonpsy: can you pls..." <- I will open a new PR.

18:35 <coatless[m]> Seems like some messages are going through to Slack and others aren't :'(

18:35 <coatless[m]> Sorry for the re-open/closed PR spam.

18:36 <coatless[m]> rcurtin: want a handle with the python bindings on conda?

19:07 <coatless[m]> handle <-> help. 0 coffee today :'(

20:50 <rcurtin[m]> coatless: yeah, I have been back and forth with the Slack bridging folks, but the main maintainer is on vacation right now

20:50 <rcurtin[m]> I actually think I am getting closer with the Windows build, but if you have any specific ideas, I'm all ears. It appears that the install path for Windows is wrong, but I need to figure out what "right" is, then I can make a patch and it should be good to go