#mlpack on 2020-10-03 — irc logs at libera.irclog.whitequark.org

2018-11-12 22:39 ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/

01:08 < abernauer[m]> Ok I should have the bindings built in the morning my Armadillo version is outdated.

06:15 BlakJak888 has joined #mlpack

06:17 < BlakJak888> I am trying to compile the LSTM_STOCK_PREDICTION example but I am facing an error on Visual Studio 2019. The error is "C1060 compiler is out of heap space". I have succesfully compiled the MNIST Digit example so I do not believe it is my setup problem. Has anyone come across this error before? Any advice?

07:12 OmarWagih1Gitter has quit [Ping timeout: 244 seconds]

07:12 OmarWagih1Gitter has joined #mlpack

07:41 SakshamRastogiG4 has quit [Ping timeout: 244 seconds]

07:42 SakshamRastogiG4 has joined #mlpack

08:01 BlakJak888 has quit [Remote host closed the connection]

08:02 ImQ009 has joined #mlpack

08:33 d1 has quit [Ping timeout: 272 seconds]

08:33 d1 has joined #mlpack

10:48 < shrit[m]> You can try to add to the compiler options `/Zm200` to resolve this error

11:01 gaulishcoin has quit [Quit: The Lounge - https://thelounge.chat]

11:02 < shrit[m]> However, it is not garanteed to resolve this issue.

11:23 gaulishcoin has joined #mlpack

12:08 gaulishcoin has quit [Quit: The Lounge - https://thelounge.chat]

13:04 prioryofsion has joined #mlpack

13:10 prioryofsion has quit [Max SendQ exceeded]

13:11 prioryofsion has joined #mlpack

13:11 prioryofsion has quit [Max SendQ exceeded]

13:12 prioryofsion has joined #mlpack

13:12 prioryofsion has quit [Max SendQ exceeded]

13:12 prioryofsion has joined #mlpack

13:12 prioryofsion has quit [Max SendQ exceeded]

13:13 prioryofsion has joined #mlpack

13:13 prioryofsion has quit [Max SendQ exceeded]

13:13 prioryofsion has joined #mlpack

13:13 prioryofsion has quit [Max SendQ exceeded]

13:14 prioryofsion has joined #mlpack

13:14 prioryofsion has quit [Max SendQ exceeded]

13:14 prioryofsion has joined #mlpack

13:14 prioryofsion has quit [Max SendQ exceeded]

13:15 prioryofsion has joined #mlpack

13:15 prioryofsion has quit [Max SendQ exceeded]

13:15 prioryofsion has joined #mlpack

13:15 prioryofsion has quit [Max SendQ exceeded]

13:16 casiofx991es has joined #mlpack

13:16 casiofx991es has quit [Max SendQ exceeded]

13:16 casiofx991es has joined #mlpack

13:16 casiofx991es has quit [Max SendQ exceeded]

13:17 casiofx991es has joined #mlpack

13:17 casiofx991es has quit [Max SendQ exceeded]

13:17 casiofx991es has joined #mlpack

13:17 casiofx991es has quit [Max SendQ exceeded]

13:18 BlakJak888 has joined #mlpack

13:18 casiofx991es has joined #mlpack

13:18 casiofx991es has quit [Max SendQ exceeded]

13:29 gaulishcoin has joined #mlpack

13:45 < BlakJak888> @shrit That was the first thing I tried. Actually used /Zm2000 - same error. I have since narrowed it down to a few lines of code that are generating the error.

13:47 < BlakJak888> This code gives an error:

13:47 < BlakJak888> if (bTrain || bLoadAndTrain){// RNN regression model.RNN<MeanSquaredError<>, HeInitialization> model(rho);if (bLoadAndTrain){// The model will be trained further.//std::cout << "Loading and further training model..." << std::endl;data::Load(modelFile, "LSTMMulti", model);}}

13:48 < BlakJak888> Commenting out data::Load(modelFile, "LSTMMulti", model); causes the code to compile successfully

13:48 < BlakJak888> However

13:49 radioactive11 has joined #mlpack

13:51 < BlakJak888> ... actually no however

13:51 < BlakJak888> It definitely seems that the offending line of code is the data::Load()

13:53 radioactive11 has quit [Remote host closed the connection]

13:59 < BlakJak888> In fact when I put back the full set of code for the example I had to comment out the 2 places in the code where the software tries to load the "LSTMMulti" file type. Leaving the "data::Save" for the "LSTMMulti" seemed to compile OK.

14:01 < BlakJak888> Could it be a problem with boost::serialization ?

14:01 < BlakJak888> I am using boost_1_74_0

14:17 < BlakJak888> Going back to 1_73_0 seemed to make no difference. Here are the compiler error lines:

14:17 < BlakJak888> D:\sdk\boost_1_73_0\boost\mpl\bind.hpp(539,1): fatal error C1060: compiler is out of heap space1> The command exited with code 2.1> Done executing task "CL" -- FAILED.1>Done building target "ClCompile" in project "StockPriceLSTM.vcxproj" -- FAILED.1>1>Done building project "StockPriceLSTM.vcxproj" -- FAILED.1>1>Build FAILED.

14:17 < BlakJak888> The error is exactly the same with 1_74_0

14:24 hyyyyoui has joined #mlpack

14:25 < hyyyyoui> Hi

14:26 < hyyyyoui> I had a doubt in building mlpack

14:26 < hyyyyoui> If you do not want to build everything in the library, individual components of the build can be specified:

14:26 < hyyyyoui> Similar to above, what is the specification for ann method (if there is)?

14:27 < hyyyyoui> Also, the normal make command does something like a clean build right? Is there a way to avoid that. Cause it takes a long time to build on my system.

14:29 < zoq> hyyyyoui: That refers to executables, since there is no executable for the ann part it's not necessary. You can accelerate the build process if you disable the executables -> cmake -DBUILD_CLI_EXECUTABLES=OFF

14:30 < zoq> hyyyyoui: About clean, if you don't run 'make install' it will not clean the build folder.

14:31 < zoq> hyyyyoui: You can also disable the test build with "-DBUILD_TESTS=OFF"

14:33 < zoq> BlakJak888: I try to reproduce the issue later, it could be a serialization problem.

14:33 < zoq> BlakJak888: Btw. do you build against the latest master branch?

14:35 < BlakJak888> zoq: I followed the website instructions for a Windows Build from Source and downloaded the latest release from mlpack.org

14:36 < BlakJak888> 3.4.1

14:36 < BlakJak888> ensmallen 2.14.2

14:36 < zoq> BlakJak888: Can you build the git master branch?

14:37 < BlakJak888> armadillo 9.900.3

14:37 < BlakJak888> I can.

14:37 < BlakJak888> Where should I clone from?

14:37 < BlakJak888> https://github.com/mlpack/mlpack

14:37 < BlakJak888> ???

14:37 < zoq> BlakJak888: yes - https://github.com/mlpack/mlpack

14:38 < hyyyyoui> zoq: I was able to get better speed with your suggestions. Thanks.

14:39 < hyyyyoui> > Moving header files to include/mlpack/

14:40 < hyyyyoui> This line takes a lot of time after make command. Anything I can try for this?

14:55 < zoq> hyyyyoui: You can avoid the step, don't call "make install" and link against the build folder.

14:58 BlakJak888 has quit [Remote host closed the connection]

15:21 < hyyyyoui> i dont get you

15:21 < hyyyyoui> the command i am runngin is make -j4

15:21 < hyyyyoui> *running

15:22 < hyyyyoui> do i change some flag in cmake?

15:45 gtank___ has quit [Ping timeout: 272 seconds]

15:45 vansika__ has quit [Ping timeout: 272 seconds]

15:45 vansika__ has joined #mlpack

15:46 BlakJak888 has joined #mlpack

15:48 gtank___ has joined #mlpack

15:57 < BlakJak888> zoq : I just recompiled the latest master MLPACK code from GITHUB. The error is the same, and it disappears when commenting out the lines data::Load(modelFile, "LSTMMulti", model); and data::Load(modelFile, "LSTMMulti", modelP);

15:58 < BlakJak888> This time the compiler failed with this error:

15:58 < BlakJak888> D:\sdk\boost_1_74_0\boost\serialization\singleton.hpp(156,1): fatal error C1060: compiler is out of heap space

15:58 < BlakJak888> different header, but same memory issue

16:00 < shrit[m]> BlakJack888 These errors are not related to mlpack, these errors are related to Visual studio compiler,

16:02 < shrit[m]> If you can use Ubuntu mlpack should build prefectly. However, if you still with VS you need to find a way to increase the heap space for the compiler.

16:06 < shrit[m]> BlakJack888 What is the VS studio you are using?

16:17 < BlakJak888> VS 2019 Version 16.7.5 (latest)

16:17 < BlakJak888> I need to use Windows

16:19 < shrit[m]> Ok, So mlpack is 3 parts. The library, bindings, and tests, which part is failing for you?

16:19 < shrit[m]> You can compile mlpack using make mlpack -j4

16:20 < shrit[m]> Sorry you are using Windows

16:20 < shrit[m]> Looking at the failing code, it is probably the bindings that are failing

16:24 < BlakJak888> I don't follow you. MLPACK compiles fine.

16:24 < BlakJak888> The example fails

16:25 < BlakJak888> https://github.com/mlpack/examples/blob/master/lstm_stock_prediction

16:25 < BlakJak888> It only compiles when I comment out data::Load() calls that load back the LSTMMulti model

16:25 < BlakJak888> It looks like a boost::serializaiton issue

16:26 < BlakJak888> Refer earlier exchange between me and zoq

16:26 < shrit[m]> OK, I see, I over read your above comments

16:27 < shrit[m]> OK, I see, I over read your above comments

16:29 blakjak8881 has joined #mlpack

16:29 < shrit[m]> Sorry, it seems that boost serialization headers are consuming the entire memory in VS

16:30 blakjak8881 has quit [Quit: Leaving.]

16:33 blakjak8881 has joined #mlpack

16:33 BlakJak888 has quit [Remote host closed the connection]

16:36 blakjak8881 has quit [Client Quit]

16:37 blakjak888 has joined #mlpack

16:39 < blakjak888> shrit : I mentioned before I am using Boost !_74_0

16:39 < blakjak888> I tried 1_73_0 and it gave the same error

16:40 < blakjak888> I have managed to compile and run other examples. It seems the problem is specific to this LSTM example

16:40 < shrit[m]> We are currently working on replacing boost serialization with cereal

16:43 < shrit[m]> The LSTM examples are working fine, The problem is related to Visual Studio.

17:03 < rcurtin> abernauer[m]: looks like Armadillo isn't installed in /usr/local/lib/; probably you meant to use /usr/local/include/?

17:04 < abernauer[m]> yeah I ran apt list for the dev version of armadillo and it's installed so probably the wrong directory

17:10 < yashwants19[m]> Hey [abernauer](https://matrix.to/#/@abernauer:matrix.org) may be you can use this artifact created by github actions for installing mlpack R-bindings.

17:10 < yashwants19[m]> https://github.com/mlpack/mlpack/suites/1156063386/artifacts/16712371

17:11 < yashwants19[m]> This artifacts was created during 3.4.1 release.

17:13 < yashwants19[m]> After extracting this artifacts you can directly install R-bindings using R CMD INSTALL mlpack_3.4.1.tar.gz

17:14 < abernauer[m]> Ok I got a 404 error on the link

17:14 < yashwants19[m]> https://github.com/mlpack/mlpack/actions/runs/243335889

17:14 < yashwants19[m]> Try this.

17:15 < yashwants19[m]> You can download mlpack_r_tarball

17:16 < abernauer[m]> ok will do

17:17 < yashwants19[m]> 👍

17:25 hyyyyoui has quit [Remote host closed the connection]

19:45 ImQ009 has quit [Quit: Leaving]

20:39 < rcurtin> shrit[m]: the microsoft deliverability team did the same thing for me... they "implemented mitigation" but provided no reason why anything was wrong

20:42 < shrit[m]> rcurtin that is good to here. Now I think if email is sent to an outlook address it should go to spam

20:42 < rcurtin> if that happens, I will keep bothering them, because there is no good reason for that message to go to spam

20:43 < shrit[m]> cereal is building on my machine, and not ont on the build farm. Also the heap issue is still present for the feed_forward test even it has been separated into two files

20:44 < shrit[m]> I know, but they are going to say that they can not garantee that message goes to the inbox

20:44 < rcurtin> how did you split it into two files?

20:44 < rcurtin> ideally you want to split it in a way that not all the headers are required in both files

20:44 < rcurtin> or at least split it such that some types are only used in one file and not another

20:45 < shrit[m]> I know, but most of the headers are necessary in both of them I think

20:45 < rcurtin> if you can remove any of them at all, it's likely to help

20:45 < shrit[m]> I will try a different splitting

20:46 < rcurtin> one more thing to possible try is to compile without optimizations on Windows... maybe that is using up RAM?

20:49 < rcurtin> even if it is just a workaround for now, I think it is okay---in the longer term, if we can remove more and more of boost from mlpack, this should help fix the RAM usage in visual studio

20:53 < shrit[m]> There is nothing I can remove from the ffn tests

20:53 < shrit[m]> The one that is failing is the seconds one which contain the kmeans headers.

20:53 < shrit[m]> The first one is not failing, as I have already remove the kmeans headers.

20:54 < rcurtin> does the second file only contain RBFNetworkTest?

20:55 < shrit[m]> no, I have split them equally,

20:56 < rcurtin> ok... try with RBFNetworkTest only in the second file... let's see what that does

20:56 < rcurtin> I have a few more ideas too... I am wondering if our use of `cotire` is causing problems on Windows

20:56 < rcurtin> but, see what that does first...

20:56 < rcurtin> https://github.com/sakra/cotire/blob/master/MANUAL.md

20:56 < rcurtin> I wonder if disabling it entirely could help, or maybe the COTIRE_MINIMUM_NUMBER_OF_TARGET_SOURCES could be set lower

20:57 < zoq> I wonder if it builds if we remove some layers from layer_types.hpp, like AlphaDropout, MiniBatchDiscrimination and SpatialDropout.

20:58 < shrit[m]> @zoq Do you mean removing them from the boost variant ?

20:59 < shrit[m]> Now there is only the VS15 that is failing due to this error

20:59 < zoq> yes, and also comment the tests.

21:02 < rcurtin> I'm not sure what CMake is setting up MSVC to do with precompiled headers... if it's not making them, that could also be the source of the problem

21:02 < rcurtin> I *think* cotire does this automatically, but I'm not sure

21:59 < rcurtin> making progress with bandicoot: https://gitlab.com/conradsnicta/bandicoot-code/-/merge_requests/9

22:09 < zoq> rcurtin: Great numbers, I really need to setup my env.

22:09 < shrit[m]> Great work

22:09 < rcurtin> admittedly, the original kernels were even slower than the CPU, so it's easy to do better :-D

22:10 < shrit[m]> rcurtin: Did you manually do all these graphs?

22:10 < rcurtin> I learned a lot about tools like cuda-memcheck, oclgrind, etc... my first implementations were really wrong and had all kinds of bugs

22:10 < rcurtin> shrit[m]: there's a utility script `create-plots.py` in the benchmarks/ directory that I used

22:11 < zoq> Looking at the numbers they are close peak bandwidth.

22:11 < zoq> *close to

22:11 < rcurtin> close to it, although not as good at the smaller sizes. I think that might be kernel startup overhead, but I haven't dug into it

22:11 < rcurtin> even cuBLAS is slower at smaller sizes, so I figured "comparable to cuBLAS" was good enough :)

22:13 < zoq> So if I get this right, I first checkout #6 and then #9?

22:14 < rcurtin> actually if you check out #9 directly it should work

22:14 < rcurtin> you can then go into `benchmarks/`, do `make accu` (or `make dot` or both), then run the benchmark:

22:14 < rcurtin> ./accu device_name trials n_elem out_csv

22:14 < rcurtin> for example, I ran like:

22:14 < rcurtin> ./accu rtx2080ti 5 1000000000 accu_results.csv

22:15 < rcurtin> I'm not sure what that will do on an OpenCL-only device though, you might need to comment out all cuda-related code in accu.cpp... I have never tried that

22:16 < zoq> Okay, will it pick the first device, if I have more than one?

22:16 < rcurtin> yeah, I think it will

22:16 < rcurtin> it can be configured to pick a certain device, but I think the call to coot::get_rt().init() might need to be modified; can't remember how

22:17 < zoq> okay, great will test it out first thing tomorrow

22:17 < rcurtin> it should output to stdout which device it chose

22:17 < rcurtin> if you have problems let me know, I have no idea how robust bandicoot is :-D

22:17 < rcurtin> it works on my 3 nearly identical nvidia devices... but that's all I have...

22:20 < zoq> I think I have an AMD card somewhere that I can test, but mainly use Nvidia here as well.

22:20 < rcurtin> I have some truly ancient Radeons somewhere, but I think they are from before OpenCL was even invented

22:20 < rcurtin> actually... my chromebook has an ARM Mali GPU, it might be interesting to try there

22:21 < rcurtin> but I have no idea if there is good support for actually using it

22:23 < shrit[m]> ARM Mali GPU, I never thought about doing GPU calculation on IOT devices :D

22:23 < rcurtin> I would be really interested to see if it would give any speedup... I guess we can find out

22:26 < shrit[m]> me too, I never thought about such a hardware, usually people are only looking for GPU farm calculations etc..

22:27 < shrit[m]> I am very happy that we have Soft actor critic algorithm. Many thanks to Nishar Kumar

22:27 < zoq> Found a Radeon HD 5970 :)

22:28 < zoq> shrit[m]: Yeah, Nishant did a great job.

22:29 < shrit[m]> I am trying to put it on my drones :D. I just want see if I can get something out of it, or they will eventuelly crash into some mountain

22:29 < shrit[m]> 5970 this one is old

22:30 < zoq> Yeah, don't use it anymore, but that's what I found my shelf.

22:30 < zoq> Nice, I think you have some sort of simulation you can test it on?

22:31 < shrit[m]> Even it is dated 10 years, maybe it can beat my two xeons

22:31 < shrit[m]> I have a full simulation framework I was working on it in the last 3 years.

22:32 < zoq> Nice, if you have some results, interested to see it.

22:32 < shrit[m]> Even it took me 3 years to write the framework, the only requirement is to have a State and Action class that are compatible with API

22:33 < shrit[m]> Actually I never used RL on my drones, I am only stuck to Supervised learning

22:34 < shrit[m]> Since it is harder to train a multi agent system. All the environment we have are only compatible with one agent at a time

22:34 < shrit[m]> Once I have any results I will make a video and share it.

22:37 < zoq> The API is all Shangtong Zhang's work.

22:39 < shrit[m]> Yes, I know, I use the same State action mechanism in Supervised learning. I inspired my self a lot from these two classes last year, Now I happy to see that I have jsut to add some callbacks in the train function.