ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/
HARSHCHAUHAN[m] has quit [Ping timeout: 268 seconds]
TaapasAgrawalGit has quit [Ping timeout: 268 seconds]
Cadair has quit [Ping timeout: 268 seconds]
KhizirSiddiquiGi has quit [Ping timeout: 268 seconds]
KritikaGuptaGitt has quit [Ping timeout: 268 seconds]
LolitaNazarov[m] has quit [Ping timeout: 268 seconds]
NishantKumarGitt has quit [Ping timeout: 268 seconds]
siddhant_jain[m] has quit [Ping timeout: 268 seconds]
PulkitgeraGitter has quit [Ping timeout: 268 seconds]
rishishounakGitt has quit [Ping timeout: 268 seconds]
AvikantSrivastav has quit [Ping timeout: 268 seconds]
SeverinoTessarin has quit [Ping timeout: 268 seconds]
aryan-026[m] has quit [Ping timeout: 268 seconds]
RishabhGoel[m] has quit [Ping timeout: 268 seconds]
AbhishekNimje[m] has quit [Ping timeout: 268 seconds]
RV784Gitter[m] has quit [Ping timeout: 268 seconds]
siddhant_jain[m] has joined #mlpack
RV784Gitter[m] has joined #mlpack
HARSHCHAUHAN[m] has joined #mlpack
NishantKumarGitt has joined #mlpack
KhizirSiddiquiGi has joined #mlpack
Cadair has joined #mlpack
KritikaGuptaGitt has joined #mlpack
TaapasAgrawalGit has joined #mlpack
AbhishekNimje[m] has joined #mlpack
PulkitgeraGitter has joined #mlpack
SeverinoTessarin has joined #mlpack
AvikantSrivastav has joined #mlpack
rishishounakGitt has joined #mlpack
RishabhGoel[m] has joined #mlpack
aryan-026[m] has joined #mlpack
LolitaNazarov[m] has joined #mlpack
_slack_mlpack_19 has quit [Ping timeout: 268 seconds]
_slack_mlpack_19 has joined #mlpack
ImQ009 has joined #mlpack
EmmanuelLykosGit has quit [Ping timeout: 268 seconds]
EmmanuelLykosGit has joined #mlpack
Samyak has joined #mlpack
Anton59 has joined #mlpack
< Anton59> Does mlpack have equivalent functionality of Tensorflow, Pytorch, Scikit-learn ?
< Anton59> A second question: Is the library going to continue to be supported in the next 3-5 years?
Samyak has quit [Remote host closed the connection]
< zoq> Anton59: mlpack implements some neural network functions, but not as many as TF or PyTorch, so at this point I wouldn't directly compare it with TF and PyTorch; it's similair to Scikit meaning our focus is not necessarily on Neural Networks, more on implementing cutting edge methods, make them fast.
< rcurtin[m]> Hi @Anton59, I don't think we are going anywhere anytime soon :)
< Anton59> ok, because I am thinking of working with it in my research, since I like C++ and not a big fan of Python.
< zoq> If I remember right, Ryan correct me if I'm wrong, mlpack is now 13 years old?
< zoq> started in 2007, I think
< rcurtin[m]> yeah, 2007 :)
< zoq> rcurtin[m]: Did you switched to another client?
< Anton59> so if you compare it with scikit-learn does it cover most of the functionality, maybe with a little bit more typing, but nevertheless ...
< rcurtin[m]> @Anton59 it sounds like mlpack might be a good choice in this case---I think it is the most mature of the "stable" C++ machine learning toolkits, and development is pretty active
< rcurtin[m]> @zoq yeah, I finally made the switch over to matrix a handful of weeks ago for all my chat clients
< rcurtin> I still keep IRC open though since it's what our logging system uses ;)
< rcurtin[m]> @Anton59 I would say so; there are some places where mlpack has some methods that scikit does not, and also vice versa, but I think for the "typical" algorithms like random forest, linear models, decision trees, k-means, and this type of thing, we should be pretty much at parity
< rcurtin[m]> in some cases, mlpack is faster (especially with k-means), in part due to the C++ implementation, and in part due to the choice of better algorithms under the hood
< rcurtin[m]> sometimes, scikit has its algorithms actually implemented in C though (via Cython) and when they do this the performance is quite comparable to mlpack
< Anton59> thanks, then I can count on it + boost + a DB library and I have everything
< Anmol2001Gitter[> can anyone help me with this error while testing ?
< Anmol2001Gitter[> ~/mlpack-3.4.2/build$ bin/mlpack_test -t KNNTest
< Anmol2001Gitter[> Test setup error: no test cases matching filter or all test cases were disabled
< rcurtin[m]> @Anton59 awesome
< rcurtin[m]> @Anmol2001 we moved our testing framework to Catch recently, so the syntax is a little different: try `bin/mlpack_test "[KNNTest]"`
< rcurtin[m]> did you find documentation somewhere that has the `-t` in it?
< zoq> rcurtin[m]: Interesting, so I guess the IRC mlpack user base lost another one.
< rcurtin[m]> zoq: sadly it might be true :-D
< rcurtin[m]> oh nice! I didn't know about that file
< Anmol2001Gitter[> @rcurtin yes i found -t in documentation
< rcurtin[m]> Anmol2001 (Gitter): could you point out where? would you be interested in updating it?
< Anmol2001Gitter[> sure
< Anmol2001Gitter[> here in doxygen
< rcurtin[m]> ah, ok, so that will be in the file `doc/guide/build.hpp`
< Anmol2001Gitter[> yeah
< Anmol2001Gitter[> current status is as above
< rcurtin[m]> are you running on the current git master version?
< Anmol2001Gitter[> actually i followed doxygen documentation to setup
< Anton59> Wanted to report an issue I discovered. When I try this: cmake -D DEBUG=ON -D PROFILE=ON ../
< rcurtin[m]> Anmol2001 (Gitter): okay, but what version of mlpack are you using?
< Anmol2001Gitter[> 3.4.2
< Anton59> If I try to compile with debugging and profiling my Linux machine freezes when I do make -j4 with 4 cores on my CPU. If I do just cmake ../ then things are fine.
anmol has joined #mlpack
< zoq> Anton59: Just build with less jobs make-j1 or make -j2, mlpack uses a lot of memory for the build (we make use of templates a lot)
< rcurtin[m]> Anmol2001 (Gitter): sorry, I did not know you were using that version. at that time we were in the middle of the transition, so it looks like KNNTest was a part of `mlpack_catch_test`. if you update to git master then all the tests are in `mlpack_test`
< zoq> Anton59: Also we are working on reducing the memory usage.
< Anmol2001Gitter[> @Anton59 try with -j 3 or -j 2 sometimes its the memory exceeded while using more cores
< Anton59> Thank you, I thought this might be the case. Will try.
< anmol> welcome brother
anmol has quit [Ping timeout: 245 seconds]
Anton59 has quit [Remote host closed the connection]
Anton20 has joined #mlpack
< Anton20> Even with make -j 2 when configuration is done with cmake -D DEBUG=ON -D PROFILE=ON ../ the memory gets exhausted and the machine freezes. Because most of it is in headers, maybe I don't need to build with these options turned on. Or at least I can turn off the profiling and just try with debugging.
< RishabhGarg108Gi> @Anton20 , If you want to use mlpack just for research work and do not plan to contribute, then you can build with -DBUILD_TESTS=OFF. Building tests is really heavy. So, if you skip them, then you can pretty smoothly build it with just -j1 and it also wont take much time.
< siddhant_jain[m]> Anton20 just use make
< Anton20> make is like make -j 1 I hope
< Anmol2001Gitter[> i think by default there is one core
< iamarchit123Gitt> with 8GB ram if you are building only for DEBUG and PROFILE i have seen default j value suffices suffices but if you building for tests then i have found j1 is necessary 2 of the tests(i think one is ann) consume exceptional memory and if they are running in parralel my Ubuntu machine crashed.I came back after cofee only to see it dead at time of crash only 30 MB of ram was free :)
Anton20 has quit [Ping timeout: 245 seconds]
Antonhr has joined #mlpack
< Antonhr> I appreciate everybody's comments, but if you see that I login to the chat with different names that is because my machine crashes when I try to compile with debug on, even if I exclude the tests. The only way everything compiles is if I do cmake ../ and make.
< rcurtin[m]> Antonhr: how much RAM do you have available? and do you need to compile the tests?
< rcurtin[m]> sorry that things are painful right now, it is definitely a known pain point that we are working on
< Antonhr> So maybe putting a warning on the web site where this statement is : cmake -D DEBUG=ON -D PROFILE=ON ../ and RAM expectations would be better. I have :
< Antonhr> anton@anton-Precision-7720:~/mlpack-3.4.2/build$ free -h total used free shared buff/cache availableMem: 15Gi 2.0Gi 10Gi 531Mi 2.3Gi 12GiSwap: 2.0Gi 0B 2.0Gi
< Antonhr> 16 GB RAM
< rcurtin> so you have like 10GB RAM free and mlpack won't build with only one core?
< rcurtin> I am still not really understanding; are you building the tests? do you need to build all of the tests?
< Antonhr> Do not need to build the tests. Even excluding the tests as soon as I include debugging and worse yet profiling the build does not finish. It progressively grinds the Ubuntu 20.04 to a screeching halt.
< rcurtin[m]> so you are configuring with `cmake -DBUILD_TESTS=OFF`, and then typing `make`; is it building bindings for other languages?
< Antonhr> Yes sir.
< rcurtin[m]> do you need bindings for other languages?
< Antonhr> No need, just C++ pure library,
< rcurtin[m]> ok; if that's all you need, you can disable the bindings... `-DBUILD_PYTHON_BINDINGS=OFF -DBUILD_JULIA_BINDINGS=OFF -DBUILD_R_BINDINGS=OFF -DBUILD_CLI_EXECUTABLES=OFF`
< rcurtin[m]> or, you can just make the `mlpack` target: `make mlpack`
< rcurtin[m]> after `make mlpack`, the library will be in `lib/` and the headers will be in `include/` under your build directory
< Antonhr> I will try this to speed things up, thanks.
< rcurtin[m]> I'd suggest just reconfiguring CMake to disable all the bindings
< rcurtin[m]> I guess I forgot `-DBUILD_GO_BINDINGS=OFF` too
< Antonhr> All of them, no preference for Go.
< rcurtin[m]> I forget how many different languages we have bindings for now :)
< Anmol2001Gitter[> i just made a pr , have a look when free, https://github.com/mlpack/mlpack/pull/2770
< Antonhr> Yes, but the reason for me is simpler - C++.
< rcurtin[m]> 👍️
< RishabhGarg108Gi> @rcurtin , many times its boring to write all these options to disable all the other bindings. Would it be a good idea to have another cmake option -DBUILD_BINDINGS that can be used to enable or disable all bindings with just one option
< rcurtin[m]> RishabhGarg108 (Gitter): hmm, yeah! that could be nice; alternately, maybe the better idea would be to disable all the `BUILD_X_BINDINGS` options by default
< RishabhGarg108Gi> Yep. This would be better to disable all of them by default. I will open an issue for this :+1:
< rcurtin[m]> thanks!
Antonhr has quit [Remote host closed the connection]
< Anmol2001Gitter[> @rcurtin i am interested in updating doc/guide/build.hpp as i myself faced the issue : )
< rcurtin[m]> yes, please, go ahead
PulkitgeraGitter has quit [Ping timeout: 268 seconds]
robotcatorGitter has quit [Ping timeout: 268 seconds]
Antonhr has joined #mlpack
PulkitgeraGitter has joined #mlpack
robotcatorGitter has joined #mlpack
Antonhr has quit [Remote host closed the connection]
< jeffin143[m]> > make is like make -j 1 I hope
< jeffin143[m]> @antom
< jeffin143[m]> By default it will take as many cores you have , that is 4
< jeffin143[m]> @anton *
< rcurtin[m]> @jeffin143 are you sure on that one? my understanding is that `make` will only use one core if you don't specify a `-j` option
< rcurtin[m]> that's very strange, make with only one core fails but two succeeds?
< iamarchit123Gitt> but in my case make -j2 failed and make -j1 passed with build_test plain make fails
< Anmol2001Gitter[> the issue in this link is little similar
< rcurtin[m]> are any of you using `ninja` to build?
< iamarchit123Gitt> anyone tried by enabling swap memory to increase effective RAM
< jeffin143[m]> > that's very strange, make with only one core fails but two succeeds?
< jeffin143[m]> Ryan that was probably 1 year ago , I have been using a good computer since then
< jeffin143[m]> But I definitely remember that happened
< rcurtin[m]> interesting; I know that ninja will build in parallel by default but I don't know of any systems that will default `make` to multiple cores
< jeffin143[m]> May be I use to reverse search and hit tab , and once I might have used make -j4 and thus
< jeffin143[m]> Not sure
< jeffin143[m]> I will test again
< Anmol2001Gitter[> @rcurtin i have done required documentation changes with test commands
< HARSHCHAUHAN[m]> Hii Everyone, I am trying to setup mlpack env on my local.
< HARSHCHAUHAN[m]> After this command "cmake -D DEBUG=ON -D PROFILE=ON ../"
< HARSHCHAUHAN[m]> I am getting an error.
< HARSHCHAUHAN[m]> can anyone help me out please!!
< rcurtin[m]> the error message shows that the version of cereal on your system is too old; install a newer version and try again 👍️
< zoq> rcurtin[m]: I did a quick benchmark between mlpack (master) and mlpack (3.3.2 ) to see if the memory usage increased; I patched out the nn stuff for both, and also disabled all bindings.
< zoq> mlpack master uses 3686364 kbytes and mlpack 3.3.2 3157104 kbytes
< zoq> build time also increased from 27:02.16 to 31:50.38
< zoq> I'll do the same with the nn code now.
< zoq> I think we added some code in between besides catch2 and cereal, but 530 MB seems strange.
< rcurtin[m]> fascinating! do you want to try 3.4.2 also? that has (some) catch2 but not cereal
< zoq> Yes, let's test 3.4.2 as well
< rcurtin[m]> it's also possible that the benefit of removing boost won't be seen until we remove all of it; so if we are still using visitor and math in places, those still may be including a huge amount
< rcurtin[m]> let me see if I can get gcc or clang to output some information on how long it spends in each phase of compilation; that could be helpful too
< zoq> But somehow we increased the memory footprint.
< rcurtin[m]> right, I have some ideas for why that could be, but let me get a breakdown of compilation time first
< zoq> Would be nice to include that into our ci; I'm using 'time' right now, would be easy to add just to get some numbers.
< rcurtin[m]> agreed, maybe there is some way for Jenkins to track it?
< zoq> I'll look into it.
ImQ009 has quit [Quit: Leaving]
< rcurtin[m]> -ftime-report on `adaboost_test.cpp` and `ann_layer_test.cpp` on mlpack master: https://pastebin.com/tY5HSvZm
< rcurtin[m]> `ann_layer_test.cpp` takes almost 8GB of RAM by itself!! :-O
< zoq> wow, insane
< rcurtin[m]> let's see what that was on 3.3.2 (I realize that the test is probably a little bit smaller, and there were fewer layers, but still, I want to see if it is a huge jump or something)
< zoq> Okay, if it's that bad, I'll put https://github.com/mlpack/mlpack/issues/2647 on the top of my list.
< rcurtin[m]> on 3.3.2: https://pastebin.com/J0LHRCDR
< rcurtin[m]> that's 6.3GB, so still quite a lot but it's a good bit less than master
< rcurtin[m]> I want to see if I can figure out why it's taking so much memory
< rcurtin[m]> I can see that many of the other files that don't use the NN toolkit use ~a few hundred MB usually
< zoq> I guess just adding one layer to LayerTypes will have a huge effect.
< rcurtin[m]> I remember from a long time ago when I profiled with `-ftime-report` that a massive amount of time was being spent just parsing includes
< rcurtin[m]> however, the reports I'm seeing now don't seem to have massive amount of time spent in parsing
< rcurtin[m]> so I am trying to dig further and understand what's going on there
< rcurtin[m]> I had believed for a long time that simply removing `#include`s would be very helpful (this is, in part, the reason I've thought removing boost would be super helpful), but the results I am seeing now make me wonder if my thoughts were incorrect
< zoq> I guess you are still right, but it might not have the effect we wanted to see.
< rcurtin[m]> so I'm trying again removing cotire, which I think will use precompiled headers; maybe that is doing a really good job of reducing parse time?
< rcurtin[m]> oh wow---without cotire, suddenly simply parsing for `adaboost_test.cpp` takes 850MB and 5 seconds, whereas with cotire it takes 0.4s and 45MB... so things are way better than they *could* be! :)
< zoq> this is crazy, so cotire did an awsome job, I guess I haven't appreciated it enough.
< rcurtin[m]> and on `ann_layer_test.cpp`, without cotire parsing takes 12 seconds and 1.0GB, with cotire it takes 4 seconds and 330MB (probably because some headers are not precompiled there)
< rcurtin[m]> I agree, I think I underappreciated cotire too!
< rcurtin[m]> I found this tool: https://github.com/mikael-s-persson/templight
< rcurtin[m]> I think I'll play with it and see if I can make a callgraph for template instantiations; maybe this will give useful information too
< zoq> looks like a promising tool
< rcurtin[m]> it'll take a little while to get set up with it; it requires a custom llvm build
< rcurtin[m]> at least compiling LLVM proves that there is something out there that's more computationally intensive to compile than mlpack :-D
< zoq> haha
< zoq> so looks like right now you need at least 4.8 GB memory
< zoq> free memory
< rcurtin[m]> yeah; I guess the GCC output must not be reporting on peak memory usage at one time but the total (including things that are deallocated and reallocated later)
< rcurtin[m]> I suppose, maybe we can at least say that we are helping people keep their houses warm in winter :)
< zoq> haha, the only problem is in some areas you don't heat with electricity, because it's expensive :(
< rcurtin[m]> true :-D
< zoq> But I guess it's a nice by product, if you like it warm.
< zoq> Here it's currently 6 degrees celsius (outside).
< zoq> Also, I'm building on a remote machine.
< rcurtin[m]> same here, a little colder than I would expect for atlanta this time of year
< zoq> Also, I don't expect any snow this year, maybe beginning of next year; which doesn't matter really, because we are in a lockdown.
< rcurtin[m]> it would give you something interesting and different to look at out the window :)
< zoq> oh yes definitely
< shrit[m]> I am currently compiling a neural network code, only one cc file, two functions and two models, in one file. It is using 20 percent of my 32 GB RAM
< shrit[m]> So I am not suprised if ann_layers_test.cpp is taking about 8 GB of RAM
< rcurtin[m]> I suspect it might be a quick fix to just split the file into several files, so that all the instantiation doesn't happen all in one file
< rcurtin[m]> but, that's not a great solution, only a quick one
< shrit[m]> Agreed, I would blame boost visitors on this increase. Otherwise I can not see anything else that consume that much amount of RAM
< rcurtin[m]> heavy use of SFINAE could also be to blame, but I would suspect the visitor paradigm too since it is super template heavy
< shrit[m]> We can divide it, but it will not divide the amount of consumed RAM by half
< rcurtin[m]> I am still compiling this custom LLVM version, but I'm hoping when I finally manage to make it work it might shed some light on where the painful part is :)
< shrit[m]> Hope that too :)