#mlpack on 2020-12-18 — irc logs at libera.irclog.whitequark.org

2018-11-12 22:39 ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/

00:57 gaulishcoin has quit [Read error: Connection reset by peer]

01:15 < RishabhGarg108Gi> I think this macro is provided by catch and this makes very easy for you to perform various checks while writing some unit test. You can have a look at https://github.com/mlpack/mlpack/issues/2523 to appreciate how handy this macro is.

01:36 yashwants19[m] has joined #mlpack

05:43 < AakashkaushikGit> Hey @zoq , I will be having some breaks between exams so i will keep updating that ann_exp repository.

07:57 gaulishcoin0 has joined #mlpack

08:08 < iamarchit123Gitt> @RishabhGarg108 if we are using REQUIRE,TEST_CODE from CATCH in our own custom code apart from including catch.hpp any library we need to link to while compiling our code @RishabhGarg108

08:28 ImQ009 has joined #mlpack

09:27 < RishabhGarg108Gi> No. All you need is the catch.hpp . This header will be sufficient.

11:26 < zoq> AakashkaushikGit: Sounds good, I will work on it as well.

16:19 Sunanda has joined #mlpack

16:47 < zoq> Hello everyone, the casual mlpack video meeting starts in about 15 minutes - https://zoom.us/j/3820896170, for more details checkout the community page - https://www.mlpack.org/community.html

17:29 Sunanda has quit [Remote host closed the connection]

18:04 < rcurtin[m]> wow, only 1m40s to compile the PCA example with templight... it finished as soon as I left the meeting

18:04 < rcurtin[m]> but `dot` is having more trouble turning the graphviz graph into an image... :)

18:08 Antonhr has joined #mlpack

18:13 < Antonhr> I was able to compile cmake -D DEBUG=ON -D PROFILE=ON ../; make -j 4 with no problems after increasing the /swapfile on my Linux machine.

18:14 < Antonhr> The compilation was dipping into the swap file claiming up to 11GB of it in addition to my 16GB RAM. So my initial 2GB /swapfile did not stand a chance.

18:14 < iamarchit123Gitt> how much you allocated for Swapfile and what was the available free RAM when you started the compilation

18:14 < rcurtin[m]> Antonhr: wow, I'm surprised it needed to be so big. we've been digging into memory usage over here too; still tracking down what our major culprits are

18:14 < iamarchit123Gitt> 11GB :O

18:15 < Antonhr> I created a 16GB /swapfile and it worked since I have an SSD no slowdown at all. Able to browse and do other work while waiting for it to finish.

18:16 < Antonhr> I had about 11GB unused RAM when I started, but around 50% to 80% of the compilation it eats it all up and starts borrowing from the swap. I was watching it just for fun.

18:18 < Antonhr> Using the swap was way faster and cheaper than upgrading the laptop with another 16GB. So I am glad it worked fine.

18:21 < Antonhr> free -h total used free shared buff/cache availableMem: 15Gi 14Gi 169Mi 53Mi 504Mi 319MiSwap: 15Gi 11Gi 4.9Gi

18:21 < Antonhr> Worst case when 11GB of swap was claimed.

18:26 < shrit[m]> Agreed, we are working on reducing the memory consmption during compilation. By the way, it is always recommended to have equal amount of RAM and swap in linux :+1:

18:27 < rcurtin[m]> shrit: really? this must be new (...or newer than when I started just sizing all my swap at ~2GB in 2006...)

18:27 < shrit[m]> I feel I can no longer compile mlpack on my laptop, only on my workstation

18:27 < rcurtin[m]> I never went and updated my understanding of how to set swap sizes, so maybe I should do it differently in the future :)

18:29 < Antonhr> I wonder why the default swap size for Ubuntu 20.04 is 2GB if I have 16GB of RAM.

18:29 < iamarchit123Gitt> If someone has HDD will it be smoothless considering HDD has slower access than SSD which is faster and has speed comparable to RAM if i am not wrong

18:31 < Antonhr> Modern HDDs have caches, so it may not be that bad, but with GB size accesses the SSD will be performing better.

18:32 < shrit[m]> I remember that the rule was swarm = ram * 2 for small RAM, this is good if I am using my old 64 MB RAM laptop.

18:33 < shrit[m]> swap*

18:34 < shrit[m]> In our today measure 8 GB should be the minumum for a laptop or any workstation, since most of them are shipped with Windows

18:35 < shrit[m]> So the same amount for swap should at least be equal. knowing that most laptop and PC are sold with 512GB -> 1TB of HDD, 8 GB of swap should not be a big loss

18:38 < Antonhr> yes, even 16GB of swap seems not a great loss out of 1TB

18:39 Antonhr has quit [Remote host closed the connection]

18:41 < shrit[m]> I put 18 GB of swap on my workstation. Knowning that my RAM is 32GB, I have rarely seen it being used, only once I was compiling a physiques engine, it used about 12 GB of swap, total RAM usage of 44GB.

22:12 < rcurtin[m]> ha! after 5 hours my `dot` run of the template call graph for `PCA<>` finished. It gave this output:

22:12 < rcurtin[m]> ```

22:12 < rcurtin[m]> dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.000267811 to fit

22:12 < rcurtin[m]> ```

22:13 < rcurtin[m]> and it produced an image that's 32766 pixels wide and 1 pixel tall

22:13 < rcurtin[m]> so... not very useful...

22:14 < rcurtin[m]> although, maybe better than the alternative; without scaling, it would have a resolution 12M x 375 and take 34GB to represent in memory (not sure how good png encoding is)

22:16 < shrit[m]> it is very strange

22:17 < shrit[m]> I do not know how much information are generated, but this is a lot for only PCA

22:20 < shrit[m]> this should be around 5.5K * 5.5K pixel for a one screen

22:20 < shrit[m]> one image

22:20 < rcurtin[m]> according to the text output there are 176k template instantiations during PCA compilation

22:21 < rcurtin[m]> of those, 67k are in `std::`

22:21 < abernauer[m]> that sounds like a problem lol

22:21 < zoq> did the bitmap export work for the fibonacci example?

22:21 < rcurtin[m]> wait, sorry, I got that backwards, *108k* of those are in `std::`

22:21 < rcurtin[m]> ~35k are in `arma::`

22:22 < rcurtin[m]> and yeah, bitmap export worked just fine for the fibonacci example

22:23 < zoq> hm, does that mean there are ~33k template instantiations for PCA?

22:24 < rcurtin[m]> I'm not sure yet

22:25 ImQ009 has quit [Quit: Leaving]

22:27 < shrit[m]> For this number of insantiation, I would award gcc a prize, it is very fast though

22:28 < shrit[m]> they need a trophy :+1:

22:28 < rcurtin[m]> ha! I am trying running templight again, filtering out all instantiations in system headers

22:29 < rcurtin[m]> hopefully this will give something a bit smaller and more manageable

22:30 < shrit[m]> All of these instantiation are reducing running time.

22:49 < rcurtin[m]> ok, I think that I have read enough of the templight-tools documentation to understand what I am looking at, and I have learned enough about kcachegrind to understand its results

22:49 < rcurtin[m]> it seems that quite some time is spent in compiling cereal internals (xml.hpp, json.hpp), but to me this is a little confusing as I never actually used any cereal functionality in the example program

22:50 < rcurtin[m]> https://www.ratml.org/misc/pca.cg (you can load that with kcachegrind)

22:51 < rcurtin[m]> and here's the code: https://www.ratml.org/misc/pca.cpp

22:51 < rcurtin[m]> a somewhat significant amount of time is spent instantiating Armadillo classes (no surprise there)

22:53 < shrit[m]> my browser is on fire when loading the pca.cg, I will download it

22:53 < rcurtin[m]> wow, it is trying to load that directly? probably downloading and then opening with kcachegrind is the right thing

22:54 < shrit[m]> actually it is on fire when loading only the text

22:55 < rcurtin[m]> just ballparking: ~20-30% of time is spent in cereal headers; ~2-5% of time in STL headers; ~15-25% of time in Armadillo headers... if I'm reading that right

22:57 < rcurtin[m]> now that I understand this, let me try again with the RNN example

22:57 < shrit[m]> I can see it right now

23:08 < rcurtin[m]> one of the things that I am struggling to understand right now is why we are instantiating all this cereal stuff when we aren't even using it in that example code

23:09 < shrit[m]> I can see all of it

23:09 < shrit[m]> I have no idea

23:09 < shrit[m]> but we are including cereal.

23:10 < shrit[m]> So I do not know if this counts, because we are including it in the core

23:10 < rcurtin[m]> yeah, we are including it, but that should not cause types to be instantiated, it should just be parsed, that's all

23:10 < rcurtin[m]> there must be some class somewhere we are instantiating, which requires some cereal type or something

23:13 < shrit[m]> Actually we can not see the code on you website

23:14 < rcurtin[m]> ah sorry, https://www.ratml.org/misc/pca_test.cpp

23:14 < rcurtin[m]> wrong filename

23:14 < shrit[m]> but I can think about when loading the data, we are calling cereal

23:15 < rcurtin[m]> hmm, I am not calling `data::Load()` though

23:16 < shrit[m]> Maybe aramdillo

23:16 < shrit[m]> armadillo*

23:16 < shrit[m]> because we are adding serialize function to armadillo that the reason we include armadillo later.

23:16 < shrit[m]> So if we include arma we include cereal

23:16 < rcurtin[m]> maybe? my understanding is that those are template functions, so only if we actually call them will things be instantiated

23:17 < rcurtin[m]> I'm trying to look up the entire call chain to see what the highest-level parent of the cereal instantiations are

23:27 < rcurtin[m]> I can't find any tool that can load the full graph visualization of instantiations... so all I seem to be able to use is kcachegrind

23:28 < rcurtin[m]> so I don't know how to get the answer to the question of why we are instantiating things in cereal

23:28 < rcurtin[m]> I think the best idea I have for how to do this is to slowly remove includes from `core.hpp` and see when cereal stops showing up...

23:29 < shrit[m]> When I look at the elf object in Kcachegrind, I can see only the cereal:JSONInputArchive

23:29 < shrit[m]> which will include everything else.

23:29 < rcurtin[m]> I'm wondering if it is getting instantiated as part of things like `HasSerialize` and the other things we are using like that with SFINAE

23:30 < shrit[m]> Maybe

23:31 < rcurtin[m]> I got the RNN example done too... the boost visitor stuff completely dwarfs the cereal time

23:31 < rcurtin[m]> I'm uploading the callgraph file now

23:32 < rcurtin[m]> whereas cereal was ~20-30% in the PCA example, here it appears to be... 2-3%?

23:32 < rcurtin[m]> just a few of the `*_visitor_impl.hpp` seem to account for 60% of the template instantiation time

23:32 < shrit[m]> omg

23:33 < shrit[m]> I believe this is normal, when thinking about 5 minutes to compile a file

23:33 < rcurtin[m]> you know... I think one "quick fix" (but it is not a great fix) would be to merge a bunch of the visitors

23:33 < rcurtin[m]> that kind of breaks the entire visitor idea, of course... since you want to have one visitor type per task

23:34 < rcurtin[m]> a lot of the pain appears to be under `variant<LayerTypes>::apply_visitor(VisitorType)` for lots of different `VisitorType`s

23:34 < shrit[m]> I know, and finally only one type is required

23:35 < rcurtin[m]> I'm going to go do some other things for a while, but I also want to try compiling `knn_main.cpp`, since that uses `KNNModel` which also uses `boost::visitor`

23:36 < rcurtin[m]> it would be interesting to see the behavior there, and also that is a much simpler example to adapt to see the effect on compilation times (but I am not sure it will tell us too much about the ANN experiment, it will just give some small idea)

23:37 < shrit[m]> Agreed

23:37 < shrit[m]> I will need a lot of time until we see something out from the ann experiment.

23:38 < shrit[m]> I can not see visitor is going before 6 or 10 months from now

23:40 < zoq> If the inheritance approach does not introduce any big slowdowns, I expect we can remove the visitor approach at least from the ann codebase in way less time.

23:42 < zoq> I try to run some experiments over the weekend.