#mlpack on 2022-04-14 — irc logs at libera.irclog.whitequark.org

2021-07-27 15:44 rcurtin_irc changed the topic of #mlpack to: mlpack: a scalable machine learning library (https://www.mlpack.org/) -- channel logs: https://libera.irclog.whitequark.org/mlpack -- NOTE: messages sent here might not be seen by bridged users on matrix, gitter, or slack

03:14 <ShubhamAgrawal[m> rcurtin: In ANN, why don't we store gradients and forward data in layer itself and store pointer to these allocation in FFN?

03:18 <rcurtin[m]> that makes a huge amount of programming overhead for anyone writing a new layer, and increases opportunities for bugs (because there is more code allocating memory in different places)

03:19 <ShubhamAgrawal[m> rcurtin[m]: Oh I am thinking about it for my proposal to change it to in-layer.

03:19 <ShubhamAgrawal[m> Because now you need to add gradients from different layers to get a gradient for resnet or inception like network

03:21 <rcurtin[m]> I would not support the idea of holding the memory in-layer. that's how it used to be, and there were very very many bugs (some never solved) that resulted from the complexity of that design

03:22 <rcurtin[m]> those bugs mostly had to do with aliases that went out of date or got deallocated and so forth

03:22 <rcurtin[m]> for your situation, I don't think it's too painful in the DAG case to simply allocate a temporary matrix for the gradient, then add it to the final gradient

03:23 <ShubhamAgrawal[m> rcurtin[m]: Ok

03:23 <ShubhamAgrawal[m> I can leave it as it is and change backprop algo for DAG

03:23 <rcurtin[m]> it is true that a copy is suboptimal; but when compared with the cost of the forward and backward passes themselves, I suspect it will not be too painful

03:23 <rcurtin[m]> (and if it is, we can find out later with a profiler and tune it as needed)

03:23 <ShubhamAgrawal[m> Ok thanks

03:24 <rcurtin[m]> I used to feel much more strongly about making code as fast as possible everywhere; upon reflection over the years, sometimes the cost of maintaining clever code is too high---especially if that clever code was not actually solving an empirically demonstrated performance bottleneck

03:25 <ShubhamAgrawal[m> rcurtin[m]: Yeah

03:25 <ShubhamAgrawal[m> I think too abstract approaches sometimes.

04:04 <fieryblade[m]> .

04:14 <ShubhamAgrawal[m> zoq: For writing pre-trained models proposal, do I need to write API as API is written in models wiki page?

04:14 <ShubhamAgrawal[m> Or should I copy it to proposal too?

09:00 say4n[m] has quit [Quit: You have been kicked for being idle]

09:00 AkifEmreapolu[m] has quit [Quit: You have been kicked for being idle]

09:00 TarekNasser[m]1 has quit [Quit: You have been kicked for being idle]

09:00 manav71ManavSang has quit [Quit: You have been kicked for being idle]

10:34 Niket has joined #mlpack

10:43 Niket has quit [Ping timeout: 250 seconds]

16:00 hitesh-anandhite has quit [Quit: You have been kicked for being idle]

17:27 krushia_ has joined #mlpack

17:33 krushia has quit [*.net *.split]

19:32 <zoq[m]> > <@shubhamag:matrix.org> zoq: For writing pre-trained models proposal, do I need to write API as API is written in models wiki page?

19:32 <zoq[m]> > Or should I copy it to proposal too?

19:32 <zoq[m]> If it's the same no need to copy it in again, you can reference; if it differs from the API please point that out.

19:32 <zoq[m]> * If it's the same no need to copy it in again, you can add a reference; if it differs from the API please point that out.

21:34 krushia_ is now known as krushia