rcurtin_irc changed the topic of #mlpack to: mlpack: a scalable machine learning library (https://www.mlpack.org/) -- channel logs: https://libera.irclog.whitequark.org/mlpack -- NOTE: messages sent here might not be seen by bridged users on matrix, gitter, or slack
ahmed_abdelatty has joined #mlpack
<AnwaarKhalid[m]> Hello rcurtin ! I'm also sorry for the late response, I was on a trek 😃
<AnwaarKhalid[m]> I agree that the concatenation operation should be integrated with the DAG class. In general, we should integrate all merging layers ( add_merge, multiply_merge & concat) with the DAG class. I think we can achieve this with the 'Add' function where in a user can specify an arbitrary no. of `Multilayers` & how to merge these layers. So something like:
<AnwaarKhalid[m]> So, if we decide to stick with the Multilayer adaptation of merging layers, we can internally add these layers to the network. This way these merging layers can still be used in a regular `FFN` as well. What do you think about this ?
<AnwaarKhalid[m]> `DAG.Add(Multlayer A, Multilayer B, Multilayer C, "merge_type")`
ahmed_abdelatty has quit [Quit: Client closed]
ahmed_abdelatty has joined #mlpack
AhmedAbdelatty has joined #mlpack
ahmed_abdelatty has quit [Quit: Client closed]
AhmedAbdelatty has quit [Client Quit]
tamandeepsTamand has quit [Quit: You have been kicked for being idle]
<ShubhamAgrawal[m> > <@khalidanwaar:matrix.org> Hello rcurtin ! I'm also sorry for the late response, I was on a trek 😃... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/d1be40ede5e1b724a6c68d836405cc2868265725)
<ShubhamAgrawal[m> s/😃//, s/./.😃/
fieryblade[m] has joined #mlpack
<ShubhamAgrawal[m> rcurtin: zoq @marcusedel:matrix.org: you please see `TEST_CASE_METHOD` as it has issue in static code analysis of Static Initialization Order Fiasco
<ShubhamAgrawal[m> * zoq @marcusedel:matrix.org: Can you please
<ShubhamAgrawal[m> * zoq @marcusedel:matrix.org: Can you please
<ShubhamAgrawal[m> * zoq @marcusedel:matrix.org: Can you please
<rcurtin[m]> Anwaar Khalid: no worries about the slow response :) for the DAG class, I don't see the reason we need `"merge_type"`. we can adopt the semantic that a layer with multiple in-edges will have its outputs concatenated (so e.g. three layers each outputting a 2x6x2 tensor will result in a 2x6x2x3 input to the next layer), and then we can have a `Sum` layer (or similar), which could reduce the `2x6x2x3` input to a `2x6x2x1` output by summing the
<rcurtin[m]> values along the last axis
<rcurtin[m]> (the DAG network can perform an optimization to avoid forming the 2x6x2x3 input, if a `Sum` layer (or whatever we call it) has multiple in-edges)
<rcurtin[m]> I'm not sure that I understand the rest of the comment, and how it relates to `MultiLayer` though
<rcurtin[m]> Shubham Agrawal: can you provide more information on the `TEST_CASE_METHOD` issue? I am not sure what you're referring to
<ShubhamAgrawal[m> ```
<ShubhamAgrawal[m> You can see here
<ShubhamAgrawal[m> I have cross checked all 503 issue came from same method
<rcurtin[m]> I have no idea what you are talking about though; how did you come to this issue? what even is the issue? is this a compilation issue? what is the context?
<ShubhamAgrawal[m> <rcurtin[m]> "I have no idea what you are..." <- I also don't know much about it. Its all warnings in static code analysis checks. I just counted the number of TEST_CASE_METHOD calls and its coming out to be 503. I just thought there is some issue in this code.
<rcurtin[m]> okay, it is something that is coming up in the static code analysis checks for one of your PRs?
<rcurtin[m]> which PR is it? did you change any code related to the line that you pasted?
<ShubhamAgrawal[m> rcurtin[m]: In multiple PR
<rcurtin[m]> I'm sorry to ask so many questions but please understand I have basically zero context on what you are up to or what you are trying to do---and even if you did tell me before, I jump between so many things you should not in general assume that I remember any of what you have told me :)
<ShubhamAgrawal[m> rcurtin[m]: I tried to think about it but catch code is too complex right now for me.
<rcurtin[m]> I don't understand the response---you didn't answer either question that I asked. I agree that the catch code is very complicated, and ideally we should ignore any static code analysis errors that come from catch and not our code
<rcurtin[m]> if "we should ignore static code analysis errors" is the answer you were looking for originally, cool, we have solved the issue 😄 but maybe not? I am not sure
SuvarshaChennare has quit [Quit: You have been kicked for being idle]
<ShubhamAgrawal[m> rcurtin[m]: Then maybe its fine. Maybe it may come from circular dependencies in util folder. But it will be fixed once it gets to header only
<rcurtin[m]> I dunno, if you are hoping that making mlpack header-only will result in only one translation unit, that is actually not true for the tests, as each test will be compiled in its own .cpp file
<rcurtin[m]> but again, I am not 100% sure what you are thinking, so maybe you are right, I guess we will find out? 😄
<rcurtin[m]> (I am going to step out for lunch, my responses may be a bit slow for a little while)
<ShubhamAgrawal[m> And also one more thing
<ShubhamAgrawal[m> How will we determine the first iteration in Forward
<ShubhamAgrawal[m> or Can we use ComputeDimension for computing dimension in other pooling layer in case of AdaptivePool?
<rcurtin[m]> if we changed topics you have to assume again that I have no context :) can you state the question clearly? I think you are talking about a pooling layer? but I am not sure of the details of what the real question is
<rcurtin[m]> sorry if writing the questions out fully is tedious, but I promise you will get a more coherent answer if you can provide more details in the question 😄
<ShubhamAgrawal[m> rcurtin[m]: nvm I figured it out
<AnwaarKhalid[m]> <rcurtin[m]> "Anwaar Khalid: no worries..." <- If I understood that correctly, you're saying if a layer has more than one incoming edge, the DAG network should default to concatenating the outputs unless it's a `Sum` layer for example, in which case it should just add the outputs along the last axis.
<AnwaarKhalid[m]> <rcurtin[m]> "I'm not sure that I understand..." <-
<AnwaarKhalid[m]> What I meant was if we adapt all the merging layers to inherit from `MultiLayer` like you have done for concat, the `Add` function can create the corresponding merging layer on the fly and add the layers to it's internal network. So, e.g
<AnwaarKhalid[m]> `DAG.Add(Multilayer A, Multilayer B, MutiLayer C, "concat")` -- creates a concat layer and adds the layers A, B & C to `concat.Network()`
<rcurtin[m]> for the DAG network, it will not (in general) work to implement the concatenation operation as a `MultiLayer`. what happens if I create a DAG such that layer `A` has an out-edge to *two* layers elsewhere in the DAG? the `Concat` layer built as a `MultiLayer` can only work for a subset of possible DAGs
<AnwaarKhalid[m]> You mean something like this?
<rcurtin[m]> I don't understand the DAG, which are the inputs and which are the outputs? is the flow bottom to top or top to bottom?
<AnwaarKhalid[m]> network A, B & C -- are multilayers
<AnwaarKhalid[m]> it's top to bottom
<rcurtin[m]> I don't understand. if networkA is a (sequential) `MultiLayer` with one output layer, how can I put that as input into a concatenation? it would be a trivial concatenation
<AnwaarKhalid[m]> Okay okay.. let me explain :D
<AnwaarKhalid[m]> Concat_Merge will concat the output of Add_Merge & Mul_Merge. Add_Merge will add the outputs of networkB & networkC. Similary, Mult_Merge will multiply the outputs of networkD & networkE.
<rcurtin[m]> so the flow is bottom to top then?
<AnwaarKhalid[m]> So, it's top to bottom.
<AnwaarKhalid[m]> The diagram reflects the order in which the layers will be added to the DAG network.
<rcurtin[m]> you just described the data flow as being bottom to top
<rcurtin[m]> the user specifies input into the network; they receive output out of the final layer; you are saying that `networkA` produces the final output? then it is bottom to top
<AnwaarKhalid[m]> The inpuit to the network first passes through `networkA` and generates `outputA`.... (full message at https://libera.ems.host/_matrix/media/r0/download/libera.chat/653f2a6b113f5743e14d947625498bbbbcfafe83)
<rcurtin[m]> if this is the case, then the diagram that is drawn makes no sense to me; I would have expected to see the data flow through the layers
<AnwaarKhalid[m]> s/added/multiplied/
<rcurtin[m]> regardless, I am sure that my point still stands that the `MultiLayer` approach to concatenation cannot express all possible DAGs
<rcurtin[m]> suppose, for instance, that `networkA` (which I am understanding to be a sequential series of layers) has a layer whose output we would like to use as input to the next layer in `networkA`, but *also* as input to possibly multiple concatenative node (i.e. a DAG node with multiple in-edges)
<rcurtin[m]> if you are attempting to store the concatenative node in such a way that it holds all of its inputs, this limits what you can express
<AnwaarKhalid[m]> I think I get your point. The `Multilayer` approach would not be able handle all the cases where the input to the layers can be different. I'll think more on this. Thanks for quick responses :)
<rcurtin[m]> maybe for now the `MultiLayer` approach is the best solution we have until we have the DAG network implemented---but at least when we do implement the DAG network itself, we will not be able to use the `MultiLayer` approach
krushia has quit [Quit: Konversation terminated!]
krushia has joined #mlpack
krushia has quit [Ping timeout: 276 seconds]