<AnwaarKhalid[m]>
Hello rcurtin ! I'm also sorry for the late response, I was on a trek 😃
<AnwaarKhalid[m]>
I agree that the concatenation operation should be integrated with the DAG class. In general, we should integrate all merging layers ( add_merge, multiply_merge & concat) with the DAG class. I think we can achieve this with the 'Add' function where in a user can specify an arbitrary no. of `Multilayers` & how to merge these layers. So something like:
<AnwaarKhalid[m]>
So, if we decide to stick with the Multilayer adaptation of merging layers, we can internally add these layers to the network. This way these merging layers can still be used in a regular `FFN` as well. What do you think about this ?
<AnwaarKhalid[m]>
`DAG.Add(Multlayer A, Multilayer B, Multilayer C, "merge_type")`
ahmed_abdelatty has quit [Quit: Client closed]
ahmed_abdelatty has joined #mlpack
AhmedAbdelatty has joined #mlpack
ahmed_abdelatty has quit [Quit: Client closed]
AhmedAbdelatty has quit [Client Quit]
tamandeepsTamand has quit [Quit: You have been kicked for being idle]
<ShubhamAgrawal[m>
rcurtin: zoq @marcusedel:matrix.org: you please see `TEST_CASE_METHOD` as it has issue in static code analysis of Static Initialization Order Fiasco
<ShubhamAgrawal[m>
* zoq @marcusedel:matrix.org: Can you please
<ShubhamAgrawal[m>
* zoq @marcusedel:matrix.org: Can you please
<ShubhamAgrawal[m>
* zoq @marcusedel:matrix.org: Can you please
<rcurtin[m]>
Anwaar Khalid: no worries about the slow response :) for the DAG class, I don't see the reason we need `"merge_type"`. we can adopt the semantic that a layer with multiple in-edges will have its outputs concatenated (so e.g. three layers each outputting a 2x6x2 tensor will result in a 2x6x2x3 input to the next layer), and then we can have a `Sum` layer (or similar), which could reduce the `2x6x2x3` input to a `2x6x2x1` output by summing the
<rcurtin[m]>
values along the last axis
<rcurtin[m]>
(the DAG network can perform an optimization to avoid forming the 2x6x2x3 input, if a `Sum` layer (or whatever we call it) has multiple in-edges)
<rcurtin[m]>
I'm not sure that I understand the rest of the comment, and how it relates to `MultiLayer` though
<rcurtin[m]>
Shubham Agrawal: can you provide more information on the `TEST_CASE_METHOD` issue? I am not sure what you're referring to
<ShubhamAgrawal[m>
I have cross checked all 503 issue came from same method
<rcurtin[m]>
I have no idea what you are talking about though; how did you come to this issue? what even is the issue? is this a compilation issue? what is the context?
<ShubhamAgrawal[m>
<rcurtin[m]> "I have no idea what you are..." <- I also don't know much about it. Its all warnings in static code analysis checks. I just counted the number of TEST_CASE_METHOD calls and its coming out to be 503. I just thought there is some issue in this code.
<rcurtin[m]>
okay, it is something that is coming up in the static code analysis checks for one of your PRs?
<rcurtin[m]>
which PR is it? did you change any code related to the line that you pasted?
<ShubhamAgrawal[m>
rcurtin[m]: In multiple PR
<rcurtin[m]>
I'm sorry to ask so many questions but please understand I have basically zero context on what you are up to or what you are trying to do---and even if you did tell me before, I jump between so many things you should not in general assume that I remember any of what you have told me :)
<ShubhamAgrawal[m>
rcurtin[m]: I tried to think about it but catch code is too complex right now for me.
<rcurtin[m]>
I don't understand the response---you didn't answer either question that I asked. I agree that the catch code is very complicated, and ideally we should ignore any static code analysis errors that come from catch and not our code
<rcurtin[m]>
if "we should ignore static code analysis errors" is the answer you were looking for originally, cool, we have solved the issue 😄 but maybe not? I am not sure
SuvarshaChennare has quit [Quit: You have been kicked for being idle]
<ShubhamAgrawal[m>
rcurtin[m]: Then maybe its fine. Maybe it may come from circular dependencies in util folder. But it will be fixed once it gets to header only
<rcurtin[m]>
I dunno, if you are hoping that making mlpack header-only will result in only one translation unit, that is actually not true for the tests, as each test will be compiled in its own .cpp file
<rcurtin[m]>
but again, I am not 100% sure what you are thinking, so maybe you are right, I guess we will find out? 😄
<rcurtin[m]>
(I am going to step out for lunch, my responses may be a bit slow for a little while)
<ShubhamAgrawal[m>
And also one more thing
<ShubhamAgrawal[m>
How will we determine the first iteration in Forward
<ShubhamAgrawal[m>
or Can we use ComputeDimension for computing dimension in other pooling layer in case of AdaptivePool?
<rcurtin[m]>
if we changed topics you have to assume again that I have no context :) can you state the question clearly? I think you are talking about a pooling layer? but I am not sure of the details of what the real question is
<rcurtin[m]>
sorry if writing the questions out fully is tedious, but I promise you will get a more coherent answer if you can provide more details in the question 😄
<ShubhamAgrawal[m>
rcurtin[m]: nvm I figured it out
<AnwaarKhalid[m]>
<rcurtin[m]> "Anwaar Khalid: no worries..." <- If I understood that correctly, you're saying if a layer has more than one incoming edge, the DAG network should default to concatenating the outputs unless it's a `Sum` layer for example, in which case it should just add the outputs along the last axis.
<AnwaarKhalid[m]>
<rcurtin[m]> "I'm not sure that I understand..." <-
<AnwaarKhalid[m]>
What I meant was if we adapt all the merging layers to inherit from `MultiLayer` like you have done for concat, the `Add` function can create the corresponding merging layer on the fly and add the layers to it's internal network. So, e.g
<AnwaarKhalid[m]>
`DAG.Add(Multilayer A, Multilayer B, MutiLayer C, "concat")` -- creates a concat layer and adds the layers A, B & C to `concat.Network()`
<rcurtin[m]>
for the DAG network, it will not (in general) work to implement the concatenation operation as a `MultiLayer`. what happens if I create a DAG such that layer `A` has an out-edge to *two* layers elsewhere in the DAG? the `Concat` layer built as a `MultiLayer` can only work for a subset of possible DAGs
<rcurtin[m]>
I don't understand the DAG, which are the inputs and which are the outputs? is the flow bottom to top or top to bottom?
<AnwaarKhalid[m]>
network A, B & C -- are multilayers
<AnwaarKhalid[m]>
it's top to bottom
<rcurtin[m]>
I don't understand. if networkA is a (sequential) `MultiLayer` with one output layer, how can I put that as input into a concatenation? it would be a trivial concatenation
<AnwaarKhalid[m]>
Okay okay.. let me explain :D
<AnwaarKhalid[m]>
Concat_Merge will concat the output of Add_Merge & Mul_Merge. Add_Merge will add the outputs of networkB & networkC. Similary, Mult_Merge will multiply the outputs of networkD & networkE.
<rcurtin[m]>
so the flow is bottom to top then?
<AnwaarKhalid[m]>
So, it's top to bottom.
<AnwaarKhalid[m]>
The diagram reflects the order in which the layers will be added to the DAG network.
<rcurtin[m]>
you just described the data flow as being bottom to top
<rcurtin[m]>
the user specifies input into the network; they receive output out of the final layer; you are saying that `networkA` produces the final output? then it is bottom to top
<rcurtin[m]>
if this is the case, then the diagram that is drawn makes no sense to me; I would have expected to see the data flow through the layers
<AnwaarKhalid[m]>
s/added/multiplied/
<rcurtin[m]>
regardless, I am sure that my point still stands that the `MultiLayer` approach to concatenation cannot express all possible DAGs
<rcurtin[m]>
suppose, for instance, that `networkA` (which I am understanding to be a sequential series of layers) has a layer whose output we would like to use as input to the next layer in `networkA`, but *also* as input to possibly multiple concatenative node (i.e. a DAG node with multiple in-edges)
<rcurtin[m]>
if you are attempting to store the concatenative node in such a way that it holds all of its inputs, this limits what you can express
<AnwaarKhalid[m]>
I think I get your point. The `Multilayer` approach would not be able handle all the cases where the input to the layers can be different. I'll think more on this. Thanks for quick responses :)
<rcurtin[m]>
maybe for now the `MultiLayer` approach is the best solution we have until we have the DAG network implemented---but at least when we do implement the DAG network itself, we will not be able to use the `MultiLayer` approach