ChanServ changed the topic of #mlpack to: Due to ongoing spam on freenode, we've muted unregistered users. See http://www.mlpack.org/ircspam.txt for more information, or also you could join #mlpack-temp and chat there.
berFt27 has joined #mlpack
berFt27 has quit [Remote host closed the connection]
JSharp18 has joined #mlpack
JSharp18 has quit [Remote host closed the connection]
ninsei has joined #mlpack
ninsei has quit [Remote host closed the connection]
j-fish has joined #mlpack
j-fish is now known as Guest91136
tomaw29 has joined #mlpack
Guest91136 has quit [Remote host closed the connection]
tomaw29 has quit [Remote host closed the connection]
clorophormo has joined #mlpack
clorophormo has quit [Ping timeout: 256 seconds]
bambams19 has joined #mlpack
bambams19 has quit [K-Lined]
NightMonkey25 has joined #mlpack
ShikharJ_ has joined #mlpack
NightMonkey25 has quit [Remote host closed the connection]
< ShikharJ_>
rcurtin: Are you there?
ShikharJ_ has quit [Quit: Page closed]
< akhandait>
zoq: Yeah, I am trying to see why the error goes to -nan
thekingofbandit4 has joined #mlpack
thekingofbandit4 has quit [Killed (Sigyn (Spam is off topic on freenode.))]
richardjohn12 has joined #mlpack
richardjohn12 has quit [Remote host closed the connection]
roger_rabbit has joined #mlpack
roger_rabbit has quit [K-Lined]
SkyPatrol has joined #mlpack
SkyPatrol has quit [Remote host closed the connection]
ImQ009 has joined #mlpack
n-st7 has joined #mlpack
< rcurtin>
ShikharJ_: yes, I am now
n-st7 has quit [Ping timeout: 264 seconds]
Hoosilon16 has joined #mlpack
Hoosilon16 has quit [Remote host closed the connection]
Asoka4 has joined #mlpack
Asoka4 has quit [Read error: Connection reset by peer]
Venusaur21 has joined #mlpack
Venusaur21 has quit [Remote host closed the connection]
< akhandait>
zoq: The test should pass with the current build.
< akhandait>
zoq: I had some doubts about the transposed conv issue
ShikharJ_ has joined #mlpack
< ShikharJ_>
rcurtin: I was thinking how often does a new release happen for mlpack?
< ShikharJ_>
rcurtin: And what constitutes a major release (what makes one go from let's say a 2.3 release to a 3.0)?
< ShikharJ_>
zoq: I have updated the work product report, I'll make a blog post using the same material as well. Also, I'll take up the remaining work now.
ShikharJ_ has quit [Ping timeout: 252 seconds]
Guest34098 has joined #mlpack
Guest34098 has quit [K-Lined]
< rcurtin>
ShikharJ_: it's all arbitrary :)
< rcurtin>
I try to release, e.g., once a month, but it doesn't always happen because a month is pretty short and the releases aren't automated
< rcurtin>
I figured a 3.1.0 release at the end of GSoC with the new project code merged would be good
vivekp has quit [Ping timeout: 240 seconds]
em has joined #mlpack
em has quit [Read error: Connection reset by peer]
CGML20 has joined #mlpack
CGML20 has quit [Remote host closed the connection]
< zoq>
ShikharJ_: Sounds good :)
< zoq>
akhandait: Here to help.
< rcurtin>
also, I should say about releases, I'm not picky at all, if anyone else wants to spearhead a release I have no problem with that at all :)
< akhandait>
zoq: We need to take the output width and height of the transposed conv layer as a parameter
< zoq>
akhandait: the output width of the forward pass?
< akhandait>
yes
< akhandait>
If you see Relationship 14 of that paper, o 0 = s(i 0 − 1) + a + k − 2p
< akhandait>
sorry, it didn't copy correctly
< zoq>
let me open the paper
< akhandait>
Yeah, it will be better
< akhandait>
Relationship 14 if the final, most general formula for transposed conv layers
< akhandait>
according to that to calculate 'o', we need 'a'
< akhandait>
a = i + 2p − k
< zoq>
I see
< akhandait>
where i is the input of the associated conv layer, that means it is the output of the trans conv
< akhandait>
Knowinig only input, s, p and k, multiple values of output are possible
< akhandait>
What we can do is check if that relationship holds true and throw an error otherwise
< zoq>
agreed, good idea
< akhandait>
zoq: That's taken care of then, another thing I wanted to clarify
< zoq>
static_assert or something liek that
< zoq>
*like
< akhandait>
Oh, I used a Fatal
< akhandait>
is that okay?
< zoq>
Sure that works as well.
< zoq>
this will do an exit afterwards, which I think is fine
< zoq>
we can't continue anyway
< akhandait>
Yeah
< akhandait>
About inserting the zeros in between, do you know any other way we can do that instead of creating a bigger matrix of zeros and setting alternate elements as the input values
< akhandait>
We will need to use loops
< akhandait>
but I can't think of any other way
< zoq>
Ideally, we would just skip the values in the conv operation, but right now I can't see any other solution as using loops.
< akhandait>
I am a little confused, can we insert zeros between individual elements using submat, like Figure 4.6 of that paper
< akhandait>
Will submat set the elements alternately if we give the correct indices of rows/columns
< zoq>
ah sorry for the confusion, this is just another way to insrt the values in a bigger matrix, to "avoid" the for loops
< akhandait>
so we can directly insert the elements alternately in the bigger matrix
< zoq>
exactly
< akhandait>
I think it will save at least some time
< akhandait>
I will time it and see
< zoq>
okay, sounds good
Chords has joined #mlpack
Chords has quit [Read error: Connection reset by peer]
ShikharJ_ has joined #mlpack
ImQ009_ has joined #mlpack
ImQ009_ has quit [Read error: Connection reset by peer]
< ShikharJ_>
akhandait: Sorry, if I'm interrupting, but if I follow your conversation correctly about inserting the zero elements directly, then your results would be significantly different from what we obtain using other frameworks.
ImQ009_ has joined #mlpack
ImQ009 has quit [Ping timeout: 240 seconds]
< akhandait>
ShikharJ: Sorry, I am not sure how other frameworks implement this. Can you explain a bit? This seems to be correct according to the paper
< ShikharJ_>
akhandait: Also i is not the output of the trans conv layer. O is.
ImQ009_ has quit [Read error: Connection reset by peer]
ImQ009 has joined #mlpack
< akhandait>
ShikharJ: Yes, i is not, but it is the output of the associated conv layer which is basically the same thing, if he conv goes from 2 -> 4 (i = 2) the trans conv will go from 4 -> 2 (0 = 2)
< akhandait>
(o = 2)
< akhandait>
ShikharJ: I has a discussion with Marcus last time on #mlpack_temp. Sorry you had to miss it. I will send a txt file to you so that you can go through it.
< ShikharJ_>
akhandait: About flipping the kernel, yeah it doesn't matter mathematically, but it was done in the Forward pass because a Trans Conv is simple a Conv operation done in reverse.
< ShikharJ_>
akhandait: Could you mention why you think the full convolution on the forward pass is incorrect?
< akhandait>
ShikharJ: I am not sure about incorrect but it's extremely inefficient, more so for larger matrices
< akhandait>
As I think again about it, I don't think it's incorrect. I will do the job
< akhandait>
it*
< ShikharJ_>
akhandait: Okay, let's leave that for now. What about the stride being one? Why should it depend on the input stride, when we are taking a full convolution?
< akhandait>
ShikharJ: You are correct, we won't need the stride for a full convolution. It's just that we shouldn't always perform a full convolution. That's the reason we will need the stride of the associated convolution operation(to insert zeros in between the input units).
< akhandait>
As mentioned in the paper, when stride of a convolution layer is > 1, the stride of the associated transposed con layer is < 1. That's the reason we insert the zeros.
< akhandait>
Now that I again think about it, I think performing a full convolution in a transposed conv layer is not a correct backward operation for a conv layer which used > 1 stride
< akhandait>
for example
< akhandait>
Let's say that a conv layer goes from 64x64 to 32x32 using k = 33, p = 0, s = 1, then it's correct for it's associated transposed conv layer to use k = 33, p = 0 to go from 32x32 to 64x64
< akhandait>
but,
< akhandait>
if a conv layer goes from 64x64 to 32x32 using k = 5, p = 2, s = 2, then it's correct associated transposed operation should be k = 5, p = 2, s = 1(but with zeros inserted between input units)
< ShikharJ_>
akhandait: "Performing a full convolution is not a correct backward operation for a conv layer which used > 1 stride". I don't think this is correct, a full convolution with stride one is the correct backwards operation on a conv layer.
ImQ009 has quit [Quit: Leaving]
< akhandait>
ShikharJ: Yeah, it will do the job mathematically, but for that particular conv operation(> 1 stride), using the same kernel size(with fractional stride) is better associated . At least this is what I have understood from the paper.
< akhandait>
zoq: I timed it, surprisingly, using loops is more than two times faster than using submat in this case
< akhandait>
this is when I use [] instead of ()
< ShikharJ_>
akhandait: Furthermore, if you just look intuitively, at the kernel size that small while upsizing an image, you're going to lose a lot of accuracy.
< ShikharJ_>
akhandait: Transconv is nothing but the general conv operation reversed.
< akhandait>
ShikharJ: Hmm, I am not sure about. I think using a huge kernel size for a transposed_conv when a smaller one was used in the conv layer is not the correct solution for that(assuming it's losing accuracy)
< akhandait>
ShikharJ: exactly
< akhandait>
We hardly ever use a huge kernel size in conv operation if the input is big(we use s > 1). The same way, if we use a full convolution to reverse a conv operation which used s > 1, we are not exactly reversing it.
< akhandait>
We are just getting it to the same size, but not using the 'reverse' of the conv layer
< akhandait>
So, as I said, a conv layer goes from 64x64 to 32x32 using k = 33, p = 0, s = 1, then it's correct for it's associated transposed conv layer to use k = 33, p = 0 to go from 32x32 to 64x64
< akhandait>
that's the correct reverse operation
< akhandait>
but if a conv layer goes from 64x64 to 32x32 using k = 5, p = 2, s = 2, then it's correct associated transposed operation should be k = 5, p = 2, s = 1(but with zeros inserted between input units)
< akhandait>
that's the correct reverse operation for that case
< ShikharJ_>
Okay let's break this case by case.
< akhandait>
I think the paper has done that for us :)
< akhandait>
but I will be happy to go through it again
< akhandait>
maybe we both will get to learn some things
< ShikharJ_>
akhandait: "We hardly ever use a huge kernel size in conv operation if the input is big. The same way, if we use a full convolution to reverse a conv operation which used s > 1, we are not exactly reversing it." This is true, but we take this approximation while backpropagating in conv networks in general.
< akhandait>
also, intuitively, it seems very appropriate to use the same kernel size for a transposed conv operation as we used in the associated conv operation
< akhandait>
ShikharJ: Yes we do, but that doesn't justify using a full convolution for a transposed conv when we used s > 1 for it's corresponsin conv operation
< akhandait>
about my previous point, if you see Figure 4.6 of that paper, I don't think we lose accuracy this way
< ShikharJ_>
akhandait: It does because the input is smaller (or rather denser in infromation theoretic sense) than the output in transposed conv.
< akhandait>
Yeah, each output unit(of transposed conv) which should correspond to input unit(of conv operation) is affected by only those input units(of trans conv) corresponsing output units of which(again conv op) were affected by the corresponsing input units of the conv op
< akhandait>
Also, I think it's always better to trust the paper we are following than our intuition :)
< akhandait>
Sorry if my last point is confusing to read, I hope you get what I am trying to say
< ShikharJ_>
akhandait: Yeah confused me big time :P zoq: What would you have to say of this?