ChanServ changed the topic of #mlpack to: Due to ongoing spam on freenode, we've muted unregistered users. See http://www.mlpack.org/ircspam.txt for more information, or also you could join #mlpack-temp and chat there.
beaver1 has joined #mlpack
beaver1 has quit [Remote host closed the connection]
unreal_ has joined #mlpack
unreal_ is now known as Guest40437
Guest40437 has quit [Remote host closed the connection]
sheep28 has joined #mlpack
sheep28 has quit [Killed (TheMaster (Spam is not permitted on freenode.))]
vivekp has quit [Ping timeout: 272 seconds]
witness has joined #mlpack
philips29 has joined #mlpack
philips29 has quit [Remote host closed the connection]
vivekp has joined #mlpack
MLJens has joined #mlpack
< MLJens>
Hello. Relating to my request of yesterday, yes - I checked the dimension of the labels and the trainings sets.
< MLJens>
For MNist, I have 28x28 = 784 input neurons.
< MLJens>
The digits are from 0 to 9 - so I got 10 output neurons.
< MLJens>
for a faster computation, i limited the number of trainingsexamples to 991
< MLJens>
My trainingsset has 784 rows and 991 columns
< MLJens>
My labels have one row and 991 columns
< MLJens>
So i stepped in the function network.train raising the exception.
< MLJens>
The error occures in the function 'double Optimize(DecomposableFunctionType& function, arma::mat& iterate)'
< zoq>
MLJens: How does the target vector for digit 0 look like?
< MLJens>
You mean in the trainset? {0 0 0 0 ... 1.0 1.0 ... 0 0 0 0}^T So, white pixels are zero, black pixels are 1.0. I did some preparation, that only zeros and ones are part of the trainingsets.
< MLJens>
It's interesting, that after the exception is thrown, the trainset has 104802528 rows, and 34772056277008 columns
< zoq>
Actually I was talking about the target/labels vector
< zoq>
I guess it looks like: [0, .... 0, 1, .... 1, 2, ... 2, ...]?
< MLJens>
Correct, it has one row and 991 (number of training examples) columns.
< MLJens>
In my post on stack overflow, there was the old code shown with a vector for the labels but i still changed that to a matrix with one row and nExample columns.
< zoq>
instead of starting with 0 can you start with 1 as label so 0 becomes 1 and 1 becomes 2 and so on
< zoq>
MLJens: Can you increase the labels by one?
< zoq>
the value of the labels
< MLJens>
Thank you so much - that fixed this problem.
< MLJens>
Ok, my code is still training - I don't know, if everything works now - but the current error disappeared.
< MLJens>
One more question. As the training is taking quite long time - is there a possibility to plot the convergence of the solver for each iteration?
< zoq>
great, perhaps it's a good idea to decrease the number of training iterations
< zoq>
did you build with -DDEBUG=OFF?
< zoq>
the optimizer should print the loss to the console
< MLJens>
Thanks, I will check that. I have just 50 training examples, so I don't want to decrease it any more. In the end, there are 60000 examples :-)
< MLJens>
Greetings from Berlin Ostbahnhof - I saw, you're located in Berlin too.
< zoq>
Danke :)
< zoq>
Next time we can meet up to solve the issue :)
< MLJens>
Would be better, but in the end you helped me a lot.
< MLJens>
Now, I'm going to let him train as I have to leaf to visit a concert in Lido. If he has not finished tomorrow, I'll wirte again here. If he has... I'm sure, some more questions will come up as I want to get deeper in this library.
< zoq>
Sounds good have fun and see you tomorrow
< MLJens>
Thank you, see you...
MLJens has quit [Quit: Page closed]
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
caiojcarvalho has joined #mlpack
caiojcarvalho has quit [Client Quit]
caiojcarvalho has joined #mlpack
caiojcarvalho has quit [Client Quit]
caiojcarvalho has joined #mlpack
caiojcarvalho has quit [Client Quit]
cjlcarvalho has joined #mlpack
wenhao has joined #mlpack
ImQ009 has joined #mlpack
ImQ009 has quit [Quit: Leaving]
sophiebits has joined #mlpack
sophiebits has quit [Remote host closed the connection]
< rcurtin>
zoq: I am looking at the LayerNorm code and trying to understand it... I am thinking that maybe the line 'output.each_row() %= gamma' should be 'output.each_col() %= gamma', where gamma has length equal to input.n_rows (i.e. the dimension of the input)
< rcurtin>
from what I can tell of the paper, I think that we are learning one gamma value and one beta value for each dimension of the input
< rcurtin>
let me know what you think... I am not a huge expert on layer normalization, this is only based on a quick review of the paper and code :)
wenhao has quit [Ping timeout: 252 seconds]
mdoep26 has joined #mlpack
harding19 has joined #mlpack
mdoep26 has quit [Remote host closed the connection]
miaows14 has joined #mlpack
harding19 has quit [Remote host closed the connection]
miaows14 has quit [Remote host closed the connection]
koenig14 has joined #mlpack
koenig14 has quit [Killed (Sigyn (Spam is off topic on freenode.))]
< rcurtin>
but then I don't follow the need for the 'gamma' or 'beta' parameters (or what their shape should be)
< rcurtin>
I think that 'gamma' and 'beta' correspond to g and b in Eq.5 of the paper, with the sentence:
< rcurtin>
"They also learn an adaptive bias b and gain g for each neuron after the normalization."
< rcurtin>
to me that seems to input that there's one bias and gain parameter for each neuron, and the number of neurons (I think) would be equal to the number of features (input.n_rows)
< rcurtin>
I could be wrong---like I said I am not an expert, so correct me if I am wrong :)
< rcurtin>
Shikhar's link doesn't seem to mention the bias/gain parameter. In fact I wonder if you get effectively the same thing if you remove bias/gain from LayerNorm, and then create a network with a LayerNorm layer followed immediately by a Linear layer
< rcurtin>
(this would mean there are no parameters to be learned in the LayerNorm layer, I suppose)
< rcurtin>
forgive me if what I wrote makes no sense please :)