ChanServ changed the topic of #mlpack to: Due to ongoing spam on freenode, we've muted unregistered users. See http://www.mlpack.org/ircspam.txt for more information, or also you could join #mlpack-temp and chat there.
cjlcarvalho has quit [Ping timeout: 252 seconds]
cjlcarvalho has joined #mlpack
< davida>
zoq: I tried LeakyReLU<> to see if it was better. I think I forgot to switch that back before uploading the file. The result isn't any different. I will try doubling the epochs and post the result in TrainingResults.txt again.
< davida>
What bothers me is that the result is so much worse than the same on Tensorflow. I could understand a little bit of difference but not 50% delta.
ayesdie has joined #mlpack
ayesdie has quit [Quit: ayesdie]
< zoq>
davida: I'll have to take a closer look into the conv layer weight initialization.
< davida>
zoq: I completed the 200 epochs and uploaded the results with the standard ReLU Layer. No improvement at all with longer training. In fact it plateaus after about 50 epochs.
< davida>
BTW I renamed Week1Main.cpp as ConvolutionModelApplication.cpp
< davida>
zoq: As the Python example uses Batch Gradient Descent, I set my parameters on SGD to BatchSize=40 and Iterations=27 (40x27 = 1,080 = nbr of training examples) to give me one complete pass thru' the dataset (assuming my understanding of how this works is correct). Something quite weird happened. Training & Test accuracy did not change for the entire 100 epochs and remained very low at 18.6%%.
< zoq>
davida: Same if you use a step size of 0.01?
< davida>
zoq: checking now - I think I did try multiple values of alpha and there was no real improvement.
< zoq>
davida: Okay, one more test, what about using RandomInitialization instead of XavierInitialization.
< zoq>
davida: Either way I'll take a closer look into the conv layer.
< davida>
zoq: I did try RandomInitialization already and it made no difference.
< zoq>
davida: Okay, strange.
< davida>
zoq: I have tried multiple different combinations of hyperparameters and cannot get the result to improve beyond about 50% but that requires me setting MaxIterations to 100,000.
< davida>
zoq: This is why I was wondering if I had actually coded my model correctly. In the Python exercise, Andrew Ng is using One Hot vectors, but I am assuming that all that is taken care of by the NegativeLogLikelihood<>
< davida>
It might be a bit easier if we could actually see the cost after each epoch, but I am really not sure how to calculate that.
< davida>
I am using Train & Test accuracy as a proxy.
cjlcarvalho has quit [Ping timeout: 240 seconds]
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 252 seconds]
vivekp has joined #mlpack
< zoq>
davida: Do you think you could save and upload the train/test data as arma_binary ( trainSetX.save("A.bin"); )?