ChanServ changed the topic of #mlpack to: Due to ongoing spam on freenode, we've muted unregistered users. See http://www.mlpack.org/ircspam.txt for more information, or also you could join #mlpack-temp and chat there.
LordCow23 has quit [Read error: Connection reset by peer]
mac_nibblet9 has quit [Remote host closed the connection]
feirlane27 has joined #mlpack
feirlane27 has quit [Remote host closed the connection]
jamesd1 has joined #mlpack
jamesd1 has quit [K-Lined]
Game_Freak has joined #mlpack
Game_Freak is now known as Guest34047
Guest34047 has quit [Remote host closed the connection]
kuzetsa6 has joined #mlpack
kuzetsa6 has quit [Remote host closed the connection]
jose18 has joined #mlpack
jose18 has quit [Remote host closed the connection]
Kirjava20 has joined #mlpack
Kirjava20 has quit [Remote host closed the connection]
Genesis- has joined #mlpack
Genesis- has quit [Remote host closed the connection]
heyimwill20 has joined #mlpack
heyimwill20 has quit [Remote host closed the connection]
jadaking has joined #mlpack
jadaking has quit [Remote host closed the connection]
zipleen6 has joined #mlpack
meka17 has joined #mlpack
zipleen6 has quit [Remote host closed the connection]
meka17 has quit [K-Lined]
cnf8 has joined #mlpack
cnf8 has quit [Remote host closed the connection]
mutk has joined #mlpack
mutk has quit [K-Lined]
fikka has joined #mlpack
fikka has quit [Remote host closed the connection]
gsora20 has joined #mlpack
gsora20 has quit [Remote host closed the connection]
evil_steve21 has joined #mlpack
evil_steve21 has quit [Remote host closed the connection]
Ferus6 has joined #mlpack
Ferus6 has quit [Remote host closed the connection]
< ShikharJ> zoq: Those are minor changes, which I think can be done in your PR itself? I'd suggest that we remove the default parameters altogether, instead of setting the size as 10 for BatchNorm and as 1 for LayerNorm.
< zoq> ShikharJ: Okay, I'll make the necessary changes later today.
ivarmedi has joined #mlpack
ivarmedi has quit [Remote host closed the connection]
Dominionionion22 has joined #mlpack
Dominionionion22 has quit [Remote host closed the connection]
kaospunk_ has joined #mlpack
kaospunk_ has quit [Remote host closed the connection]
robertohueso has joined #mlpack
robertohueso has quit [Client Quit]
robertohueso has joined #mlpack
Bercik0 has joined #mlpack
Bercik0 has quit [Remote host closed the connection]
IrishFBall3227 has joined #mlpack
IrishFBall3227 has quit [Remote host closed the connection]
mvantellingen6 has joined #mlpack
mvantellingen6 has quit [Remote host closed the connection]
joepie9127 has joined #mlpack
joepie9127 has quit [Remote host closed the connection]
gtrs_ has joined #mlpack
gtrs_ has quit [Remote host closed the connection]
blakjak888 has joined #mlpack
blakjak has joined #mlpack
blakjak has quit [Client Quit]
blakjak888_ has joined #mlpack
blakjak888 has quit [Ping timeout: 256 seconds]
blakjak888_ has quit [Client Quit]
blakjak888 has joined #mlpack
blakjak888_ has joined #mlpack
blakjak888__ has joined #mlpack
blakjak888__ has quit [Client Quit]
blakjak888__ has joined #mlpack
blakjak888__ has quit [Client Quit]
blakjak888 has quit [Ping timeout: 256 seconds]
blakjak888__ has joined #mlpack
blakjak888_ has quit [Ping timeout: 256 seconds]
blakjak888__ has quit [Client Quit]
blakjak888_ has joined #mlpack
< blakjak888_> I am trying to get a little help with my first MLPACK FFN. I am going thru' Andrew Ng's Deepleaning.ai course and doing the projects in parallel in C++ with MLPACK and Armadillo. Is this the right place to ask some questions related to the use of MLPACK. I have searched a lot for examples but could not find anything to explain why my code is failing.
< blakjak888_> I build an FFN like this:
< blakjak888_> mlpack::ann::FFN<mlpack::ann::CrossEntropyError<>, mlpack::ann::RandomInitialization> model; model.Add<mlpack::ann::Linear<> >(trainSetX.n_rows, 25); model.Add<mlpack::ann::ReLULayer<> >(); model.Add<mlpack::ann::Linear<> >(25, 12); model.Add<mlpack::ann::ReLULayer<> >(); model.Add<mlpack::ann::Linear<> >(12, 6); model.Add<mlpack::ann::LogSoftMax<> >();
< blakjak888_> ... with my optimizer like this: mlpack::optimization::SGD<mlpack::optimization::AdamUpdate> optimizer(0.0001, 64, 10000, 1e-8, true, mlpack::optimization::AdamUpdate(1e-8, 0.9, 0.999));
< blakjak888_> However, I am always getting the same error no matter how I tune my "step" value. It tells me that SGD does not converge.
< blakjak888_> ... sorry, it actually converges to NaN.
< blakjak888_> I have double checked my input data at it is exactly matching my Python data. The input data is 12288 x 1080.
< blakjak888_> I suspect I may be missing a step somewhere, however when I follow any examples that I can find on the web for FFNs they pretty much do what I am doing here.
< blakjak888_> Can anyone give some advice or point me in the right direction?
< rcurtin> blakjak888_: sorry that you're having issues
< rcurtin> there are some examples you could look at in src/mlpack/tests/feedforward_network_test.cpp, and perhaps those would be useful
< rcurtin> if you're training with cross-entropy error I guess this is classification... can you tell me what your labels are?
< blakjak888_> I have 6 labels. The data is 1080 examples of 64x64x3 (RGB) images of a hand showing 1, 2, 3, 4, 5 fingers or a 0 indicated by thumb and forefinger together.
< rcurtin> sounds good. do the labels take values between 0 and 5, or 1 and 6? (or something else?)
< blakjak888_> 0 to 5
< rcurtin> hm. so, I am not sure on this (zoq should correct me) but I think that the output for using CrossEntropyError needs to be one-hot encoded
< rcurtin> let me check an example...
< rcurtin> right, I think this is the case; so instead of having a vector like [0 3 2 1 1 ...] where each element is a label, try using a matrix with 6 rows, where only the row corresponding to the true label is 1 and all others are zeros
< blakjak888_> Yes. I am using one-hot encoding
< rcurtin> ohh, I see
< rcurtin> ok
< blakjak888_> So my label matrix is 6x1080
< rcurtin> right, that sounds good
< rcurtin> the model you're using is pretty simple, so I don't think simplifying it further will make a difference...
< blakjak888_> I have manually checked all the input matrices and they seem to match exactly what I had in the Coursera deeplearning.ai tutorial.
< rcurtin> have you tried using a different optimizer than Adam?
< blakjak888_> I came to suspect that my optimizer was not setup correctly due to the error message I was getting.
< rcurtin> it seems like a bit of a long shot, since usually Adam will work ok
< blakjak888_> I tried GradientDescent
< rcurtin> just glancing at the definition of 'optimizer' it seems to have ok parameters to me
< rcurtin> did that work?
< blakjak888_> same error. In fact I tried several optimizers and all were giving me the same error.
< blakjak888_> Converging to NaN
< rcurtin> hm, ok, that is unexpected. based on what you've told me so far this *should* work, so there must be something else
< rcurtin> can you show me the full code?
< blakjak888_> I even cut my FFN to 2 layers. Linear then Sigmoid. ... but same error
< blakjak888_> Sure.
< blakjak888_> Is there a way to post code here?
< rcurtin> I'd use pastebin then copy the link to pastebin
< blakjak888_> I am not so familiar with the IRC
< rcurtin> no worries :)
< blakjak888_> Am using a broweser right now
< rcurtin> yeah, webchat works reasonably well enough I think :)
< blakjak888_> Should I paste the file or a copy of the text?
< rcurtin> either should be fine as long as I can see the code :)
< blakjak888_> not sure how to do a pastebin on this
< blakjak888_> If I just paste now I think it will join all the lines together like above.
< blakjak888_> I'll try
< rcurtin> hmmm, that's unfortunate... I guess you could re-add the line breaks in but that's a bit tedious
< blakjak888_> Yup. Won't allow it. Text too long
< rcurtin> maybe just the part of the code concerned with the building of the network and the training?
< blakjak888_> Can I email you the text?
< rcurtin> sure, that can work
< rcurtin> ryan@ratml.org
< blakjak888_> Should be in your inbox. I cut out all the header stuff.
< blakjak888_> Just send main()
< rcurtin> in your data load code, you can also do 'mlpack::data::Load("train_signs.h5", trainSetX)' and I think that will get everything in the format you need it :)
< rcurtin> that's just a minor comment, it shouldn't make any difference for the actual program :)
< blakjak888_> I tried that on the load but this is a multi dimensional H5 dataset
< blakjak888_> I found this was the only way to get in the data in the layout I needed for Armadillo matrices
< rcurtin> ah, ok, the Armadillo functionality for loading HDF5 isn't perfect
< rcurtin> (sorry about that!)
< rcurtin> it works well if the hdf5 file just has a single matrix
< rcurtin> are you sure that 'oneHotTrainSetY' doesn't contain any strange values?
< blakjak888_> I can post the code. I ti s very simple.
< rcurtin> also, I'm not sure if the call to ResetParameters() is needed before Train()
< blakjak888_> And I checked that too.
< blakjak888_> #include <armadillo> template <typename T> arma::Mat<T> oneHot(const arma::Row<T>& in) { //arma::Row<T> uniqueIn(arma::unique(in)); arma::Mat<T> oneHotMtx; //oneHotMtx.zeros(arma::Mat<T>(arma::unique(in)).n_cols, in.n_cols); oneHotMtx.zeros(arma::max(in)+1, in.n_cols); for (unsigned int i = 0; i <= in.n_cols; ++i) oneHotMtx.at(in.at(i), i) = 1; return oneHotMtx; }
< blakjak888_> Messy. I can email it.
< rcurtin> thanks---I think I understand like this but it's easier to follow it with line breaks :)
< rcurtin> I don't see anything wrong with that code either
< rcurtin> so my only thoughts for debugging here are ResetParameters(), a different optimizer, or perhaps that something about the data is being loaded very weird. However, you say you've already checked the data so that seems unlikely
< rcurtin> if none of those ideas of those mine work, I'm wondering if the best idea is to open a Github issue. I don't expect that something's wrong with the FFN code, because I've certainly trained a lot of networks like the one you're using here
< rcurtin> another thing to try (though the comments imply you've already tried it) is to use NegativeLogLikelihood<> not CrossEntropy<>; if you did that, what were the results?
< rcurtin> (unfortunately NegativeLogLikelihood<> expects a vector of labels from 1 to n_classes, not one-hot encoded. so that is a little confusing)
< blakjak888_> Is the Linear Function WX+b? Is there a way to manually set the W and b paramters rather than use RandomInit?
< rcurtin> there are a lot of initializations in src/mlpack/methods/ann/init_rules/, I suppose you could use those to set something explicitly
tokenrove16 has joined #mlpack
< rcurtin> or, rather, I mean, you could write your own initialization class with the same API as in there, and perhaps set all the weights to 1 or something for debugging
< rcurtin> unfortunately accessing the parameters of individual layers is a little bit hard because the design of the C++ code uses boost::variant for speed
< blakjak888_> That was what I was thinking. Are the paramters for each function and the format needed documented? e.g. I would expect for my Linear<> layer, the Paramters should be two matrices. W = 25x12288 and b = 25x1 but when I tried to examine Paramters() return value it gave me some strange matrix
< rcurtin> right, that Parameters() matrix is actually the entire set of parameters in the entire network
< blakjak888_> Oh. sorry, I see you replied before i asked
< rcurtin> so it's the concatenation of all the weights and biases in the network
< rcurtin> this is nice for speed, because everything is localized in memory
< rcurtin> however, it's less nice for actually inspecting what is going on
< rcurtin> it would be possible (but irritating) to write a method that actually returned the weight matrix only of a single layer, but even then because of the boost::variant usage things get really hard with types
< blakjak888_> Perhaps I could start with a much simpler example to test to see if it is actually my MLPACK installation causing problems. Do you know where I could find something very basic/simple for testing FNN with a small dataset and a known outcome?
< rcurtin> there's an MNIST example model that just uses a simple network
< rcurtin> you can get the data from the repository in the right format: https://github.com/mlpack/models
tokenrove16 has quit [Ping timeout: 264 seconds]
< rcurtin> I think it should be pretty easy to copy-paste the code from that example (or use the example in its entirety) and see if it works
< rcurtin> another idea would be to look in src/mlpack/tests/feedforward_network_test.cpp, or even run the tests
< blakjak888_> Thanks. I actually tried a similar version of that last night which was using a Convulution network so was not quite like mine. This looks much better. I will give it a go and see if I have any similar convergence issues.
< rcurtin> you could do 'make mlpack_test' to build the tests, then run all the tests and see if there are any issues
< rcurtin> like I said, this one is confusing me a little bit. based on everything you've shown me it *should* work just fine
< rcurtin> sorry that you are having issues :(
< blakjak888_> I have a Windows installation so M$ could be screwing it up!!!
< rcurtin> ahh, yeah, it can be a little more difficult to use mlpack on Windows. but it is possible :)
< rcurtin> we do test our code with AppVeyor so at least there it properly builds and runs on a Windows environment. but many things could be different between your setup and that one...
< blakjak888_> I will try now and post back when I get a result.
< rcurtin> sure; I will be out for lunch soon, but I'll try and respond when I'm able to
< rcurtin> and if you'd rather move to Github issues for debugging I'm happy to try and help there when I'm able also
< blakjak888_> Thx
< blakjak888_> Well the code is running:
< blakjak888_> Reading data ... Training ... [NVBLAS] NVBLAS_CONFIG_FILE environment variable is set to 'D:\usr\C++\FirstLSTM\FirstLSTM\nvblas.conf' 0 - accuracy: train = 29.5741%, valid = 29.2143%
< blakjak888_> and clearly the training is working:
< blakjak888_> 1 - accuracy: train = 50.4286%, valid = 48.6429% 2 - accuracy: train = 59.5979%, valid = 58.8095% 3 - accuracy: train = 65.2407%, valid = 63.9524%
< blakjak888_> ... so I will play with this working code and my data to see if it is actually my data that is causing the problem
< rcurtin> hmm, ok. I did wonder if anything was 'weird' about the data (maybe a hidden NaN somewhere or something) but based on what you said it seemed unlikely
< rcurtin> and since it's from a course the data should be ok also
< blakjak888_> I noticed that in this example code there is no normalization of the data.
< blakjak888_> The training is done directly on the pixel data.
< rcurtin> that shouldn't necessarily be an issue; if the pixel data is between 0 and 255 the network should be able to adjust just fine
< rcurtin> it's just weird things like NaN or Inf that'll definitely cause problems (or uninitialized memory being used? I didn't see any of that just from looking at the code though)
< blakjak888_> It's a huge dataset, about 13MB so hard to cehck everything, but this has at least given me a direction to go in. Thanks for your help
< rcurtin> sure :)
tedjp has joined #mlpack
Guest13996 has joined #mlpack
tedjp has quit [Remote host closed the connection]
Guest13996 has quit [Remote host closed the connection]
roidelapluie25 has joined #mlpack
roidelapluie25 has quit [Remote host closed the connection]
KnownSyntax23 has joined #mlpack
KnownSyntax23 has quit [Remote host closed the connection]
catsup1 has joined #mlpack
catsup1 has quit [Ping timeout: 252 seconds]
noeatnosleep22 has joined #mlpack
LeoTh3o28 has joined #mlpack
noeatnosleep22 has quit [Remote host closed the connection]
LeoTh3o28 has quit [Remote host closed the connection]
okamis has joined #mlpack
okamis has quit [Remote host closed the connection]
cjlcarvalho has joined #mlpack
Osleg13 has joined #mlpack
Osleg13 has quit [Remote host closed the connection]
etonka28 has joined #mlpack
etonka28 has quit [Remote host closed the connection]
cjlcarvalho has quit [Ping timeout: 252 seconds]
lumidify has joined #mlpack
lumidify has quit [Remote host closed the connection]
Guest28198 has joined #mlpack
LaunchpadMcQuack has joined #mlpack
Guest28198 has quit [Ping timeout: 252 seconds]
LaunchpadMcQuack has quit [Remote host closed the connection]
FruitieX18 has joined #mlpack
pepo3 has joined #mlpack
FruitieX18 has quit [Remote host closed the connection]
pepo3 has quit [Remote host closed the connection]
hayer_ has joined #mlpack
hayer_ has quit [Remote host closed the connection]
phadthai28 has joined #mlpack
phadthai28 has quit [Ping timeout: 252 seconds]