verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
mikeling has joined #mlpack
< rcurtin> wow, I just found some really cool usage of mlpack right before I went to bed:
< rcurtin> the paper is paywalled, but I want to highlight a bit in there:
< rcurtin> "MLPack [15] was chosen to be used for the NN library. Although MLPack's artifical neural network (ANN) is not as mature as other software libraries, it provided many advantages.
< rcurtin> First, the build process was relatively simple and the required dependency list was short.
< rcurtin> Second, its API is well-documented both with function definitions in Doxygen and code usage examples.
< rcurtin> Third, other machine learning algorithms in MLPack have been used by our cognitive communications colleagues at NASA GRC, so using a common library would ease incorporation of our cognitive engine with their activities.
< rcurtin> Additionally, MLPack does not support the Levenberg-Marquardt backpropagation algorithm for training. The authors wrote their own implementation of the algorithm and verified its performance using MATLAB's "trainlm" function in its Neural Network Toolbox."
< rcurtin> I think this is really exciting, I think I will send the authors an email to find out if our code has been used in space :)
< rcurtin> (this is a personal life goal for me, to do something that ends up in space... I guess code counts, sort of !)
< rcurtin> (also, still lots of confusion about the capitalization of mlpack... not sure how to fix that...)
vivekp has quit [Ping timeout: 248 seconds]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
kris1 has quit [Quit: kris1]
kris___ has quit [Quit: Connection closed for inactivity]
kris1 has joined #mlpack
< zoq> rcurtin: Exciting, we should implement Levenberg-Marquardt and release it as a space edition :)
< zoq> Also, I'm always wondering why people use MLPack, the main paper uses MLPACK so I'm not sure I get the connection. I guess at this point we settled on mlpack; at least that's what I use :)
vivekp has quit [Ping timeout: 260 seconds]
vivekp has joined #mlpack
kris1 has quit [Quit: kris1]
kris1 has joined #mlpack
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Remote host closed the connection]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Remote host closed the connection]
kris___ has joined #mlpack
< kris___> This is the output on cifar dataset with 5000 images. Something seems to be wrong in the evaluation code.
< lozhnikov> kris___: try to increase the radius e.g. multiply it by 1.5 or 2
< lozhnikov> actually, the paper doesn't restrict the upper bound
< kris1> I did try that i increased it by a factor of 3 but the same thing happens.
< kris1> I have the preprocessing as per the paper.
< kris1> I think some thing might wrong in the preprocessing part.
< kris1> could you have a look at he gist once.
< lozhnikov> maybe it is reasonable to increase slabPenalty and initial visiblePenalty
< kris1> Hmmm right now i am using the values that are provided them in the paper let me make the changes and see.
< lozhnikov> I looked through the preprocessing part. Actually it doesn't correspond to the paper. The paper states:
< lozhnikov> "We use the following protocol. We train mcRBM on 8x8
< lozhnikov> color image patches sampled at random locations, and then
< lozhnikov> we apply the algorithm to extract features convolutionally
< lozhnikov> over the whole 32x32 image by extracting features on a
< lozhnikov> 7x7 regularly spaced grid (stepping every 4 pixels)"
< kris___> Yes why do you say that the preprocessing is different.
< kris___> patches[:,:,channel, i * 7 + j, img] = img_data[i*4 : i*4 + 8, j*4 : j*4+8, channel]
< lozhnikov> 1. You don't sample patches at random locations
< kris___> I did it like we do with cnn.
< lozhnikov> 2. you don't sample color patches you use whitening instead
< lozhnikov> okay, the authors refer to paper [13]: "We produced image features from an ssRBM model trained on patches using the same procedure as [13]"
< lozhnikov> and that paper states the same that I just wrote
< lozhnikov> i.e. you should do a number of color patches at different locations and then train the ssRBM on them
< kris___> Hmmm so i shouldn't do centre and the whiten the patches.....
< kris___> Also how many random samples should i take is that another hyper parameters....
< lozhnikov> it seems the paper doesn't describe that
< kris___> Centring and whitening of patches are required afaik otherwise the we get the assertion error on visiblePenalty is less than zero
< lozhnikov> Could you elaborate a bit? I don't see the connection
< kris___> Well i did not the not do the preprocessing i was getting this error. I don't a theoretical explanation right now.
< kris___> I would have to work that out.
mikeling has quit [Quit: Connection closed for inactivity]
< kris___> lozhnikov: I re-read the training procedure. I don't understand it.
< kris___> You train the ssRBM on color image patches
< kris___> randomly sampled
< lozhnikov> yeah
< kris___> but i don't get this line "extract features convolutionally"
< kris___> are these required for training the logistic regression.
< kris___> because we already have the features for the ssRBM.
< lozhnikov> I think that means you have to sample 49 hidden variables from the image and concatenate it
< kris___> Yes but those features are required for which algorithm?
< lozhnikov> that is you should sample 49 patches (the shape of the image is 32x32, the step size is equal to 4), then sample hidden variables from each patch and concatenate the results
< lozhnikov> then you train logistic regressor on these features
< lozhnikov> If I understand your question right these features are required for logistic regression
< kris___> Okay so this for the testing part of the algorithm.
< lozhnikov> yeah
< kris___> the ssRBM is trained on random patches of 8*8 size. is that correct?
< lozhnikov> yeah, I think that's correct
< kris___> okay i will try to complete this by morning.
< kris___> What should we do about the GAN...........
< lozhnikov> I think we should complete the test
< kris___> The problem seems to be that i don't have computation resources.
< lozhnikov> but I have:)
< kris___> Okay i think we should complete the Batch Normalization just to be sure.....
< lozhnikov> actually, I think we've got another issue. The oreilly example states that the step size should be equal to 3e-4. But our implementation shouldn't work with this step (you can check the loss function to ensure)
< lozhnikov> it is enough to look at the pretrain phase
< lozhnikov> so, I guess the implementation of the discriminator differs from the oreilly example
< lozhnikov> but the discriminator network hasn't got batch normalization layers at all
kris1 has quit [Quit: kris1]
kris___ has quit [Quit: Connection closed for inactivity]