#mlpack on 2014-05-08 — irc logs at libera.irclog.whitequark.org

05:34 < jenkins-mlpack> Project mlpack - nightly matrix build build #448: STILL UNSTABLE in 1 hr 33 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20nightly%20matrix%20build/448/

06:28 witness___ has joined #mlpack

07:30 oldbeardo has joined #mlpack

08:07 < oldbeardo> naywhayare: I remembered one more thing that we had discussed earlier

08:09 < oldbeardo> any optimizer in the package looks for Evaluate(arma::mat& parameters) and Gradient(arma::mat& parameters, arma::mat& gradient) functions for an algorithm's implementation

08:11 < oldbeardo> instead of this we should have a default like CostGradient(arma::mat& parameters, arma::mat& gradient), since many of the methods involve common computations for both the functions

08:12 < oldbeardo> also, splitting it into two effectively achieves nothing (there's no reason that comes to mind immediately)

08:15 < oldbeardo> I know that making this change will mean a lot of refactoring, but I think this is something worth doing now, it can potentially run methods at twice the speed

08:45 oldbeardo has quit [Quit: Page closed]

09:50 cuphrody has joined #mlpack

11:04 marcus_z1q has joined #mlpack

12:17 marcus_z1q is now known as marcus_zoq

13:38 oldbeardo has joined #mlpack

14:48 < naywhayare> oldbeardo: I think there are some cases where Evaluate() is called more often than Gradient() so it makes sense to split them up

14:48 < naywhayare> but you are right that in some cases simultaneously evaluating the objective function and the gradient is more efficient

14:49 < oldbeardo> naywhayare: okay, in which cases does that happen?

14:50 < naywhayare> I can't remember right now. I'd have to look it up

14:51 < oldbeardo> okay, because as I see it, the optimizer for every iteration uses the Evaluate() and Gradient() functions only once

14:51 < oldbeardo> but I haven't seen every algorithm in the library, so you must be right

14:51 < naywhayare> well, it's possible that I'm wrong. but I seem to remember some instances where the objective function is desired and the gradient is not

14:52 < naywhayare> it makes sense to find some way to avoid duplicating work, though, so there is definitely some improvement that we could do

14:53 < oldbeardo> okay, then we could probably associate a flag with the function, 0 says only objective, 1 says only gradient and 2 says both

14:54 < naywhayare> maybe... flags are runtime, though. it'd be better to have all this figured out at compile-time, though

14:54 < naywhayare> I have to step out for a little while... I'll be back later

14:54 < oldbeardo> sure

15:56 < naywhayare> ok, so I think one important thing is that we don't modify the FunctionType abstraction too much

15:56 < naywhayare> we don't want to have so many methods that a user has to implement... ideally just Evaluate() and Gradient()

15:57 < naywhayare> but, I think what we can do is use some template metaprogramming techniques to use an EvaluateGradient() function if the user implemented it in their FunctionType class

15:57 < naywhayare> (the EvaluateGradient() function might need a better name that indicates it's giving both the objective and gradient as results)

16:05 < oldbeardo> okay, that seems fair, won't require refactoring, just writing a new function if needed

16:06 < naywhayare> actually implementing the template metaprogramming bit might be a little bit complicated though

16:06 < naywhayare> I'm not sure what exactly that will entail yet

16:09 < oldbeardo> yeah, well, you know the intricacies better, I just thought of an improvement

16:10 < oldbeardo> though I would like to see the solution to the problem

16:11 < oldbeardo> also, I have written a Softmax Regression implementation, should I open a ticket?

16:11 < naywhayare> sure, feel free

16:11 < naywhayare> I am pretty busy for the next few weeks so I don't know when I'll get to it, but I'll take a look

16:11 < naywhayare> you do have commit access too now, so you could just check it into a directory and leave it uncompiled in trunk/ for now

16:12 < oldbeardo> okay, well I wanted you to have a quick look, haven't written the tests yet

16:13 < naywhayare> ok; still, you can commit it to svn; that's what trunk/ is for

16:13 < naywhayare> if we do a release soon and it isn't done, I'll just pull it out of the release

16:13 < naywhayare> and leave it in trunk/ for future work

16:15 < oldbeardo> okay, for now I will just upload the files to the ticket, haven't acquainted myself with svn yet

16:15 < naywhayare> ok, whatever you like :)

16:17 < oldbeardo> when are you planning for a release?

16:18 < naywhayare> I was hoping to get it done sometime this month

16:18 < naywhayare> but I have a paper deadline on June 6th and I'm pretty far away from having the paper done...

16:18 < naywhayare> so I don't know if I'll have time to do it before mid-June

16:19 < naywhayare> there are a couple minor things to clean up -- Saheb patched most of the dual-tree algorithm constructors but not all of them, so I need to finish that

16:19 < naywhayare> lots of other little things too; I just can't remember them all right now. I think I could look through the open tickets and that would show a bunch of them

16:20 < naywhayare> the first step toward a release is sitting down and writing down everything that needs to be finished before that particular release can happen, and I haven't done that yet :)

16:20 < oldbeardo> okay, that sounds like mid June then :)

16:20 < naywhayare> yeah, probably...

16:22 cuphrody has quit [Ping timeout: 276 seconds]

16:22 < oldbeardo> just opened the ticket, can you have a quick look?

16:24 < naywhayare> I don't really have the time to take a good look right now

16:24 < naywhayare> can you detail how it is meant to be used? you use softmax regression after training a set of stacked sparse autoencoders, right?

16:24 < naywhayare> i.e. you take the output of the last sparse autoencoder and then use softmax regression, I think. I could be wrong; I'm not sure

16:24 < oldbeardo> yes, so this is how it's done

16:25 < oldbeardo> you decide on a number of layers that you want in your network

16:25 < oldbeardo> you train that many autoencoders greedily one after the other

16:26 < oldbeardo> then you attach a classifier(Softmax Regression) to the last layer, and finetune the parameter weights

16:26 < naywhayare> ok

16:26 < oldbeardo> apart from this you can also use Softmax independently as a classifier

16:26 < naywhayare> so I see this from the UFLDL wiki:

16:27 < naywhayare> "In these notes, we describe the Softmax regression model. This model generalizes logistic regression to classification problems where the class label y can take on more than two possible values. "

16:27 < oldbeardo> yes

16:27 < naywhayare> mlpack already has logistic regression for two-class problems; would it make more sense to generalize the logistic regression code to be softmax regression?

16:27 < naywhayare> so that we don't have separate logistic regression and softmax regression implementations

16:28 < naywhayare> I don't know the answer to this of course because I have done very little reading on softmax regression. you are the expert :)

16:28 < oldbeardo> well, Softmax is the generalization, so what you are suggesting is removing Logistic Regression

16:29 < naywhayare> yeah; either removing logistic regression or merging the two implementations in a way that provides both logistic regression and softmax regression

16:29 < naywhayare> if two-class softmax regression is equivalent to logistic regression, then it doesn't make sense to have a separate logistic regression implementation, in my opinion

16:30 < oldbeardo> ummm, you won't be merging the two for sure, there's nothing in the Logistic Regression module that will improve the Softmax implementation

16:31 < naywhayare> true -- probably the only part that could be merged is the logistic_regression_main.cpp executable and the documentation it contains

16:32 < oldbeardo> the two are equivalent, read the comment in the Evaluate() function in softmax_regression_function.cpp, just above the 'probabilities' calculation

16:33 < naywhayare> okay

16:33 < naywhayare> I don't have too much time I can put into this today... unfortunately I have to do some grading work, which is time-consuming and tedious...

16:34 < oldbeardo> sure, I was going to end the discussion but you asked me about Stacked Autoencoders :)

16:35 < naywhayare> yeah; thank you for the explanation

16:35 < naywhayare> that will help my understanding of what the code does and how it will be used

16:36 < oldbeardo> no problem

16:46 oldbeardo has quit [Quit: Page closed]

17:03 witness___ has quit [Quit: Connection closed for inactivity]

17:45 ryde has joined #mlpack

17:47 < ryde> hello. Anyone here can provide hints on compiling mlpack with blas (Intel MKL) living on non-standard location?

17:50 < ryde> armadillo was a bit of challenge with MKL, but eventually found the right paths. Now mlpack on the other hand find armadillo but is unable to locate MKL included. Has anyone had similar problems in the past?

18:19 < naywhayare> ryde: I think the issue is that when you use armadillo with MKL you have to compile with -lmkl, but mlpack doesn't automatically do that

18:19 < naywhayare> what errors are you getting?