< jenkins-mlpack> Project mlpack - nightly matrix build build #456: STILL UNSTABLE in 1 hr 39 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20nightly%20matrix%20build/456/
naywhayare has joined #mlpack
oldbeardo has joined #mlpack
< naywhayare> oldbeardo: before the summer starts, we should work out some times in which I will try to be available every day on IRC
< naywhayare> the time zone difference makes it a little difficult; I think the difference between you and me is 9 and a half hours
< naywhayare> so probably when you do most of your work, I will be sleeping, but maybe we can arrange some times so that I am available at the beginning and end of your workdays
< oldbeardo> I don't think it will be an issue
< naywhayare> ok; if I am responsive enough now, then we can just keep doing what we are already doing :)
< oldbeardo> right now I leave early because I get internet access only till 11pm
< oldbeardo> once I'm back at home I can continue working till 2-3 am
< naywhayare> ok, if that works for you
< naywhayare> I can start waking up a little earlier too, if that's easier for you
< oldbeardo> yeah, it would be
< oldbeardo> not to sound rude, I thought Ajinkya was my mentor
< naywhayare> ah, I thought we were co-mentoring, but you're right
< naywhayare> I knew he was helping with one of the CF projects but I couldn't remember if it was Sumedh or you
< naywhayare> I should have just looked it up
< oldbeardo> well, I have no problem if you were my mentor too :)
< naywhayare> ok, well if you have something set up with him, then that's good. I'll try to start waking up earlier anyway... that would probably be good for my productivity :)
< oldbeardo> yup, it will be your lunch time soon, won't it?
< oldbeardo> I'm asking because I'm going to upload the Softmax Regression codes soon, would be great if we could finish it today, so that the next thing I do is QUIC-SVD
< naywhayare> I'm actually cooking lunch at home today
< naywhayare> so I won't be leaving a computer
< oldbeardo> oh great
< oldbeardo> naywhayare: have you ever worked with GPUs?
< naywhayare> oldbeardo: no, I haven't. I have basic knowledge of GPUs and I have attended some introductory presentations on using CUDA and similar libraries
< naywhayare> but I would not call myself an expert in any way :)
< oldbeardo> okay, no problem, I uploaded the code just now
< oldbeardo> IRC looks so empty now that the GSoC formalities are over
< naywhayare> yeah, not many people idling in here anymore
< naywhayare> still more active than it was two years ago :)
< naywhayare> let me download the code you uploaded, hang on
< oldbeardo> okay, wait are you saying that two years back you were the only active member on IRC? :D
< naywhayare> basically, yeah. even jenkins-mlpack wasn't here :)
< naywhayare> one of the old guys who used to work for the fastlab (james cline / Rodya) used to idle in here
< naywhayare> but he works for Rdio now and I don't think he's got time for mlpack anymore
< oldbeardo> what's Rdio?
< naywhayare> streaming music service; I don't think they're that popular, but I think they're doing okay
< naywhayare> I've never used it and I don't know too much about them
< oldbeardo> okay, just saw their website, looks in the early stages, and their service isn't available in India
< naywhayare> yeah, I think they are still a pretty small company
< naywhayare> ok, the test looks good; I think maybe ComputeAccuracy() should be in the test and not in SoftmaxRegression, though
< naywhayare> I can't think of cases other than testing when a user might want to call ComputeAccuracy()
< naywhayare> I could see that maybe a user might want to say "ok, how did the model do?" and print that as output
< naywhayare> but realistically I think a better idea might be to have a function in src/mlpack/core/ somewhere that scores predictions given true labels
< naywhayare> good to see the change to sp_mat worked fine
< naywhayare> although you only call GetGroundTruthMatrix() once, in the constructor of SoftmaxRegressionFunction; do you think that it should just be inlined into the constructor since it's not used anywhere else?
< oldbeardo> wait, I'll have to look, I forgot how I had arranged it
< oldbeardo> well, I made it this way so that if the code is needed for some other implementation, it can easily be copied
< oldbeardo> and inlining won't provide any significant advantage
< naywhayare> the only advantage it provides is code simplicity
< naywhayare> I think if someone else needs it, we can refactor it out then, if you think that's reasonable
< oldbeardo> I think it's better this way, I guess I'm a sucker for abstraction
< naywhayare> ok, we can leave it as is then
< naywhayare> I spend a lot of time trying to figure out how to keep APIs clean, so when I see a function I often think "do we need another function? can we get rid of it?"
< oldbeardo> okay, I haven't dealt with issues like these, what are the problems that arise because of this?
< naywhayare> when the API gets very complex it can be confusing for new users
< naywhayare> so ideally you want to make it as simple as possible for users to do most of what they need to do
< naywhayare> but you also want to provide flexibility so more advanced users can do more complex things
< naywhayare> so finding the right cutoff for what to provide as functions, what to have as template parameters, and so forth, can be quite difficult
< naywhayare> in my opinion that is an example of a very bloated API
< naywhayare> at one point mlpack used Epetra for sparse matrix support, but... Epetra is impossible to understand and is incredibly complex and has a million different types and functions and so forth
< oldbeardo> right, I get your point
< naywhayare> I'm sure Epetra does some very useful things, but... wow, it would take a long time to learn what it does
< oldbeardo> can't we make the internal functions private, instead of inlining?
< naywhayare> a private function would only be useful to that class
< naywhayare> I don't have a huge problem with leaving it public; SoftmaxRegressionFunction isn't a class most users should be using anyway
< naywhayare> kind of an internal class
< oldbeardo> okay, any other issues with the code?
< naywhayare> yeah, the ComputeAccuracy() function, and then I think we should adapt the logistic_regression_main.cpp executable to use SoftmaxRegression, and then make a softmax_regression_main.cpp
< naywhayare> alternately just make one softmax_regression_main.cpp that does a superset of what logistic_regression_main.cpp does
< oldbeardo> I took the ComputeAccuracy() function from the LogisticRegression class itself
< naywhayare> huh, why did I accept that then? let me look at its history
< naywhayare> weird, I don't think I spent enough time with this code before I accepted it
< naywhayare> let me think about it for a little while. if you can think of cases where ComputeAccuracy() or ComputeError() would be useful to an end user, let me know
< naywhayare> I think maybe the better option is to remove ComputeAccuracy() and then add some other method somewhere that computes accuracy given a list of predictions and true labels
< naywhayare> so it's not just in logistic regression or softmax regression
< oldbeardo> it will be useful to check the training set accuracy
< oldbeardo> but yes, this can be made a common function
< naywhayare> ok, I agree
< naywhayare> now we have to figure out where to put the common function :)
< naywhayare> I think maybe we will need a new namespace or directory under src/mlpack/core/
< naywhayare> I'm not sure ComputeAccuracy() would fit under anything that's already there... arma_extend, data (which is save/load/change labels), dists, kernels, math, metrics, optimizers, tree, util
< oldbeardo> how about core/performance
< naywhayare> yeah, let's do that for now and maybe we'll think of something better later
< naywhayare> performance sort of implies runtime performance, not classification performance, but I don't have a better word
< naywhayare> one last concern: the LogisticRegression code allows one to specify a decision boundary
< naywhayare> this is common for most two-class classifiers, but could we generalize that to the multi-class case?
< oldbeardo> ummm, I don't think so, I will check it out anyway
< naywhayare> I think maybe the user could pass some weighting vector, then you could multiply the class probabilities for each point by the weighting vector before selecting the max
< naywhayare> think about the two class case... if you want a decision boundary of 0.5, you could pass [1 1] (the default), but if you want a boundary decision of 0.25, you could pass something different, like [1.5 0.5]?
< naywhayare> I haven't done the exact math yet to figure out what the exact relation is
< naywhayare> but I think something like that might work
< oldbeardo> see the problem there is you assume a linear separation in the binary case
< oldbeardo> so you have to check only for a single line
< oldbeardo> in the multiclass case you will have to check for an exponential number of line combinations to come up with the class
< naywhayare> hang on a second, let me sketch something out
< oldbeardo> at least that's what my intuition tells me
< naywhayare> ok, hang on, uploading some images
< oldbeardo> sure
< oldbeardo> which site are you using? because of them are clocked
< oldbeardo> *some of them
< naywhayare> my own :)
< oldbeardo> heh, nice one
< naywhayare> the hard part is getting them off my phone
< naywhayare> ok
< oldbeardo> while looking at core/arma_extend/ I happened to see files named 'SpMat_extra_bones', 'SpMat_extra_meat'
< naywhayare> I've drawn three gaussians and then an approximate decision surface between them
< oldbeardo> what's that about?
< naywhayare> let me finish this real quick and I'll tell you what those are
< naywhayare> the explanation for that is a little complex :)
< oldbeardo> okay sure
< naywhayare> anyway, I then modified the image so that the pdf of g_2 is weighted about twice as heavily
< naywhayare> and I tried to draw what I thought the modified decision surface would look like
< naywhayare> to classify a point, just like softmax regression, you just find the probability of the point coming from each of those gaussians, and select the gaussian with max probability
< naywhayare> but if I double the probability of the point coming from g2, then I get a modified decision surface like in the image, but I didn't have to calculate numerous lines; just multiply the probability estimate for g2 by 2 (or whatever factor)
< naywhayare> but I'm not sure that's the same thing that is happening in logistic regression
< oldbeardo> I'm actually a little confused by this
< oldbeardo> I have no idea about the Gaussian interpretation of Logistic Regression
< naywhayare> no, I wasn't using logistic regression specifically as an example
< naywhayare> this is some classifier that has no relation to logistic regression or softmax regression
< naywhayare> the similarity is that it classifies in the same way softmax regression does
< naywhayare> I think that you should be able to represent softmax regression as a series of PDFs in some space (like these three gaussians)
< oldbeardo> is this an existing algorithm or are you trying to come up with a new one?
< naywhayare> no, I'm trying to extend the idea of a modifiable decision boundary from logistic regression to softmax regression
< naywhayare> I don't know if anyone has done that before
< naywhayare> ideally I would like to entirely remove the logistic regression module and replace it with softmax regression
< naywhayare> but to do this we need to be sure that softmax regression has the same functionality; the only thing it's missing is the decision boundary parameter
< naywhayare> so I was trying to think of a multiclass generalization to the decision boundary parameter
< oldbeardo> actually we shouldn't
< naywhayare> you don't think so?
< oldbeardo> yes I don't, this might be a silly reason but I'll say it anyway
< naywhayare> it's ok, go ahead :)
< oldbeardo> while testing today I used the same Gaussians as in the Logistic Regression test
< oldbeardo> they had base points as "1.0 1.0 1.0" and "9.0 9.0 9.0"
< oldbeardo> using that dataset, I was getting an accuracy of 52%
< oldbeardo> so I worked out the math, turns out Softmax does not give the Logistic cost function when num_classes=2
< oldbeardo> I did this mentally so I may be incorrect
< naywhayare> I thought that it worked out to be the same... hang on, let me look up that site that said it was the same
< oldbeardo> the point I'm making is that Softmax has a bias towards features points with a higher norm
< naywhayare> can you explain why that is? I'm trying to understand
< oldbeardo> okay
< oldbeardo> the probability for a class is exp(lin_j) / sum(exp(lin_i))
< oldbeardo> if you take num_classes = 2 it becomes exp(lin_0) / (exp(lin_0) + exp(lin_1))
< oldbeardo> which is 1 / (1 + exp(lin_1 - lin_0))
< naywhayare> lin_j = \theta_j^T * x?
< naywhayare> just to be sure we are on the right page
< oldbeardo> yes
< naywhayare> ok
< oldbeardo> now this is not the same as sigmoid(lin_0)
< oldbeardo> so, the learned weights are in favour of the class which has higher norm for its data points
< oldbeardo> at least that's what I inferred from the printed probabilities
< naywhayare> I need to spend some time thinking about why this is true
< naywhayare> but I see you what you mean
< naywhayare> I have to go in a few moments, so I wanted to clarify one more thing -- you asked about Mat_extra_bones.hpp and so forth
< naywhayare> SpMat_extra_bones.hpp, specifically
< naywhayare> in the Armadillo code, the class Mat is defined in Mat_bones.hpp
< naywhayare> at the bottom of the file, it has this nice stanza:
< naywhayare> #ifdef ARMA_EXTRA_MAT_PROTO
< naywhayare> #include ARMA_INCFILE_WRAP(ARMA_EXTRA_MAT_PROTO)
< naywhayare> #endif
< naywhayare> but that's inside the definition of the Mat class
< naywhayare> so to extend the functionality of the Mat class, you just have to define ARMA_EXTRA_MAT_PROTO and create that file, and include some things in it
< oldbeardo> okay, nice naming though :)
< naywhayare> it's the same for SpMat, Col, Row, and so forth
< naywhayare> ah, you can thank Conrad for the bones/meat naming; that was his idea
< naywhayare> anyway, the SpMat_extra_bones.hpp file just adds the batch constructors for SpMat if the version of Armadillo being used is older than 3.810.0
< naywhayare> most of the stuff in arma_extend/ is backports so that mlpack can work with older versions of Armadillo while still utilizing new functionality
< oldbeardo> okay, also I wanted to ask
< oldbeardo> I was thinking of shifting to Ubuntu 14.04, will the latest armadillo work?
< naywhayare> it should, and if it doesn't, we'll just fix the bugs :)
< naywhayare> anything else before I go? I assume you're about to leave too
< oldbeardo> no, that's it, yup I'm about to
< naywhayare> ok, I'm gonna go then. see you later
< oldbeardo> yup, see ya
oldbeardo has quit [Quit: Page closed]