#mlpack on 2014-05-16 — irc logs at libera.irclog.whitequark.org

05:40 < jenkins-mlpack> Project mlpack - nightly matrix build build #456: STILL UNSTABLE in 1 hr 39 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20nightly%20matrix%20build/456/

13:46 naywhayare has joined #mlpack

16:25 oldbeardo has joined #mlpack

16:35 < naywhayare> oldbeardo: before the summer starts, we should work out some times in which I will try to be available every day on IRC

16:36 < naywhayare> the time zone difference makes it a little difficult; I think the difference between you and me is 9 and a half hours

16:36 < naywhayare> so probably when you do most of your work, I will be sleeping, but maybe we can arrange some times so that I am available at the beginning and end of your workdays

16:41 < oldbeardo> I don't think it will be an issue

16:42 < naywhayare> ok; if I am responsive enough now, then we can just keep doing what we are already doing :)

16:42 < oldbeardo> right now I leave early because I get internet access only till 11pm

16:42 < oldbeardo> once I'm back at home I can continue working till 2-3 am

16:43 < naywhayare> ok, if that works for you

16:43 < naywhayare> I can start waking up a little earlier too, if that's easier for you

16:44 < oldbeardo> yeah, it would be

16:44 < oldbeardo> not to sound rude, I thought Ajinkya was my mentor

16:45 < naywhayare> ah, I thought we were co-mentoring, but you're right

16:45 < naywhayare> I knew he was helping with one of the CF projects but I couldn't remember if it was Sumedh or you

16:45 < naywhayare> I should have just looked it up

16:46 < oldbeardo> well, I have no problem if you were my mentor too :)

16:46 < naywhayare> ok, well if you have something set up with him, then that's good. I'll try to start waking up earlier anyway... that would probably be good for my productivity :)

16:47 < oldbeardo> yup, it will be your lunch time soon, won't it?

16:48 < oldbeardo> I'm asking because I'm going to upload the Softmax Regression codes soon, would be great if we could finish it today, so that the next thing I do is QUIC-SVD

16:48 < naywhayare> I'm actually cooking lunch at home today

16:48 < naywhayare> so I won't be leaving a computer

16:49 < oldbeardo> oh great

17:04 < oldbeardo> naywhayare: have you ever worked with GPUs?

17:05 < naywhayare> oldbeardo: no, I haven't. I have basic knowledge of GPUs and I have attended some introductory presentations on using CUDA and similar libraries

17:05 < naywhayare> but I would not call myself an expert in any way :)

17:05 < oldbeardo> okay, no problem, I uploaded the code just now

17:06 < oldbeardo> IRC looks so empty now that the GSoC formalities are over

17:07 < naywhayare> yeah, not many people idling in here anymore

17:07 < naywhayare> still more active than it was two years ago :)

17:07 < naywhayare> let me download the code you uploaded, hang on

17:07 < oldbeardo> okay, wait are you saying that two years back you were the only active member on IRC? :D

17:08 < naywhayare> basically, yeah. even jenkins-mlpack wasn't here :)

17:08 < naywhayare> one of the old guys who used to work for the fastlab (james cline / Rodya) used to idle in here

17:08 < naywhayare> but he works for Rdio now and I don't think he's got time for mlpack anymore

17:09 < oldbeardo> what's Rdio?

17:09 < naywhayare> streaming music service; I don't think they're that popular, but I think they're doing okay

17:10 < naywhayare> I've never used it and I don't know too much about them

17:12 < oldbeardo> okay, just saw their website, looks in the early stages, and their service isn't available in India

17:12 < naywhayare> yeah, I think they are still a pretty small company

17:15 < naywhayare> ok, the test looks good; I think maybe ComputeAccuracy() should be in the test and not in SoftmaxRegression, though

17:15 < naywhayare> I can't think of cases other than testing when a user might want to call ComputeAccuracy()

17:15 < naywhayare> I could see that maybe a user might want to say "ok, how did the model do?" and print that as output

17:16 < naywhayare> but realistically I think a better idea might be to have a function in src/mlpack/core/ somewhere that scores predictions given true labels

17:17 < naywhayare> good to see the change to sp_mat worked fine

17:17 < naywhayare> although you only call GetGroundTruthMatrix() once, in the constructor of SoftmaxRegressionFunction; do you think that it should just be inlined into the constructor since it's not used anywhere else?

17:19 < oldbeardo> wait, I'll have to look, I forgot how I had arranged it

17:20 < oldbeardo> well, I made it this way so that if the code is needed for some other implementation, it can easily be copied

17:20 < oldbeardo> and inlining won't provide any significant advantage

17:21 < naywhayare> the only advantage it provides is code simplicity

17:21 < naywhayare> I think if someone else needs it, we can refactor it out then, if you think that's reasonable

17:22 < oldbeardo> I think it's better this way, I guess I'm a sucker for abstraction

17:24 < naywhayare> ok, we can leave it as is then

17:24 < naywhayare> I spend a lot of time trying to figure out how to keep APIs clean, so when I see a function I often think "do we need another function? can we get rid of it?"

17:25 < oldbeardo> okay, I haven't dealt with issues like these, what are the problems that arise because of this?

17:26 < naywhayare> when the API gets very complex it can be confusing for new users

17:26 < naywhayare> so ideally you want to make it as simple as possible for users to do most of what they need to do

17:26 < naywhayare> but you also want to provide flexibility so more advanced users can do more complex things

17:27 < naywhayare> so finding the right cutoff for what to provide as functions, what to have as template parameters, and so forth, can be quite difficult

17:28 < naywhayare> http://trilinos.org/docs/r11.8/packages/epetra/doc/html/index.html

17:28 < naywhayare> in my opinion that is an example of a very bloated API

17:28 < naywhayare> at one point mlpack used Epetra for sparse matrix support, but... Epetra is impossible to understand and is incredibly complex and has a million different types and functions and so forth

17:29 < oldbeardo> right, I get your point

17:29 < naywhayare> I'm sure Epetra does some very useful things, but... wow, it would take a long time to learn what it does

17:29 < oldbeardo> can't we make the internal functions private, instead of inlining?

17:30 < naywhayare> a private function would only be useful to that class

17:30 < naywhayare> I don't have a huge problem with leaving it public; SoftmaxRegressionFunction isn't a class most users should be using anyway

17:30 < naywhayare> kind of an internal class

17:32 < oldbeardo> okay, any other issues with the code?

17:32 < naywhayare> yeah, the ComputeAccuracy() function, and then I think we should adapt the logistic_regression_main.cpp executable to use SoftmaxRegression, and then make a softmax_regression_main.cpp

17:32 < naywhayare> alternately just make one softmax_regression_main.cpp that does a superset of what logistic_regression_main.cpp does

17:34 < oldbeardo> I took the ComputeAccuracy() function from the LogisticRegression class itself

17:34 < naywhayare> huh, why did I accept that then? let me look at its history

17:35 < naywhayare> weird, I don't think I spent enough time with this code before I accepted it

17:37 < naywhayare> let me think about it for a little while. if you can think of cases where ComputeAccuracy() or ComputeError() would be useful to an end user, let me know

17:37 < naywhayare> I think maybe the better option is to remove ComputeAccuracy() and then add some other method somewhere that computes accuracy given a list of predictions and true labels

17:37 < naywhayare> so it's not just in logistic regression or softmax regression

17:38 < oldbeardo> it will be useful to check the training set accuracy

17:39 < oldbeardo> but yes, this can be made a common function

17:39 < naywhayare> ok, I agree

17:39 < naywhayare> now we have to figure out where to put the common function :)

17:40 < naywhayare> I think maybe we will need a new namespace or directory under src/mlpack/core/

17:40 < naywhayare> I'm not sure ComputeAccuracy() would fit under anything that's already there... arma_extend, data (which is save/load/change labels), dists, kernels, math, metrics, optimizers, tree, util

17:42 < oldbeardo> how about core/performance

17:42 < naywhayare> yeah, let's do that for now and maybe we'll think of something better later

17:43 < naywhayare> performance sort of implies runtime performance, not classification performance, but I don't have a better word

17:43 < naywhayare> one last concern: the LogisticRegression code allows one to specify a decision boundary

17:43 < naywhayare> this is common for most two-class classifiers, but could we generalize that to the multi-class case?

17:44 < oldbeardo> ummm, I don't think so, I will check it out anyway

17:44 < naywhayare> I think maybe the user could pass some weighting vector, then you could multiply the class probabilities for each point by the weighting vector before selecting the max

17:45 < naywhayare> think about the two class case... if you want a decision boundary of 0.5, you could pass [1 1] (the default), but if you want a boundary decision of 0.25, you could pass something different, like [1.5 0.5]?

17:45 < naywhayare> I haven't done the exact math yet to figure out what the exact relation is

17:46 < naywhayare> but I think something like that might work

17:46 < oldbeardo> see the problem there is you assume a linear separation in the binary case

17:47 < oldbeardo> so you have to check only for a single line

17:47 < oldbeardo> in the multiclass case you will have to check for an exponential number of line combinations to come up with the class

17:47 < naywhayare> hang on a second, let me sketch something out

17:48 < oldbeardo> at least that's what my intuition tells me

17:52 < naywhayare> ok, hang on, uploading some images

17:53 < oldbeardo> sure

17:53 < oldbeardo> which site are you using? because of them are clocked

17:53 < oldbeardo> *some of them

17:53 < naywhayare> my own :)

17:54 < oldbeardo> heh, nice one

17:54 < naywhayare> the hard part is getting them off my phone

17:55 < naywhayare> ok

17:55 < naywhayare> http://www.ratml.org/misc_img/gmm_surface_unweighted.png

17:55 < oldbeardo> while looking at core/arma_extend/ I happened to see files named 'SpMat_extra_bones', 'SpMat_extra_meat'

17:55 < naywhayare> I've drawn three gaussians and then an approximate decision surface between them

17:55 < oldbeardo> what's that about?

17:55 < naywhayare> let me finish this real quick and I'll tell you what those are

17:55 < naywhayare> the explanation for that is a little complex :)

17:56 < oldbeardo> okay sure

17:56 < naywhayare> anyway, I then modified the image so that the pdf of g_2 is weighted about twice as heavily

17:56 < naywhayare> http://ratml.org/misc_img/gmm_surface_weighted.png

17:56 < naywhayare> and I tried to draw what I thought the modified decision surface would look like

17:57 < naywhayare> to classify a point, just like softmax regression, you just find the probability of the point coming from each of those gaussians, and select the gaussian with max probability

17:58 < naywhayare> but if I double the probability of the point coming from g2, then I get a modified decision surface like in the image, but I didn't have to calculate numerous lines; just multiply the probability estimate for g2 by 2 (or whatever factor)

17:58 < naywhayare> but I'm not sure that's the same thing that is happening in logistic regression

18:00 < oldbeardo> I'm actually a little confused by this

18:00 < oldbeardo> I have no idea about the Gaussian interpretation of Logistic Regression

18:00 < naywhayare> no, I wasn't using logistic regression specifically as an example

18:00 < naywhayare> this is some classifier that has no relation to logistic regression or softmax regression

18:01 < naywhayare> the similarity is that it classifies in the same way softmax regression does

18:01 < naywhayare> I think that you should be able to represent softmax regression as a series of PDFs in some space (like these three gaussians)

18:01 < oldbeardo> is this an existing algorithm or are you trying to come up with a new one?

18:02 < naywhayare> no, I'm trying to extend the idea of a modifiable decision boundary from logistic regression to softmax regression

18:02 < naywhayare> I don't know if anyone has done that before

18:02 < naywhayare> ideally I would like to entirely remove the logistic regression module and replace it with softmax regression

18:02 < naywhayare> but to do this we need to be sure that softmax regression has the same functionality; the only thing it's missing is the decision boundary parameter

18:03 < naywhayare> so I was trying to think of a multiclass generalization to the decision boundary parameter

18:03 < oldbeardo> actually we shouldn't

18:03 < naywhayare> you don't think so?

18:03 < oldbeardo> yes I don't, this might be a silly reason but I'll say it anyway

18:04 < naywhayare> it's ok, go ahead :)

18:04 < oldbeardo> while testing today I used the same Gaussians as in the Logistic Regression test

18:04 < oldbeardo> they had base points as "1.0 1.0 1.0" and "9.0 9.0 9.0"

18:05 < oldbeardo> using that dataset, I was getting an accuracy of 52%

18:06 < oldbeardo> so I worked out the math, turns out Softmax does not give the Logistic cost function when num_classes=2

18:06 < oldbeardo> I did this mentally so I may be incorrect

18:06 < naywhayare> I thought that it worked out to be the same... hang on, let me look up that site that said it was the same

18:07 < oldbeardo> the point I'm making is that Softmax has a bias towards features points with a higher norm

18:08 < naywhayare> can you explain why that is? I'm trying to understand

18:08 < oldbeardo> okay

18:09 < oldbeardo> the probability for a class is exp(lin_j) / sum(exp(lin_i))

18:10 < oldbeardo> if you take num_classes = 2 it becomes exp(lin_0) / (exp(lin_0) + exp(lin_1))

18:10 < oldbeardo> which is 1 / (1 + exp(lin_1 - lin_0))

18:10 < naywhayare> lin_j = \theta_j^T * x?

18:11 < naywhayare> just to be sure we are on the right page

18:11 < oldbeardo> yes

18:11 < naywhayare> ok

18:11 < oldbeardo> now this is not the same as sigmoid(lin_0)

18:12 < oldbeardo> so, the learned weights are in favour of the class which has higher norm for its data points

18:13 < oldbeardo> at least that's what I inferred from the printed probabilities

18:13 < naywhayare> I need to spend some time thinking about why this is true

18:13 < naywhayare> but I see you what you mean

18:13 < naywhayare> I have to go in a few moments, so I wanted to clarify one more thing -- you asked about Mat_extra_bones.hpp and so forth

18:13 < naywhayare> SpMat_extra_bones.hpp, specifically

18:14 < naywhayare> in the Armadillo code, the class Mat is defined in Mat_bones.hpp

18:14 < naywhayare> at the bottom of the file, it has this nice stanza:

18:14 < naywhayare> #ifdef ARMA_EXTRA_MAT_PROTO

18:14 < naywhayare> #include ARMA_INCFILE_WRAP(ARMA_EXTRA_MAT_PROTO)

18:14 < naywhayare> #endif

18:14 < naywhayare> but that's inside the definition of the Mat class

18:15 < naywhayare> so to extend the functionality of the Mat class, you just have to define ARMA_EXTRA_MAT_PROTO and create that file, and include some things in it

18:15 < oldbeardo> okay, nice naming though :)

18:15 < naywhayare> it's the same for SpMat, Col, Row, and so forth

18:15 < naywhayare> ah, you can thank Conrad for the bones/meat naming; that was his idea

18:16 < naywhayare> anyway, the SpMat_extra_bones.hpp file just adds the batch constructors for SpMat if the version of Armadillo being used is older than 3.810.0

18:16 < naywhayare> most of the stuff in arma_extend/ is backports so that mlpack can work with older versions of Armadillo while still utilizing new functionality

18:16 < oldbeardo> okay, also I wanted to ask

18:17 < oldbeardo> I was thinking of shifting to Ubuntu 14.04, will the latest armadillo work?

18:17 < naywhayare> it should, and if it doesn't, we'll just fix the bugs :)

18:18 < naywhayare> anything else before I go? I assume you're about to leave too

18:18 < oldbeardo> no, that's it, yup I'm about to

18:18 < naywhayare> ok, I'm gonna go then. see you later

18:18 < oldbeardo> yup, see ya

18:19 oldbeardo has quit [Quit: Page closed]