#mlpack on 2014-08-26 — irc logs at libera.irclog.whitequark.org

2014-05-21 16:24 naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

00:06 avonmoll has quit [Quit: ChatZilla 0.9.90.1 [Firefox 31.0/20140716183446]]

00:53 avonmoll has joined #mlpack

01:10 avonmoll has quit [Quit: ChatZilla 0.9.90.1 [Firefox 31.0/20140716183446]]

03:40 govg has quit [Ping timeout: 272 seconds]

03:41 govg has joined #mlpack

03:41 govg has quit [Changing host]

03:41 govg has joined #mlpack

03:46 < jenkins-mlpack> Starting build #2108 for job mlpack - svn checkin test (previous build: SUCCESS)

04:25 < jenkins-mlpack> Project mlpack 1.0.10 - matrix build build #1: ABORTED in 6 hr 27 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%201.0.10%20-%20matrix%20build/1/

05:17 < jenkins-mlpack> Project mlpack - svn checkin test build #2108: SUCCESS in 1 hr 31 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2108/

05:17 < jenkins-mlpack> Ryan Curtin: Correctly handle SortPolicy abstraction.

08:41 sumedhghaisas has joined #mlpack

09:44 < jenkins-mlpack> Project mlpack 1.0.10 - matrix build build #2: ABORTED in 5 hr 18 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%201.0.10%20-%20matrix%20build/2/

11:26 sumedhghaisas has quit [Ping timeout: 272 seconds]

11:32 sumedhghaisas has joined #mlpack

13:06 avonmoll has joined #mlpack

13:08 avonmoll_ has joined #mlpack

13:11 avonmoll has quit [Ping timeout: 240 seconds]

13:11 avonmoll_ is now known as avonmoll

13:18 avonmoll_ has joined #mlpack

13:21 avonmoll has quit [Ping timeout: 246 seconds]

13:21 avonmoll_ is now known as avonmoll

14:57 oldbeardo has joined #mlpack

14:58 < oldbeardo> naywhayare: I have some questions about the partial specialization

14:59 < oldbeardo> does writing a specialization mean that we have to redefine the whole class? or can we just modify some methods?

15:00 < naywhayare> oldbeardo: I think that you have to partially specialize the whole class

15:01 < oldbeardo> no, what I meant was in the partially specialized class can we modify only some functions?

15:01 < naywhayare> http://stackoverflow.com/questions/15374841/c-template-partial-specialization-member-function

15:01 < naywhayare> oh, no... you have to modify everything

15:01 < naywhayare> well, not modify, necessarily... you can use the same implementation for some things

15:01 < naywhayare> but it's not like inheritance; you can't just leave a method unspecified

15:02 < naywhayare> i.e. if you don't specify Tolerance(), then your partially specialized class simply won't have it

15:03 < oldbeardo> ah, that sucks

15:07 < oldbeardo> naywhayare: maybe someone else should work on it, I'm not familiar with GMM or the gmm code

15:08 < naywhayare> oldbeardo: sure, sounds fine; I was just giving ideas when I suggested it

15:08 < oldbeardo> I don't really understand the tradeoffs that are being made, or the inefficiencies that are present

15:08 < naywhayare> basically, as it currently is, each iteration of the EM algorithm trains the full covariance matrix, and then makes every non-diagonal entry 0

15:09 < naywhayare> but a faster algorithm can be derived when diagonal covariance is assumed

15:09 < naywhayare> Armadillo implements the faster diagonal-covariance GMM training algorithm, so, it makes some amount of sense to just wrap that instead of writing one from scratch

15:10 < oldbeardo> redefining the whole class sounds like an overkill for this

15:10 < naywhayare> so, there are a couple options

15:11 < naywhayare> EMFit is a pretty simple class, and both Estimate() overloads would need to be modified

15:11 < naywhayare> another idea is to further templatize the class, splitting out the function for training the covariance

15:13 < oldbeardo> yes, this sounds like a good one

15:13 < oldbeardo> anyway, I don't think I should be the one making this change

15:14 < naywhayare> fair enough; it's not a high-priority issue, so it can wait until someone else comes along

15:14 < oldbeardo> okay, I will find if there is something else I can work on

15:16 < naywhayare> do you like CSS / web development at all? I have been meaning for months to restyle the GSoC blog like the rest of the mlpack website, and make it not just for GSoC

15:17 < oldbeardo> heh, no, I haven't done any of it before

15:17 < oldbeardo> by the way, should I close ticket #349?

15:20 < naywhayare> if I remember right, the main issue with softmax regression was that it did not give the same results as logistic regression, even though it should

15:20 < naywhayare> take a look at the bottom of comment:1 -- "If the derivation is right, then on that test set with two Gaussians that are far apart, we should get much higher accuracy than 52%."

15:24 < oldbeardo> yes, I remember that, and I also remember how you wanted me to check it

15:25 < oldbeardo> but I think the main issue here is that Softmax will never give predictions based on theta_1 - theta_2

15:25 < naywhayare> why do you say that?

15:26 < oldbeardo> the way you wanted me to test it was by making the cost function converge and make predictions based on theta_1 - theta_2

15:27 < oldbeardo> where theta_1 and theta_2 are the weights learned by the module

15:27 < oldbeardo> in practice, it will always suffer from bias while predicting

15:28 < naywhayare> I'm not following why you say that; can you elaborate on what you mean?

15:28 < oldbeardo> as far as the correctness of the module is concerned, tests for that are already in place

15:29 < naywhayare> http://ufldl.stanford.edu/wiki/index.php/Softmax_Regression#Properties_of_softmax_regression_parameterization

15:29 < naywhayare> "Further, if the cost function J(θ) is minimized by some setting of the parameters (\theta_1, \theta_2,\ldots, \theta_k), then it is also minimized by (\theta_1 - \psi, \theta_2 - \psi,\ldots, \theta_k - \psi) for any value of ψ."

15:29 < oldbeardo> yes, I'm not arguing against that

15:29 < oldbeardo> that is mathematically correct

15:30 < naywhayare> right... then if J(θ) has a non-unique minimizer, and one of those minimizers is the logistic regression minimizer, then the performance of logistic regression and softmax regression should be identical

15:31 < oldbeardo> true, except you will need a function to map the parameters of softmax into the parameters of logistic

15:31 < oldbeardo> and thus, practically the weights are different, and thus their performance

15:32 < naywhayare> although you need a function to map between the two parameter representations, they are functionally equivalent:

15:32 < naywhayare> "Notice also that by setting ψ = θ1, one can always replace θ1 with \theta_1 - \psi = \vec{0} (the vector of all 0's), without affecting the hypothesis."

15:32 < naywhayare> i.e., predictions should be the same regardless of ψ

15:33 < naywhayare> http://ufldl.stanford.edu/wiki/index.php/Softmax_Regression#Relationship_to_Logistic_Regression

15:33 < naywhayare> there is a more concrete derivation to show that the output of softmax regression in the two-class case is equivalent to logistic regression

15:38 < oldbeardo> okay, so maybe being overparametrized acts against Softmax in the binary case

15:39 < naywhayare> no, the overparamerization doesn't make a difference; the hypothesis output is the same regardless

15:39 < oldbeardo> by that I mean, maybe the theta learned in Logistic is different theta_1 - theta_2 learned in Softmax

15:40 < oldbeardo> More formally, we say that our softmax model is overparameterized, meaning that for any hypothesis we might fit to the data, there are multiple parameter settings that give rise to exactly the same hypothesis function hθ mapping from inputs x to the predictions.

15:41 < naywhayare> I think that it is necessary that theta = theta_1 - theta_2; however, regardless of that, the performance should be the same

15:41 < naywhayare> since these things give rise to the same hypothesis function

15:44 < oldbeardo> I'll be back in some time

15:44 < naywhayare> okay, see you later

16:38 < oldbeardo> naywhayare: I'm back, I still haven't found a suitable explanation for this

17:06 oldbeardo has quit [Quit: Page closed]

17:34 sumedhghaisas has quit [Ping timeout: 264 seconds]

17:35 sumedhghaisas has joined #mlpack

17:40 jenkins-mlpack has quit [Remote host closed the connection]

17:42 jenkins-mlpack has joined #mlpack

17:45 jenkins-mlpack has quit [Client Quit]

17:46 jenkins-mlpack has joined #mlpack

17:55 < jenkins-mlpack> Starting build #2109 for job mlpack - svn checkin test (previous build: SUCCESS)

19:31 sumedhghaisas has quit [Ping timeout: 240 seconds]

19:33 jenkins-mlpack has quit [Ping timeout: 240 seconds]

20:15 sumedhghaisas has joined #mlpack

22:04 avonmoll has quit [Quit: ChatZilla 0.9.90.1 [Firefox 31.0/20140716183446]]

22:53 sumedhghaisas has quit [Ping timeout: 272 seconds]