naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
avonmoll has quit [Quit: ChatZilla 0.9.90.1 [Firefox 31.0/20140716183446]]
avonmoll has joined #mlpack
avonmoll has quit [Quit: ChatZilla 0.9.90.1 [Firefox 31.0/20140716183446]]
govg has quit [Ping timeout: 272 seconds]
govg has joined #mlpack
govg has quit [Changing host]
govg has joined #mlpack
< jenkins-mlpack> Starting build #2108 for job mlpack - svn checkin test (previous build: SUCCESS)
< jenkins-mlpack> Project mlpack 1.0.10 - matrix build build #1: ABORTED in 6 hr 27 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%201.0.10%20-%20matrix%20build/1/
< jenkins-mlpack> Project mlpack - svn checkin test build #2108: SUCCESS in 1 hr 31 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2108/
< jenkins-mlpack> Ryan Curtin: Correctly handle SortPolicy abstraction.
sumedhghaisas has joined #mlpack
< jenkins-mlpack> Project mlpack 1.0.10 - matrix build build #2: ABORTED in 5 hr 18 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%201.0.10%20-%20matrix%20build/2/
sumedhghaisas has quit [Ping timeout: 272 seconds]
sumedhghaisas has joined #mlpack
avonmoll has joined #mlpack
avonmoll_ has joined #mlpack
avonmoll has quit [Ping timeout: 240 seconds]
avonmoll_ is now known as avonmoll
avonmoll_ has joined #mlpack
avonmoll has quit [Ping timeout: 246 seconds]
avonmoll_ is now known as avonmoll
oldbeardo has joined #mlpack
< oldbeardo> naywhayare: I have some questions about the partial specialization
< oldbeardo> does writing a specialization mean that we have to redefine the whole class? or can we just modify some methods?
< naywhayare> oldbeardo: I think that you have to partially specialize the whole class
< oldbeardo> no, what I meant was in the partially specialized class can we modify only some functions?
< naywhayare> oh, no... you have to modify everything
< naywhayare> well, not modify, necessarily... you can use the same implementation for some things
< naywhayare> but it's not like inheritance; you can't just leave a method unspecified
< naywhayare> i.e. if you don't specify Tolerance(), then your partially specialized class simply won't have it
< oldbeardo> ah, that sucks
< oldbeardo> naywhayare: maybe someone else should work on it, I'm not familiar with GMM or the gmm code
< naywhayare> oldbeardo: sure, sounds fine; I was just giving ideas when I suggested it
< oldbeardo> I don't really understand the tradeoffs that are being made, or the inefficiencies that are present
< naywhayare> basically, as it currently is, each iteration of the EM algorithm trains the full covariance matrix, and then makes every non-diagonal entry 0
< naywhayare> but a faster algorithm can be derived when diagonal covariance is assumed
< naywhayare> Armadillo implements the faster diagonal-covariance GMM training algorithm, so, it makes some amount of sense to just wrap that instead of writing one from scratch
< oldbeardo> redefining the whole class sounds like an overkill for this
< naywhayare> so, there are a couple options
< naywhayare> EMFit is a pretty simple class, and both Estimate() overloads would need to be modified
< naywhayare> another idea is to further templatize the class, splitting out the function for training the covariance
< oldbeardo> yes, this sounds like a good one
< oldbeardo> anyway, I don't think I should be the one making this change
< naywhayare> fair enough; it's not a high-priority issue, so it can wait until someone else comes along
< oldbeardo> okay, I will find if there is something else I can work on
< naywhayare> do you like CSS / web development at all? I have been meaning for months to restyle the GSoC blog like the rest of the mlpack website, and make it not just for GSoC
< oldbeardo> heh, no, I haven't done any of it before
< oldbeardo> by the way, should I close ticket #349?
< naywhayare> if I remember right, the main issue with softmax regression was that it did not give the same results as logistic regression, even though it should
< naywhayare> take a look at the bottom of comment:1 -- "If the derivation is right, then on that test set with two Gaussians that are far apart, we should get much higher accuracy than 52%."
< oldbeardo> yes, I remember that, and I also remember how you wanted me to check it
< oldbeardo> but I think the main issue here is that Softmax will never give predictions based on theta_1 - theta_2
< naywhayare> why do you say that?
< oldbeardo> the way you wanted me to test it was by making the cost function converge and make predictions based on theta_1 - theta_2
< oldbeardo> where theta_1 and theta_2 are the weights learned by the module
< oldbeardo> in practice, it will always suffer from bias while predicting
< naywhayare> I'm not following why you say that; can you elaborate on what you mean?
< oldbeardo> as far as the correctness of the module is concerned, tests for that are already in place
< naywhayare> "Further, if the cost function J(θ) is minimized by some setting of the parameters (\theta_1, \theta_2,\ldots, \theta_k), then it is also minimized by (\theta_1 - \psi, \theta_2 - \psi,\ldots, \theta_k - \psi) for any value of ψ."
< oldbeardo> yes, I'm not arguing against that
< oldbeardo> that is mathematically correct
< naywhayare> right... then if J(θ) has a non-unique minimizer, and one of those minimizers is the logistic regression minimizer, then the performance of logistic regression and softmax regression should be identical
< oldbeardo> true, except you will need a function to map the parameters of softmax into the parameters of logistic
< oldbeardo> and thus, practically the weights are different, and thus their performance
< naywhayare> although you need a function to map between the two parameter representations, they are functionally equivalent:
< naywhayare> "Notice also that by setting ψ = θ1, one can always replace θ1 with \theta_1 - \psi = \vec{0} (the vector of all 0's), without affecting the hypothesis."
< naywhayare> i.e., predictions should be the same regardless of ψ
< naywhayare> there is a more concrete derivation to show that the output of softmax regression in the two-class case is equivalent to logistic regression
< oldbeardo> okay, so maybe being overparametrized acts against Softmax in the binary case
< naywhayare> no, the overparamerization doesn't make a difference; the hypothesis output is the same regardless
< oldbeardo> by that I mean, maybe the theta learned in Logistic is different theta_1 - theta_2 learned in Softmax
< oldbeardo> More formally, we say that our softmax model is overparameterized, meaning that for any hypothesis we might fit to the data, there are multiple parameter settings that give rise to exactly the same hypothesis function hθ mapping from inputs x to the predictions.
< naywhayare> I think that it is necessary that theta = theta_1 - theta_2; however, regardless of that, the performance should be the same
< naywhayare> since these things give rise to the same hypothesis function
< oldbeardo> I'll be back in some time
< naywhayare> okay, see you later
< oldbeardo> naywhayare: I'm back, I still haven't found a suitable explanation for this
oldbeardo has quit [Quit: Page closed]
sumedhghaisas has quit [Ping timeout: 264 seconds]
sumedhghaisas has joined #mlpack
jenkins-mlpack has quit [Remote host closed the connection]
jenkins-mlpack has joined #mlpack
jenkins-mlpack has quit [Client Quit]
jenkins-mlpack has joined #mlpack
< jenkins-mlpack> Starting build #2109 for job mlpack - svn checkin test (previous build: SUCCESS)
sumedhghaisas has quit [Ping timeout: 240 seconds]
jenkins-mlpack has quit [Ping timeout: 240 seconds]
sumedhghaisas has joined #mlpack
avonmoll has quit [Quit: ChatZilla 0.9.90.1 [Firefox 31.0/20140716183446]]
sumedhghaisas has quit [Ping timeout: 272 seconds]