naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
< jenkins-mlpack> Project mlpack - svn checkin test build #2007: SUCCESS in 1 hr 20 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2007/
< jenkins-mlpack> sumedhghaisas: * added local minima storing functionality to termination policies
andrewmw94 has quit [Quit: Leaving.]
andrewmw94 has joined #mlpack
andrewmw94 has quit [Client Quit]
Anand has joined #mlpack
< Anand> Marcus : I have modified the mlpack interface for all methods. Also added linear regression for weka, scikit and mlpack. Please have a look. I will add linear regression for shogun and matlab today and then will make a merge with master today only!
Anand has quit [Ping timeout: 246 seconds]
naywhayare has joined #mlpack
< jenkins-mlpack> Project mlpack - nightly matrix build build #512: FAILURE in 5 hr 1 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20nightly%20matrix%20build/512/
< jenkins-mlpack> * sumedhghaisas: * added local minima storing functionality to termination policies
< jenkins-mlpack> * Ryan Curtin: Lengthen comments that weren't 80 columns long. This may be the most trivial
< jenkins-mlpack> fix ever in my long, decorated history of trivial commits.
< jenkins-mlpack> * Ryan Curtin: Very minor changes.
< jenkins-mlpack> * saxena.udit: IsDistinct() improved.
< jenkins-mlpack> * Ryan Curtin: Don't use arma::unique() because it's slow.
< jenkins-mlpack> * Ryan Curtin: Use bool instead of int for tracking convergence.
< jenkins-mlpack> * Ryan Curtin: Fix some formatting issues; no functionality change.
< jenkins-mlpack> * Ryan Curtin: Const-correctness and 80-character lines... very trivial fix, no functionality
< jenkins-mlpack> change.
< jenkins-mlpack> * saxena.udit: Entropy calculation improved.
< jenkins-mlpack> * andrewmw94: R tree now has dataset and indices
< jenkins-mlpack> * Ryan Curtin: Include mlpack/core.hpp.
< jenkins-mlpack> * Ryan Curtin: Another test to make sure the correct splitting attribute is used.
< jenkins-mlpack> * Ryan Curtin: Fix some formatting, fix backwards entropy splitting, add getters/setters, and
< jenkins-mlpack> comment a little bit about the internal structure of the class.
naywhayare has joined #mlpack
sumedhghaisas has joined #mlpack
Anand has quit [Ping timeout: 246 seconds]
Anand has joined #mlpack
< Anand> Marcus : Does Shogun support python linear regression just like logistic regression? Look at how we did logistic regression for shogun. We didn't have to write any C code.
Anand has quit [Ping timeout: 246 seconds]
< jenkins-mlpack> Starting build #2008 for job mlpack - svn checkin test (previous build: SUCCESS)
< jenkins-mlpack> Project mlpack - svn checkin test build #2008: SUCCESS in 1 hr 20 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2008/
< jenkins-mlpack> saxena.udit: Changes are part of perceptron code review, as discussed with Ryan
sumedh_ has joined #mlpack
sumedhghaisas has quit [Ping timeout: 245 seconds]
< jenkins-mlpack> Starting build #2009 for job mlpack - svn checkin test (previous build: SUCCESS)
udit_s has joined #mlpack
andrewmw94 has joined #mlpack
< naywhayare> andrewmw94: my solution to the weird corel dataset bug is in http://www.mlpack.org/trac/changeset/16808
< naywhayare> I had to modify the tree abstraction very slightly, by adding the function MinimumBoundDistance()
< naywhayare> I implemented this in HRectBound (r16809) and then added a function MinimumBoundDistance() to the tree types; for RectangleTree, it just passes on HRectBound::MinWidth()
< andrewmw94> ok. I'm not sure I understand the error in the dual tree traverser, but it's good to know it is fixed.
< andrewmw94> does the MininmumBoundDistance() return the MinWidth() for BSP trees too? Because I don't think that matches the comment.
< naywhayare> yeah, that is what it returns
< naywhayare> oh, hang on... I have botched my terminology
< naywhayare> I have to divide everything by 2... MinimumBoundDistance = MinWidth / 2
< andrewmw94> ahh. I think I get it now
< andrewmw94> how long did it take you to find this. It looks painful.
< naywhayare> probably 3-4 hours to find the bug, and then I spent about a day thinking about the right way to fix it
< andrewmw94> not too bad I guess...
< andrewmw94> but I don't envy you
< naywhayare> well... I also wrote all the code, so going into it I had some idea of what the bug was
< naywhayare> the actual code that was wrong was an attempt at a clever way to prune a node combination without actually doing the O(d) MinDistance() calculation between them
< andrewmw94> ooh, that reminds me. When I was looking through the neighbor search code, I saw that you have a bunch of different ways to calculate bounds (5 I think)
< andrewmw94> which seems like it could be slower than just using the one that works best. But then I thought, maybe we could change it so higher up in the tree, where a prune would save a ton of computation, it does more precise bounds checking. But as it reaches the bottom, it just does something fast.
< andrewmw94> Do you think that would have potential?
< naywhayare> yes, I think that would be a good idea
< naywhayare> how to implement it is a little less clear...
< naywhayare> I guess you could do it based on node.NumDescendants()
< naywhayare> but some of those 5 bounds also depend on bounds propagating from children or parents, so we'd need to also be sure that we weren't breaking those
< naywhayare> each of those bounds can be derived using different variants of the triangle inequality
< naywhayare> and they are basically bounds on "what is the largest possible nearest neighbor distance of any query point in the given query node, given everything we know so far?"
< naywhayare> and each of the bounds considers some different aspect of the "everything we know so far"
< andrewmw94> yeah. Changing it would be complicated. I was thinking it could be based on the depth of the tree below this node. R trees are balanced, so that gives you a branchingfactor^depth * minFillLeaves as a lower bound.
< andrewmw94> but we wouldn't want it specific to R trees
Anand has joined #mlpack
< Anand> Marcus : I was talking about linear regression for shogun. Logistic regression is already done.
< Anand> You used modshogun for multiclass logistic regression. Is there a similar import and methods for linear regression?
< Anand> Also, we need to think about matlab linear regression code
< jenkins-mlpack> Project mlpack - svn checkin test build #2009: SUCCESS in 1 hr 19 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2009/
< jenkins-mlpack> saxena.udit: Minor improvement. No major functionality changes
< jenkins-mlpack> Starting build #2010 for job mlpack - svn checkin test (previous build: SUCCESS)
Anand has quit [Ping timeout: 246 seconds]
< udit_s> naywhayare: Hey ! Did you get around to the AdaBoost mail ?
< jenkins-mlpack> Project mlpack - svn checkin test build #2010: SUCCESS in 1 hr 35 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2010/
< jenkins-mlpack> * Ryan Curtin: Oops, this needed to be divided by 2.
< jenkins-mlpack> * Ryan Curtin: Use slightly safer Width().
< jenkins-mlpack> * Ryan Curtin: Use the bound's cached MinWidth() for MinimumBoundDistance().
< jenkins-mlpack> * Ryan Curtin: Test MinWidth().
< jenkins-mlpack> * Ryan Curtin: Add MinWidth(), which is a better solution than having the tree calculate it by
< jenkins-mlpack> hand.
< jenkins-mlpack> * Ryan Curtin: Fix elusive bug that only occurred in particularly rare situations.
< jenkins-mlpack> * Ryan Curtin: Add MinimumBoundDistance().
< jenkins-mlpack> This represents the minimum distance between the center of a node and any edge
< jenkins-mlpack> of the bound. Note that for ball bounds, this is equivalent to the furthest
< jenkins-mlpack> descendant distance.
Anand has joined #mlpack
< Anand> Marcus : I am not getting how the current code is giving us the predicted labels. I am not concerned about the MSE calculation, but the predicted labels
< marcus_zoq> Anand: Ah okay, we need to modify the timing code, in the way we did with the logistic regression. So if the user pass more than one file use the second file as test set. We can assume that the last row of the training set are the responses. Afterwards, we can use 'model.apply(RealFeatures(testSet.T)).get_labels()' to get the labels. Does this make sense?
< Anand> Ok. So, is this apply(..) method applicable for linear regression too?
< Anand> If yes, then I got it
< marcus_zoq> Anand: The apply function is capable to predict the labels.
< Anand> Ok. Got it. And what about matlab?
< marcus_zoq> Anand: We need to rewrite the matlab code, it looks like the regress class isn't able to predict new data. We can use a combination of fitlm and feval.
< marcus_zoq> Anand: If you like I can make the necessary changes.
< Anand> I guess fitval is like mnrval used in logistic regression and feval like mnrval, right?
< Anand> maybe I can do this myself!
< marcus_zoq> Anand: Right, you can use matlab on the build server right?
< Anand> Oh, yes I will need to run it on the build server. I will try. Else, I will make the changes and you can have a look then!
< marcus_zoq> Anand: Okay
Anand has quit [Ping timeout: 246 seconds]
udit_s has quit [Quit: Leaving]
sumedh_ has quit [Ping timeout: 255 seconds]
oldbeardo has joined #mlpack
< oldbeardo> naywhayare: there?
< naywhayare> oldbeardo: only sort of
< naywhayare> I am helping someone inspect a car today... but I have my phone, which is maybe ok for this :)
< oldbeardo> naywhayare: hmmm, that may not be enough
< naywhayare> :-(
< naywhayare> you can leave messages in the channel and I will try to answer when I can
sumedh_ has joined #mlpack
< oldbeardo> naywhayare: I also wanted to inform you, I may be unavailable for a week starting Saturday, I'm switching cities
< oldbeardo> naywhayare: I will try to add the tests by tomorrow
oldbeardo has quit [Quit: Page closed]
< jenkins-mlpack> Starting build #2011 for job mlpack - svn checkin test (previous build: SUCCESS)
udit_s has joined #mlpack
< udit_s> naywhayare: Hey, are you free now ?
< naywhayare> udit_s: yeah, I am back now
< udit_s> I'll complete the gaussian distribution test for the perceptron by tomorrow. Otherwise, you said you were done with the Perceptron ?
< naywhayare> yeah, I think so
< naywhayare> I still need to go through decision_stump_main.cpp and perceptron_main.cpp, but that shouldn't take long... I just keep forgetting
< naywhayare> I am learning about AdaBoost now so that I can respond well to your email :)
< udit_s> Okay. Whenever you're free, let's talk about AdaBoost. Because I'm kinda stuck on those two points I've mentioned.
< udit_s> Oh. Cool.
< naywhayare> right; I am working on an answer now. give me a handful of minutes (maybe 20 to 30? I need to do some reading) and I will have a good response for you
< udit_s> Sure. Take your time. Actually, I just wanted to catch you before I slept off. I wanted to get started on it by tomorrow.
< udit_s> Let's talk tomorrow then ? Say, 1300 UTC ?
< udit_s> We could have a discussion similar to what we did before the Perceptron...
< udit_s> And I'll work on it over the weekend.
< naywhayare> okay
< naywhayare> I should be awake by 1300 UTC
< udit_s> What would you prefer ? Would now (the next hour or so) be a better time for you ?
< jenkins-mlpack> Project mlpack - svn checkin test build #2011: SUCCESS in 1 hr 18 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20svn%20checkin%20test/2011/
< jenkins-mlpack> siddharth.950: Adding Regularized SVD Code
< naywhayare> nah, let's do it tomorrow morning
< naywhayare> (sorry for the slow response, I stepped out)
< udit_s> okay.
udit_s has quit [Quit: Leaving]
sumedh__ has joined #mlpack
sumedh_ has quit [Read error: Connection reset by peer]
andrewmw94 has left #mlpack []