#mlpack on 2015-01-06 — irc logs at libera.irclog.whitequark.org

2014-09-13 04:58 cameron.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

00:09 < zoq> If I'm right Nishant Mehta implemented LARS/LASSO.

00:11 < stephentu> oh i didnt take a very close look haha

00:11 < stephentu> cool

05:16 sumedhghaisas has joined #mlpack

05:34 sumedhghaisas has quit [Ping timeout: 244 seconds]

08:42 decltype_me has quit [Ping timeout: 265 seconds]

09:10 decltype_me has joined #mlpack

10:11 stephentu has quit [Ping timeout: 256 seconds]

10:58 dataVinci has joined #mlpack

11:26 decltype_me has quit [Quit: Konversation terminated!]

13:06 dataVinci has quit [Quit: Leaving]

14:13 dataVinci has joined #mlpack

17:49 sumedh has joined #mlpack

18:16 dataVinci has quit [Quit: Leaving]

18:22 govg has quit [Quit: leaving]

18:41 govg has joined #mlpack

19:23 stephentu has joined #mlpack

19:25 < naywhayare> stephentu: it sounds like I need to sit down and redesign the testing framework so that one can 'make lrsdp_test' or something like that

19:25 < naywhayare> I've been hearing a lot of people asking that question recently

19:25 < naywhayare> I think it's possible, I just need to think of what CMake voodoo is necessary to do it nicely :)

19:30 sumedh has quit [Ping timeout: 244 seconds]

19:30 sumedh has joined #mlpack

22:08 < stephentu> naywhayare: i dont really understand the auto bug. c++ is still magical sometimes

22:12 < naywhayare> if I had to guess, the standard probably doesn't make a firm guarantee of what 'auto' will evaluate to be (probably just "something that compiles and works")

22:13 < naywhayare> and since armadillo has all these ridiculous types that can be cast back and forth for the purposes of delayed evaluation, different compiler versions are probably picking different armadillo class instantiations and it all goes to hell for whatever reason

22:13 < naywhayare> I may put down the time to find a minimal case where 'auto' causes issues with Armadillo, but for me at least it's pretty low priority

22:15 < naywhayare> (my possibly archaic opinion is that I just don't like auto because it can make code ambiguous to read, but that's not good enough justification to tell other people they shouldn't use it)

22:16 < stephentu> naywhayare: auto is for lazy people like me who dont use real IDEs so typing a full typename a::b<arg1, arg2, ...>::c::d blah blah is annoying

22:17 < stephentu> although in this case i saved like 5 characters

22:17 < stephentu> anyways, on an unrelated note, do you have any other ideas for mlpack projects

22:17 < naywhayare> real IDEs? who needs 'em! I use screen + vim :)

22:17 < stephentu> i'm in this fun mode of procrastinating from doing actual research

22:18 < naywhayare> hm, so I was really glad to see you work with the LRSDP code because originally I thought it would provide a nice easy way to solve machine learning problems, faster than using MATLAB + SeDuMi or whatever

22:18 < naywhayare> I had planned to make LMNN work and also MVU, but I could never get LRSDP to consistently converge

22:19 < naywhayare> then my research went a different direction and it became clear that I could no longer afford the time towards something which clearly wasn't going anywhere fast

22:19 < naywhayare> especially given my lack of knowledge of optimizers and numerical solvers (at the time. maybe I could do better now -- but maybe not)

22:20 < naywhayare> one challenge which might be suitable for you might be an SVM implementation, if you would find that interesting

22:20 < naywhayare> we used to have one, long ago, but it was broken and untested, so I removed it

22:21 < stephentu> is there any reason you dont just use libsvm

22:21 < stephentu> or whatever hte standard is called

22:21 < naywhayare> I don't mind dependencies for lower-level stuff, but I consider libsvm to be at the same level of abstraction as mlpack

22:22 < naywhayare> the shogun project wraps all kinds of other libraries at the same abstraction level, like cover trees and libsvm and liblinear and all sorts of stuff like that

22:23 < naywhayare> but in some sense this means that what shogun gives you is just a nice wrapper around those other libraries (and whether or not it's "nice" is in the eye of the beholder...)

22:23 < naywhayare> often, wrapping may also incur slowdowns when data structures need to be copied or modified

22:24 < naywhayare> my view is that by having our own implementation, we can provide something more flexible than the libsvm implementation

22:24 < naywhayare> there is some way to use trees to accelerate SMO, I think, but the person who was associated with that is long gone, and I wouldn't touch his code with a ten-foot stick anyway...

22:25 < stephentu> if i were to implement svms i'd just use SGD

22:25 < stephentu> so i might not be the person for that :)

22:26 < stephentu> how big are the matrices for MVU

22:26 < stephentu> i'm actually looking to try to implement an interior point solver for SDPs

22:26 < stephentu> (for fun)

22:26 < naywhayare> what originally brought me to MVU was this paper by a former labmate:

22:26 < stephentu> but if the MVU matrices are small then it might be better to use

22:27 < naywhayare> http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4785894

22:27 < naywhayare> I won't bother sending the PDF because the paper itself is horribly written and incomprehensible

22:27 < naywhayare> but, it contains a blurb that suggests that the speech signal can be unfolded into 3 dimensions, not 39 (which is what's usually used for speech processing)

22:28 < naywhayare> that would be an exciting result, but I couldn't replicate his results on the TIMIT database, which is something like 1.7 million points in 39 dimensions

22:29 < naywhayare> for LMNN I think people would want to use matrices of about the same size or larger (depending on the dataset); the dimensionality should not be too incredibly high for LMNN

22:29 < stephentu> i've never heard about this stuff so i'll check it out

22:29 < naywhayare> a comparison of Weinberger's LMNN reference implementation and a fast LRSDP implementation that converges to a similar result would bring a lot of eyes to the implementation and might actually be paperworthy on its own (maybe in a workshop?)

22:30 < naywhayare> do you want some references to read on what LMNN and MVU are?

22:30 < stephentu> sure

22:30 < stephentu> can you send to tu.stephenl@gmail.com

22:30 < naywhayare> or, I guess just those acronyms should be good enough to search; still, I'll send some to that email

22:34 < stephentu> thanks

22:34 < naywhayare> ok, sent

22:37 < stephentu> what do you work on?

22:38 < naywhayare> I've spent the past handful of years thinking about the class of algorithms known as dual-tree algorithms

22:38 < naywhayare> dual-tree algorithms are best known for providing quick solutions to problems like nearest neighbor search, but they can actually be applied to a pretty wide range of problems

22:39 < naywhayare> minimum spanning tree calculation, density estimation, k-means clustering, mixture model parameter estimation, maximum inner-product search, even approximate matrix multiplication

22:40 < naywhayare> the basic idea is that we'll build a query tree (for nearest neighbor, this is a tree built on the query points: those points for which we want the nearest neighbor) and a reference tree (for nearest neighbor, this is a tree built on the set of points we are searching in)

22:40 < naywhayare> then we'll do a dual-tree traversal where we visit combinations of nodes in the two trees

22:40 < naywhayare> and where we can, we'll prune away branches (which is what makes it fast)

22:41 < naywhayare> so, somewhat unsurprisingly, a lot of the algorithms in mlpack are dual-tree algorithms :)

22:41 < naywhayare> they used to be about all of the algorithms, but many of the other contributors left when they graduated from the lab, and since then people with different interests have appeared, so mlpack is a bit more general-purpose now :)

22:44 < stephentu> neat, this is a whole world that is unfamiliar to me

22:44 < naywhayare> people at NIPS sometimes tell me "that's not machine learning", and they're sort of right -- it's closer to computational geometry in some respects

22:45 < naywhayare> how about yourself? when I think of Ben Recht, I think of kernel embeddings, but I think that was years ago and he's probably doing something else now

22:45 < stephentu> our lab actually has very diverse interests

22:46 < stephentu> theres a core group fo people into compressed sensing

22:46 < stephentu> and they are the ones that give the weekly talks that go over my head

22:46 < naywhayare> when I think compressed sensing, I think "sparse L1-penalized regression"... is that in the right direction?

22:46 < naywhayare> there are people here at Georgia Tech who do compressed sensing, but I've never had a big interest in it (I guess I need to figure out how to do it with trees first)

22:47 < stephentu> maybe circa 2006

22:47 < naywhayare> yeah, I think my knowledge of it dates from about then too :)

22:47 < stephentu> a lot of them work on proving convergence of optimziation for SDPs

22:47 < stephentu> like one recent work was showing that the SDP for phase retrival

22:47 < stephentu> has a very nice natural 1st order method

22:47 < stephentu> with nice convergence rates

22:47 < stephentu> etc

22:48 < stephentu> myself i'm still pretty new

22:48 < stephentu> so trying ot figure out hte landscape

22:48 < naywhayare> it took me several years to get my feet on the ground and figure out what it was I wanted to do

22:48 < stephentu> i had a brief stint with bayesian non-parametrics this summer

22:48 < stephentu> but i think that excitement has died

22:49 < stephentu> now i'm into sdps and trying to think about hwo to actually make them work at scale

22:49 < stephentu> but that might be too ambitious

22:49 < stephentu> cause honestly LRSDP sucks

22:49 < stephentu> liek it works great when it works

22:49 < stephentu> but theres no theory

22:50 < naywhayare> you mentioned another SDP solving approach that was fast, but I can't remember the name of it

22:50 < stephentu> so dual ascent methods are pretty fast

22:50 < stephentu> but the problem is recovering the primal solution

22:50 < stephentu> is kind of annoying

22:50 < stephentu> and it only works when your problem has nice structure

22:50 < stephentu> specifically, when the trace of the feasible set is constant

22:51 < stephentu> so it works nice for graph combinatoric problems

22:51 < stephentu> but not for nuclear norm minimization

22:51 < naywhayare> I don't think it's too ambitious of an idea though

22:52 < naywhayare> certainly the payoff would be high if you figured something out :)

22:52 < stephentu> ya its one of those

22:52 < stephentu> peopel have been pounding at this for years

22:52 < naywhayare> do people use sampling approaches in practice for SDPs? i.e. Nystroem approximation type stuff

22:52 < naywhayare> if that makes any sense... I think it does ?

22:54 < stephentu> not sure

22:54 < stephentu> that is an interesting thought

22:56 < naywhayare> I'm not forming the thought completely, but one direction to consider might be some type of hierarchical approximation scheme

22:57 < naywhayare> where you come up with some approximate solution to the whole SDP matrix by selecting only some of the points in the dataset

22:57 < naywhayare> and then have additive higher-quality solutions on nearby subsets of the points

22:57 < naywhayare> or... something like that

22:57 < naywhayare> maybe there's nothing there, but maybe there is. I'm reminded of a recent ICML paper:

22:57 < naywhayare> http://jmlr.csail.mit.edu/proceedings/papers/v32/si14.html

22:58 < naywhayare> that's sort of hierarchical approximation of a kernel matrix, but the basic ideas there might be somehow applicable. I am not an optimization expert, though, so maybe my thoughts are entirely in outer space :)

23:07 < naywhayare> anyway, it's getting late... headed home for now

23:19 < stephentu> thanks for the thoughts-- i have no intuition if SDPs can actually be approximated reasonably well (unlike the case for SVMs and other classification problems).

23:20 < stephentu> recently , ben gave me a bunch of papers relating to condition number of SDPs

23:20 < stephentu> i think that may be the way to go to try to capture the notion of "stability" of an SDP

23:20 < stephentu> but these papers are pretty deep and haven't been able to wrap my head aroudn them