verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
stephentu has joined #mlpack
< stephentu>
naywhayare: there are two kinds of ML algorithms
< stephentu>
ones taht you can prove shit about
< stephentu>
and ones that work in practice
< stephentu>
if you try to do both your life really sucks
< stephentu>
haha
< naywhayare>
stephentu: truth! :)
< naywhayare>
one of the things that has been really interesting me lately has been deriving runtime bounds that depend on easily calculable measures of the dataset
< naywhayare>
worst-case analysis is somewhat pointless: you never have a worst-case dataset in practice (well... depending, but generally, yeah, datasets are well-behaved to some extent)
< naywhayare>
so if I can come up with a measure of "goodness" of a dataset for a particular task, and I know how algorithms fare with respect to that measure, I can make an informed decision about which algorithm to use
< naywhayare>
in the nearest neighbor search literature, at least, there is not very much in this vein right now
< naywhayare>
I'm hopeful that I can find some time to think about this type of stuff in the future :)
< naywhayare>
what is the algorithm that you are working on? a matrix completion algorithm? :)
< stephentu>
naywhayare: that'd be an awesome thing to have, seems quite difficult for optimization algorithms
< stephentu>
naywhayare: this stupid idea to try to make stochastic algorithms parallelizable and still equivalent to some serial order
< stephentu>
naywhayare: there are enoughn matrix completion algorithms
< naywhayare>
ah, okay
< stephentu>
the craziest one i've seen being
< naywhayare>
well hopefully the submission goes well :)
< stephentu>
gradient descent on an orthogonal group manifold
< stephentu>
as if anybody is actually going to implement that
< naywhayare>
heh
< stephentu>
the idea here is
< stephentu>
what if we didnt have to write a new NIPS/ICML paper
< stephentu>
every time we wanted to parallelize some ML algorithm
< stephentu>
although its unclear ot me if thats a desirable goal
< stephentu>
since we'll have less things to work on
< naywhayare>
maybe, but it's worth noting that most abstractions of that type ("free parallelization!") come at some cost, and someone could come in and do a better job by hand
< naywhayare>
still, I think it would be a great thing to have, if it works
< stephentu>
oh ya
< stephentu>
it has
< stephentu>
not just some cost
< stephentu>
but
< stephentu>
A LOT of cost
< stephentu>
thats why i've been raging
< stephentu>
trying to figure out clever ways to cut teh costs
< stephentu>
there really is no free lunch :(
< naywhayare>
yeah; at the very least, if you can say "the cost is overridden if you have enough processors", then that's at least good enough
< stephentu>
well in our case we'r ehoping its the cost is overriden if your gradients are super expensive
< stephentu>
like if the gradient is ComputeDigitsOfPi(10000)
< naywhayare>
ah, okay, I see
< stephentu>
then i think we're good
< stephentu>
or like mine bitcoins
< naywhayare>
I've had to spend a lot of space in the NIPS paper I'm working on outlining the use case
< naywhayare>
it's a k-means algorithm, and it's fast only when k is large and also the number of points is large
< naywhayare>
but justifying that k is actually large in practice... that took some digging to find out that people were actually doing that
< stephentu>
interesting
< naywhayare>
I feel like I have to devote an entire page of the submission to justifying that though, otherwise a reviewer goes, without searching, "yeah but nobody is actually doing that"
< stephentu>
i applaud you for still innovating on k-means
< stephentu>
i woudl have thought everything was done
< naywhayare>
that's what I thought too; personally, I don't think there's any huge innovation
< naywhayare>
I've taken off-the-shelf dual-tree nearest neighbor search techniques, observed that k-means is nearest neighbor search, and put the two together
< naywhayare>
plus some other exploitation of the problem to get some additional speedup, like, sometimes the nearest centroid of a point can't change between iterations
< naywhayare>
you need low-dimensional data (like, below 50 dimensions), large datasets (several hundred thousand and up), and large k (500? 1000? in that ballpark or larger) to see any speedup though
< naywhayare>
we'll see if the reviewers think it's interesting :) last year, they didn't, but I've since improved the algorithm significantly
< stephentu>
naywhayare: sounds cool
< stephentu>
good luck!
< stephentu>
i'll be going ot NIPS regardless
< stephentu>
so we can hang out in teh cold
< naywhayare>
:)
< naywhayare>
I should be there too
< naywhayare>
I'll be at ICML in france also, if you are going
< naywhayare>
anyway, I'm getting pretty tired... gonna call it a night