verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
< stephentu>
naywhayare: so it seems that things like laplacian eigenmaps
< stephentu>
shoudl scale pretty well
< stephentu>
for large dataset sizes
< stephentu>
since its just sparse eigenvector problem
< stephentu>
do you have a sense for why something like an MVU SDP is worth it?
< naywhayare>
better results, maybe? at the time I pursued it because a senior labmate (the Nick Vasiloglou whose papers I linked) thought it was a good idea and I didn't know how to evaluate if it was a good idea or if there was something easier to do
< stephentu>
right this is the problem i'm having is
< stephentu>
there are like 20 different manifold methods
< stephentu>
and there is no real way to evaluate them
< stephentu>
and like everybody does swiss roll
< stephentu>
but htey all seem to work on swiss roll
< stephentu>
and after that its like well... images?
< stephentu>
maybe ill try on some image datasets
< naywhayare>
you kind of have to set up a bit of a pipeline... map to low-dimensional spaces, then train a classifier
< naywhayare>
but one of the issues there is that if you want to do that Right, you need out-of-sample extensions so that you can map new points to the low-dimensional spaces
< naywhayare>
I think I've mentioned that paper before though
< naywhayare>
speech is an interesting problem it might be applied to, but really any common machine learning problem that kernel methods perform well on should provide a decent challenge, I think
< stephentu>
ya i saw that problem
< stephentu>
*paper
< stephentu>
ok i think maybe the question i'll try to ansewr is
< stephentu>
are therea ny non trivial problems where the MVU does better than soemthing like laplacian eigenmaps
< naywhayare>
I am interested to know what you find out
< stephentu>
naywhayare: might be a while haha
< stephentu>
but i think mlpack will benefit from this
< stephentu>
i'll try to dogfood myself for a while
< stephentu>
w/o going back to python haha
< naywhayare>
yeah... if you notice things that are poorly designed or ugly to implement, please, we can make it better :)
< naywhayare>
I've done my best to make the API easy for experiments like this, but also with time constraints; it can always be improved :)
< stephentu>
yep will do
< stephentu>
its really hard to compete w/ sklearn though
< stephentu>
its just too easy to use
< stephentu>
then you build huge pipelines around it
< stephentu>
and get locked in
< naywhayare>
yeah, and that's almost more of a Python vs. C++ difference (and it also has to do with the number of users sklearn has, which helps them with documentation and also having ready-to-go scripts and stuff like that)
< naywhayare>
I think people can be convinced to convert to mlpack assuming that mlpack either offers much faster algorithms, or offers algorithms that are closer to the state-of-the-art
< naywhayare>
both of these things take a lot of developer time, though...
< stephentu>
the most precious of resources
curiousguy13 has quit [Ping timeout: 240 seconds]
< stephentu>
naywhayare: so arpack has a way of solving generalized eigenvalue problems
< stephentu>
Ly = lambda D y
< stephentu>
where L, D are psd matrices
< stephentu>
do you have a sense of how hard it woudl be to hook up armadillo code to use it
< stephentu>
i assume it uses arpack to solve eigenvalue problems for sparse matrices?
< stephentu>
laplacian eigenmaps requires solving such a generalized eigenvalue problem
curiousguy13 has joined #mlpack
jbc__ has joined #mlpack
jbc_ has quit [Ping timeout: 245 seconds]
jbc__ is now known as jbc_
stephentu has quit [Ping timeout: 252 seconds]
curiousguy13 has quit [Ping timeout: 265 seconds]
curiousguy13 has joined #mlpack
dhfromkorea has joined #mlpack
dhfromkorea has quit [Remote host closed the connection]
dhfromkorea has joined #mlpack
curiousguy13 has quit [Ping timeout: 246 seconds]
curiousguy13 has joined #mlpack
dhfromkorea has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
stephentu has joined #mlpack
< naywhayare>
stephentu: Armadillo does wrap ARPACK via eigs_sym() and eigs_gen(), but that doesn't do generalized eigenvalue problems
dhfromkorea has joined #mlpack
< naywhayare>
I think the assumption is that D = I, for what Armadillo has implemented
< naywhayare>
it certainly wouldn't be hard to extend the code that's there to do laplacian eigenmaps
< stephentu>
naywhayare: so whats teh right way to do that?
< stephentu>
modify armadillo?
< naywhayare>
you could modify armadillo and add a function to src/mlpack/core/arma_extend/
< naywhayare>
take a look at sp_auxlib_meat.hpp, either eigs_sym() or eigs_gen(), and you can see how I wrapped it for armadillo
< naywhayare>
(ARPACK is frustrating and complex, but at the very least I've done the FORTRAN wrapping and you should be able to pretty easily adapt the code that's there)
< stephentu>
cool thanks
< stephentu>
ya i'm well aware of the fun of using ARPACK
< stephentu>
or any fortran library
< stephentu>
they really love those 17 parameter functions
< naywhayare>
yeah...
< naywhayare>
the "reverse communication interface" really entertains me
< stephentu>
would you recommend adding a new function
< stephentu>
like geneigs_sym
< stephentu>
or modfiying eigs_sym
< naywhayare>
I'd add a new function
< naywhayare>
ah, Armadillo has "eig_pair()" which solves Ax = Bx\lambda, which I think is the same system you're solving
< naywhayare>
so you could call it "eigs_pair()"
< naywhayare>
ah, that'll make things easy for patching Armadillo upstream
< stephentu>
ah
< stephentu>
i see
< stephentu>
so it works for dense matrices
< stephentu>
but not sparse
< stephentu>
got it
< naywhayare>
yeah, eig_pair() probably uses whatever LAPACK functionality is there, and almost certainly finds _all_ the eigenvalues
< naywhayare>
ARPACK will actually work with dense matrices too, and I have been thinking of extending Armadillo so that eigs_sym() and eigs_gen() both work with dense arguments
stephentu_ has joined #mlpack
dhfromkorea has quit [Read error: Connection reset by peer]
dhfromkorea has joined #mlpack
dhfromkorea has quit [Remote host closed the connection]
< stephentu_>
naywhayare: oh god this arpack stuff is horrible
dhfromkorea has joined #mlpack
< naywhayare>
stephentu_: yeah, it can be. one of the nice things is that the arma::arpack namespace wraps most of the types so you don't need snaupd(), dnaupd(), znaupd(), and cnaupd(), just 'naupd()'
stephentu_ has quit [Ping timeout: 245 seconds]
stephentu_ has joined #mlpack
< naywhayare>
okay, I think that the weird networking issue it solved
< naywhayare>
*is solved
< naywhayare>
it was either a bad ethernet cable or a bad port in the wall (potentially, and probably, both)
< naywhayare>
very weird problem to track down, and I'm still not 100% sure that it's solved (I can't rule out some configuration change elsewhere on campus having caused, and then later fixed, the problem, but it sure seems like I could reproduce it by changing cable/wall port)
< zoq>
Great to hear that you 'potentially' fixed the issue ... I guess the problem cost you a lot of time.
< naywhayare>
I spent most of the time waiting on the networking folks to prepare a new switch to try and check the logs on their end
< naywhayare>
I think the majority of what I did was unplug and replug cables for a few minutes, and do lots of pings inside while(true) loops :)
< zoq>
Okay, sounds like the network team should take all the credits :)
sumedhghaisas has quit [Ping timeout: 245 seconds]
dhfromkorea has quit [Remote host closed the connection]