verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
< stephentu> naywhayare: so it seems that things like laplacian eigenmaps
< stephentu> shoudl scale pretty well
< stephentu> for large dataset sizes
< stephentu> since its just sparse eigenvector problem
< stephentu> do you have a sense for why something like an MVU SDP is worth it?
< naywhayare> better results, maybe? at the time I pursued it because a senior labmate (the Nick Vasiloglou whose papers I linked) thought it was a good idea and I didn't know how to evaluate if it was a good idea or if there was something easier to do
< stephentu> right this is the problem i'm having is
< stephentu> there are like 20 different manifold methods
< stephentu> and there is no real way to evaluate them
< stephentu> and like everybody does swiss roll
< stephentu> but htey all seem to work on swiss roll
< stephentu> and after that its like well... images?
< stephentu> maybe ill try on some image datasets
< naywhayare> you kind of have to set up a bit of a pipeline... map to low-dimensional spaces, then train a classifier
< naywhayare> but one of the issues there is that if you want to do that Right, you need out-of-sample extensions so that you can map new points to the low-dimensional spaces
< naywhayare> there are some techniques for this, but I don't know how good they are: http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2003_AA23.pdf
< naywhayare> I think I've mentioned that paper before though
< naywhayare> speech is an interesting problem it might be applied to, but really any common machine learning problem that kernel methods perform well on should provide a decent challenge, I think
< stephentu> ya i saw that problem
< stephentu> *paper
< stephentu> ok i think maybe the question i'll try to ansewr is
< stephentu> are therea ny non trivial problems where the MVU does better than soemthing like laplacian eigenmaps
< naywhayare> I am interested to know what you find out
< stephentu> naywhayare: might be a while haha
< stephentu> but i think mlpack will benefit from this
< stephentu> i'll try to dogfood myself for a while
< stephentu> w/o going back to python haha
< naywhayare> yeah... if you notice things that are poorly designed or ugly to implement, please, we can make it better :)
< naywhayare> I've done my best to make the API easy for experiments like this, but also with time constraints; it can always be improved :)
< stephentu> yep will do
< stephentu> its really hard to compete w/ sklearn though
< stephentu> its just too easy to use
< stephentu> then you build huge pipelines around it
< stephentu> and get locked in
< naywhayare> yeah, and that's almost more of a Python vs. C++ difference (and it also has to do with the number of users sklearn has, which helps them with documentation and also having ready-to-go scripts and stuff like that)
< naywhayare> I think people can be convinced to convert to mlpack assuming that mlpack either offers much faster algorithms, or offers algorithms that are closer to the state-of-the-art
< naywhayare> both of these things take a lot of developer time, though...
< stephentu> the most precious of resources
curiousguy13 has quit [Ping timeout: 240 seconds]
< stephentu> naywhayare: so arpack has a way of solving generalized eigenvalue problems
< stephentu> Ly = lambda D y
< stephentu> where L, D are psd matrices
< stephentu> do you have a sense of how hard it woudl be to hook up armadillo code to use it
< stephentu> i assume it uses arpack to solve eigenvalue problems for sparse matrices?
< stephentu> laplacian eigenmaps requires solving such a generalized eigenvalue problem
curiousguy13 has joined #mlpack
jbc__ has joined #mlpack
jbc_ has quit [Ping timeout: 245 seconds]
jbc__ is now known as jbc_
stephentu has quit [Ping timeout: 252 seconds]
curiousguy13 has quit [Ping timeout: 265 seconds]
curiousguy13 has joined #mlpack
dhfromkorea has joined #mlpack
dhfromkorea has quit [Remote host closed the connection]
dhfromkorea has joined #mlpack
curiousguy13 has quit [Ping timeout: 246 seconds]
curiousguy13 has joined #mlpack
dhfromkorea has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
stephentu has joined #mlpack
< naywhayare> stephentu: Armadillo does wrap ARPACK via eigs_sym() and eigs_gen(), but that doesn't do generalized eigenvalue problems
dhfromkorea has joined #mlpack
< naywhayare> I think the assumption is that D = I, for what Armadillo has implemented
< naywhayare> it certainly wouldn't be hard to extend the code that's there to do laplacian eigenmaps
< stephentu> naywhayare: so whats teh right way to do that?
< stephentu> modify armadillo?
< naywhayare> you could modify armadillo and add a function to src/mlpack/core/arma_extend/
< naywhayare> take a look at sp_auxlib_meat.hpp, either eigs_sym() or eigs_gen(), and you can see how I wrapped it for armadillo
< naywhayare> (ARPACK is frustrating and complex, but at the very least I've done the FORTRAN wrapping and you should be able to pretty easily adapt the code that's there)
< stephentu> cool thanks
< stephentu> ya i'm well aware of the fun of using ARPACK
< stephentu> or any fortran library
< stephentu> they really love those 17 parameter functions
< naywhayare> yeah...
< naywhayare> the "reverse communication interface" really entertains me
< stephentu> would you recommend adding a new function
< stephentu> like geneigs_sym
< stephentu> or modfiying eigs_sym
< naywhayare> I'd add a new function
< naywhayare> ah, Armadillo has "eig_pair()" which solves Ax = Bx\lambda, which I think is the same system you're solving
< naywhayare> so you could call it "eigs_pair()"
< naywhayare> ah, that'll make things easy for patching Armadillo upstream
< stephentu> ah
< stephentu> i see
< stephentu> so it works for dense matrices
< stephentu> but not sparse
< stephentu> got it
< naywhayare> yeah, eig_pair() probably uses whatever LAPACK functionality is there, and almost certainly finds _all_ the eigenvalues
< naywhayare> ARPACK will actually work with dense matrices too, and I have been thinking of extending Armadillo so that eigs_sym() and eigs_gen() both work with dense arguments
stephentu_ has joined #mlpack
dhfromkorea has quit [Read error: Connection reset by peer]
dhfromkorea has joined #mlpack
dhfromkorea has quit [Remote host closed the connection]
< stephentu_> naywhayare: oh god this arpack stuff is horrible
dhfromkorea has joined #mlpack
< naywhayare> stephentu_: yeah, it can be. one of the nice things is that the arma::arpack namespace wraps most of the types so you don't need snaupd(), dnaupd(), znaupd(), and cnaupd(), just 'naupd()'
stephentu_ has quit [Ping timeout: 245 seconds]
stephentu_ has joined #mlpack
< naywhayare> okay, I think that the weird networking issue it solved
< naywhayare> *is solved
< naywhayare> it was either a bad ethernet cable or a bad port in the wall (potentially, and probably, both)
< naywhayare> very weird problem to track down, and I'm still not 100% sure that it's solved (I can't rule out some configuration change elsewhere on campus having caused, and then later fixed, the problem, but it sure seems like I could reproduce it by changing cable/wall port)
< zoq> Great to hear that you 'potentially' fixed the issue ... I guess the problem cost you a lot of time.
< naywhayare> I spent most of the time waiting on the networking folks to prepare a new switch to try and check the logs on their end
< naywhayare> I think the majority of what I did was unplug and replug cables for a few minutes, and do lots of pings inside while(true) loops :)
< zoq> Okay, sounds like the network team should take all the credits :)
sumedhghaisas has quit [Ping timeout: 245 seconds]
dhfromkorea has quit [Remote host closed the connection]
< naywhayare> :)
curiousguy13 has quit [Ping timeout: 252 seconds]
curiousguy13 has joined #mlpack
stephentu_ has quit [Ping timeout: 240 seconds]
stephentu_ has joined #mlpack