< naywhayare> stephentu: yeah, I was planning on submitting an application
< naywhayare> want to be a mentor? :)
< naywhayare> we'll probably have fewer students this year... last year I mentored three or four, it was a terrible idea; took way more time than I had expected
< naywhayare> this year I'll mentor one...
< stephentu> naywhayare: i'm happy to be a mentor
< stephentu> i have a few neat ideas
< stephentu> parallel optimization
< stephentu> is the theme
< stephentu> (they arent novel ideas, just it'd be great to get this stuff implemented solidly)
< stephentu> also the manifold stuff since i might not be able to get to it
< stephentu> shit is starting to pick up
< stephentu> also another idea is like trying to make mlpack more composable like sklearn
< stephentu> the pipelines abstraction
< stephentu> but thats more of a software engineering problem
< stephentu> also mlpack is very optimization / dual tree heavy
< stephentu> we're kind of light on the whole bayesian thing
< stephentu> graphical model
< stephentu> i'm not the biggest fan of that stuff, but i would be happy to mentor some implementations of say np bayes
< stephentu> even just LDA might be useful
< naywhayare> sure, I could agree with that
< naywhayare> we'll have to make sure the implementations are high-quality and fast
< naywhayare> thinking of which, I'm hoping to spend some time soon writing up more detailed style/coding guidelines, and I wouldn't mind having you (and marcus and whomever else) take a look at them too
< naywhayare> imitating scikit's functionality is fine, but we have to make sure there is some compelling reason to use mlpack instead of scikit in those cases
< stephentu> sure
< stephentu> i might be more of a fan of immutability than you are
< stephentu> but happy to take a look
< naywhayare> the idea being that once you create an optimizer/classifier/regressor/some other machine learning object, its parameters cannot be changed?
< stephentu> something like that. like it kind of bothers me that for the SDP object, you can just change the C and Ai matrices like that
< stephentu> you can even make the Ais and bs inconsistent w/ each other
< stephentu> (eg different lengths)
< naywhayare> well, I mean, we have to assume some intelligence on the part of the user
< naywhayare> but I do see what you mean
< stephentu> haha
< stephentu> but honestly after using it
< stephentu> i have found it much easier
< stephentu> to deal with
< stephentu> mutating in place
< stephentu> versus
< stephentu> constructing all args
< stephentu> and then using std::move
< stephentu> its ok now, but if we want to move to parallelism
< stephentu> having immutable objects makes reasoning about code so much easier
< stephentu> do gsoc mentors get anything?
< stephentu> or is it all charity
< naywhayare> Google pays $500 per mentor plus t-shirt and possibly a trip to the mentor summit in October in San Jose (each organization gets to send two members)
< stephentu> san jose
< stephentu> that sounds so exciting
< stephentu> </sarcasm>
< naywhayare> well, for you it is not as exciting as for someone who has never been to California :)
< naywhayare> really though, the summit is a lot of fun
< naywhayare> even though San Jose isn't very exciting
< naywhayare> (it is walkable though -- I had to run halfway across town to buy a laptop because mine abruptly died there last year)
< naywhayare> (it took... about an hour to get to the nearest best buy... sigh)
< stephentu> uber?
< naywhayare> I was being cheap :)
< stephentu> i'm sure its fun i was just kidding
< naywhayare> :)
< naywhayare> they closed down Great Adventure for us last year, it was a good time
< stephentu> oh teh amusement park next to teh hilton?
< naywhayare> yeah
< naywhayare> they didn't have all the rides open, but still, it was a great time, no lines or anything
< stephentu> ya i mean google has quite a bit of clout in the silicon valley
< naywhayare> marcus and I met up with the shogun guys:
< naywhayare> it's true :)
< naywhayare> we didn't find anyone from scikit or any other machine learning libraries though
< stephentu> hey which one are you
< stephentu> i realize i have no idea what you look like
< naywhayare> leftmost, the least european of the bunch :)
< stephentu> nice
< naywhayare> marcus is next to me, then the three shogun guys (one of them must have won the lottery for extra slots I guess)
< naywhayare> the whole program is a lot of fun, but it can be a bit of a time sink depending on the student
< stephentu> we get to filter students though right?
< stephentu> i actually did gsoc in undergrad
< stephentu> as a participant
< naywhayare> the first year I mentored Marcus and one other student... Marcus basically didn't need any help from me, and the other student simply disappeared
< naywhayare> ah, neat! for which organization?
< stephentu> scala
< naywhayare> and yeah, we get to select
< stephentu> i was trying to beef up the remote actors implementation
< stephentu> but it toko about the entire summer to get it working
< stephentu> and then i never got to really integrate it
< stephentu> b/c actually writing real software is much harder
< naywhayare> last year I think we had 30-40 applications and accepted 5... of those, probably only 6-8 were actually competitive
< naywhayare> yeah
< naywhayare> a summer isn't very much time to really get something baked into the code
< stephentu> ya it was way too ambitious
< naywhayare> the CF code has gone through two summers now and still needs some more cleanup
< stephentu> so ya id liek to see the other side of it
< naywhayare> one of my next free-time projects was going to be to sit down with the CF code, integrate it all, and write up a nice tutorial on how to use CF with numerous different factorizations
< naywhayare> yeah, the time contribution is a bit picky
< stephentu> and we still need to hook up matrix completion
< stephentu> to CF
< naywhayare> sorry... that didn't make any sense... let me try again:
< naywhayare> the time contribution is a bit tricky
< naywhayare> like I said my two students in the first year needed basically nothing
< naywhayare> so the next year I mentored four, thinking it would be about the same
< stephentu> didtns cale
< stephentu> *scale
< stephentu> ?
< naywhayare> well, the students needed more help for their projects
< naywhayare> so I was spending like 15-20 hours a week on GSoC
< stephentu> gross
< naywhayare> a lot got done for mlpack, but I didn't get much research done...
< stephentu> so i was thinking since i'll prob be doign research
< stephentu> and studying for prelims
< stephentu> this woudl be a good way to breka up teh routine
< stephentu> a
< stephentu> bit
< naywhayare> yeah
< naywhayare> with just one (or even two students) it's pretty nice, maybe a handful of hours here or there to help them out
< stephentu> ya i'm thinking one good student
< naywhayare> I felt stretched really thin last summer; I didn't manage to get super-involved with any of the students
< stephentu> who is ambitious
< naywhayare> there's some code that I still haven't managed to look at and it's what, five months later?
< stephentu> even better if i can recruit them from berkeley undergrads
< naywhayare> :)
< naywhayare> I got to meet one of the students last year... he's an undergraduate at CUA in DC, and I was passing through, so I got some lunch with him
< naywhayare> there really aren't very many applicants from the US, generally, though
< naywhayare> one of the reasons is that it's not too hard to pull down ~$6k or so in a summer doing just a regular internship with a software company or whatever
< stephentu> oh ya you take a pay cut
< stephentu> but i mean
< stephentu> who wants to write liek
< stephentu> another web app
< naywhayare> it's true
< stephentu> we're doing some cool shit here
< naywhayare> machine learning is a bit of a niche too, so I think we see fewer of the really excited open-source people as a result
< stephentu> actually i was at facebook though and it was a lot of fun
< stephentu> but i got really lucky
< naywhayare> well... maybe? I don't know what the typical applicant pool for GSoC looks like
< stephentu> and got to hack on the compiler
< naywhayare> that does sound like a good time
< stephentu> but most peopel were doign boring stuff
< stephentu> like writing php apps
< naywhayare> in 2010 I interned at Google, with the Similar Pages team, who were having me run some big mapreduces
< naywhayare> they took three days to run
< naywhayare> but my team didn't have anything else for me to do
< naywhayare> so... I played a ___lot___ of pinball
< stephentu> lol
< stephentu> oh internships
< stephentu> machine learning is niche?
< stephentu> i feel like
< stephentu> everybody out here
< stephentu> is doing machine learning
< naywhayare> I mean, I asked repeatedly, "I want to hunt bugs! Give me something to do!", and I got a lot of "well, we'll look into that... have you checked out this cafe yet?"
< naywhayare> you live in a bubble :)
< stephentu> well i used to live in cambridge, MA
< stephentu> and it was like that too
< naywhayare> cambridge is also a bubble...
< stephentu> when i was at MIT
< stephentu> everyobyd did machine learning
< naywhayare> yeah
< naywhayare> MIT :)
< stephentu> i guess i only live in bubbles
< stephentu> i jsut assume like everybody has gone through the derivation of SVM
< stephentu> in their free time
< naywhayare> maybe a better way to put what I meant is that despite the fact that everyone is talking about machine learning, the set of people who can actually write and understand machine learning algorithms is quite small
< stephentu> then why are there so many fucking papers in NIPS
< stephentu> haha
< stephentu> its insane
< stephentu> i guess the few who understand
< stephentu> really understand
< stephentu> and crank stuff out like machines
< naywhayare> from a NIPS perspective, people look like they are machines because they are part of a machine (a big research lab)
< naywhayare> speaking from experience, without a lab, it's very hard to get anything published at NIPS
< naywhayare> in part because as a one-man show it's very hard to keep your finger on the pulse of what all reviewers will say
< naywhayare> and also because in a lab, risk is distributed because papers have many co-authors
< naywhayare> so you can have some people working on the crazier ideas that might not work, and some others work on the solid ideas that might only be incremental
< naywhayare> but my opinion is based on my experience, which is only my experience and a small subset of all experience, so, YMMV :)
< stephentu> i see
< stephentu> thats interesting
< naywhayare> maybe? I dunno, it's certainly not guaranteed to be a well-informed opinion :)
< naywhayare> anyway... I'm headed to bed. I hope to find some time to work on mlpack tomorrow, but I'm pretty underwater with the ICML deadline coming up
< stephentu> later dude
