sumedh has joined #mlpack
sumedh has quit [Client Quit]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 252 seconds]
Android53 has joined #mlpack
< naywhayare> man, where is everybody? figured people would be showing up by now...
< naywhayare> wait, what the heck? Doodle gave me an incorrect time! it seems to think 11:30pm eastern time is 4:30 UTC, but it's actually 3:30 UTC!
< naywhayare> some kind of daylight savings time issue
< naywhayare> no, I'm the one who's wrong. I converted to London time to get the UTC time, but... London isn't UTC, it's BST
< naywhayare> I had made the assumption London was on GMT... I don't know why...
< marcus_zoq> hm
< naywhayare> the only person not able to make it now will be Andrew, unless he decides to stay up late
< naywhayare> so the consequences aren't terrible
< naywhayare> anyway, I've learned my lesson today: London != UTC
< marcus_zoq> Okay, see you in an hour :)
< naywhayare> actually, it's more complex than that; in the summer, London time is BTC, but in the winter it's GMT
< naywhayare> daylight savings time is frustrating and overcomplex, in my opinion...
< marcus_zoq> you're right
< Android53> Sorry hit the wrong button
ajinkya has joined #mlpack
udit_s has joined #mlpack
oldbeardo has joined #mlpack
sumedhghaisas has joined #mlpack
< naywhayare> I'll wait a few more minutes for Anand to show up and maybe Andrew will too
< naywhayare> it turns out I screwed up -- 3:30 UTC was the time that everyone could make
< naywhayare> but due to some screwups in time zone conversions I made it 4:30 UTC
< naywhayare> so... sorry about that :(
< oldbeardo> heh, that's what I was wondering, as to why you kept it an hour late
< naywhayare> I'll give it until 4:35 UTC then I'll go ahead and start
< naywhayare> ok, well, 4:35 UTC it is.
< naywhayare> So, first of all, welcome to the Google Summer of Code! Congratulations again on your acceptance.
< naywhayare> If at some point I say something that's unclear, feel free to immediately ask a question, so we can get things cleared up immediately
< naywhayare> Right now we're in the community bonding period. This lasts until May 19, which is when coding starts. Then, on June 27th, there is the midterm evaluation, and coding ends on August 18th (then the code gets submitted to Google)
< naywhayare> Next I'll talk briefly about student expectations and mentor expectations, so that we can be sure that we're all on the same page
< naywhayare> Students are expected to work a full work week; it's okay if some weeks you work more or less than others, but in the end the amount of time put in should be equivalent to a full-time job
< naywhayare> Regular contact and communication with mentors is expected; preferably via IRC, but email is okay too. IRC is real-time, so it's often better for debugging and helping out
< naywhayare> Also, students are expected to send a weekly email to the mailing list reporting their progress. These should be concise reports; they are for the community (or whoever is on the mailing list) to have an idea of how your work is progressing
< naywhayare> Here's an example written by Marcus last year: https://mailman.cc.gatech.edu/pipermail/mlpack/2013-July/000142.html (there are lots more examples too)
< naywhayare> Please give advance warning to mentors, in case you will be away for a day or two. There's no need to let a mentor know every time that you get up for lunch or whatever, but if you'll be uncontactable for a while, we should know.
< naywhayare> Disappearing students are a common GSoC problem, so keeping in regular contact helps prevent these types of situations.
< naywhayare> We don't expect that every student completes everything laid out in their proposal. Sometimes the reality of the projects is a little more difficult. When this happens, the student and the mentor will work together to figure out what needs to be changed.
< naywhayare> We really hope that we don't have to fail any students, but we will fail a student who disappears, doesn't appear to be working the expected amount of time, or who is otherwise seriously underperforming.
< naywhayare> If any of those situations occur, the student will be made aware with warnings, so failure will not be unexpected. We hope we've chosen you guys well enough that this won't happen, but we need to lay it out so that we are all clear.
< naywhayare> Now, expectations for mentors (including myself): mentors should be in regular contact with students and be able to provide reasonably quick responses (i.e. within a day or two, if not in real-time)
< naywhayare> Mentors should work out with their students times in which they are both available regularly to work with each other, and they should be willing to help debug technical and code problems.
< naywhayare> The mentor shouldn't be doing the majority of the work, of course, but mentors are there to help and their services should be utilized when necessary.
< naywhayare> Mentors should also notify students if they'll be away for some amount of time. This is particularly relevant for me, because I'm prone to disappearing. I will be more up front with my plans over the summer so that this doesn't happen :)
< naywhayare> Lastly, the first part of the summer is the most important for helping students get started.
< naywhayare> Ok, so, before I move on, any questions about expectations? Hopefully I haven't put you all to sleep :)
< marcus_zoq> I'm awake :)
< naywhayare> Tough crowd :) anyway, moving on, I wanted to give a bit of history about the library
< sumedhghaisas> Me too I guess :)
< naywhayare> mlpack was originally designed in 2007/2008 by a group at Georgia Tech called the FASTLab. It was called "FASTLIB/MLPACK" at the time, and was primarily a collection of experimental research algorithms
< naywhayare> I wasn't involved at that time. I was hired by the FASTLab in late 2009 and took over primary maintenance from the other group, who had mostly graduated.
< naywhayare> I spent most of 2010 and 2011 working with a team of some others (mainly James Cline and Neil Slagle, neither of whom are around anymore) reworking the basic design decisions behind the library
< naywhayare> During this process we produced a design document for the future; it's now a few years old, but explains some design decision. Maybe some of you have already seen it, but if you are interested:
< naywhayare> For the most part, until 2013, the group working on mlpack was mainly the FASTLab, at Georgia Tech in Atlanta (where I am)
< naywhayare> at the end of 2011, mlpack 1.0.0 was released with a corresponding paper at a NIPS workshop; later, in 2013, a JMLR paper. Also in 2013, we were accepted to GSoC, which has really helped the library become more than a small graduate student project
< naywhayare> but I want to clarify what happened with the FASTLab a little bit, because I often talk to people who are confused about it. many of the FASTLab students, when they graduated, founded a company that is now called Skytree
< naywhayare> it seems to be a fairly successful machine learning startup. but, when they did that, they did not maintain contact with mlpack and so Skytree isn't affiliated with mlpack (and vice versa)
< naywhayare> even so, some people who are now at Skytree occasionally contribute to mlpack (it's still LGPL code though, of course)
< naywhayare> so anyway, where I am going with this is that I want to say that mlpack is no longer a project centered at Georgia Tech, and we have contributors from all around the world (including you!)
< naywhayare> when I talk to other researchers at conferences, more and more are familiar with mlpack, and better yet, those of them that have used it always praise it
< naywhayare> so, we are doing a good job!
< naywhayare> I also want to point out that there are a few active contributors who don't hang out on IRC. Parikshit Ram, a former FASTLabber, now at Skytree, is one. Another is Ryan Birmingham (sometimes in here as 'birm') who is an undergraduate student working with me
< naywhayare> and also Michael Fox, who is working for a hedge fund in New York. I say this so that if you see random names you don't recognize in the commit logs, you can know who they are :)
< naywhayare> so that's an abridged history and a quick description of how things are (at least in my view)
< naywhayare> Lastly, I had a few random things to mention:
< naywhayare> we have a build farm (http://www.ratml.org/misc_img/build_farm.jpg) that is in my lab that I maintain. I hook up random computers that I find. If you need accounts on any of the systems there for testing purposes, let me know
< naywhayare> There are sparc64 and sparc32 systems, and I also want to add an ARM phone and an old powerpc box. These are useful for testing different configurations of mlpack (albeit maybe unnecessary, but I have fun :))
< naywhayare> these are all hooked up to the build server: http://big.cc.gt.atl.ga.us/
< naywhayare> The Trac wiki has some useful links:
< naywhayare> http://www.mlpack.org/trac/wiki/NewStyleGuidelines -- code style guidelines
< naywhayare> http://www.mlpack.org/trac/wiki/UsingCMake -- a quick CMake tutorial I wrote a long time ago
< naywhayare> http://www.mlpack.org/trac/wiki/MLPACKOnWindows -- compiling mlpack on Windows (it isn't fun or easy)
< naywhayare> http://www.mlpack.org/trac/wiki/AutomaticBenchmark -- Marcus' documentation on the automatic benchmarking system, deployed to http://www.mlpack.org/benchmark.html
< naywhayare> if you find any bugs, feel free to fix them! or, if you want to solve some Trac tickets that are unclear, feel free to ask for clarification
< naywhayare> I'd like to release mlpack 1.0.9 in the next month or so, so I'll be preparing that release. if anyone is interested in helping out with the various tasks that need to happen, let me know; there is a lot to do :)
< naywhayare> the last link I have is the mlpack-svn list, which is where the commit emails go: https://lists.cc.gatech.edu/mailman/listinfo/mlpack
Anand has joined #mlpack
< naywhayare> and finally, I want to conclude with a request that GSoC communication remains public; this helps preserve the openness of our open-source project
< naywhayare> also, if you ask a question in public, you're asking more people and are more likely to get a good response :)
< naywhayare> Anand: hey, I'll send you the logs of what I wrote before shortly
< naywhayare> so, that's everything I had to say. sorry for writing so much... any questions?
< Anand> Hi! Sorry for being late.
< Anand> I will see the logs
< naywhayare> Anand: no worries. I don't have the automatic logs set up on mlpack.org yet, but I will email these ones to you...
< Anand> If I have missed something really important, please tell me now
< marcus_zoq> I have no questions, thanks.
< ajinkya> one questions, or rather something to discuss... are we planning to have regular irc meetings or after this its mostly mentor-mentee thing (with irc support ofcourse)
< ajinkya> *question
< naywhayare> Anand: I sent you the logs; they should be in your inbox
< Anand> Yes, I got them!
< naywhayare> ajinkya: I'm not sure. I wouldn't mind the idea of regular meetings, but I think it's pretty hard to coordinate the times given how spread out we all are
< naywhayare> does anyone have any opinions on that?
< ajinkya> we can fix on one and whoever can will attend ... i dont know, just a suggestion
< oldbeardo> naywhayare: well, I think you are right, taking a poll everytime is not feasible
< udit_s> No questions from me. Everything seems pretty clear.
< Anand> It is a good idea to have such meetings on a regular basis (atleast twice ot thrice a month). But, given how spread out we all are, even once a month is not a bad idea!
< udit_s> We could have a fixed time where people could drop by for weekly progress and discussions if they were able to.
< naywhayare> oldbeardo: yeah; this time is okay for ajinkya and I in the US, and you guys in India, but I think poor Marcus has woken up in the middle of the night for this :(
< ajinkya> i think if we do it a bit later than this, it will still be fine in india as its day time there and we can push for Marcus' morning time
< oldbeardo> naywhayare: yes, maybe we could have a GSoC progress page, where everyone can post comments about their projects
< ajinkya> but yeh, probably can be taken offline
< naywhayare> ajinkya: yeah; that's reasonable. it means Andrew (in DC) won't be able to make it, but I don't mind staying up super late :)
< ajinkya> yeh me neither :)
< naywhayare> oldbeardo: that is a decent idea. although I think the weekly emails should take care of keeping everyone updated on what is going on
< marcus_zoq> naywhayare: me neither
< Anand> How about a common blog?
< Anand> We can write about our progress there so that every one interested can read it up.
< naywhayare> Anand: I could probably set one up on mlpack.org, if there is sufficient interest
< naywhayare> to me it's sounding like regular IRC meetings are somewhat unfeasible, and an asynchronous system where students can push updates about their projects is a better solution
< oldbeardo> I like the idea of a common blog
< Anand> Exactly
< naywhayare> okay; if someone wants to set one up, I can use mod_rewrite in apache to make it load on mlpack.org/gsocblog/ (or something like that), like I do with mlpack.org/trac/ (which is actually trac.research.cc.gatech.edu/fastlab/ )
< naywhayare> any volunteers? :)
< marcus_zoq> If there is sufficient interest, I can setup the blog.
< Anand> So instead of sending a mail on the mailing list (as Ryan mentioned in the chat that I missed), writing about the progress on the blog is a better idea
< naywhayare> I would consider either a blog update or a mailing list post as equivalent, as long as the student is regular about their updates (weekly, at least)
< sumedhghaisas> Yes even I agree with Anand...
< Anand> So, everyone ok with this? Then Marcus could set the blog up
< udit_s> yep.
< naywhayare> ok, so four of five students are interested in using a blog. I'll have to talk with Andrew separately (I assume he's asleep now; I think he wakes up early)
< oldbeardo> naywhayare, ajinkya, sumedhghaisas I had a question about my project in specific
< naywhayare> oldbeardo: yes, there will need to be a little bit of coordination between you and sumedh
< naywhayare> Google asks that students not work together on projects. however, although you are both working on CF, you are both implementing different new algorithms and won't be working together
< naywhayare> the part that will need coordination is ensuring that everyone can agree on an API that works for both of you
< sumedhghaisas> The design decisions have to taken together I guess....
< oldbeardo> naywhayare: okay, my question was do our applications have some common algorithms?
< naywhayare> yeah; I think we will have to modify the CF abstractions to work with both of your improvements, but I think this is possible
< naywhayare> oldbeardo: in my checking during the application process, it did not seem like you were implementing common algorithms
< naywhayare> there is a similarity, but I don't think it's exact; let me pull up the applications and remember
< oldbeardo> sure
< sumedhghaisas> its about regularized SVD if I am not wrong...
< naywhayare> yeah; Siddharth proposed to implement Regularized SVD and Sumedh proposed to implement regularized ALS through NMF
< naywhayare> the two things are similar but I don't think they are identical
< oldbeardo> naywhayare: well I proposed 3 algorithms, so there's a good chance one of them will be common
< naywhayare> oldbeardo: nah, Sumedh did not propose QUIC-SVD or PMF
< oldbeardo> naywhayare: that's fine then, it should work out well
< naywhayare> either way, we will have to keep the implementations independent, at least until the summer is over
< sumedhghaisas> naywhayare: And I have 1 algorithm which uses regularized SVD... :)
< naywhayare> I assume that there will probably be some common functionality between your two projects. at the end of the summer, we can work on merging the two to reduce the codebase size; but we can't do this until the end of the summer because of Google's rules
< sumedhghaisas> so we should implement it separately?
< naywhayare> yeah; we will all have to work together to ensure the CF abstractions work with both your contributions, but for each of the individual algorithms you should be working separately
< oldbeardo> can Sumedh not use the Regularized SVD implementation that I will work on?
< naywhayare> I am not sure of this and I will have to ask Carol
< naywhayare> I wanted to wait to figure this out until the time comes for Sumedh to use regularized SVD, so that the specifics of the situation are more clear
< ajinkya> Ryan, I can do that if you want. Towards the end of the summer we can see how we can reuse the code and implementations and gel them together ?
< naywhayare> either way, I am sure we can find a solution that Google will be happy with, and that won't be any extra work for you guys
< naywhayare> ajinkya: yeah, definitely. we have to wait until after the code is submitted to Google, so, the end of August, but after that, definitely
< ajinkya> yep thats what i meant
< cuphrody> hey, I'm quite confused with the dual-tree algorithm.. is there any explanation?
< ajinkya> Google is pretty strict about these rules
< naywhayare> cuphrody: yeah. there are some papers. let me find you a link to a mailing list message that has links to papers :)
< sumedhghaisas> oldbeardo: So we will be implementing these algorithms on our own ...
< cuphrody> appreciate for that:)
< naywhayare> ajinkya: I'm not sure why Google is so strict. you could always ask Carol... I am sure she would give a very long response :)
< oldbeardo> sumedhghaisas: it all depends on what Google has to say about this
< Anand> cuphrody : A.G. Gray and A.W. Moore. "N-body problems in statistical learning." Advances in Neural Information Processing Systems (2001): 521-527.
< Anand> A.W. Moore. "Nonparametric density estimation: toward computational tractability." Proceedings of the Third SIAM International Conference on Data Mining (2003).
< oldbeardo> though I'm completely against writing multiple implementations for the same algorithm
< Anand> cuphrody: A. Beygelzimer, S. Kakade, and J.L. Langford. "Cover trees for nearest neighbor." Proceedings of the 23rd International Conference on Machine Learning (2006).
< cuphrody> oh.. so much material...:(
< naywhayare> Anand: thank you! I couldn't find a link
< Anand> cuphrody: P. Ram, D. Lee, W.B. March, A.G. Gray. "Linear-time algorithms for pairwise statistical problems." Advances in Neural Information Processing Systems (2009).
< Anand> Yes, I got all of them from Ryan!
< naywhayare> cuphrody: yeah... there is a lot of material... that paper by Ram and Lee et al is mostly theory, by the way
< cuphrody> thanks very much = =!
< sumedhghaisas> oldbeardo: yeah me too... that would be a coding overhead :)
< naywhayare> cuphrody: another possibility is "Tree-Independent Dual-Tree Algorithms", a paper we submitted to ICML last year; http://machinelearning.wustl.edu/mlpapers/papers/icml2013_curtin13
< naywhayare> cuphrody: that one might give a better overview of how dual-tree algorithms fit into mlpack
< cuphrody> that must be very useful thanks:)
< naywhayare> oldbeardo: sumedhghaisas: yeah; we will have to talk with Google and see. I am pretty convinced that the major parts of your projects do not overlap
< naywhayare> so I don't think there will be large amounts of code duplication
< naywhayare> we will definitely have a clearer picture in a few months
< oldbeardo> naywhayare: yup, thanks for the discussion
< ajinkya> yeh lets worry about that later
< udit_s> Okay guys, I'll catch you later. ryan, marcus, I'll be in touch with you.
< naywhayare> anyway, I am getting tired, but I had one more thing I wanted to do. I think most of us have talked to each other at some point, but I don't know if we know much about each other. so I wanted to give an introduction, and if anyone else does too, feel free
< udit_s> Also, the link for the blog will be up on the ML, right ?
< naywhayare> so, to start, my name is Ryan and I live in Atlanta. I've been a student at Georgia Tech for... almost a decade. When I'm not working on mlpack or research, I like to race go-karts, and sometimes I drive old cars around with friends: http://www.ratml.org/misc_img/saturn_v.thumb.png
< naywhayare> udit_s: yeah, when we get that set up we'll put a link there
< oldbeardo> wow, this is new
< sumedhghaisas> naywhayare: cool :)
udit_s has left #mlpack []
< ajinkya> naywhayare: did you steal the ramblin wreck!?
< naywhayare> ajinkya: ha! no, actually, I have a friend who has spent years restoring an old Model A and painted it to look like the wreck
< Anand> naywhayare : Cool stuff!
< naywhayare> I am lucky that he invites me along to drive it sometimes :)
< ajinkya> pretty neat that looks!
< cuphrody> my name is Jialun Wang.. I'm a Chinese student in HUST and I just want to participate in mlpack regardless summer of code :) happy to know you guys.
< Anand> Ok, so for those who do not know me : I am Anand Soni. I am a final year student pursuing my Bachelors degree in computer science from the Indian Institute of Technology Bombay. I live in Mumbai, India.
< oldbeardo> naywhayare: do you own these cars? cool rides I should say
< naywhayare> oldbeardo: no, I only own the BMW in the back. it looks a lot nicer than it actually is. :) the other two are a friend's, but we take them on trips now and then to make sure they still work
< naywhayare> cuphrody: nice to meet you! I'm glad you're interested in helping, and if you have any questions, always feel free to ask
< oldbeardo> naywhayare: the one in the middle is too good
< ajinkya> ok i ll go next
< cuphrody> naywhayare: :) I'll be online for 24 hours.
< ajinkya> my name is Ajinkya, originally from Pune, India. Graduate Georgia Tech in Dec 2013. I was part of the FastLab where MLPACK was started, and redesigned by Ryan :)
< ajinkya> I was a part for a short span then moved to the Bay on west coast for work
< jenkins-mlpack> Project mlpack - nightly matrix build build #441: STILL UNSTABLE in 1 hr 36 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20nightly%20matrix%20build/441/
< naywhayare> jenkins-mlpack: that's not a very friendly introduction. :(
< jenkins-mlpack> naywhayare did you mean me? Unknown command 'that's'
< jenkins-mlpack> Use '!jenkins help' to get help!
< ajinkya> haha
< cuphrody> = =!
< Anand> naywhayare : Am I getting ssh access to any of your machines?
< sumedhghaisas> Okay my turn ....
< naywhayare> Anand: if you need it, I can set it up. I spent some time setting up an LDAP/Kerberos server, so we only need to set up one account
< sumedhghaisas> I am Sumedh Ghaisas, 3rd year student at BITS Pilani - Goa Campus. I live in Mumbai, India. Leisure time... unn... except studying for exams everything is included in leisure time...
< sumedhghaisas> :)
< sumedhghaisas> I like playing table tennis for that matter...
< Anand> naywhayare : I think it will be good if I get one.
< naywhayare> Anand: ok, hang on a moment
< naywhayare> let's see if I remember how to do this
< Anand> naywhayare : Ok. You can send me the details later!
Android53 has quit [Remote host closed the connection]
< sumedhghaisas> naywhayare: can there be a common login there?? so u can just share that with us...
< oldbeardo> I'm Siddharth, final year student of Computer Science at BITS - Goa. I live in Mumbai, India. Sumedh and I have many things in common it seems
< naywhayare> sumedhghaisas: not a common login, but if necessary I can set you up with individual accounts. I'd like to keep the number of accounts to a minimum (for security reasons mainly) but if there's something you'd like to investigate on one of them I'll happily make you an account
< sumedhghaisas> so you need not make a logic for each one of us....
< sumedhghaisas> *login i mean
< naywhayare> nah, shared passwords are a bad idea. it's easy to make new accounts anyway
< ajinkya> MLPACK is going to be a hit in MUMBAI this year it seems
< naywhayare> oldbeardo: are you also a table tennis player like Sumedh?
< naywhayare> I have played a handful of times... I am not very good :)
< sumedhghaisas> ajinkya: plus one on that :)
< ajinkya> I suck at it
< oldbeardo> naywhayare: heh, yes I'm
< sumedhghaisas> I used to play for a club till 8th...
< naywhayare> man, you guys really do have a lot in common! :)
< sumedhghaisas> then .. studies :(
< ajinkya> :)
< ajinkya> Ok, I am off for now.. good to meet y'all
< ajinkya> happy coding!
< oldbeardo> ajinkya: see you soon :)
< sumedhghaisas> Yeah... The best part is... I didn't know about Siddharth.. until recently .... I guess it goes same for Siddhrath too...
< naywhayare> ajinkya: see you later
< oldbeardo> yes, it does
< ajinkya> definitely. We ll keep in touch.
< Anand> everyone : Nice to meet you all. I need to go now! I think we will have a nice time this Summer. Let us stay connected.
< cuphrody> see u :)
< naywhayare> Anand: see you later
< naywhayare> thanks everyone for coming to this meeting, by the way
< cuphrody> :) happy to participate
< naywhayare> it's getting quite late here, so I think I'm going to go to bed soon, once I figure out how to add a new user to LDAP/Kerberos, or once I get frustrated and give up for now :)
< cuphrody> bye:)
< oldbeardo> naywhayare: thanks for the kickstarter meeting, nice meeting everyone here :)
< sumedhghaisas> Okay seems odd for an Indian (considering time)... but now time to sleep :) bye :) nice meeting everyone...
< marcus_zoq> Buy, nice to see you.
< Anand> Good bye, everyone! :)
< oldbeardo> marcus_zoq: heh, I think you meant 'bye'
Anand has quit [Quit: Page closed]
< marcus_zoq> right :)
< marcus_zoq> Maybe, I need more sleep.
< oldbeardo> haha, everyone does :)
< oldbeardo> see you all later :)
sumedhghaisas has quit [Quit: Leaving]
oldbeardo has quit [Quit: Page closed]
ajinkya has quit [Ping timeout: 240 seconds]