verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
govg has quit [Ping timeout: 260 seconds]
govg has joined #mlpack
tsathoggua has joined #mlpack
tsathoggua has quit [Client Quit]
govg has quit [Ping timeout: 252 seconds]
govg has joined #mlpack
govg has quit [Ping timeout: 244 seconds]
govg has joined #mlpack
keonkim has quit [Ping timeout: 250 seconds]
keonkim has joined #mlpack
govg has quit [Quit: leaving]
govg has joined #mlpack
Mathnerd314 has quit [Ping timeout: 260 seconds]
bang has joined #mlpack
bang is now known as Guest54688
Guest54688 has quit [Quit: Page closed]
govg has quit [Ping timeout: 260 seconds]
mentekid has quit [Remote host closed the connection]
mentekid has joined #mlpack
mentekid has quit [Ping timeout: 276 seconds]
mentekid has joined #mlpack
< tham> zoq : Hi, I have some questions want to ask
< tham> About autopilot, how do you detect obstacle?
< tham> By radar?computer vision?both?or other tech?
nilay has joined #mlpack
nilay has quit [Ping timeout: 250 seconds]
< zoq> tham: That depends, on the algorithm. You can use the laser scanner at the bottom, the laser scanner on the top or cameras.
< zoq> tham: and of course combinations
< tham> zoq : Thanks
tham has quit [Quit: Page closed]
nilay has joined #mlpack
< zoq> nilay: Hello, how are things going? Have you thought about the initial interface? If you like I can also propose something and we could go from there.
< nilay> zoq: Hi, i haven't thought of initial interface uptill now. I was thinking to implement random forest first.
< zoq> nilay: Sounds like a good start, I think it's a good idea to discuss the initial interface of the random forest before we start coding.
< nilay> ok
< nilay> zoq: i do not get how only decision-stump can be used for random forest.
< zoq> nilay: So, if you like I can propose something or you could do that if you like.
< zoq> nilay: I think, it would be clear how to use them in the interface.
< nilay> zoq: ok then.
< zoq> nilay: So, if you like I can come up with something that we could use as discussion basis.
< nilay> zoq: that would be good
< zoq> nilay: okay, good. I'll see if can write something down at the end of the day, and we can probably talk about it tomorrow, if you have time?
< nilay> zoq: i had a question, when starting to code, if I make the files(random_forest.cpp, random_forest.hpp... etc) in the methods directory then how do i only execute them?
< nilay> zoq: end of the day by utc?
< zoq> nilay: Does the current time work for you?
< nilay> yes
< zoq> nilay: okay, good. I think the best idea is to use a unit test to do that. Note, if you build mlpack you only build changes.
< nilay> zoq: what is a unit test?
< nilay> do you mean i just do ../cmake and it'll take care of everything.
< zoq> nilay: Also you have to write a CMakefile to build the code. Take a look at https://github.com/mlpack/mlpack/tree/master/src/mlpack/methods/decision_stump especially at the CMakeLists.txt file.
< zoq> nilay: You can basically change the files accoring to your project.
< nilay> zoq: i take a look at it. so i just write the cmake makefile and the other files and do ../cmake ?
< zoq> nilay: yes
< zoq> nilay: About how to test your code, I'll go and send you an mail in a couple of hours. I have to go to a meeting now.
< nilay> zoq: ok
nilay has quit [Ping timeout: 250 seconds]
nilay has joined #mlpack
mentekid has quit [Ping timeout: 276 seconds]
nilay has quit [Ping timeout: 250 seconds]
nilay has joined #mlpack
mentekid has joined #mlpack
Mathnerd314 has joined #mlpack
< zoq> nilay: I just sent you some instructions how to create the project and how to test and run the code. Let me know if anything isn't clear.
< nilay> zoq: ok, i will try them. Can we discuss the interface?
< nilay> zoq: we need to use opencv for various image functions, we will just not use the trained model opencv provides, right?
< zoq> nilay: great, can we discuss the interface tomorrow?
< nilay> zoq: okay sure.
< nilay> zoq: can you answer my opencv query..?
< zoq> nilay: I think, we do not need any special OpenCV image function for the project, e.g. we don't need openCV to compute the image gradient, we can do it ourself in a couple of lines. Since OpenCV comes with this huge amount of dependencies, it doesn't make much sense to me, to install OpenCV just to use I guess two functions.
< nilay> zoq: if we don't use opencv, how do we read image?
< zoq> nilay: e.g. by using arma::load(...)
< nilay> zoq: Inspired by Lim et al. [31], we use a similar set of color and gradient channels (originally developed for fast pedes- trian detection [12]). We compute three color channels in CIE-LUV color space along with normalized gradient mag- nitude at two scales (original and half resolution). Addition- ally, we split each gradient magnitude channel into four channels based on orientation. And what about this?
< zoq> nilay: Take a look at the rgb2luv function in https://github.com/ArtanisCV/StructuredForests/blob/master/utils.p .., no need to use opencv.
< zoq> nilay: The same applies for the gradient function.
< nilay> zoq: ok
< zoq> nilay: armadillo comes with an histogram function so we could use that
< zoq> nilay: And also to do convolution with a triangle filter.
< nilay> zoq: ok, i get it now. many optimizations can be done for random forests as seen here... ( https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm ) do we want to do them?
< nilay> zoq: or we just build the random forest as given in the paper. or do you think we discuss this tomorrow?
< zoq> nilay: I would go with the random forest as described in the paper. And do optimizations afterwards.
< nilay> zoq: ok, and how do we use decision stump. Tree will not grow by more than 1 level, if decision stump is used.
< zoq> nilay: Right, we have to modify the code, or use Cloud's code that is already partially modified.
< zoq> nilay: So read images and labels -> extract features -> train structured trees -> merge ensemble -> done
< nilay> zoq: do we assume the labels are provided to us?
< zoq> nilay: Yes, we can start with a subset of the BSDS500 dataset (http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/resources.html) to test the code.
< nilay> zoq: ok, and is there a armadillo tutorial i can look at?, i cannot seem to find using simple google search.
< zoq> nilay: I don't think so, but the docs (http://arma.sourceforge.net/docs.html) are pretty good.
mentekid has quit [Ping timeout: 276 seconds]
< nilay> zoq: ok, that's all my doubts for now :)
< nilay> thanks.
< zoq> nilay: Sure no problem, once we start coding all this becomes more clear :)
< zoq> I really like Andrew's blog post: http://mlpack.org/gsocblog/andrew-week-6.html ... And I guess, at some point you realize the same.
< nilay> zoq: yes, I hope so. Right now I look most of the things up too. But I guess that is the result of using so many different languages.
< nilay> zoq: The paper(FAST EDGE DETECTION USING STRUCTURED FORESTS) talks about the following (Section 4, page 5, input features): Our learning approach predicts a structured 16 x 16 segmentation mask from a larger 32 x 32 image patch. They augment the image patch to 32x32x3. But never really say how to find the structured 16x16 labels.
< zoq> nilay: yeah, the paper left out some details. In that case the dataset already contains the labels. So each pixel in the input image (or a 32 x 32 patch) has a coressponding segmentation label that we use as label.
< nilay> zoq: why is the segmentation label 16x16 and image patch 32x32
wasiq has joined #mlpack
< nilay> zoq: or are there 4 such labels to cover the entire area
< zoq> nilay: You could also use the entire area, but if I remember correctly they use 16x16 for performance reasons.
mentekid has joined #mlpack
< nilay> zoq: so we convert 32x32 segmentation label to 16x16, by max voting (or some other metric) in each 2x2 patch
< zoq> nilay: yes
< nilay> zoq: ok
tsathoggua has joined #mlpack
tsathoggua has quit [Client Quit]
< rcurtin> mentekid: I finished my simulations on the box with OpenBLAS, did I get you those results?
< rcurtin> ah right, I see that I did now... I agree, it seems like cutoff 0.05 or so seems to be a good choice
< rcurtin> but I'm a bit concerned because your system showed such different patterns
< mentekid> rcurtin: hey so I haven't had the time to run it properly on my system yet
< mentekid> I will run it today though so we can take a better look
< mentekid> I hope my results will agree with yours so we can decide for one version of the code and put a lid on this :)
< mentekid> Do you have time now or later to talk about the project? I have a few questions
< rcurtin> yeah, I have time now
< rcurtin> sorry for the slow response on that :)
< mentekid> no problem :)
< mentekid> So first of all regarding the blog, I think I would prefer posting something on the mlpack website. How do I do that?
< rcurtin> sure, so let's make sure that is all still working...
< rcurtin> okay, so the blog itself is still up, but I don't see any link to it from mlpack.org
< rcurtin> let me update the website... I guess I could add a link in "learn about mlpack" and under the "how can I join the mlpack community" page
< mentekid> if it's too much hassle I could always just post in the list
< mentekid> I mean if I'm the only student that prefers the blog, there's no point in making everyone checking in every week instead of simply getting an email
< rcurtin> nah it's no problem---I have been wanting to use the blog more anyway
< rcurtin> for instance to write up a post "here is how I did <task> in mlpack", those can be helpful
< rcurtin> and I suspect you will not be the only one who wants to provide updates as a blog post; in 2014 that was the preference of all the students
< zoq> yeah, I think some of the other also like to use the blog ... we just have to make sure it works as it did the last time.
< rcurtin> okay, I updated the webpage with links
< mentekid> cool then :)
< rcurtin> the blog itself is found at https://github.com/zoq/blog/
< rcurtin> and I *think* that you make a post just by writing markdown in content/blog/
< rcurtin> maybe it is more complex than that... I am about to find out :)
< zoq> That's right just create a single markdown file and push the file. I can send everyone an mail with some instructions and invite everyone to the repo.
< zoq> A markdown file with some metadata at the top: https://raw.githubusercontent.com/zoq/blog/master/content/blog/AnandWeekTen.md
< mentekid> Cool that
< mentekid> (sorry pressed enter)
< mentekid> That's practical, so I just push to the repo and then it appears on the blog.
< zoq> mentekid: right
< mentekid> another question is regarding my timeline and milestones
< mentekid> I'm not sure but I think some of the deliverables can be moved a bit sooner
< mentekid> for example, I've allocated almost a week for proposing changes and getting feedback, but I believe I can do that by Sunday so I can get right to the interesting part sooner
< mentekid> a) should I revise the milestones and resend it somewhere b) do you have any feedback regarding the milestones I've set
< mentekid> (I still can't find the blog link by the way)
< rcurtin> try ctrl+r on the mlpack website
< rcurtin> the blog itself is at http://www.mlpack.org/gsocblog/
< rcurtin> let me look at your proposal once I finish this first blog post, just a moment...
< rcurtin> I thought this post would only take a minute to write... :)
sumedhghaisas has joined #mlpack
< rcurtin> zoq: I think the github webhook for the blog repo needs to be updated to point to big.mlpack.org:7780
< zoq> rcurtin: okay
< zoq> rcurtin: nice blog post :)
< rcurtin> okay, I think I got that looking decent
< rcurtin> I want to do some more CSS work with the blog site, like maybe to make the fonts the same, but maybe I will get to that later
< rcurtin> mentekid: okay, let me look at your proposal timeline now, sorry that took so long
< mentekid> it's ok, I'm setting up the timing tests now too no hurry
< rcurtin> I see what you mean about the milestones and timeline, I think you are already ahead of schedule
< rcurtin> also what latex package did you use to make that timeline?
< mentekid> it was a ripoff from stackoverflow, let me find the thread
< rcurtin> I think that there's no need to change the timeline unless you are falling far behind; 1.5 weeks for the C++ implementation might be a bit short, but LSH is a much simpler algorithm than, i.e., nearest neighbor search with trees
< rcurtin> ah, okay, it's just a pretty tabular environment; nice!
< rcurtin> yeah, I am not sure I have too much feedback on the timeline... it looks good to me
< rcurtin> if the reality deviates from the plan (like if you are running early) that's not an issue at all
< rcurtin> and if you're running behind, also not an issue, you have a bunch of "blue sky" time allocated and we can use that if necessary
< mentekid> cool then. Yeah my concern was with how closely I will follow it, right now I have no idea if I'll go far behind or too fast...
< mentekid> But I've already dived into the code so I more or less know what I want to do, at least regarding the multiprobe part
< rcurtin> it's always hard to know; my personal prediction is, you will be ahead of schedule for most of the implementation, but the testing will probably go over schedule somewhat
< rcurtin> I have no idea if I will be right with that prediction though :)
< rcurtin> also, some things came today!
< mentekid> no that sounds about right I think
< rcurtin> oh... this is not the right window. oops. but I did get some packages and it was nice :)
< mentekid> by the way, I made some changes to lsh for my thesis, so it would allow me to change the projection tables (there wasn't a way to change them before)
< rcurtin> just like an accessor for the projections matrix?
< rcurtin> if you want to submit a PR for that, feel free, that could be useful to other people too
< mentekid> I started making it like that but then I saw I would end up re-implementing half a function
< mentekid> so I just added a default argument to LSHSearch.Train()
< mentekid> by default it's an empty vector, but you can change it to any vector you want
< rcurtin> why a vector and not a matrix? I thought you would want to specify all the projections, not just one
< mentekid> no I mean a std::vector of arma::mat objects
< mentekid> that's how the LSHSearch object stores the projection tables, as a vector of matrices
< rcurtin> oh, right, I misread it, I thought it was just one projection table, but yeah, it is many
< rcurtin> actually, I am not sure what I was thinking, but it was incorrect :)
< rcurtin> I guess in Train() then we need to check and make sure that if the user passed anything in the std::vector, that it is the right number of tables
< rcurtin> and throw a std::invalid_argument otherwise
< mentekid> yeah I haven't done that because it was meant for personal use so I just did a quick and dirty swap, but I should anyway
< mentekid> my purpose was to change the projections because I've found a way to make more reliable projections using PCA, but I think it could also be helpful in the testing process because now we can simply run some small examples by hand and see what happens
< rcurtin> so actually that is a bit related to the paper I am about to submit...
< rcurtin> there is this hashing algorithm for furthest neighbor search (a more niche problem)
< rcurtin> and it chooses projections randomly
< rcurtin> but when you instead choose your projections based on the data (like you might with PCA), you get an algorithm which empirically performs way better
< rcurtin> (fewer projections needed for the same performance)
< mentekid> yeah it's almost the same thing with LSH
< mentekid> if you replace just a few projections with PCs you get better results
< mentekid> is the furthest neighbor done for machine learning or other purposes by the way?
< mentekid> I've only seen it in stuff like fluid dynamics I think
< rcurtin> oh, there are applications in fluid dynamics? that is good to know
< rcurtin> it's used for a couple of embedding algorithms
< rcurtin> to be perfectly honest I didn't think it had that interesting applications, but they were specifically asking for algorithms in the CFP for the conference
< mentekid> I think they use it in Fast Multipole Method which is the most headache-inducing algorithm I've read :P
< rcurtin> ah, okay
< rcurtin> FMM... that is where dual-tree algorithms come from :)
< mentekid> which is used in fluid dynamics and electromagnetics and stuff
< mentekid> right, there you have a target and a source tree right?
< rcurtin> I can't remember a part of the FMM where furthest neighbor is used, but I didn't study it too in-depth, so maybe I should revisit it
< mentekid> I can't say I'm really familiar, but I remember the part where you find groups of points that don't "intersect"
< mentekid> I only read the algorithm once and didn't like it
< mentekid> but I watched somebody present his thesis related to it, that's where I got the impression they used fnn
< mentekid> I might be wrong though
< rcurtin> I'll read through it later and let you know... I need a better motivation for the algorithm, so I am definitely looking for applications :)
< mentekid> nice :) where are you submitting?
nilay has quit [Ping timeout: 250 seconds]
< rcurtin> SISAP ("similarity search and applications")
sumedhghaisas has quit [Ping timeout: 276 seconds]
< mentekid> oh Japan, very nice! Good luck :)
< rcurtin> yeah, we will see if it gets in... but I think I have good theory (an absolute approximation guarantee) and I have good results, so it should be no issue
< rcurtin> I have never been to Japan... I would really like to go
wasiq has quit [Ping timeout: 276 seconds]
mentekid has quit [Ping timeout: 244 seconds]
mentekid has joined #mlpack