verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
witness_ has joined #mlpack
manish7294 has joined #mlpack
vivekp has joined #mlpack
< manish7294> zoq: Ah! that's looks strange, as we are not passing any initial transformation and it's being generated by shogun itself at line 56 of LMNNImp.cpp
< manish7294> I got it, if you see letter dataset---the labels are in the first column of csv file. That may be the reason of this unusual error.
< manish7294> so, letter dataset needs an update :)
< manish7294> And regarding 100% accuracy----It seems strange to me because if we comment out transformedData and just carry out accuracy on the original iris, then also we get 100%. Which is quite strange in itself as in iris there are some points which are just far away from there original class
manish7294 has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]
manish7294 has joined #mlpack
manish7294 has quit [Ping timeout: 260 seconds]
manish7294 has joined #mlpack
manish7294_ has joined #mlpack
< manish7294_> Same is with balance_scale dataset --- it too have labels in first column
< manish7294_> Things seems to be connecting now, It seems this is the reason why I was getting 20% and 40.235% on these two and 100% on all others.
manish7294 has quit [Ping timeout: 260 seconds]
< manish7294_> Since we have query set same as training set, It looks like for k = 1 the shogun's knn is prediciting itself as the nearest point and hence the 100% accuracy.
< manish7294_> Let me verify this by changing the value of k
< manish7294_> Right, This time things work --- got 96.6667 on iris :)
manish7294_ has quit [Quit: Page closed]
< Atharva> sumedhghaisas: Hi Sumedh
vivekp has quit [Ping timeout: 260 seconds]
vivekp has joined #mlpack
< jenkins-mlpack> Project docker mlpack nightly build build #353: STILL UNSTABLE in 3 hr 7 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/353/
vivekp has quit [Ping timeout: 265 seconds]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 255 seconds]
vivekp has joined #mlpack
vivekp has quit [Ping timeout: 256 seconds]
vivekp has joined #mlpack
wenhao has joined #mlpack
< rcurtin> manish7294_: sorry I was unavailable over the weekend, looks like you got things figured out
< rcurtin> with mlpack kNN, you can specify just a single reference set to do all-nearest-neighbors, which won't count the nearest point
< rcurtin> er, sorry, "which won't count the point as its own nearest neighbor"
< rcurtin> but it looks like shogun doesn't have that support... I wonder if to make it work right, if you need to query for each point individually with the rest of the points as the reference set
< rcurtin> you could also ask in #shogun if there's a better way to do it
manish7294 has joined #mlpack
< manish7294> rcurtin: Hi, how's race went?
< manish7294> things are good if we just avoid k = 1
< rcurtin> manish7294: it was two events... one was an "ironman", a 1-hour endurance race
< manish7294> so maybe we can have k = 3 for benchmarking
< manish7294> great :)
< rcurtin> the other event was a series of races, but we are randomly assigned karts, and the karts were not good, so as a result I did not do well
< rcurtin> but I still had fun :)
< manish7294> wow! you are wearing cat tshirt :)
< rcurtin> my concern about using k=3 is that it'll give different results than mlpack would for k=3
< rcurtin> yeah, I ordered that shirt for fun but it turned out to be made of a really nice under armour-like material (don't know what it's called)
< rcurtin> so it's perfect for activities where I sweat a lot :)
< manish7294> why do you think it will give different results
< rcurtin> the kNN classification is the weighted average of the nearest k neighbors
< rcurtin> in shogun, if the 1st nearest neighbor is always the query point, then we get different results than mlpack, where the 1st nearest neighbor is not the query point
< manish7294> Right, I totally missed that
< manish7294> Okay I will ask on #shogun
< rcurtin> it might be worth looking through their documentation to see if there is any other idea first
< rcurtin> I only took a quick look
< manish7294> sure
< manish7294> or what we can do is keep k for mlpack training 1 less than the k used in shogun accuracy prediction
< manish7294> like for k = 1, we can put shogun to k = 2
< rcurtin> no, that may still result in problems
< rcurtin> there are still situations where that could give different results for mlpack and shogun
< manish7294> okay, then I shall try to find out more
< manish7294> will it be good to discuss this problem mentioning this particular with reference to mlpack on #shogun
< rcurtin> you can mention mlpack, but I don't think the folks in there are familiar with the particular project we are doing
< rcurtin> so, up to you :) I doubt it will make much difference
< manish7294> okay I will try my best to explain myself
< zoq> rcurtin: Nice, this t-shirt is the best :)
< manish7294> zoq: Can you please check the letters and balance dataset?
< zoq> manish7294: Hold on let me start the benchmark,.
< manish7294> zoq: not that, the datasets on the mlpack.org
< manish7294> they have labels in the first column
vivekp has quit [Ping timeout: 264 seconds]
< manish7294> And that may be the reason you got that error yesterday
< zoq> I see
< manish7294> and I think wine.csv doesn't have labels in it, not sure though.
< zoq> Great that you figured it out.
< manish7294> zoq: was possible because of your debugging :)
< zoq> manish7294: for the wine dataset the last column contains the labels, but SplitTrainData should remove that part.
vivekp has joined #mlpack
< sumedhghaisas> Atharva: Hi Atharva
< manish7294> zoq: Thanks! I was unsure about it as I only had a very quick glance over it.
< sumedhghaisas> Sorry little busy today. Is it possible to catch up tomorrow?
< zoq> ShikharJ: Great blog update, always nice to see good results.
< Atharva> sumedhghaisas: Sure Sumedh! I have rebased the repar PR, do check it when you get time. Till then I will complete the second PR.
< sumedhghaisas> Atharva: ahh I did give it a look.
< sumedhghaisas> there seems to be some static code check errors.
< sumedhghaisas> although I am not sure how to solve them?
vivekp has quit [Ping timeout: 245 seconds]
< Atharva> Yeah I read the details, they are due to the emtpy constructor in the repar layer. But all the other layers have a constructor like that one.
< sumedhghaisas> hmm... lets ask Marcus then :)
< sumedhghaisas> zoq: Hey Marcus
< sumedhghaisas> Are we following static code check for ann layers?
vivekp has joined #mlpack
< zoq> sumedhghais: Yes, but sometimes the static analysis check returns nonsense. You are talking about #1420?
< manish7294> zoq: Currently we have two elements (timing, Accuracy) in metrics of LMNN but only timing is being shown as benchmarks execute.
< zoq> manish7294: You are right, we should print everything.
< zoq> Atharva: We can ignore the first two, I guess you could use arma::ones<arma::Mat<eT> > instead of arma::ones<arma::Mat<eT>> to solve the two; not sure.
< zoq> Atharva: You can fix the third issue setting latentSize, stochastic, includeKl to zero? inside the constructor initialization list.
< manish7294> zoq: So, is this a problem from my code or it's need to be implemented in benchmarks ?
< zoq> manish7294: This has to be implemented inside the main benchmark script; we could open an issue and maybe someone (including myself) will pick it up in the next days?
< manish7294> zoq: sounds good :)
< zoq> manish7294: can you open the issue?
< manish7294> zoq: I too will try if I can.
< zoq> manish7294: Thanks, if not I can open one later today.
< manish7294> zoq: sure, will open in about an hour from now.
< manish7294> or,I think I can do it now also :)
< zoq> wenhao: Really interesting results, do you think you could accumulate the results over multiple runs and include the runtime as another metric?
< sumedhghaisas> zoq: yes #1420
< sumedhghaisas> I looked at the static code analysis result. Don't know if it makes sense
< zoq> wenhao: Interesting looks like, there is no difference between the different search policies.
< zoq> sumedhghais: Right, we can ignore the first two, see my comments above.
< rcurtin> manish7294: I'll provide some updated comments later today or tomorrow... after a week off, I have a lot to catch up on it seems...
< manish7294> rcurtin: Great :)
ImQ009 has joined #mlpack
< rcurtin> wenhao: I took a look at #1410, there is some really nice refactoring there. thank you for your hard work!
< rcurtin> one cool thing also is that with the NeighborSearchPolicy templatized, it would be possible to plug in LSH instead of tree-based kNN if a user wanted it
< rcurtin> I guess that could be useful if the rank of the decomposition was very large, so that the kNN search was high-dimensional, where LSH could perform better
< Atharva> I was thinking about upgrading to 18.04, has anybody encountered any problem with mlpack on it?
< rcurtin> I haven't had any problems
< ShikharJ> Atharva: Retreat comrade, before you regret!
< rcurtin> (with mlpack that is)
< Atharva> ShikharJ: Why what happened?
< rcurtin> I use debian unstable on most systems anyway :)
< Atharva> rcurtin: How is the nvidia driver support on 18.04, have they improved it?
< rcurtin> I haven't had any issues, sometimes the driver version is a little bit old, but if you install via apt there's no issues I've seen
< ShikharJ> Atharva: Lots of issues with boot and shutdown routines, no bumblebee support for switching off GPU, lots of apts don't work.
< rcurtin> oh my, maybe don't take my word for it then :)
< Atharva> Getting mixed reviews here :p
< ShikharJ> Atharva: If your current system works fine, don't make the mistake of upgrading it.
< ShikharJ> There's a reason why people still use 14.04
< Atharva> Maybe I will wait then, I don't have any problems with 16.04 other than the fact that Nvidia drivers are a bit hard to get started with
< ShikharJ> Atharva: It would only be harder with the newer versions, you see, Nvidia provides software after a particular Linux OS is released. Till then only the older ones are provided, which may not even work.
< ShikharJ> You might want to try 17.10, it's a major improvement over 16.04.
< ShikharJ> 17.10 and 18.04 are pretty similar, you wouldn't even feel the difference.
< rcurtin> my recommendation, which may be more work, would be to switch to debian unstable, which is what ubuntu is a derivative of
< Atharva> ShikharJ: Thanks! I will give this a thought over the next weekend.
< rcurtin> but it depends on personal preference. I like minimal systems so debian is a good starting place for me
< Atharva> rcurtin: Will it be suitable for someone like me who doesn't have a lot of experience with different linux distributions?
< Atharva> How different is it from ubuntu? Are any commands different?
< rcurtin> Atharva: I would say it depends on how comfortable you are with the command line. Ubuntu is generally understood to be "easier" and typically has nicer GUI tools and everything
< rcurtin> but I prefer working with the command-line wherever possible, so Debian is fine for me. both are built on apt, so the process of installing and upgrading packages is roughly the same
< rcurtin> but again, I would say, give it a shot if you like, but maybe you might not like it. only one way to find out :)
< Atharva> rcurtin: Yes, only one way to find out. If I do it I will let you know how I find it to be. :)
< rcurtin> ah, sorry, I did say "debian unstable" but I would recommend instead to use "debian testing"
< rcurtin> the releases are much less often than Ubuntu
< Atharva> Okay, thanks, I will check it out.
< rcurtin> so debian stable ("stretch") is a little old, but testing ("buster") will have more up-to-date packages
< Atharva> You said you use 18.04, does it kind of run parallel to ubuntu?
< rcurtin> no, I was using 18.04 in a docker container for some unrelated work at Symantec
< rcurtin> but I have built mlpack in that same setup, no problems
< Atharva> Okayy
< ShikharJ> zoq: Where does math::MakeAlias originate from?
< rcurtin> ShikharJ: do you mean which file?
< ShikharJ> zoq: I don't think there's a support for arma::cube for that.
< ShikharJ> This needs to be extended
< rcurtin> it's in src/mlpack/core/math/, and yeah, if you want to add cube support it would be great
< ShikharJ> rcurtin: Thanks!
< rcurtin> of course, happy to help :)
< ShikharJ> rcurtin: Is there a better way of doing `math::MakeAlias(const_cast<arma::cube>(arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize)), false);`, since this is not a pointer or a reference, so it leads to compilation issues.
< rcurtin> the inner arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize) is already most of the way to an alias
< ShikharJ> But we need to cast away the constness.
< rcurtin> if you add 'false' as a fifth constructor parameter, it's completely an alias
< rcurtin> I guess, I am not fully seeing the need for the MakeAlias() function since the MakeAlias() function basically just calls the advanced constructor you've already written there
< ShikharJ> Yeah, I misrepresented the question.
< ShikharJ> The question is how to remove the constness from an object like arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize, false, false);
< rcurtin> but that shouldn't give you a const object, I don't think
< rcurtin> if the problem is that input is const, you can do 'arma::cube(const_cast<arma::cube>(input).memptr(), inputWidth, inputHeight, inSize * batchSize, false);
< rcurtin> '
< ShikharJ> rcurtin: That's exactly the issue, I'll give it a try.
< rcurtin> yeah, I think that is basically the exact code that's already a part of MakeAlias() for vectors and matrices, but I am not 100% sure (I am not looking at it right this moment)
< rcurtin> ok, finally, our mlpack paper is accepted into the Journal of Open Source Software: http://joss.theoj.org/papers/10.21105/joss.00726
< ShikharJ> Congrats rcurtin and everyone!
< zoq> rcurtin: Awesome! Thanks for keeping up with all the comments!
< rcurtin> it was really useful to have random new people come try to use the software and post their feedback; we found a lot of documentation issues
< rcurtin> I think I will try and find random people on the street and give them a few dollars to try it out and see what they think :)
< ShikharJ> rcurtin: Just curious, does mlpack submit a paper everytime a new version release happens?
< rcurtin> no, we submitted one for the original release to a NIPS workshop and then submitted a longer version to the JMLR open source software track
< rcurtin> but version 3 is so different and has so much more, it was time to submit somewhere again
< rcurtin> maybe it would be useful to submit another paper for version 4? I am not sure, let's see what happens when we get there :)
< ShikharJ> rcurtin: Ah, I see, hope to stick around till that time :)
< zoq> rcurtin: That is an interesting idea, I suppose they are somewhat familiar with toolboxes.
< rcurtin> yeah, this is a big problem that I think JOSS is thinking about... how often to submit a new paper for a new version?
< rcurtin> I don't think we could submit to JMLR MLOSS again
< zoq> rcurtin: I guess it makes sense to at least allow someone to update the paper in some forms e.g. add new names.
< ShikharJ> rcurtin: Could we look for other journals (such as maybe PeerJ)?
< zoq> rcurtin: But with all the effort they put into the review not sure they have the manpower to do a review for every version.
< rcurtin> right, I think it might be difficult for them to provide reviewers. so I suspect they would frown upon us submitting a new version every month
< rcurtin> (that would also make it really hard for people to know what to cite when they use it)
< rcurtin> ShikharJ: I don't know about PeerJ, but you're right, maybe there are other efforts out there
< zoq> agreed, good point
manish7294 has quit [Ping timeout: 255 seconds]
ImQ009 has quit [Quit: Leaving]
wenhao has quit [Ping timeout: 260 seconds]