#mlpack on 2018-06-18 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

01:53 witness_ has joined #mlpack

02:09 manish7294 has joined #mlpack

02:14 vivekp has joined #mlpack

02:22 < manish7294> zoq: Ah! that's looks strange, as we are not passing any initial transformation and it's being generated by shogun itself at line 56 of LMNNImp.cpp

02:25 < manish7294> I got it, if you see letter dataset---the labels are in the first column of csv file. That may be the reason of this unusual error.

02:26 < manish7294> so, letter dataset needs an update :)

02:30 < manish7294> And regarding 100% accuracy----It seems strange to me because if we comment out transformedData and just carry out accuracy on the original iris, then also we get 100%. Which is quite strange in itself as in iris there are some points which are just far away from there original class

02:31 manish7294 has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]

02:32 manish7294 has joined #mlpack

02:41 manish7294 has quit [Ping timeout: 260 seconds]

02:48 manish7294 has joined #mlpack

02:48 < manish7294> https://pasteboard.co/HqoR4mi.png

02:57 manish7294_ has joined #mlpack

02:57 < manish7294_> Same is with balance_scale dataset --- it too have labels in first column

02:58 < manish7294_> Things seems to be connecting now, It seems this is the reason why I was getting 20% and 40.235% on these two and 100% on all others.

03:00 manish7294 has quit [Ping timeout: 260 seconds]

03:00 < manish7294_> Since we have query set same as training set, It looks like for k = 1 the shogun's knn is prediciting itself as the nearest point and hence the 100% accuracy.

03:01 < manish7294_> Let me verify this by changing the value of k

03:32 < manish7294_> Right, This time things work --- got 96.6667 on iris :)

05:00 manish7294_ has quit [Quit: Page closed]

08:48 < Atharva> sumedhghaisas: Hi Sumedh

09:11 vivekp has quit [Ping timeout: 260 seconds]

09:16 vivekp has joined #mlpack

10:21 < jenkins-mlpack> Project docker mlpack nightly build build #353: STILL UNSTABLE in 3 hr 7 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/353/

11:40 vivekp has quit [Ping timeout: 265 seconds]

11:46 vivekp has joined #mlpack

12:02 vivekp has quit [Ping timeout: 255 seconds]

12:05 vivekp has joined #mlpack

12:19 vivekp has quit [Ping timeout: 256 seconds]

12:20 vivekp has joined #mlpack

13:00 wenhao has joined #mlpack

13:13 < rcurtin> manish7294_: sorry I was unavailable over the weekend, looks like you got things figured out

13:14 < rcurtin> with mlpack kNN, you can specify just a single reference set to do all-nearest-neighbors, which won't count the nearest point

13:14 < rcurtin> er, sorry, "which won't count the point as its own nearest neighbor"

13:15 < rcurtin> but it looks like shogun doesn't have that support... I wonder if to make it work right, if you need to query for each point individually with the rest of the points as the reference set

13:15 < rcurtin> you could also ask in #shogun if there's a better way to do it

13:16 manish7294 has joined #mlpack

13:18 < manish7294> rcurtin: Hi, how's race went?

13:18 < manish7294> things are good if we just avoid k = 1

13:18 < rcurtin> manish7294: it was two events... one was an "ironman", a 1-hour endurance race

13:18 < manish7294> so maybe we can have k = 3 for benchmarking

13:19 < rcurtin> in that one, I placed 2nd: https://scontent-dfw5-1.xx.fbcdn.net/v/t1.0-9/35362869_1677295408985256_4960853661157687296_o.jpg?_nc_cat=0&oh=e6817d90c07c60182d5f4bdc708f756b&oe=5BA5CC32

13:19 < manish7294> great :)

13:19 < rcurtin> the other event was a series of races, but we are randomly assigned karts, and the karts were not good, so as a result I did not do well

13:19 < rcurtin> but I still had fun :)

13:20 < manish7294> wow! you are wearing cat tshirt :)

13:20 < rcurtin> my concern about using k=3 is that it'll give different results than mlpack would for k=3

13:20 < rcurtin> yeah, I ordered that shirt for fun but it turned out to be made of a really nice under armour-like material (don't know what it's called)

13:20 < rcurtin> so it's perfect for activities where I sweat a lot :)

13:21 < manish7294> why do you think it will give different results

13:23 < rcurtin> the kNN classification is the weighted average of the nearest k neighbors

13:23 < rcurtin> in shogun, if the 1st nearest neighbor is always the query point, then we get different results than mlpack, where the 1st nearest neighbor is not the query point

13:23 < manish7294> Right, I totally missed that

13:24 < manish7294> Okay I will ask on #shogun

13:25 < rcurtin> it might be worth looking through their documentation to see if there is any other idea first

13:25 < rcurtin> I only took a quick look

13:25 < manish7294> sure

13:27 < manish7294> or what we can do is keep k for mlpack training 1 less than the k used in shogun accuracy prediction

13:27 < manish7294> like for k = 1, we can put shogun to k = 2

13:27 < rcurtin> no, that may still result in problems

13:28 < rcurtin> there are still situations where that could give different results for mlpack and shogun

13:28 < manish7294> okay, then I shall try to find out more

13:30 < manish7294> will it be good to discuss this problem mentioning this particular with reference to mlpack on #shogun

13:32 < rcurtin> you can mention mlpack, but I don't think the folks in there are familiar with the particular project we are doing

13:32 < rcurtin> so, up to you :) I doubt it will make much difference

13:35 < manish7294> okay I will try my best to explain myself

13:39 < zoq> rcurtin: Nice, this t-shirt is the best :)

13:50 < manish7294> zoq: Can you please check the letters and balance dataset?

13:52 < zoq> manish7294: Hold on let me start the benchmark,.

13:53 < manish7294> zoq: not that, the datasets on the mlpack.org

13:53 < manish7294> they have labels in the first column

13:53 vivekp has quit [Ping timeout: 264 seconds]

13:53 < manish7294> And that may be the reason you got that error yesterday

13:54 < zoq> I see

13:55 < manish7294> and I think wine.csv doesn't have labels in it, not sure though.

13:55 < zoq> Great that you figured it out.

13:56 < manish7294> zoq: was possible because of your debugging :)

13:58 < zoq> manish7294: for the wine dataset the last column contains the labels, but SplitTrainData should remove that part.

14:01 vivekp has joined #mlpack

14:04 < sumedhghaisas> Atharva: Hi Atharva

14:05 < manish7294> zoq: Thanks! I was unsure about it as I only had a very quick glance over it.

14:05 < sumedhghaisas> Sorry little busy today. Is it possible to catch up tomorrow?

14:06 < zoq> ShikharJ: Great blog update, always nice to see good results.

14:06 < Atharva> sumedhghaisas: Sure Sumedh! I have rebased the repar PR, do check it when you get time. Till then I will complete the second PR.

14:07 < sumedhghaisas> Atharva: ahh I did give it a look.

14:07 < sumedhghaisas> there seems to be some static code check errors.

14:07 < sumedhghaisas> although I am not sure how to solve them?

14:08 vivekp has quit [Ping timeout: 245 seconds]

14:08 < Atharva> Yeah I read the details, they are due to the emtpy constructor in the repar layer. But all the other layers have a constructor like that one.

14:09 < sumedhghaisas> hmm... lets ask Marcus then :)

14:09 < sumedhghaisas> zoq: Hey Marcus

14:10 < sumedhghaisas> Are we following static code check for ann layers?

14:10 vivekp has joined #mlpack

14:11 < zoq> sumedhghais: Yes, but sometimes the static analysis check returns nonsense. You are talking about #1420?

14:13 < manish7294> zoq: Currently we have two elements (timing, Accuracy) in metrics of LMNN but only timing is being shown as benchmarks execute.

14:14 < zoq> manish7294: You are right, we should print everything.

14:14 < zoq> Atharva: http://masterblaster.mlpack.org/job/pull-requests-mlpack-static-code-analysis/1680/cppcheckResult/source.all/?before=5&after=5

14:15 < zoq> Atharva: We can ignore the first two, I guess you could use arma::ones<arma::Mat<eT> > instead of arma::ones<arma::Mat<eT>> to solve the two; not sure.

14:16 < zoq> Atharva: You can fix the third issue setting latentSize, stochastic, includeKl to zero? inside the constructor initialization list.

14:17 < manish7294> zoq: So, is this a problem from my code or it's need to be implemented in benchmarks ?

14:21 < zoq> manish7294: This has to be implemented inside the main benchmark script; we could open an issue and maybe someone (including myself) will pick it up in the next days?

14:22 < manish7294> zoq: sounds good :)

14:23 < zoq> manish7294: can you open the issue?

14:23 < manish7294> zoq: I too will try if I can.

14:23 < zoq> manish7294: Thanks, if not I can open one later today.

14:23 < manish7294> zoq: sure, will open in about an hour from now.

14:24 < manish7294> or,I think I can do it now also :)

14:28 < zoq> wenhao: Really interesting results, do you think you could accumulate the results over multiple runs and include the runtime as another metric?

14:35 < sumedhghaisas> zoq: yes #1420

14:36 < sumedhghaisas> I looked at the static code analysis result. Don't know if it makes sense

14:37 < zoq> wenhao: Interesting looks like, there is no difference between the different search policies.

14:37 < zoq> sumedhghais: Right, we can ignore the first two, see my comments above.

14:38 < rcurtin> manish7294: I'll provide some updated comments later today or tomorrow... after a week off, I have a lot to catch up on it seems...

14:38 < manish7294> rcurtin: Great :)

14:41 ImQ009 has joined #mlpack

15:02 < rcurtin> wenhao: I took a look at #1410, there is some really nice refactoring there. thank you for your hard work!

15:03 < rcurtin> one cool thing also is that with the NeighborSearchPolicy templatized, it would be possible to plug in LSH instead of tree-based kNN if a user wanted it

15:03 < rcurtin> I guess that could be useful if the rank of the decomposition was very large, so that the kNN search was high-dimensional, where LSH could perform better

17:26 < Atharva> I was thinking about upgrading to 18.04, has anybody encountered any problem with mlpack on it?

17:27 < rcurtin> I haven't had any problems

17:27 < ShikharJ> Atharva: Retreat comrade, before you regret!

17:27 < rcurtin> (with mlpack that is)

17:27 < Atharva> ShikharJ: Why what happened?

17:28 < rcurtin> I use debian unstable on most systems anyway :)

17:28 < Atharva> rcurtin: How is the nvidia driver support on 18.04, have they improved it?

17:28 < rcurtin> I haven't had any issues, sometimes the driver version is a little bit old, but if you install via apt there's no issues I've seen

17:28 < ShikharJ> Atharva: Lots of issues with boot and shutdown routines, no bumblebee support for switching off GPU, lots of apts don't work.

17:29 < rcurtin> oh my, maybe don't take my word for it then :)

17:29 < Atharva> Getting mixed reviews here :p

17:29 < ShikharJ> Atharva: If your current system works fine, don't make the mistake of upgrading it.

17:30 < ShikharJ> There's a reason why people still use 14.04

17:30 < Atharva> Maybe I will wait then, I don't have any problems with 16.04 other than the fact that Nvidia drivers are a bit hard to get started with

17:31 < ShikharJ> Atharva: It would only be harder with the newer versions, you see, Nvidia provides software after a particular Linux OS is released. Till then only the older ones are provided, which may not even work.

17:32 < ShikharJ> You might want to try 17.10, it's a major improvement over 16.04.

17:32 < ShikharJ> 17.10 and 18.04 are pretty similar, you wouldn't even feel the difference.

17:34 < rcurtin> my recommendation, which may be more work, would be to switch to debian unstable, which is what ubuntu is a derivative of

17:35 < Atharva> ShikharJ: Thanks! I will give this a thought over the next weekend.

17:35 < rcurtin> but it depends on personal preference. I like minimal systems so debian is a good starting place for me

17:36 < Atharva> rcurtin: Will it be suitable for someone like me who doesn't have a lot of experience with different linux distributions?

17:36 < Atharva> How different is it from ubuntu? Are any commands different?

17:36 < rcurtin> Atharva: I would say it depends on how comfortable you are with the command line. Ubuntu is generally understood to be "easier" and typically has nicer GUI tools and everything

17:37 < rcurtin> but I prefer working with the command-line wherever possible, so Debian is fine for me. both are built on apt, so the process of installing and upgrading packages is roughly the same

17:37 < rcurtin> but again, I would say, give it a shot if you like, but maybe you might not like it. only one way to find out :)

17:38 < Atharva> rcurtin: Yes, only one way to find out. If I do it I will let you know how I find it to be. :)

17:38 < rcurtin> ah, sorry, I did say "debian unstable" but I would recommend instead to use "debian testing"

17:38 < rcurtin> the releases are much less often than Ubuntu

17:39 < Atharva> Okay, thanks, I will check it out.

17:39 < rcurtin> so debian stable ("stretch") is a little old, but testing ("buster") will have more up-to-date packages

17:40 < Atharva> You said you use 18.04, does it kind of run parallel to ubuntu?

17:40 < rcurtin> no, I was using 18.04 in a docker container for some unrelated work at Symantec

17:40 < rcurtin> but I have built mlpack in that same setup, no problems

17:41 < Atharva> Okayy

18:11 < ShikharJ> zoq: Where does math::MakeAlias originate from?

18:11 < rcurtin> ShikharJ: do you mean which file?

18:11 < ShikharJ> zoq: I don't think there's a support for arma::cube for that.

18:11 < ShikharJ> This needs to be extended

18:11 < rcurtin> it's in src/mlpack/core/math/, and yeah, if you want to add cube support it would be great

18:13 < ShikharJ> rcurtin: Thanks!

18:13 < rcurtin> of course, happy to help :)

18:26 < ShikharJ> rcurtin: Is there a better way of doing `math::MakeAlias(const_cast<arma::cube>(arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize)), false);`, since this is not a pointer or a reference, so it leads to compilation issues.

18:27 < rcurtin> the inner arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize) is already most of the way to an alias

18:27 < ShikharJ> But we need to cast away the constness.

18:27 < rcurtin> if you add 'false' as a fifth constructor parameter, it's completely an alias

18:28 < rcurtin> I guess, I am not fully seeing the need for the MakeAlias() function since the MakeAlias() function basically just calls the advanced constructor you've already written there

18:29 < ShikharJ> Yeah, I misrepresented the question.

18:30 < ShikharJ> The question is how to remove the constness from an object like arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize, false, false);

18:30 < rcurtin> but that shouldn't give you a const object, I don't think

18:30 < rcurtin> if the problem is that input is const, you can do 'arma::cube(const_cast<arma::cube>(input).memptr(), inputWidth, inputHeight, inSize * batchSize, false);

18:30 < rcurtin> '

18:31 < ShikharJ> rcurtin: That's exactly the issue, I'll give it a try.

18:31 < rcurtin> yeah, I think that is basically the exact code that's already a part of MakeAlias() for vectors and matrices, but I am not 100% sure (I am not looking at it right this moment)

19:13 < rcurtin> ok, finally, our mlpack paper is accepted into the Journal of Open Source Software: http://joss.theoj.org/papers/10.21105/joss.00726

19:24 < ShikharJ> Congrats rcurtin and everyone!

19:25 < zoq> rcurtin: Awesome! Thanks for keeping up with all the comments!

19:26 < rcurtin> it was really useful to have random new people come try to use the software and post their feedback; we found a lot of documentation issues

19:26 < rcurtin> I think I will try and find random people on the street and give them a few dollars to try it out and see what they think :)

19:27 < ShikharJ> rcurtin: Just curious, does mlpack submit a paper everytime a new version release happens?

19:27 < rcurtin> no, we submitted one for the original release to a NIPS workshop and then submitted a longer version to the JMLR open source software track

19:27 < rcurtin> but version 3 is so different and has so much more, it was time to submit somewhere again

19:28 < rcurtin> maybe it would be useful to submit another paper for version 4? I am not sure, let's see what happens when we get there :)

19:28 < ShikharJ> rcurtin: Ah, I see, hope to stick around till that time :)

19:28 < zoq> rcurtin: That is an interesting idea, I suppose they are somewhat familiar with toolboxes.

19:29 < rcurtin> yeah, this is a big problem that I think JOSS is thinking about... how often to submit a new paper for a new version?

19:29 < rcurtin> I don't think we could submit to JMLR MLOSS again

19:31 < zoq> rcurtin: I guess it makes sense to at least allow someone to update the paper in some forms e.g. add new names.

19:32 < ShikharJ> rcurtin: Could we look for other journals (such as maybe PeerJ)?

19:33 < zoq> rcurtin: But with all the effort they put into the review not sure they have the manpower to do a review for every version.

19:33 < rcurtin> right, I think it might be difficult for them to provide reviewers. so I suspect they would frown upon us submitting a new version every month

19:33 < rcurtin> (that would also make it really hard for people to know what to cite when they use it)

19:34 < rcurtin> ShikharJ: I don't know about PeerJ, but you're right, maybe there are other efforts out there

19:34 < zoq> agreed, good point

20:07 manish7294 has quit [Ping timeout: 255 seconds]

20:16 ImQ009 has quit [Quit: Leaving]

22:34 wenhao has quit [Ping timeout: 260 seconds]