verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
witness_ has joined #mlpack
manish7294 has joined #mlpack
vivekp has joined #mlpack
< manish7294>
zoq: Ah! that's looks strange, as we are not passing any initial transformation and it's being generated by shogun itself at line 56 of LMNNImp.cpp
< manish7294>
I got it, if you see letter dataset---the labels are in the first column of csv file. That may be the reason of this unusual error.
< manish7294>
so, letter dataset needs an update :)
< manish7294>
And regarding 100% accuracy----It seems strange to me because if we comment out transformedData and just carry out accuracy on the original iris, then also we get 100%. Which is quite strange in itself as in iris there are some points which are just far away from there original class
manish7294 has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]
< manish7294_>
Same is with balance_scale dataset --- it too have labels in first column
< manish7294_>
Things seems to be connecting now, It seems this is the reason why I was getting 20% and 40.235% on these two and 100% on all others.
manish7294 has quit [Ping timeout: 260 seconds]
< manish7294_>
Since we have query set same as training set, It looks like for k = 1 the shogun's knn is prediciting itself as the nearest point and hence the 100% accuracy.
< manish7294_>
Let me verify this by changing the value of k
< manish7294_>
Right, This time things work --- got 96.6667 on iris :)
< rcurtin>
manish7294_: sorry I was unavailable over the weekend, looks like you got things figured out
< rcurtin>
with mlpack kNN, you can specify just a single reference set to do all-nearest-neighbors, which won't count the nearest point
< rcurtin>
er, sorry, "which won't count the point as its own nearest neighbor"
< rcurtin>
but it looks like shogun doesn't have that support... I wonder if to make it work right, if you need to query for each point individually with the rest of the points as the reference set
< rcurtin>
you could also ask in #shogun if there's a better way to do it
manish7294 has joined #mlpack
< manish7294>
rcurtin: Hi, how's race went?
< manish7294>
things are good if we just avoid k = 1
< rcurtin>
manish7294: it was two events... one was an "ironman", a 1-hour endurance race
< manish7294>
so maybe we can have k = 3 for benchmarking
< rcurtin>
the other event was a series of races, but we are randomly assigned karts, and the karts were not good, so as a result I did not do well
< rcurtin>
but I still had fun :)
< manish7294>
wow! you are wearing cat tshirt :)
< rcurtin>
my concern about using k=3 is that it'll give different results than mlpack would for k=3
< rcurtin>
yeah, I ordered that shirt for fun but it turned out to be made of a really nice under armour-like material (don't know what it's called)
< rcurtin>
so it's perfect for activities where I sweat a lot :)
< manish7294>
why do you think it will give different results
< rcurtin>
the kNN classification is the weighted average of the nearest k neighbors
< rcurtin>
in shogun, if the 1st nearest neighbor is always the query point, then we get different results than mlpack, where the 1st nearest neighbor is not the query point
< manish7294>
Right, I totally missed that
< manish7294>
Okay I will ask on #shogun
< rcurtin>
it might be worth looking through their documentation to see if there is any other idea first
< rcurtin>
I only took a quick look
< manish7294>
sure
< manish7294>
or what we can do is keep k for mlpack training 1 less than the k used in shogun accuracy prediction
< manish7294>
like for k = 1, we can put shogun to k = 2
< rcurtin>
no, that may still result in problems
< rcurtin>
there are still situations where that could give different results for mlpack and shogun
< manish7294>
okay, then I shall try to find out more
< manish7294>
will it be good to discuss this problem mentioning this particular with reference to mlpack on #shogun
< rcurtin>
you can mention mlpack, but I don't think the folks in there are familiar with the particular project we are doing
< rcurtin>
so, up to you :) I doubt it will make much difference
< manish7294>
okay I will try my best to explain myself
< zoq>
rcurtin: Nice, this t-shirt is the best :)
< manish7294>
zoq: Can you please check the letters and balance dataset?
< zoq>
manish7294: Hold on let me start the benchmark,.
< manish7294>
zoq: not that, the datasets on the mlpack.org
< manish7294>
they have labels in the first column
vivekp has quit [Ping timeout: 264 seconds]
< manish7294>
And that may be the reason you got that error yesterday
< zoq>
I see
< manish7294>
and I think wine.csv doesn't have labels in it, not sure though.
< zoq>
Great that you figured it out.
< manish7294>
zoq: was possible because of your debugging :)
< zoq>
manish7294: for the wine dataset the last column contains the labels, but SplitTrainData should remove that part.
vivekp has joined #mlpack
< sumedhghaisas>
Atharva: Hi Atharva
< manish7294>
zoq: Thanks! I was unsure about it as I only had a very quick glance over it.
< sumedhghaisas>
Sorry little busy today. Is it possible to catch up tomorrow?
< zoq>
ShikharJ: Great blog update, always nice to see good results.
< Atharva>
sumedhghaisas: Sure Sumedh! I have rebased the repar PR, do check it when you get time. Till then I will complete the second PR.
< sumedhghaisas>
Atharva: ahh I did give it a look.
< sumedhghaisas>
there seems to be some static code check errors.
< sumedhghaisas>
although I am not sure how to solve them?
vivekp has quit [Ping timeout: 245 seconds]
< Atharva>
Yeah I read the details, they are due to the emtpy constructor in the repar layer. But all the other layers have a constructor like that one.
< sumedhghaisas>
hmm... lets ask Marcus then :)
< sumedhghaisas>
zoq: Hey Marcus
< sumedhghaisas>
Are we following static code check for ann layers?
vivekp has joined #mlpack
< zoq>
sumedhghais: Yes, but sometimes the static analysis check returns nonsense. You are talking about #1420?
< manish7294>
zoq: Currently we have two elements (timing, Accuracy) in metrics of LMNN but only timing is being shown as benchmarks execute.
< zoq>
manish7294: You are right, we should print everything.
< zoq>
Atharva: We can ignore the first two, I guess you could use arma::ones<arma::Mat<eT> > instead of arma::ones<arma::Mat<eT>> to solve the two; not sure.
< zoq>
Atharva: You can fix the third issue setting latentSize, stochastic, includeKl to zero? inside the constructor initialization list.
< manish7294>
zoq: So, is this a problem from my code or it's need to be implemented in benchmarks ?
< zoq>
manish7294: This has to be implemented inside the main benchmark script; we could open an issue and maybe someone (including myself) will pick it up in the next days?
< manish7294>
zoq: sounds good :)
< zoq>
manish7294: can you open the issue?
< manish7294>
zoq: I too will try if I can.
< zoq>
manish7294: Thanks, if not I can open one later today.
< manish7294>
zoq: sure, will open in about an hour from now.
< manish7294>
or,I think I can do it now also :)
< zoq>
wenhao: Really interesting results, do you think you could accumulate the results over multiple runs and include the runtime as another metric?
< sumedhghaisas>
zoq: yes #1420
< sumedhghaisas>
I looked at the static code analysis result. Don't know if it makes sense
< zoq>
wenhao: Interesting looks like, there is no difference between the different search policies.
< zoq>
sumedhghais: Right, we can ignore the first two, see my comments above.
< rcurtin>
manish7294: I'll provide some updated comments later today or tomorrow... after a week off, I have a lot to catch up on it seems...
< manish7294>
rcurtin: Great :)
ImQ009 has joined #mlpack
< rcurtin>
wenhao: I took a look at #1410, there is some really nice refactoring there. thank you for your hard work!
< rcurtin>
one cool thing also is that with the NeighborSearchPolicy templatized, it would be possible to plug in LSH instead of tree-based kNN if a user wanted it
< rcurtin>
I guess that could be useful if the rank of the decomposition was very large, so that the kNN search was high-dimensional, where LSH could perform better
< Atharva>
I was thinking about upgrading to 18.04, has anybody encountered any problem with mlpack on it?
< rcurtin>
I haven't had any problems
< ShikharJ>
Atharva: Retreat comrade, before you regret!
< rcurtin>
(with mlpack that is)
< Atharva>
ShikharJ: Why what happened?
< rcurtin>
I use debian unstable on most systems anyway :)
< Atharva>
rcurtin: How is the nvidia driver support on 18.04, have they improved it?
< rcurtin>
I haven't had any issues, sometimes the driver version is a little bit old, but if you install via apt there's no issues I've seen
< ShikharJ>
Atharva: Lots of issues with boot and shutdown routines, no bumblebee support for switching off GPU, lots of apts don't work.
< rcurtin>
oh my, maybe don't take my word for it then :)
< Atharva>
Getting mixed reviews here :p
< ShikharJ>
Atharva: If your current system works fine, don't make the mistake of upgrading it.
< ShikharJ>
There's a reason why people still use 14.04
< Atharva>
Maybe I will wait then, I don't have any problems with 16.04 other than the fact that Nvidia drivers are a bit hard to get started with
< ShikharJ>
Atharva: It would only be harder with the newer versions, you see, Nvidia provides software after a particular Linux OS is released. Till then only the older ones are provided, which may not even work.
< ShikharJ>
You might want to try 17.10, it's a major improvement over 16.04.
< ShikharJ>
17.10 and 18.04 are pretty similar, you wouldn't even feel the difference.
< rcurtin>
my recommendation, which may be more work, would be to switch to debian unstable, which is what ubuntu is a derivative of
< Atharva>
ShikharJ: Thanks! I will give this a thought over the next weekend.
< rcurtin>
but it depends on personal preference. I like minimal systems so debian is a good starting place for me
< Atharva>
rcurtin: Will it be suitable for someone like me who doesn't have a lot of experience with different linux distributions?
< Atharva>
How different is it from ubuntu? Are any commands different?
< rcurtin>
Atharva: I would say it depends on how comfortable you are with the command line. Ubuntu is generally understood to be "easier" and typically has nicer GUI tools and everything
< rcurtin>
but I prefer working with the command-line wherever possible, so Debian is fine for me. both are built on apt, so the process of installing and upgrading packages is roughly the same
< rcurtin>
but again, I would say, give it a shot if you like, but maybe you might not like it. only one way to find out :)
< Atharva>
rcurtin: Yes, only one way to find out. If I do it I will let you know how I find it to be. :)
< rcurtin>
ah, sorry, I did say "debian unstable" but I would recommend instead to use "debian testing"
< rcurtin>
the releases are much less often than Ubuntu
< Atharva>
Okay, thanks, I will check it out.
< rcurtin>
so debian stable ("stretch") is a little old, but testing ("buster") will have more up-to-date packages
< Atharva>
You said you use 18.04, does it kind of run parallel to ubuntu?
< rcurtin>
no, I was using 18.04 in a docker container for some unrelated work at Symantec
< rcurtin>
but I have built mlpack in that same setup, no problems
< Atharva>
Okayy
< ShikharJ>
zoq: Where does math::MakeAlias originate from?
< rcurtin>
ShikharJ: do you mean which file?
< ShikharJ>
zoq: I don't think there's a support for arma::cube for that.
< ShikharJ>
This needs to be extended
< rcurtin>
it's in src/mlpack/core/math/, and yeah, if you want to add cube support it would be great
< ShikharJ>
rcurtin: Thanks!
< rcurtin>
of course, happy to help :)
< ShikharJ>
rcurtin: Is there a better way of doing `math::MakeAlias(const_cast<arma::cube>(arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize)), false);`, since this is not a pointer or a reference, so it leads to compilation issues.
< rcurtin>
the inner arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize) is already most of the way to an alias
< ShikharJ>
But we need to cast away the constness.
< rcurtin>
if you add 'false' as a fifth constructor parameter, it's completely an alias
< rcurtin>
I guess, I am not fully seeing the need for the MakeAlias() function since the MakeAlias() function basically just calls the advanced constructor you've already written there
< ShikharJ>
Yeah, I misrepresented the question.
< ShikharJ>
The question is how to remove the constness from an object like arma::cube(input.memptr(), inputWidth, inputHeight, inSize * batchSize, false, false);
< rcurtin>
but that shouldn't give you a const object, I don't think
< rcurtin>
if the problem is that input is const, you can do 'arma::cube(const_cast<arma::cube>(input).memptr(), inputWidth, inputHeight, inSize * batchSize, false);
< rcurtin>
'
< ShikharJ>
rcurtin: That's exactly the issue, I'll give it a try.
< rcurtin>
yeah, I think that is basically the exact code that's already a part of MakeAlias() for vectors and matrices, but I am not 100% sure (I am not looking at it right this moment)
< zoq>
rcurtin: Awesome! Thanks for keeping up with all the comments!
< rcurtin>
it was really useful to have random new people come try to use the software and post their feedback; we found a lot of documentation issues
< rcurtin>
I think I will try and find random people on the street and give them a few dollars to try it out and see what they think :)
< ShikharJ>
rcurtin: Just curious, does mlpack submit a paper everytime a new version release happens?
< rcurtin>
no, we submitted one for the original release to a NIPS workshop and then submitted a longer version to the JMLR open source software track
< rcurtin>
but version 3 is so different and has so much more, it was time to submit somewhere again
< rcurtin>
maybe it would be useful to submit another paper for version 4? I am not sure, let's see what happens when we get there :)
< ShikharJ>
rcurtin: Ah, I see, hope to stick around till that time :)
< zoq>
rcurtin: That is an interesting idea, I suppose they are somewhat familiar with toolboxes.
< rcurtin>
yeah, this is a big problem that I think JOSS is thinking about... how often to submit a new paper for a new version?
< rcurtin>
I don't think we could submit to JMLR MLOSS again
< zoq>
rcurtin: I guess it makes sense to at least allow someone to update the paper in some forms e.g. add new names.
< ShikharJ>
rcurtin: Could we look for other journals (such as maybe PeerJ)?
< zoq>
rcurtin: But with all the effort they put into the review not sure they have the manpower to do a review for every version.
< rcurtin>
right, I think it might be difficult for them to provide reviewers. so I suspect they would frown upon us submitting a new version every month
< rcurtin>
(that would also make it really hard for people to know what to cite when they use it)
< rcurtin>
ShikharJ: I don't know about PeerJ, but you're right, maybe there are other efforts out there