verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
Mohith has joined #mlpack
< Mohith> hello guyz
Mohith has quit [Ping timeout: 250 seconds]
sumedhghaisas has joined #mlpack
Stellar_Mind has joined #mlpack
Stellar_Mind has quit [Ping timeout: 276 seconds]
< marcosirc> clear
marcosirc has quit [Quit: WeeChat 1.4]
Stellar_Mind has joined #mlpack
mohiths has joined #mlpack
< mohiths> hello i need help
< mohiths> i installed visual studio 2015
Stellar_Mind has quit [Ping timeout: 264 seconds]
< mohiths> when i clicked manage nuget packages for solution
< mohiths> it is showing me error that no projects are supported by nuget
< mohiths> please help me
Stellar_Mind has joined #mlpack
Mathnerd314 has quit [Ping timeout: 272 seconds]
< mohiths> hello
< mohiths> Can anyone help me out!
< keonkim> hello
< keonkim> mohiths: I wrote this tutorial -> http://keon.io/mlpack-on-windows.html
< mohiths> yeah i'm following that only
< mohiths> but when i clicked manage nuget packages for solution
< keonkim> mohiths: yap
< mohiths> it is showing error that no projcets are supported by nuget
< mohiths> There is some problem with nuget
< keonkim> hmm... unfortunately I don't use vs anymore, don't know if I can help without recreating the situation
< keonkim> what step were you on before getting that error message?
< mohiths> oh should i switch to linux now ?
< mohiths> is that the only solution that i have?
< mohiths> ok wait let me explain ......
< mohiths> I'm in step 2 of your tutorial
< mohiths> i created a project file using existing code
< mohiths> then it is given that i need to go to tools and click nuget package manager and then nuget packages for solutions
< mohiths> and there i'm getting error that no projects are supported by nuget
< mohiths> hey bro there ?
< keonkim> are you using the latest VS 2015?
< keonkim> yup
< keonkim> I was searching on google
< keonkim> :)
< mohiths> yeah VS 2015
< keonkim> is it updated after Oct 2015?
< mohiths> yeah It's the latest version
< keonkim> hmm strange, there is not much I can find on the internet
< mohiths> yeah me too
< mohiths> should i switch to linux ? is it more easier ?
< keonkim> at least for me using it on linux is much easier.
< mohiths> oh good
< mohiths> can you please send me any links .. how to install mlpack on linux?
< mohiths> also tell me which version of linux is better ?
< keonkim> I use ubuntu. you can follow the README on github: https://github.com/mlpack/mlpack
< mohiths> which version of ubuntu ?
< keonkim> Any stable version should be fine.
< keonkim> I tried with 14.04 and 16.04
< mohiths> okay
< mohiths> what about 15.10?
< keonkim> 15.05 I never tried.
< keonkim> but it should be fine.
< mohiths> okay thank you
< keonkim> 14.04 and 16.04 are Long Term Support versions
< mohiths> time has come to switch to linux
< keonkim> so I recommend those
< mohiths> ok i'm installing 14.04
< mohiths> thanks a lot
mohiths has quit [Quit: Page closed]
< keonkim> but if you are comfortable with windows, fixing it should take less time than learning Linux environment from scratch.
nilay has joined #mlpack
mentekid has quit [Ping timeout: 244 seconds]
mentekid has joined #mlpack
TD has quit [Ping timeout: 250 seconds]
tsathoggua has joined #mlpack
tsathoggua has quit [Client Quit]
Stellar_Mind has quit [Ping timeout: 244 seconds]
nilay has quit [Quit: Page closed]
sumedhghaisas has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]
marcosirc has joined #mlpack
Mathnerd314 has joined #mlpack
TD has joined #mlpack
< TD> Has anyone ever installed mlpack and created a working solution in a windows environment?
< marcosirc> Hi! I think tham uses mlpack in windows. But he is not online now...
< TD> Okay, just wanted to know if this was possible. I'm stuck in between trying to fix a data feed so it works in Linux or trying to install mlpack in Windows.
< mentekid> didn't keon write a nice tutorial for mlpack on windows 10?
< mentekid> I think I saw a link a few days ago floating around... Let me check (or have you seen that?)
< TD> Yeah, and it worked great. Mlpack is successfully installed. When I try to build a custom solution though, I am getting errors that certain files can't be opened. Reached out to VS and then told me contact the creator of the library
< TD> certain libraries
< TD> I haven't given up but just wanted to know if someone has programmed a successful solution
< TD> hope :-0
< zoq> TD: Does the same error occur if you open one of the mlpack executables?
< TD> zoq: That's my hope right now
< TD> I will let you know
< TD> I am going to try it tonight
< zoq> okay, thanks
< TD> persistently stupid has always been my strong point in solving problems :-)
< TD> Or stupid persistence
< TD> It's a combination
< zoq> nah, I'm sure, we can figure it out somehow
nilay has joined #mlpack
kwikadi has quit [Remote host closed the connection]
kwikadi has joined #mlpack
< marcosirc> rcurtin: are you online?
< rcurtin> marcosirc: yeah, I just got back from lunch
< marcosirc> rcurtin: ok! I have included more documentation as requested in the PR.
< marcosirc> And, also, I fixed an error in serialization.
< marcosirc> Loading was not working.
< rcurtin> loading of NSModel was not working?
< rcurtin> I had tested that, but it is possible something changed. one of the problems is, we only have tests for C++ code, not really for the command-line programs
< rcurtin> I think that at some point we should figure out how to test the command-line programs and integrate that with the rest of the mlpack tests
< rcurtin> but I haven't had a chance to look into it fully (I think maybe that should be next after the 2.0.2 release, which would have happened except I found a bug...)
< marcosirc> Yeah. I agree, I was planning to include some test in serialization of NSModel.
< rcurtin> the documentation looks great, thanks for taking the time to do that
< marcosirc> Thanks!
< rcurtin> ah, I see the serialization issue you were talking about now
< rcurtin> I think we could probably make boost::variant serialize the right thing, but then we would need to have it hold something like SecondShim<NeighborSearch<...>> instead of NeighborSearch<...> objects
< marcosirc> yeah. I have thought in two different ways of fixing it.
< rcurtin> and I am not sure that is worth the extra effort
< rcurtin> have you seen src/mlpack/core/data/serialization_shim.hpp? lots of effort for such a minor change ...
< marcosirc> Yeah I have seen that...
< marcosirc> I have fixed serialization. Now it works. I use boost serialization for variants.
< rcurtin> yeah, I agree, the solution you pushed to the PR looks good to me
< marcosirc> Ok. Thanks. That is all I wanted to know.
< rcurtin> sure, glad I could help. today is a paper deadline day so I am not able to commit so much time to mlpack today, unfortunately...
< marcosirc> rcurtin: sure, no problem! Thanks for your time.
< mentekid> rcurtin: ok so all the vectorization amounted to nothing
< mentekid> search is 2-3 times slower now :(
< rcurtin> hmm :(
< rcurtin> well I still think it is useful that you wrote the code, because now we know :)
< rcurtin> do you have the code pasted somewhere so I can glance at it?
< rcurtin> maybe I will have an idea to speed it up, or, maybe we will be forced to conclude that it wasn't helpful
< mentekid> I thought it might be the sorting, but I commented it out and still got bad results so it's not that
< rcurtin> it looks like you are calculating all of the reference set distances at once, then sorting and inserting
< rcurtin> I wonder if you might be better off sequentially calculating each distance, like the BaseCase() loop
< rcurtin> but it sounds like maybe you have done some profiling of what is fast and what is slow and maybe even that would not speed things up
< mentekid> I think the main reason the vectorization is faster is because I'm doing the matrix vector multiplication
< mentekid> and I saw some pretty good cpu usage at that point, close to 400% for my 4-core machine
< mentekid> which is openBLAS using parallelism in the background
< mentekid> I can try doing it as you say sequentially though, which will indeed skip the sorting
< mentekid> But still sorting doesn't seem to be the problem in the end
< rcurtin> so the cost of assembling the vector of norms is too high then, I guess?
< mentekid> let me profile the code with armadillo debugging symbols so I know, but I guess that's the waste yeah
< mentekid> I guessed if we did enough calculations it would offset that, but apparently the candidate sets are smaller so it's too hard to balance it out
< marcosirc> Hi @zoq
< marcosirc> to benchmark some libraries all I have to do is make run... and make reports.... isn't it?
< zoq> arcosirc: If you are going to use javascript interface there is no need to run 'make reports', but you need to set LOG=True. Also you could set BLOCK and METHODBLOCk to speed things up:
< zoq> make CONFIG=commit-benchmark.yaml MLPACK_BIN=/home/marcus/workspace/mlpack-release/bin/ MLPACK_BIN_DEBUG=/home/marcus/workspace/mlpack-release/bin/ BLOCK=mlpack METHODBLOCK=LSH LOG=True run
< zoq> If you need a machine to run some benchmarks ... I can provide access to a machine that already comes with a working benchmark system setup
< marcosirc> zoq: thanks for your reply.
< marcosirc> I don't know why I can't see any graphic...
< marcosirc> I execute make run .. as you suggested
< marcosirc> then I go to the reports directory, and I execture: "python -m SimpleHTTPServer"
< marcosirc> when I open the html page and select different options, nothing is shown...
< zoq> for any view?
< marcosirc> yes. I can't see any graphic for any view.
< zoq> hm, maybe the database is empty, can you test it with: http://big.mlpack.org/job/benchmark%20-%20mlpack%20-%20nightly/ws/reports/benchmark-daily.db
< marcosirc> mmm I don't think so because I can see the different options and methods that I have run.
< marcosirc> I will try that anyway.
< marcosirc> Mmm with your db everything works fine... it is strange... I will clean everything and start again.
< zoq> hm, can you run the benchmark with LOG=False and check if you get any results?
< marcosirc> Ok, I will check.
< marcosirc> with LOG=False it is exactly the same. I will pull the last changes from your repo and try again.
< zoq> with LOG=FALSE the results are printed to stdout
< marcosirc> Yeah, they were printed.
< zoq> hm, can you send me the database file?
< marcosirc> zoq: I sent you an email with the db file.
< mentekid> rcurtin: sorry for being late with my response. I just completed the profiling. It seems like the final sorting actually does account for a part of the delay - around 30% of total time
< mentekid> but that alone doesn't justify the times I saw, where my "optimized" code was 2-3 times slower than the original
< mentekid> also for some reason the sift dataset resists being profiled, i have no idea why
< rcurtin> what do you mean? I dunno how a dataset can resist :)
< mentekid> I think it's just that it is small so the profiler is confused in the noise
< rcurtin> maybe use a larger version? :)
< mentekid> yeah that's what I'm doing now
< rcurtin> :)
< zoq> marcosirc: Thanks, let me take a look.
tsathoggua has joined #mlpack
tsathoggua has quit [Client Quit]
nilay has quit [Ping timeout: 250 seconds]
< zoq> marcosirc: Does "options: '-k 3 --seed 42 -e 0.05'" end with a space in your config file?
< marcosirc> zoq: yes.
< marcosirc> is that the problem?
< zoq> yes I think so, rc.param_name = param_name_full.split("(")[0].replace(/^\s+|\s+$/g, ''); also truncates the whitespace at the end, so if we query the database with methods.parameters == rc.param_name it doesn't match
< zoq> Can you rerun the benchmark without the white space at the end?
< zoq> that's definitely an issue, I'll fix it later today
< marcosirc> zoq: yeah! that was the problem.
< marcosirc> Now it works fine. Thanks. I think it would take me a lot of time to find this problem.
< zoq> you're welcome, I guess, the regex isn't that easy to read
< mentekid> rcurtin: different datasets, similar results. I re-run the timing test as well (with sorting, without sorting, with the old code) and the times seem to more or less agree with the profiler... So I guess it was a bad idea :/
< mentekid> And it's a bummer because OpenBLAS was running on all 4 cores and hitting good cpu usage... I guess we'll have to do parallelization ourselves
< mentekid> I'll start with the parallel query processing as we discussed. If you come up with any ideas to maybe improve the vectorized code let me know :)
< mentekid> I'll actually run a few datasets and parameters at night because all my tests were using default parameters
< rcurtin> openblas on all cores is still underperforming the existing approach?
< rcurtin> hm... that intuitively seems incorrect to me...
< rcurtin> but I guess it is possible that the cost of calculating that norms vector is just too high
< mentekid> But it shouldn't be - that's what I find so weird: We already do that calculation at least once, so why would doing it once in the beginning for all points be that wasteful...
< rcurtin> let me take a closer look at the code...
< mentekid> please do, I'm starting to believe I've done something stupid without realizing it
< rcurtin> I think the call to .cols() should be avoided; that will assemble a copy of the data matrix
< rcurtin> better to just loop over all the refIndices points and calculate their dot products
< rcurtin> as a bonus when you do it that way there is no need to store the distances of all candidates and sprt them
< rcurtin> *sort
< mentekid> you mean around line 528 right?
< mentekid> where I create a copy of the reference set
< rcurtin> yep, line 528
< mentekid> I see, I'll try doing it like BaseCase - maybe that will work better
marcosirc has quit [Quit: WeeChat 1.4]
< TD> Would anyone have the properties needed for VS2015 to run an executable? I am guessing Project - 'Visual C++' - 'Win32Project' - 'DLL' - 'Empty project'
< TD> And also the solution properties - 'C/C++ - General Additional Include - ?
< TD> Runtime Library - Multi-thread(/MT) ?
< TD> Precomplied Header - Not Using Precomplied Headers ?
< TD> And any other solution properties I am missing?
< TD> Windows 10 environment
mentekid has quit [Ping timeout: 246 seconds]