verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
manish7294 has joined #mlpack
< manish7294> rcurtin: Are you there?
< jenkins-mlpack> Project docker mlpack weekly build build #46: NOW UNSTABLE in 2 hr 51 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20weekly%20build/46/
< jenkins-mlpack> * jaiswalshikhar87: Inital Commit
< jenkins-mlpack> * jaiswalshikhar87: Rebase to master
< jenkins-mlpack> * jaiswalshikhar87: Implement O'Reilly Test
< jenkins-mlpack> * Ryan Curtin: Handle training sets that don't divide evenly by k.
< jenkins-mlpack> * Ryan Curtin: Add ShuffleData() with weights.
< jenkins-mlpack> * Ryan Curtin: Add 'shuffle' parameter to KFoldCV.
< jenkins-mlpack> * Ryan Curtin: Safer handling of sparse matrix values arrays.
< jenkins-mlpack> * Ryan Curtin: Remove trainingSubsetSize and clarify comments.
< jenkins-mlpack> * Ryan Curtin: Rename variables for clarity.
< jenkins-mlpack> * Ryan Curtin: Expose Shuffle() to users.
< jenkins-mlpack> * Ryan Curtin: Update documentation.
< jenkins-mlpack> * Ryan Curtin: Fix condition, thanks Kirill for pointing it out.
< jenkins-mlpack> * Ryan Curtin: Add section on using the Python bindings without installing.
< jenkins-mlpack> * Ryan Curtin: Clarify any need for LD_LIBRARY_PATH setting.
< jenkins-mlpack> * Ryan Curtin: Be sure to print `output = ` if there are output parameters.
< jenkins-mlpack> * Ryan Curtin: Fix the parameter name we are using. (Thanks @rasbt!)
< jenkins-mlpack> * Ryan Curtin: Don't overwrite the 'perceptron' module in Python. (Thanks @rasbt!)
manish7294 has quit [Quit: Page closed]
vivekp has joined #mlpack
< ShikharJ> zoq: I'll help you :P Imagine an input point of 3x3 dimension and 3 channels. So you now have a 3x3x3 input cube with batchSize = 1 (for simplicity). You want 3 output channels as well, so your kernel has (inSize * outsize = 3x3 = 9) slices. Let's assume even the kernel has dimensions 3x3, so your final kernel dimensions are 3x3x9, which would give you an output of 1x1x3.
< ShikharJ> zoq: Now try iterating through the Gradients step with the previous approach and the current approach, the first three input slices convolve with the first three kernel slices to give you the first output slice. So the first output slice should be convolved with the first three input slices, to give you the gradients for the first three kernel slices. But this does not happen in the previous approach, as the first three
< ShikharJ> kernel gradients are now instead stored at a distance of (outSize) from each other (see the variable s and how it's updated).
< ShikharJ> zoq: I'll also need to add the support for batches in BilinearInterpolation method
< Atharva> sumedhghaisas: zoq: rcurtin: Should I rebase the PR even if I have resolved the conflicts?
< ShikharJ> Atharva: It's a good practice, as it helps to weed out any error that might come up due to the underlying routines being changed.
< Atharva> ShikharJ: Okay, thanks!
wiking has quit [Quit: ZNC 1.7.0 - https://znc.in]
< rcurtin> manish7294: sorry, I went to bed early last night
wiking has joined #mlpack
< jenkins-mlpack> Project docker mlpack nightly build build #350: STILL UNSTABLE in 2 hr 46 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/350/
manish7294 has joined #mlpack
< manish7294> rcurtin: whoa, you are up already :)
< manish7294> rcurtin: I am facing some issues with setting up benchmarks on slake, maybe due to dependencies issues.
< manish7294> I think mysql server is not there and the python version is 2.7
< manish7294> With shogun script I am continously getting IO error
< manish7294> It is originating from SplitTrainData from util/misc.py
sulan_ has joined #mlpack
< rcurtin> manish7294: hmm, I know you can run the system to use a local sqlite db
< rcurtin> in the config.yaml you can do:
< rcurtin> database: 'benchmarks.db'
< rcurtin> driver: 'sqlite'
< rcurtin> and you can comment out databaseHost and port
< rcurtin> and yeah, it's an early flight today...
< rcurtin> not sure why I picked this one but it is too late to change it now :)
< manish7294> rcurtin: Now other one popped up File "/home/manish/benchmarks/util/database.py", line 73, in Execute return self.Execute(command) RecursionError: maximum recursion depth exceeded Makefile:176: recipe for target '.run' failed
< manish7294> rcurtin: And it seems, LMNN authors applied PCA before LMNN to reduce dimensionality of large dimensional datasets(like for mnist 784 to 164)
< rcurtin> manish7294: what if you create a very simple configuration with no LMNN runs?
< rcurtin> that will help narrow down whether the issue is specific to the LMNN benchmark scripts
< manish7294> if I turn off log then it works
< manish7294> sure, I will try debug more.
< rcurtin> hmm, can you paste your config somewhere?
< rcurtin> looks like maybe it is not specific to the LMNN scripts
< manish7294> rcurti; Here it is https://pastebin.com/cNPBGeHH
< rcurtin> does the line 'driver : 'sqlite'' need to change to 'driver: 'sqlite''?
< rcurtin> also I'd suggest removing most of the method blocks so that when you run it only runs the benchmarks you are interested in
< rcurtin> running that whole configuration would probably take several weeks :)
< manish7294> I am using BLOCKS and METHODBLOCKS to avoid that :)
< rcurtin> ah, ok
< manish7294> does 'driver : 'sqlite'' need to change to 'driver: 'sqlite''? - > Didn't work
< rcurtin> unfortunately I'm not really in a great position to help debug
< manish7294> rcurtin: no problem, I will try more :)
< rcurtin> if you don't care about the database I think you can run with LOG=False
< rcurtin> and it will print the results instead
< rcurtin> which is probably good enough for simple runs like you are doing with LMNN
< manish7294> sure, was just fearing of not able to scroll log :)
wenhao has joined #mlpack
< rcurtin> it should print the results at the very end
< manish7294> but I found a trick to that by keeping the tmux in copy mode
< rcurtin> right, otherwise I was going to suggest screen or tmux scrollback
< manish7294> rcurtin: By the way, where's the destination this time?
< rcurtin> today I am headed to Los Angeles
< rcurtin> or more specifically to the outskirts... I will go race karts at a track in Fontana, a far-out suburb
< manish7294> rcurtin: great, sounds exciting :)
< rcurtin> yeah, I am looking forward to it
< rcurtin> today I will practice at the track and tomorrow I will go racing
< manish7294> rcurtin: real race?
< rcurtin> yeah, it is one of my hobbies (more exciting than mariokart I think)
< manish7294> rcurtin: It's even more exciting, go for the first place :)
< manish7294> rcurtin: Really! that's awesome
< rcurtin> yeah, I have raced karts for several years now. it is a fun hobby
< rcurtin> but it turns out that for the kind of karting I do, the best racing is in california
< rcurtin> so I travel out there a lot
< manish7294> yoou even have name written on helmet, that's so cool :)
< rcurtin> anyway the plane is going to take off now... need to disconnect for now
< rcurtin> back in a little while
< manish7294> You are really living your hobby :)
< manish7294> bye :)
manish7294 has quit [Quit: Page closed]
< zoq> ShikharJ: Thanks for the clarification, this makes sense, really nice catch!
< zoq> ShikharJ: Are you going to adjust the BilinearInterpolation layer or if you like I can do it as well.
< zoq> rcurtin: Nice, hopefully, you can start from a good position.
< zoq> manish7294: I'll take a look at the sql issue later today.
sulan_ has quit [Quit: Leaving]
< ShikharJ> zoq: Did that already, now the PR only needs to be debugged.
< zoq> ShikharJ: Yeah, there is an issue inside the GAN class
< ShikharJ> zoq: I'm on it, the PR shiuld be ready over the weekend, along with the GANOptimizer class.
< zoq> ShikharJ: We can do this next week, no need to work over the weekend unless you can't sleep otherwise :)
< rcurtin> manish7294: ok, back. I could have been back sooner but I fell asleep during takeoff :)
< rcurtin> zoq: yeah, hopefully. I will need to qualify well, I think I can do it
< ShikharJ> zoq: Ah, this has to be done ASAP, otherwise, we would lose a lot of time for the WGAN implementation.
< zoq> ShikharJ: Don't think we are in a rush, I agree the first phase took more time as anticipated, but you fixed a lot of important issues.
< rcurtin> I will try this out and see how well the bound could work
< rcurtin> don't feel obligated to drop what you are doing and try it; I'll let you know what I find out
manish7294 has joined #mlpack
wenhao has quit [Ping timeout: 260 seconds]
sulan_ has joined #mlpack
ImQ009 has joined #mlpack
< manish7294> rcurtin: I read your research and it is precisely explained :)
< manish7294> Just have a doubt (you earlier had this) : finding the violation be too costly.
< manish7294> *may be too costly
< rcurtin> manish7294: there is a problem with the bounding, I think it is backward
< rcurtin> I am revising it now
< rcurtin> I did some quick simulations, I found that (at least for covertype), || L_{t + 1} - L_t ||_F^2 tends to be something like 1e-5 or smaller with sgd and lbfgs
< rcurtin> of course with sgd it will depend on the step size too
< rcurtin> but that is far smaller than the norm of the points
< rcurtin> anyway let me fix this bound, I see my logical error
< rcurtin> the results I saw imply that maybe we have to recalculate impostors for the first several iterations, but for later iterations we may be able to go a very long time without performing the calculation of impostors
< manish7294> rcurtin: I agree with this.
< manish7294> sgd and pretty much every optimizer shows these characterstics
< rcurtin> the flaw is basically that I did the bounding backwards. when I go from (11) to (12), I used a lower bound on d_L_{t + 1}(x_i, x_a) and an upper bound on d_L_{t + 1}(x_i, x_b)
< rcurtin> but it should be the other way around; I need to use an upper bound on d_L_{t + 11}(x_i, x_a) and a lower bound on d_L_{t + 1}(x_i, x_b)
< rcurtin> oops, t + 1 not t + 11...
< manish7294> no worries, got the point :)
< manish7294> zoq: rcurtin: Can you please tell what could be the possible reason behind getting this result with benchmarks https://pastebin.com/8bTTcUh9
< manish7294> whereas mlpack script seems to be working https://pastebin.com/ywwNw6qS
< rcurtin> hmmm, -2 indicates some kind of error while running the script
< rcurtin> unfortunately the debugging output is not so great for the benchmarking system
< rcurtin> you might consider adding some prints throughout the shogun LMNN benchmarking script to see what is going on
< rcurtin> that's how I typically debug things like that
< manish7294> rcurtin: sure, but I don't get why it is -2 --- as except -1, I am not returning -2 anywhere in shogun script.
< rcurtin> ah, sorry, -2 indicates timeout
< rcurtin> at least according to benchmark/run_benchmark.py
< manish7294> rcurtin: You certainly have done this already and I know there's no point in me asking this, our evaluations are completed right?
< zoq> manish7294: yes the evaluations are complete
< manish7294> zoq: Thanks! sorry if I bothered you :)
< zoq> manish7294: No worries :)
< rcurtin> manish7294: did you set the timeout to a much lower value than 9000?
< rcurtin> ah, hang on, the bound isn't wrong, it's just written in a counterintuitive way
< manish7294> rcurtin: I don't remember setting it.
< rcurtin> it's set in the configuration, you can check the 'timeout:' line
< rcurtin> if all the shogun LMNN runs are timing out, that implies that you ran the script and it took like 15 hours or more
< rcurtin> but I got the impression it gave you those results quite quickly
< manish7294> ya, within 1 min
< manish7294> you are right, it is 9000
< manish7294> Is this in seconds?
< manish7294> my bad! it is written just above timeout line :)
yaswagner has joined #mlpack
< rcurtin> right, that is in seconds
< manish7294> So, should I comment this value or just increase it a large one?
< rcurtin> so it was set to 9000 with those runs?
< manish7294> right
< rcurtin> ok, so if it took only a minute to run it's clearly not timing out
< zoq> Can you print s right after 's = subprocess.check_output..)'?
< zoq> The output might be helpful here.
< manish7294> zoq: sure, I will just let you know
vivekp has quit [Read error: Connection reset by peer]
vivekp has joined #mlpack
< zoq> ahh, I see the wine dataset is missing
< rcurtin> manish7294: you can reload the pdf, http://www.ratml.org/misc/lmnn_bounds.pdf, it is fixed now. the result is still roughly the same
< zoq> okay, so the wine issue is fixed with the latest commit
< manish7294> zoq: the main problem is with shogun script https://pastebin.com/8bTTcUh9 It does not appear to be timeout as there is a line in script except timeout_decorator.TimeoutError: return -1
< zoq> ahh, I was looking at: https://pastebin.com/ywwNw6qS
< manish7294> zoq: that too was a problem :)
< manish7294> rcurtin: Now, it seems good :)
< zoq> manish7294: Can you check if the PCA shogun benchmark works?
< zoq> manish7294: The shogun installation on gekko failed so maybe it failed on the other ones as well?
< manish7294> zoq: started, let's see the results
< manish7294> It is confirmly not a timeout as I cross-checked with a large value.
< manish7294> zoq: It's working
< manish7294> Probably, I am doing something wrong.
< zoq> manish7294: Okay, let's see if I can reproduce the issue.
< zoq> Can you post the make command?
< rcurtin> zoq: I think I see a bug; in run_benchmark.py, the block at 422 implies that runtime timeout is -1 and failure is -2, but the block at 457 implies the opposite
< manish7294> make run BLOCK=shogun METHODBLOCK=LMNN
< rcurtin> plane is landing... I will have to go now
< manish7294> zoq: I think the issue is that shogun's lmnn train needs a initial matrix to be passed but I haven't been doing that.
< zoq> okay, I get an definitions undefined error
< manish7294> Ya, I removed it later
< manish7294> forgot to do so on remote
< zoq> okay, now I get: 'LMNN' object has no attribute 'X'
< zoq> just added print(e) in the exception block
< manish7294> I think error is on line 63
< manish7294> self shouldn't be there
< zoq> right
< zoq> numpy.float64 should be np.float64
< zoq> afterwards I get at least some timings, not sure the output is correct
< manish7294> oh! I made a bunch of mistake, Don't know how I let them pass by :)
< manish7294> Now, I am wondering why didn't I got any error while running this?
< manish7294> zoq: Can you please post the timings you got?
< yaswagner> Hey ryan! I've been mimicking the structure of pca.pyx in Go and calling functions in the same order. It is able to call ResetTimers(), EnableTimers(), DisableBacktrace(), and DisableVerbose(), but then when it tries to call RestoreSettings("Principal Components Analysis"), I get the following error: std::invalid_argument: no settings stored under the name 'Principal Components Analysis'. I'm thinking it might be because I haven't
< yaswagner> in mlpack_main.hpp*
< yaswagner> When you hand-binded pca had you created a PyOption and added a BINDING_TYPE_PYTHON to mlpack_main.hpp ahead of time?
< rcurtin> yaswagner: ah, right, I know what is going on here
< rcurtin> when I did the hand binding I did have to do work with the PARAM macros, so I think my email to you was a little incorrect; sorry about that
< rcurtin> for the CLI and Python bindings, what the PARAM macros do is actually declare some Option type that then registers itself with the CLI singleton when its constructor is called
< rcurtin> I am not 100% sure but I think before including pca_main.cpp, you could '#define BINDING_TYPE BINDING_TYPE_PYX' for now and we can make a different type later as needed
< rcurtin> then I think the only thing remaining wohld be to set the programName string
< rcurtin> unfortunately I'm not in a great place to dig in and help right now but maybe that can help get you startedm
< yaswagner> Ok that makes sense! Thanks, Ill add the #define
< rcurtin> yeah, and I think if you have a way to set 'programName = "Principal Components Analysis"' that could also be necessary
manish7294 has quit [Quit: Page closed]
< yaswagner> Ya simply adding the #define BINDING_TYPE BINDING_TYPE_PYTHON doesnt work. Ill look into adding programName. Thanks!
< zoq> manish7294: I just timed the iris dataset, maybe it's just taking a long time?
< zoq> manish7294: I can post the modified script if that helps.
manish7294 has joined #mlpack
< manish7294> zoq: Is that with shogun?
< zoq> manish7294: yes
< zoq> shogun
< zoq> iris 0.332911
< manish7294> zoq: can I ask you one more favor
< zoq> yeah, sure
< manish7294> can you please try mlpack script too
< manish7294> I think I messed up something here, so probably will need to do fresh build
< manish7294> zoq: And did you made any other changes in the PR code other than what you mentioned earlier for running this?
< zoq> I added metrics_folder = os.path.realpath(os.path.abspath(os.path.join(
< zoq> os.path.split(inspect.getfile(inspect.currentframe()))[0], "../metrics")))
< zoq> if metrics_folder not in sys.path:
< zoq> sys.path.insert(0, metrics_folder)
< zoq> so that the import works
< manish7294> zoq: Thanks!
< zoq> I have to build your branch hold on
< manish7294> sure
< Atharva> rcurtin: Is it okay to have a normal_distribution_impl.hpp file in dists rather than normal_distribution.cpp. I am having some trouble with template funtions.
< zoq> Atharva: Sounds fine.
< zoq> manish7294: Running the script now.
< Atharva> zoq: Thanks!
< zoq> Atharva: Do not forget to include the impl inside the hpp at the end :)
< Atharva> zoq: Did that :)
< Atharva> Is there some reason template functions error when built with cpp files?
< manish7294> zoq: rcurtin: Thanks to you, I somehow managed to run the script on iris, and results are quite unbelievable - https://pastebin.com/DRHS6F5u
< manish7294> there is a large difference eventhough I have not used N iterations functionality of ours
< manish7294> zoq: Please confirm if you get similar results. Thanks!
< manish7294> Here shogun is set to default k=1
< zoq> manish7294: Will post the results here.
< Atharva> zoq: Thanks, I will check it out.
< Atharva> zoq: As I am modiying core files, it takes a lot of time to build. Is there a quicker way to do this?
< zoq> Atharva: Not really you can disable the executables and the tests if you not going to use them: -DBUILD_TEST=OFF and -DBUILD_CLI_EXECUTABLES=OFF
< zoq> Atharva: Also you can build with multiple cores: make -j4 will build the code using 4 cores.
< zoq> ohh and you can also disable the python bindings: -DBUILD_PYTHON_BINDINGS=OFF
< Atharva> Okay, I will disable the executables, tests and python bindings. As for the cores, I am using all 8 of them.
< rcurtin> manish7294: results look great so far! another thing to do would be to add the resulting kNN accuracy as a metric, so that we can know that mlpack and shogun are probiding roughly the same improvement
< Atharva> zoq: Sorry to disturb again. The NegativeLogLikelihood class has not been not moves to the loss function folder. Is there some reason or should I do it. I will open a PR tomorrow with the inputparameter member removed from layers that don't need it. I can throw in this change too
< Atharva> moved*
manish7294 has quit [Ping timeout: 260 seconds]
< zoq> Atharva: No reason, we can move it.
< Atharva> zoq: okay
witness_ has joined #mlpack
< Atharva> zoq: Should I also remove the delta member of the loss layers, I don't think it's being used there
< zoq> Atharva: yes, sure
< Atharva> zoq: Okay!
ImQ009 has quit [Quit: Leaving]
< Atharva> zoq: rcurtin: I added some files to core/dists and built mlpack with it. Now, I am working on a different branch where those files don't exist. But, due to CXX.includecache files, my build is failing. How do I clear the cache?
< zoq> Atharva: Either remove the build folder or perhaps 'make clean' works as well.
< Atharva> Okay, thanks!
vivekp has quit [Ping timeout: 240 seconds]
< zoq> Atharva: Here to help :)
vivekp has joined #mlpack
yaswagner has quit [Quit: Page closed]
vivekp has quit [Ping timeout: 264 seconds]
sulan_ has quit [Quit: Leaving]