#mlpack on 2018-06-15 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

03:02 manish7294 has joined #mlpack

03:03 < manish7294> rcurtin: Are you there?

03:38 < jenkins-mlpack> Project docker mlpack weekly build build #46: NOW UNSTABLE in 2 hr 51 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20weekly%20build/46/

03:38 < jenkins-mlpack> * jaiswalshikhar87: Inital Commit

03:38 < jenkins-mlpack> * jaiswalshikhar87: Rebase to master

03:38 < jenkins-mlpack> * jaiswalshikhar87: Implement O'Reilly Test

03:38 < jenkins-mlpack> * Ryan Curtin: Handle training sets that don't divide evenly by k.

03:38 < jenkins-mlpack> * Ryan Curtin: Add ShuffleData() with weights.

03:38 < jenkins-mlpack> * Ryan Curtin: Add 'shuffle' parameter to KFoldCV.

03:38 < jenkins-mlpack> * Ryan Curtin: Safer handling of sparse matrix values arrays.

03:38 < jenkins-mlpack> * Ryan Curtin: Remove trainingSubsetSize and clarify comments.

03:38 < jenkins-mlpack> * Ryan Curtin: Rename variables for clarity.

03:38 < jenkins-mlpack> * Ryan Curtin: Expose Shuffle() to users.

03:38 < jenkins-mlpack> * Ryan Curtin: Update documentation.

03:38 < jenkins-mlpack> * Ryan Curtin: Fix condition, thanks Kirill for pointing it out.

03:38 < jenkins-mlpack> * Ryan Curtin: Add section on using the Python bindings without installing.

03:38 < jenkins-mlpack> * Ryan Curtin: Clarify any need for LD_LIBRARY_PATH setting.

03:38 < jenkins-mlpack> * Ryan Curtin: Be sure to print `output = ` if there are output parameters.

03:38 < jenkins-mlpack> * Ryan Curtin: Fix the parameter name we are using. (Thanks @rasbt!)

03:38 < jenkins-mlpack> * Ryan Curtin: Don't overwrite the 'perceptron' module in Python. (Thanks @rasbt!)

04:51 manish7294 has quit [Quit: Page closed]

06:51 vivekp has joined #mlpack

07:06 < ShikharJ> zoq: I'll help you :P Imagine an input point of 3x3 dimension and 3 channels. So you now have a 3x3x3 input cube with batchSize = 1 (for simplicity). You want 3 output channels as well, so your kernel has (inSize * outsize = 3x3 = 9) slices. Let's assume even the kernel has dimensions 3x3, so your final kernel dimensions are 3x3x9, which would give you an output of 1x1x3.

07:10 < ShikharJ> zoq: Now try iterating through the Gradients step with the previous approach and the current approach, the first three input slices convolve with the first three kernel slices to give you the first output slice. So the first output slice should be convolved with the first three input slices, to give you the gradients for the first three kernel slices. But this does not happen in the previous approach, as the first three

07:10 < ShikharJ> kernel gradients are now instead stored at a distance of (outSize) from each other (see the variable s and how it's updated).

07:38 < ShikharJ> zoq: I'll also need to add the support for batches in BilinearInterpolation method

07:52 < Atharva> sumedhghaisas: zoq: rcurtin: Should I rebase the PR even if I have resolved the conflicts?

07:54 < ShikharJ> Atharva: It's a good practice, as it helps to weed out any error that might come up due to the underlying routines being changed.

07:55 < Atharva> ShikharJ: Okay, thanks!

09:20 wiking has quit [Quit: ZNC 1.7.0 - https://znc.in]

09:23 < rcurtin> manish7294: sorry, I went to bed early last night

09:37 wiking has joined #mlpack

10:00 < jenkins-mlpack> Project docker mlpack nightly build build #350: STILL UNSTABLE in 2 hr 46 min: http://masterblaster.mlpack.org/job/docker%20mlpack%20nightly%20build/350/

10:18 manish7294 has joined #mlpack

10:19 < manish7294> rcurtin: whoa, you are up already :)

10:20 < manish7294> rcurtin: I am facing some issues with setting up benchmarks on slake, maybe due to dependencies issues.

10:21 < manish7294> I think mysql server is not there and the python version is 2.7

10:22 < manish7294> With shogun script I am continously getting IO error

10:23 < manish7294> It is originating from SplitTrainData from util/misc.py

10:27 sulan_ has joined #mlpack

10:30 < rcurtin> manish7294: hmm, I know you can run the system to use a local sqlite db

10:30 < rcurtin> in the config.yaml you can do:

10:30 < rcurtin> database: 'benchmarks.db'

10:31 < rcurtin> driver: 'sqlite'

10:31 < rcurtin> and you can comment out databaseHost and port

10:31 < rcurtin> and yeah, it's an early flight today...

10:33 < rcurtin> not sure why I picked this one but it is too late to change it now :)

10:37 < manish7294> rcurtin: Now other one popped up File "/home/manish/benchmarks/util/database.py", line 73, in Execute return self.Execute(command) RecursionError: maximum recursion depth exceeded Makefile:176: recipe for target '.run' failed

10:39 < manish7294> rcurtin: And it seems, LMNN authors applied PCA before LMNN to reduce dimensionality of large dimensional datasets(like for mnist 784 to 164)

10:56 < rcurtin> manish7294: what if you create a very simple configuration with no LMNN runs?

10:56 < rcurtin> that will help narrow down whether the issue is specific to the LMNN benchmark scripts

10:56 < manish7294> if I turn off log then it works

11:01 < manish7294> sure, I will try debug more.

11:05 < rcurtin> hmm, can you paste your config somewhere?

11:06 < rcurtin> looks like maybe it is not specific to the LMNN scripts

11:08 < manish7294> rcurti; Here it is https://pastebin.com/cNPBGeHH

11:09 < rcurtin> does the line 'driver : 'sqlite'' need to change to 'driver: 'sqlite''?

11:10 < rcurtin> also I'd suggest removing most of the method blocks so that when you run it only runs the benchmarks you are interested in

11:10 < rcurtin> running that whole configuration would probably take several weeks :)

11:10 < manish7294> I am using BLOCKS and METHODBLOCKS to avoid that :)

11:11 < rcurtin> ah, ok

11:13 < manish7294> does 'driver : 'sqlite'' need to change to 'driver: 'sqlite''? - > Didn't work

11:14 < rcurtin> unfortunately I'm not really in a great position to help debug

11:15 < manish7294> rcurtin: no problem, I will try more :)

11:15 < rcurtin> if you don't care about the database I think you can run with LOG=False

11:15 < rcurtin> and it will print the results instead

11:15 < rcurtin> which is probably good enough for simple runs like you are doing with LMNN

11:16 < manish7294> sure, was just fearing of not able to scroll log :)

11:16 wenhao has joined #mlpack

11:17 < rcurtin> it should print the results at the very end

11:17 < manish7294> but I found a trick to that by keeping the tmux in copy mode

11:17 < rcurtin> right, otherwise I was going to suggest screen or tmux scrollback

11:19 < manish7294> rcurtin: By the way, where's the destination this time?

11:20 < rcurtin> today I am headed to Los Angeles

11:20 < rcurtin> or more specifically to the outskirts... I will go race karts at a track in Fontana, a far-out suburb

11:21 < manish7294> rcurtin: great, sounds exciting :)

11:21 < rcurtin> yeah, I am looking forward to it

11:21 < rcurtin> today I will practice at the track and tomorrow I will go racing

11:22 < manish7294> rcurtin: real race?

11:23 < rcurtin> yeah, it is one of my hobbies (more exciting than mariokart I think)

11:23 < rcurtin> http://ratml.org/misc_img/calspeed.jpg

11:23 < manish7294> rcurtin: It's even more exciting, go for the first place :)

11:24 < manish7294> rcurtin: Really! that's awesome

11:25 < rcurtin> yeah, I have raced karts for several years now. it is a fun hobby

11:27 < rcurtin> but it turns out that for the kind of karting I do, the best racing is in california

11:27 < rcurtin> so I travel out there a lot

11:27 < manish7294> yoou even have name written on helmet, that's so cool :)

11:28 < rcurtin> anyway the plane is going to take off now... need to disconnect for now

11:28 < rcurtin> back in a little while

11:28 < manish7294> You are really living your hobby :)

11:28 < manish7294> bye :)

11:35 manish7294 has quit [Quit: Page closed]

11:35 < zoq> ShikharJ: Thanks for the clarification, this makes sense, really nice catch!

11:35 < zoq> ShikharJ: Are you going to adjust the BilinearInterpolation layer or if you like I can do it as well.

11:36 < zoq> rcurtin: Nice, hopefully, you can start from a good position.

11:39 < zoq> manish7294: I'll take a look at the sql issue later today.

11:53 sulan_ has quit [Quit: Leaving]

12:11 < ShikharJ> zoq: Did that already, now the PR only needs to be debugged.

12:15 < zoq> ShikharJ: Yeah, there is an issue inside the GAN class

12:17 < ShikharJ> zoq: I'm on it, the PR shiuld be ready over the weekend, along with the GANOptimizer class.

12:19 < zoq> ShikharJ: We can do this next week, no need to work over the weekend unless you can't sleep otherwise :)

12:20 < rcurtin> manish7294: ok, back. I could have been back sooner but I fell asleep during takeoff :)

12:20 < rcurtin> zoq: yeah, hopefully. I will need to qualify well, I think I can do it

12:21 < ShikharJ> zoq: Ah, this has to be done ASAP, otherwise, we would lose a lot of time for the WGAN implementation.

12:24 < zoq> ShikharJ: Don't think we are in a rush, I agree the first phase took more time as anticipated, but you fixed a lot of important issues.

13:38 < rcurtin> manish7294: http://www.ratml.org/misc/lmnn_bounds.pdf

13:39 < rcurtin> I will try this out and see how well the bound could work

13:40 < rcurtin> don't feel obligated to drop what you are doing and try it; I'll let you know what I find out

13:46 manish7294 has joined #mlpack

14:22 wenhao has quit [Ping timeout: 260 seconds]

14:26 sulan_ has joined #mlpack

14:33 ImQ009 has joined #mlpack

14:36 < manish7294> rcurtin: I read your research and it is precisely explained :)

14:36 < manish7294> Just have a doubt (you earlier had this) : finding the violation be too costly.

14:36 < manish7294> *may be too costly

14:38 < rcurtin> manish7294: there is a problem with the bounding, I think it is backward

14:38 < rcurtin> I am revising it now

14:39 < rcurtin> I did some quick simulations, I found that (at least for covertype), || L_{t + 1} - L_t ||_F^2 tends to be something like 1e-5 or smaller with sgd and lbfgs

14:39 < rcurtin> of course with sgd it will depend on the step size too

14:39 < rcurtin> but that is far smaller than the norm of the points

14:39 < rcurtin> anyway let me fix this bound, I see my logical error

14:40 < rcurtin> the results I saw imply that maybe we have to recalculate impostors for the first several iterations, but for later iterations we may be able to go a very long time without performing the calculation of impostors

14:41 < manish7294> rcurtin: I agree with this.

14:41 < manish7294> sgd and pretty much every optimizer shows these characterstics

14:42 < rcurtin> the flaw is basically that I did the bounding backwards. when I go from (11) to (12), I used a lower bound on d_L_{t + 1}(x_i, x_a) and an upper bound on d_L_{t + 1}(x_i, x_b)

14:42 < rcurtin> but it should be the other way around; I need to use an upper bound on d_L_{t + 11}(x_i, x_a) and a lower bound on d_L_{t + 1}(x_i, x_b)

14:43 < rcurtin> oops, t + 1 not t + 11...

14:43 < manish7294> no worries, got the point :)

14:43 < manish7294> zoq: rcurtin: Can you please tell what could be the possible reason behind getting this result with benchmarks https://pastebin.com/8bTTcUh9

14:44 < manish7294> whereas mlpack script seems to be working https://pastebin.com/ywwNw6qS

14:44 < rcurtin> hmmm, -2 indicates some kind of error while running the script

14:45 < rcurtin> unfortunately the debugging output is not so great for the benchmarking system

14:45 < rcurtin> you might consider adding some prints throughout the shogun LMNN benchmarking script to see what is going on

14:45 < rcurtin> that's how I typically debug things like that

14:46 < manish7294> rcurtin: sure, but I don't get why it is -2 --- as except -1, I am not returning -2 anywhere in shogun script.

14:51 < rcurtin> ah, sorry, -2 indicates timeout

14:51 < rcurtin> at least according to benchmark/run_benchmark.py

14:52 < manish7294> rcurtin: You certainly have done this already and I know there's no point in me asking this, our evaluations are completed right?

14:52 < zoq> manish7294: yes the evaluations are complete

14:53 < manish7294> zoq: Thanks! sorry if I bothered you :)

14:54 < zoq> manish7294: No worries :)

14:55 < rcurtin> manish7294: did you set the timeout to a much lower value than 9000?

14:56 < rcurtin> ah, hang on, the bound isn't wrong, it's just written in a counterintuitive way

14:57 < manish7294> rcurtin: I don't remember setting it.

14:57 < rcurtin> it's set in the configuration, you can check the 'timeout:' line

14:58 < rcurtin> if all the shogun LMNN runs are timing out, that implies that you ran the script and it took like 15 hours or more

14:58 < rcurtin> but I got the impression it gave you those results quite quickly

14:58 < manish7294> ya, within 1 min

14:59 < manish7294> you are right, it is 9000

14:59 < manish7294> Is this in seconds?

15:00 < manish7294> my bad! it is written just above timeout line :)

15:01 yaswagner has joined #mlpack

15:01 < rcurtin> right, that is in seconds

15:01 < manish7294> So, should I comment this value or just increase it a large one?

15:02 < rcurtin> so it was set to 9000 with those runs?

15:02 < manish7294> right

15:02 < rcurtin> ok, so if it took only a minute to run it's clearly not timing out

15:03 < zoq> Can you print s right after 's = subprocess.check_output..)'?

15:04 < zoq> The output might be helpful here.

15:05 < manish7294> zoq: sure, I will just let you know

15:05 vivekp has quit [Read error: Connection reset by peer]

15:10 vivekp has joined #mlpack

15:11 < zoq> ahh, I see the wine dataset is missing

15:11 < rcurtin> manish7294: you can reload the pdf, http://www.ratml.org/misc/lmnn_bounds.pdf, it is fixed now. the result is still roughly the same

15:14 < zoq> okay, so the wine issue is fixed with the latest commit

15:14 < manish7294> zoq: the main problem is with shogun script https://pastebin.com/8bTTcUh9 It does not appear to be timeout as there is a line in script except timeout_decorator.TimeoutError: return -1

15:15 < zoq> ahh, I was looking at: https://pastebin.com/ywwNw6qS

15:15 < manish7294> zoq: that too was a problem :)

15:17 < manish7294> rcurtin: Now, it seems good :)

15:17 < zoq> manish7294: Can you check if the PCA shogun benchmark works?

15:18 < zoq> manish7294: The shogun installation on gekko failed so maybe it failed on the other ones as well?

15:19 < manish7294> zoq: started, let's see the results

15:20 < manish7294> It is confirmly not a timeout as I cross-checked with a large value.

15:25 < manish7294> zoq: It's working

15:27 < manish7294> Probably, I am doing something wrong.

15:27 < zoq> manish7294: Okay, let's see if I can reproduce the issue.

15:28 < zoq> Can you post the make command?

15:28 < rcurtin> zoq: I think I see a bug; in run_benchmark.py, the block at 422 implies that runtime timeout is -1 and failure is -2, but the block at 457 implies the opposite

15:29 < manish7294> make run BLOCK=shogun METHODBLOCK=LMNN

15:32 < rcurtin> plane is landing... I will have to go now

15:32 < manish7294> zoq: I think the issue is that shogun's lmnn train needs a initial matrix to be passed but I haven't been doing that.

15:32 < zoq> okay, I get an definitions undefined error

15:33 < manish7294> Ya, I removed it later

15:33 < manish7294> forgot to do so on remote

15:35 < zoq> okay, now I get: 'LMNN' object has no attribute 'X'

15:35 < zoq> just added print(e) in the exception block

15:36 < manish7294> I think error is on line 63

15:38 < manish7294> self shouldn't be there

15:38 < zoq> right

15:38 < zoq> numpy.float64 should be np.float64

15:39 < zoq> afterwards I get at least some timings, not sure the output is correct

15:41 < manish7294> oh! I made a bunch of mistake, Don't know how I let them pass by :)

15:44 < manish7294> Now, I am wondering why didn't I got any error while running this?

15:48 < manish7294> zoq: Can you please post the timings you got?

16:04 < yaswagner> Hey ryan! I've been mimicking the structure of pca.pyx in Go and calling functions in the same order. It is able to call ResetTimers(), EnableTimers(), DisableBacktrace(), and DisableVerbose(), but then when it tries to call RestoreSettings("Principal Components Analysis"), I get the following error: std::invalid_argument: no settings stored under the name 'Principal Components Analysis'. I'm thinking it might be because I haven't

16:04 < yaswagner> in mlpack_main.hpp*

16:04 < yaswagner> When you hand-binded pca had you created a PyOption and added a BINDING_TYPE_PYTHON to mlpack_main.hpp ahead of time?

16:07 < rcurtin> yaswagner: ah, right, I know what is going on here

16:08 < rcurtin> when I did the hand binding I did have to do work with the PARAM macros, so I think my email to you was a little incorrect; sorry about that

16:09 < rcurtin> for the CLI and Python bindings, what the PARAM macros do is actually declare some Option type that then registers itself with the CLI singleton when its constructor is called

16:12 < rcurtin> I am not 100% sure but I think before including pca_main.cpp, you could '#define BINDING_TYPE BINDING_TYPE_PYX' for now and we can make a different type later as needed

16:13 < rcurtin> then I think the only thing remaining wohld be to set the programName string

16:13 < rcurtin> unfortunately I'm not in a great place to dig in and help right now but maybe that can help get you startedm

16:15 < yaswagner> Ok that makes sense! Thanks, Ill add the #define

16:21 < rcurtin> yeah, and I think if you have a way to set 'programName = "Principal Components Analysis"' that could also be necessary

16:32 manish7294 has quit [Quit: Page closed]

16:34 < yaswagner> Ya simply adding the #define BINDING_TYPE BINDING_TYPE_PYTHON doesnt work. Ill look into adding programName. Thanks!

16:55 < zoq> manish7294: I just timed the iris dataset, maybe it's just taking a long time?

16:56 < zoq> manish7294: I can post the modified script if that helps.

16:57 manish7294 has joined #mlpack

16:57 < manish7294> zoq: Is that with shogun?

16:58 < zoq> manish7294: yes

16:58 < zoq> shogun

16:58 < zoq> iris 0.332911

16:58 < manish7294> zoq: can I ask you one more favor

16:58 < zoq> yeah, sure

16:59 < manish7294> can you please try mlpack script too

17:00 < manish7294> I think I messed up something here, so probably will need to do fresh build

17:01 < manish7294> zoq: And did you made any other changes in the PR code other than what you mentioned earlier for running this?

17:02 < zoq> I added metrics_folder = os.path.realpath(os.path.abspath(os.path.join(

17:02 < zoq> os.path.split(inspect.getfile(inspect.currentframe()))[0], "../metrics")))

17:02 < zoq> if metrics_folder not in sys.path:

17:02 < zoq> sys.path.insert(0, metrics_folder)

17:02 < zoq> so that the import works

17:02 < manish7294> zoq: Thanks!

17:04 < zoq> I have to build your branch hold on

17:04 < manish7294> sure

17:29 < Atharva> rcurtin: Is it okay to have a normal_distribution_impl.hpp file in dists rather than normal_distribution.cpp. I am having some trouble with template funtions.

17:41 < zoq> Atharva: Sounds fine.

17:41 < zoq> manish7294: Running the script now.

17:42 < Atharva> zoq: Thanks!

17:43 < zoq> Atharva: Do not forget to include the impl inside the hpp at the end :)

17:43 < Atharva> zoq: Did that :)

17:44 < Atharva> Is there some reason template functions error when built with cpp files?

17:44 < manish7294> zoq: rcurtin: Thanks to you, I somehow managed to run the script on iris, and results are quite unbelievable - https://pastebin.com/DRHS6F5u

17:45 < manish7294> there is a large difference eventhough I have not used N iterations functionality of ours

17:47 < manish7294> zoq: Please confirm if you get similar results. Thanks!

17:48 < zoq> Atharva: See https://www.codeproject.com/Articles/48575/How-to-define-a-template-class-in-a-h-file-and-imp for an explanation :)

17:48 < manish7294> Here shogun is set to default k=1

17:48 < zoq> manish7294: Will post the results here.

17:48 < Atharva> zoq: Thanks, I will check it out.

17:57 < Atharva> zoq: As I am modiying core files, it takes a lot of time to build. Is there a quicker way to do this?

18:02 < zoq> Atharva: Not really you can disable the executables and the tests if you not going to use them: -DBUILD_TEST=OFF and -DBUILD_CLI_EXECUTABLES=OFF

18:02 < zoq> Atharva: Also you can build with multiple cores: make -j4 will build the code using 4 cores.

18:03 < zoq> ohh and you can also disable the python bindings: -DBUILD_PYTHON_BINDINGS=OFF

18:03 < Atharva> Okay, I will disable the executables, tests and python bindings. As for the cores, I am using all 8 of them.

18:22 < rcurtin> manish7294: results look great so far! another thing to do would be to add the resulting kNN accuracy as a metric, so that we can know that mlpack and shogun are probiding roughly the same improvement

18:26 < Atharva> zoq: Sorry to disturb again. The NegativeLogLikelihood class has not been not moves to the loss function folder. Is there some reason or should I do it. I will open a PR tomorrow with the inputparameter member removed from layers that don't need it. I can throw in this change too

18:26 < Atharva> moved*

18:37 manish7294 has quit [Ping timeout: 260 seconds]

18:43 < zoq> Atharva: No reason, we can move it.

18:43 < Atharva> zoq: okay

19:59 witness_ has joined #mlpack

20:25 < Atharva> zoq: Should I also remove the delta member of the loss layers, I don't think it's being used there

20:26 < zoq> Atharva: yes, sure

20:30 < Atharva> zoq: Okay!

21:08 ImQ009 has quit [Quit: Leaving]

21:22 < Atharva> zoq: rcurtin: I added some files to core/dists and built mlpack with it. Now, I am working on a different branch where those files don't exist. But, due to CXX.includecache files, my build is failing. How do I clear the cache?

21:25 < zoq> Atharva: Either remove the build folder or perhaps 'make clean' works as well.

21:26 < Atharva> Okay, thanks!

21:32 vivekp has quit [Ping timeout: 240 seconds]

21:34 < zoq> Atharva: Here to help :)

21:36 vivekp has joined #mlpack

21:39 yaswagner has quit [Quit: Page closed]

22:03 vivekp has quit [Ping timeout: 264 seconds]

22:23 sulan_ has quit [Quit: Leaving]