#mlpack on 2016-04-18 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

04:07 Nilabhra has joined #mlpack

04:50 tsathoggua has joined #mlpack

04:51 tsathoggua has quit [Client Quit]

05:07 umberto has joined #mlpack

05:11 umberto has quit [Ping timeout: 250 seconds]

07:03 tsathoggua has joined #mlpack

07:11 tsathoggua has quit [Quit: Konversation terminated!]

07:12 mentekid_ has quit [Ping timeout: 244 seconds]

07:35 wasiq has quit [Ping timeout: 264 seconds]

08:05 mentekid_ has joined #mlpack

08:49 archangel4 has joined #mlpack

09:18 archangel4 has quit [Ping timeout: 260 seconds]

09:54 wasiq has joined #mlpack

10:44 Nilabhra has quit [Remote host closed the connection]

11:06 Rodya has quit [Ping timeout: 260 seconds]

11:15 Rodya has joined #mlpack

11:26 keonkim has quit [Ping timeout: 250 seconds]

11:35 keonkim has joined #mlpack

11:56 mentekid_ has quit [Ping timeout: 250 seconds]

12:04 Nilabhra has joined #mlpack

13:15 mentekid_ has joined #mlpack

14:09 zoq_ is now known as zoq

14:22 ranjan123 has quit [Ping timeout: 250 seconds]

14:23 uumberto has joined #mlpack

14:24 uumberto has quit [Client Quit]

14:39 ranjan123 has joined #mlpack

14:40 < ranjan123> Hello everybody ! :D .

14:40 < ranjan123> rcurtin: you there ?

14:40 < rcurtin> mentekid_: any chance I can get a copy of the sift10k and gist10k datasets you were testing with?

14:41 < rcurtin> ranjan123: hello! I am here

14:41 < ranjan123> In psgd you have commented "I am concerned that this is a lot slower than it could be. It looks like you are checking for convergence of all of the threads at once, instead of letting each thread run its own SGD instance. This means there are lots of barriers and atomic sections when I don't think they need to be there. You might be able to simplify this significantly if you use the existing SGD class."

14:42 < ranjan123> I dont get this line "You might be able to simplify this significantly if you use the existing SGD class"

14:42 < ranjan123> I mean what to do with existing SGD class ?

14:43 < rcurtin> something like

14:43 < rcurtin> for (each thread)

14:43 travis-ci has joined #mlpack

14:43 < travis-ci> mlpack/mlpack#778 (master - 0e6f351 : Ryan Curtin): The build passed.

14:43 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/c5573b26c0f5...0e6f3512e589

14:43 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/123919156

14:43 travis-ci has left #mlpack []

14:43 < rcurtin> SGD sgd(function subset, ...);

14:43 < rcurtin> sgd.Optimize()

14:43 < ranjan123> ohh

14:43 < rcurtin> yeah

14:43 < rcurtin> I am not sure exactly how simple that will be to do

14:43 < rcurtin> but something like that

14:43 < rcurtin> since in each thread you are running a separate SGD

14:47 travis-ci has joined #mlpack

14:47 < travis-ci> mlpack/mlpack#779 (master - cf2b625 : Ryan Curtin): The build passed.

14:47 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/0e6f3512e589...cf2b62565078

14:47 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/123919672

14:47 travis-ci has left #mlpack []

14:48 < ranjan123> That is very simple but it is not written in any literature. There will not be any random number generator. If the number of function is huge say N, then we have to allocate a array of size N instead of selecting it randomly.

14:49 < ranjan123> I hope this is not a problem. right ?

14:51 < rcurtin> no, that's not what I mean...

14:51 < rcurtin> the parallel SGD algorithm you proposed splits the dataset up

14:51 < rcurtin> and runs SGD on each subset

14:52 < ranjan123> selecting function randomly in each thread

14:52 skon46 has joined #mlpack

14:52 < rcurtin> I don't know if this will paste correctly...

14:52 < rcurtin> nope, guess not

14:53 < rcurtin> okay, line 2 of Algorithm 2 (the one you are implementing):

14:53 < rcurtin> v_i = SGD({ c_1 , ... , c_m }, T, \nu, \omega_0) on client

14:53 < ranjan123> yes

14:53 < rcurtin> so I have misunderstood, you are not selecting a subset, you're just running SGD with different random seeds on each thread

14:54 < ranjan123> hmmm

14:54 < rcurtin> so then all that needs to be done is make sure that the random number generator being used is threadsafe and then you can just run an SGD instance for each thread

14:54 < ranjan123> yes

14:55 < ranjan123> I can replace algorithm with existing SGD

14:55 < ranjan123> I can replace algorithm 1 with existing SGD

14:56 < rcurtin> I don't know what you mean, Algorithm 1 already is the existing SGD class

14:57 < ranjan123> yes! but not the sgd which are in implemented mlpack

14:57 < ranjan123> *in

14:57 < rcurtin> why do you say that?

14:57 < ranjan123> Draw j 2f 1 :::m g uniformly at random

14:58 < ranjan123> as you said: running SGD with different random seeds on each thread

14:58 < ranjan123> ok .

14:58 < rcurtin> the SGD implementation in mlpack shuffles the points instead of sampling uniformly at random, but the real-life difference is going to be completely negligible so I don't even see that as effectively different

14:59 < ranjan123> yes! exactly

14:59 < ranjan123> one more thing

15:00 archangel4 has joined #mlpack

15:00 < ranjan123> @stephentu said "provide support for sparse gradients "

15:02 < ranjan123> Truly I don't get the point. Providing support means what ? it would be good if you could explain the point!

15:03 archangel4 has quit [Read error: Connection reset by peer]

15:03 archangel4 has joined #mlpack

15:03 < rcurtin> where did he say that?

15:04 < ranjan123> https://github.com/mlpack/mlpack/pull/603

15:04 < ranjan123> everal high level points: I think you should provide the option for a Hogwild style implementation as well. I think this is generally what people think of when they think of parallel SGD. However, to do this correctly, one should also provide support for sparse gradients-- in fact this is the case when you actually expect parallel SGD to win. When gradients are fully dense, I think the current approach you have is probably the way

15:04 < rcurtin> okay, in the PR, thanks

15:04 < rcurtin> supporting sparse gradient types will be a little more difficult, it requires a change to the DecomposableFunctionType policy

15:05 < rcurtin> right now a DecomposableFunctionType must implement Evaluate(arma::mat& coordinates) and Gradient(const arma::mat& coordinates, arma::mat& gradient)

15:05 < ranjan123> hmm

15:05 < rcurtin> but what stephen is saying is that in many situation, the gradient will be sparse (i.e. better represented as an arma::sp_mat)

15:05 < ranjan123> ohk

15:06 < rcurtin> so in order to support sparse gradients, the class should be refactored to handle cases where the DecomposableFunctionType returns a sparse gradient instead of a dense gradient

15:06 < ranjan123> hmm

15:06 < rcurtin> probably some template metaprogramming should be used here to figure out if a class has void Gradient(const arma::mat&, arma::mat&) or void Gradient(const arma::mat&, arma::sp_mat&)

15:06 < rcurtin> but I think you should leave that for another time, it needs some more thought about the right way to do it

15:08 < ranjan123> hmmm .

15:11 mentekid_ has quit [Ping timeout: 244 seconds]

15:17 < ranjan123> From your explanation, I guess It is not that hard to extend it. :P.

15:18 < ranjan123> Please make comments some on psgd code when ever you get time. I will change the style at the end.

15:18 < ranjan123> Thanks

15:20 < rcurtin> sure, I will take a look when I have a chance

15:20 palashahuja has joined #mlpack

15:20 < palashahuja> hello zoq

15:20 < rcurtin> when you update the code, make sure to leave a comment on the PR too; github doesn't notify me when there are just new commits to a PR

15:21 < ranjan123> ok

16:03 travis-ci has joined #mlpack

16:03 < travis-ci> mlpack/mlpack#780 (master - 28ae007 : Ryan Curtin): The build has errored.

16:03 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/cf2b62565078...28ae007678c8

16:03 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/123947918

16:03 travis-ci has left #mlpack []

16:05 mentekid_ has joined #mlpack

16:06 < rcurtin> mentekid_: any chance I can get the sift 10k and gist 10k datasets? :)

16:10 skon46 has quit [Ping timeout: 260 seconds]

16:13 skon46 has joined #mlpack

16:19 tsathoggua has joined #mlpack

16:22 tsathoggua has quit [Client Quit]

16:41 ank_95_ has joined #mlpack

16:43 < zoq> palashahuja: Hello

16:46 Nilabhra has quit [Remote host closed the connection]

16:54 < palashahuja> I would like to work on #227, Is it available to solve ?

16:57 archangel4 has quit [Ping timeout: 260 seconds]

17:08 ranjan123 has quit [Ping timeout: 250 seconds]

17:11 decltypeme has quit [Quit: Leaving]

17:15 < rcurtin> palashahuja: I can try to help, but I don't have very much time

17:15 < rcurtin> I would start by reading the density estimation trees paper so that you understand what the algorithm is

17:15 < rcurtin> a DET is basically a kd-tree with cached information about the class distribution at each node (plus a bit more information too)

17:15 < rcurtin> so I think you could use BinarySpaceTree with a custom statistic to fully represent the DET class

17:16 < palashahuja> ok cool

17:17 < palashahuja> should I read the one by P Ram ?

17:19 < rcurtin> yeah, that's the paper

17:24 skon46 has quit [Quit: Leaving]

17:29 wasiq has quit [Ping timeout: 250 seconds]

17:29 travis-ci has joined #mlpack

17:29 < travis-ci> mlpack/mlpack#782 (master - ee3c27a : Ryan Curtin): The build has errored.

17:29 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/28ae007678c8...ee3c27a47f00

17:29 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/123969328

17:29 travis-ci has left #mlpack []

17:40 sumedhghaisas has joined #mlpack

17:53 Nilabhra has joined #mlpack

18:08 palashahuja has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]

18:08 travis-ci has joined #mlpack

18:08 < travis-ci> mlpack/mlpack#783 (master - 8a249cf : Ryan Curtin): The build passed.

18:08 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/ee3c27a47f00...8a249cf02c9b

18:08 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/123973267

18:08 travis-ci has left #mlpack []

18:09 palashahuja has joined #mlpack

18:13 travis-ci has joined #mlpack

18:13 < travis-ci> mlpack/mlpack#784 (master - 7e0e6ea : Ryan Curtin): The build failed.

18:13 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/8a249cf02c9b...7e0e6ea1e3a7

18:13 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/123973875

18:13 travis-ci has left #mlpack []

18:14 travis-ci has joined #mlpack

18:14 < travis-ci> mlpack/mlpack#785 (master - 56b53a0 : Ryan Curtin): The build passed.

18:14 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/7e0e6ea1e3a7...56b53a09e2d4

18:14 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/123974266

18:14 travis-ci has left #mlpack []

18:20 palashahuja has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]

18:45 travis-ci has joined #mlpack

18:45 < travis-ci> darcyliu/mlpack#11 (master - f47728c : Darcy Liu): The build passed.

18:45 < travis-ci> Change view : https://github.com/darcyliu/mlpack/compare/c5573b26c0f5...f47728cb317d

18:45 < travis-ci> Build details : https://travis-ci.org/darcyliu/mlpack/builds/123981080

18:45 travis-ci has left #mlpack []

18:59 wasiq has joined #mlpack

19:43 Nilabhra has quit [Remote host closed the connection]

20:26 tsathoggua has joined #mlpack

20:30 tsathoggua has quit [Client Quit]

20:53 sumedhghaisas has quit [Ping timeout: 252 seconds]

21:30 mentekid_ has quit [Ping timeout: 252 seconds]

22:38 travis-ci has joined #mlpack

22:38 < travis-ci> darcyliu/mlpack#12 (master - 56b53a0 : Ryan Curtin): The build passed.

22:38 < travis-ci> Change view : https://github.com/darcyliu/mlpack/compare/f47728cb317d...56b53a09e2d4

22:38 < travis-ci> Build details : https://travis-ci.org/darcyliu/mlpack/builds/123981622

22:38 travis-ci has left #mlpack []