#mlpack on 2017-02-20 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

02:55 IRCFrEAK has joined #mlpack

02:55 IRCFrEAK has left #mlpack []

04:06 govg has quit [Ping timeout: 260 seconds]

04:48 govg has joined #mlpack

05:58 flyingpot has joined #mlpack

06:02 madgoat has joined #mlpack

06:02 madgoat has left #mlpack []

06:24 aashay has joined #mlpack

06:30 vpal has joined #mlpack

06:31 vivekp has quit [Ping timeout: 268 seconds]

06:52 vpal has quit [Ping timeout: 240 seconds]

06:53 vivekp has joined #mlpack

08:04 vinayakvivek has joined #mlpack

08:53 govg has quit [Ping timeout: 240 seconds]

09:00 govg has joined #mlpack

09:26 aashay has quit [Quit: Connection closed for inactivity]

09:26 flyingpot has quit [Ping timeout: 240 seconds]

10:19 flyingpot has joined #mlpack

10:25 vinayakvivek has quit [Quit: Connection closed for inactivity]

12:08 vinayakvivek has joined #mlpack

12:50 vivekp has quit [Ping timeout: 260 seconds]

13:08 flyingpot has quit [Ping timeout: 240 seconds]

13:31 aashay has joined #mlpack

13:32 vivekp has joined #mlpack

13:47 vivekp is now known as vpal

13:48 vpal is now known as vivekp

14:05 flyingpot has joined #mlpack

14:11 flyingpot has quit [Ping timeout: 260 seconds]

14:11 < vivekp> Hi, I was going through the code of adam optimizer to get an idea about how things are implemented there

14:12 < vivekp> and actually I can't fully understand the expression at line 122 in adam_impl.hpp

14:13 < vivekp> if I understand correctly, according to the algorithm given in the paper, I think the term "mean / (arma::sqrt(variance) + eps)"

14:13 < vivekp> is missing sqrt(biasCorrection2) in multiplication with epsilon.

14:13 < vivekp> so please correct me if wrong, but the correct expression would be "mean / (arma::sqrt(variance) + arma::sqrt(biasCorrection2) * eps)"

14:15 vinayakvivek has quit [Quit: Connection closed for inactivity]

14:16 < vivekp> ^for that particular term in the expression in line 122

14:49 < zoq> vivekp: Hello, I guess we are talking about version 9 and the update parameters step is: a * m'_t / (sqrt(v'_t) + e))

15:01 < zoq> maybe I missed something?

15:12 govg has quit [Ping timeout: 260 seconds]

15:13 < vivekp> zoq: yes v9, that is correct.

15:13 < vivekp> I'm actually confused a bit

15:13 < vivekp> we actually never calculate m'_t and v'_t explicitly as such but use two terms i.e. biascorrection1 and

15:13 < vivekp> biascorrection2 in the update parameter step for m'_t and v'_t respectively.

15:14 < vivekp> I actually did some calculations on paper. At first, I was thinking that

15:14 < vivekp> may be we ignore eps by taking it as approaching to zero but that was a wrong assumption as realized quickly later on

15:19 < zoq> I think you are right it would be more clear if we rename the two parameter, that way it would be easier to follow the paper.

15:21 < zoq> also, e is just for stability, you can ignore the term if you like

15:25 < vivekp> yeah, so basically we have that step currently like this:

15:26 < vivekp> a * (sqrt(biasCorrection2) / biasCorrection1) * (m_t / (sqrt(v_t) + eps))

15:26 < vivekp> which I think should really be:

15:27 < vivekp> a * (sqrt(biasCorrection2) / biasCorrection1) * (m_t / (sqrt(v_t) + (sqrt * biasCorrection2) * eps))

15:30 < zoq> where does (sqrt(v_t) + (sqrt * biasCorrection2) come from?

15:33 < vivekp> we already have sqrt(v_t) and sqrt(biasCorrection2) comes from taking lcm of the denominator in the term a * m'_t / (sqrt(v'_t) + e)) in the update parameters step

15:34 < vivekp> v'_t is v_t / biasCorrection2

15:43 < vivekp> oops, made a typo here " (sqrt * biasCorrection2) " -- I meant sqrt(biasCorrection2)

15:44 govg has joined #mlpack

15:47 < zoq> I think, you are right, but since e is small it shouldn't make a difference.

15:49 < zoq> I just checked if e.g tensorflow does the same thing, and it looks like they do: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/adam.py

15:50 < vivekp> okay, I see

15:51 < zoq> anyway I think you are right

15:51 < vivekp> that was my intial thought that eps shouldn't be making a big deal out of it but still wanted to clarify to be sure. Thanks :)

15:53 < zoq> I have to go through the latest paper (v9) and see what they changed.

15:54 flyingpot has joined #mlpack

15:54 < zoq> I haven't noticed they updated it

15:55 < zoq> Maybe we should add a comment that points out that it's an approximation and the right term should be, what do you think?

15:55 < vivekp> I actually cross checked v8 with v9, and as far as algorithm 1 is concerned I didn't find any difference in that part

15:56 < vivekp> zoq: yeah, that sounds like a good idea

15:58 govg has quit [Ping timeout: 240 seconds]

15:58 flyingpot has quit [Ping timeout: 260 seconds]

16:01 < zoq> vivekp: If you like you can open a PR, don't feel obligated, I can also make the change that including the parameter naming.

16:04 < vivekp> Sure, will do. Just to be clarify -- should we explcitly calculate m'_t and v'_t before update parameters step as done in the paper?

16:04 < vivekp> uh, I make a lot of typos

16:09 < zoq> hm, I think combining the steps is fine, and probably faster.

16:14 < vivekp> yeah, you are right.

16:14 < vivekp> Anyway, what names do you suggest for the parameters?

16:15 < vivekp> zoq: also, in the paper they proposed an extension to adam i.e adamax as well which we don't have in mlpack yet.

16:16 < vivekp> So, I'd like to implement that bit if its plausible and sounds like a good idea. What do you think?

16:19 < zoq> I would go with beta1 and v and m, in mlpack we usually don't use underscore instead we use camel casing for all names. Do you think we can discard the time index?

16:20 < zoq> The adamax idea sounds interesting, maybe we can combine the two methods into one and just use a flag?

16:29 < vivekp> Sorry I missed something, what is the time index?

16:29 < vivekp> and yes, combining the two methods sounds good.

16:30 < vivekp> I was thinking of going with a separate implementation in a new file but that would mean a lot of duplicacy in the code among adam and adamax.

16:30 < vivekp> I think combinig the two methods is a better idea.

16:31 < zoq> I was talking about m_t is m at time t = ... I think we can drop the index and just go with m.

16:34 < vivekp> oh, right. Yes, I think m is fine

17:19 govg has joined #mlpack

17:41 flyingpot has joined #mlpack

17:46 flyingpot has quit [Ping timeout: 260 seconds]

17:46 Kirizaki has joined #mlpack

17:46 Kirizaki has left #mlpack []

17:57 < rcurtin> oops, this is not my browser :)

17:57 < rcurtin> bah! laggy ssh connection, I hit "up and enter" accidentally, and it resends the message

17:57 < rcurtin> I should leave my screen session on the irssi window less often I guess

18:06 < zoq> rcurtin: There is an interesting OpenSSH feature ControlMaster, that could have helped in your situation, not sure.

18:24 < rcurtin> hmm, that is interesting, I think I may have to look into this!

18:24 < rcurtin> also, I figured out the issue with the ultrasparc t5220s---I was simply running the non-smp kernel when I needed to run the smp kernel

18:24 < rcurtin> once I realized this, I got the installation on two of those systems (aunty.mlpack.org and ironbar.mlpack.org) finished, and I am in the process of getting them connected to masterblaster for jenkins builds now

18:25 < rcurtin> I copied the account credentials from /etc/shadow so if you want to login to aunty or ironbar you can just ssh with the same credentials as masterblaster

18:31 travis-ci has joined #mlpack

18:31 < travis-ci> mlpack/mlpack#1835 (master - 26e35e9 : Marcus Edel): The build is still failing.

18:31 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/3e747462b73e...26e35e9ec1aa

18:31 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/203536159

18:31 travis-ci has left #mlpack []

18:46 < zoq> rcurtin: sounds great, let's see if it works.

18:48 < zoq> btw. I should watch Mad Max :)

18:52 < zoq> rcurtin: Nice hardware, do you monitor the systems somehow e.g. Nagios, Munin, etc.?

19:08 < rcurtin> zoq: nope, haven't set up nagios or anything, I don't have any monitoring infrastructure in place :)

19:21 < zoq> rcurtin: Maybe another point that could be interesting for the infrastructure project.

19:30 flyingpot has joined #mlpack

19:33 < zoq> rcurtin: That reminds me, since we switch from SQLite to MySQL for the benchmark system; so that we can run multiple benchmarks at the same time, the views haven't been updated.

19:33 < zoq> I've updated the benchmarks views so that they can be used with SQLite and MySQL. Since we can't use javascript to communicate with the MySQL database, I used a small PHP script that is called within the javascript code. I know PHP...

19:33 < zoq> I'm not sure if you like to run PHP on the mlpack.org machine.

19:33 < zoq> I can run the PHP script on my FreeBSD machine, we just have to update the user permissions of the MySQL user. Btw. that the php script I'm talking about: https://github.com/zoq/benchmarks/blob/master/reports/php/mysql_wrapper.php

19:34 flyingpot has quit [Ping timeout: 260 seconds]

21:18 flyingpot has joined #mlpack

21:23 flyingpot has quit [Ping timeout: 260 seconds]

22:18 GjjvdBurg has joined #mlpack

23:42 GjjvdBurg has quit [Quit: Page closed]