verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
IRCFrEAK has joined #mlpack
IRCFrEAK has left #mlpack []
govg has quit [Ping timeout: 260 seconds]
govg has joined #mlpack
flyingpot has joined #mlpack
madgoat has joined #mlpack
madgoat has left #mlpack []
aashay has joined #mlpack
vpal has joined #mlpack
vivekp has quit [Ping timeout: 268 seconds]
vpal has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
vinayakvivek has joined #mlpack
govg has quit [Ping timeout: 240 seconds]
govg has joined #mlpack
aashay has quit [Quit: Connection closed for inactivity]
flyingpot has quit [Ping timeout: 240 seconds]
flyingpot has joined #mlpack
vinayakvivek has quit [Quit: Connection closed for inactivity]
vinayakvivek has joined #mlpack
vivekp has quit [Ping timeout: 260 seconds]
flyingpot has quit [Ping timeout: 240 seconds]
aashay has joined #mlpack
vivekp has joined #mlpack
vivekp is now known as vpal
vpal is now known as vivekp
flyingpot has joined #mlpack
flyingpot has quit [Ping timeout: 260 seconds]
< vivekp> Hi, I was going through the code of adam optimizer to get an idea about how things are implemented there
< vivekp> and actually I can't fully understand the expression at line 122 in adam_impl.hpp
< vivekp> if I understand correctly, according to the algorithm given in the paper, I think the term "mean / (arma::sqrt(variance) + eps)"
< vivekp> is missing sqrt(biasCorrection2) in multiplication with epsilon.
< vivekp> so please correct me if wrong, but the correct expression would be "mean / (arma::sqrt(variance) + arma::sqrt(biasCorrection2) * eps)"
vinayakvivek has quit [Quit: Connection closed for inactivity]
< vivekp> ^for that particular term in the expression in line 122
< zoq> vivekp: Hello, I guess we are talking about version 9 and the update parameters step is: a * m'_t / (sqrt(v'_t) + e))
< zoq> maybe I missed something?
govg has quit [Ping timeout: 260 seconds]
< vivekp> zoq: yes v9, that is correct.
< vivekp> I'm actually confused a bit
< vivekp> we actually never calculate m'_t and v'_t explicitly as such but use two terms i.e. biascorrection1 and
< vivekp> biascorrection2 in the update parameter step for m'_t and v'_t respectively.
< vivekp> I actually did some calculations on paper. At first, I was thinking that
< vivekp> may be we ignore eps by taking it as approaching to zero but that was a wrong assumption as realized quickly later on
< zoq> I think you are right it would be more clear if we rename the two parameter, that way it would be easier to follow the paper.
< zoq> also, e is just for stability, you can ignore the term if you like
< vivekp> yeah, so basically we have that step currently like this:
< vivekp> a * (sqrt(biasCorrection2) / biasCorrection1) * (m_t / (sqrt(v_t) + eps))
< vivekp> which I think should really be:
< vivekp> a * (sqrt(biasCorrection2) / biasCorrection1) * (m_t / (sqrt(v_t) + (sqrt * biasCorrection2) * eps))
< zoq> where does (sqrt(v_t) + (sqrt * biasCorrection2) come from?
< vivekp> we already have sqrt(v_t) and sqrt(biasCorrection2) comes from taking lcm of the denominator in the term a * m'_t / (sqrt(v'_t) + e)) in the update parameters step
< vivekp> v'_t is v_t / biasCorrection2
< vivekp> oops, made a typo here " (sqrt * biasCorrection2) " -- I meant sqrt(biasCorrection2)
govg has joined #mlpack
< zoq> I think, you are right, but since e is small it shouldn't make a difference.
< zoq> I just checked if e.g tensorflow does the same thing, and it looks like they do: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/training/adam.py
< vivekp> okay, I see
< zoq> anyway I think you are right
< vivekp> that was my intial thought that eps shouldn't be making a big deal out of it but still wanted to clarify to be sure. Thanks :)
< zoq> I have to go through the latest paper (v9) and see what they changed.
flyingpot has joined #mlpack
< zoq> I haven't noticed they updated it
< zoq> Maybe we should add a comment that points out that it's an approximation and the right term should be, what do you think?
< vivekp> I actually cross checked v8 with v9, and as far as algorithm 1 is concerned I didn't find any difference in that part
< vivekp> zoq: yeah, that sounds like a good idea
govg has quit [Ping timeout: 240 seconds]
flyingpot has quit [Ping timeout: 260 seconds]
< zoq> vivekp: If you like you can open a PR, don't feel obligated, I can also make the change that including the parameter naming.
< vivekp> Sure, will do. Just to be clarify -- should we explcitly calculate m'_t and v'_t before update parameters step as done in the paper?
< vivekp> uh, I make a lot of typos
< zoq> hm, I think combining the steps is fine, and probably faster.
< vivekp> yeah, you are right.
< vivekp> Anyway, what names do you suggest for the parameters?
< vivekp> zoq: also, in the paper they proposed an extension to adam i.e adamax as well which we don't have in mlpack yet.
< vivekp> So, I'd like to implement that bit if its plausible and sounds like a good idea. What do you think?
< zoq> I would go with beta1 and v and m, in mlpack we usually don't use underscore instead we use camel casing for all names. Do you think we can discard the time index?
< zoq> The adamax idea sounds interesting, maybe we can combine the two methods into one and just use a flag?
< vivekp> Sorry I missed something, what is the time index?
< vivekp> and yes, combining the two methods sounds good.
< vivekp> I was thinking of going with a separate implementation in a new file but that would mean a lot of duplicacy in the code among adam and adamax.
< vivekp> I think combinig the two methods is a better idea.
< zoq> I was talking about m_t is m at time t = ... I think we can drop the index and just go with m.
< vivekp> oh, right. Yes, I think m is fine
govg has joined #mlpack
flyingpot has joined #mlpack
flyingpot has quit [Ping timeout: 260 seconds]
Kirizaki has joined #mlpack
Kirizaki has left #mlpack []
< rcurtin> oops, this is not my browser :)
< rcurtin> bah! laggy ssh connection, I hit "up and enter" accidentally, and it resends the message
< rcurtin> I should leave my screen session on the irssi window less often I guess
< zoq> rcurtin: There is an interesting OpenSSH feature ControlMaster, that could have helped in your situation, not sure.
< rcurtin> hmm, that is interesting, I think I may have to look into this!
< rcurtin> also, I figured out the issue with the ultrasparc t5220s---I was simply running the non-smp kernel when I needed to run the smp kernel
< rcurtin> once I realized this, I got the installation on two of those systems (aunty.mlpack.org and ironbar.mlpack.org) finished, and I am in the process of getting them connected to masterblaster for jenkins builds now
< rcurtin> I copied the account credentials from /etc/shadow so if you want to login to aunty or ironbar you can just ssh with the same credentials as masterblaster
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1835 (master - 26e35e9 : Marcus Edel): The build is still failing.
travis-ci has left #mlpack []
< zoq> rcurtin: sounds great, let's see if it works.
< zoq> btw. I should watch Mad Max :)
< zoq> rcurtin: Nice hardware, do you monitor the systems somehow e.g. Nagios, Munin, etc.?
< rcurtin> zoq: nope, haven't set up nagios or anything, I don't have any monitoring infrastructure in place :)
< zoq> rcurtin: Maybe another point that could be interesting for the infrastructure project.
flyingpot has joined #mlpack
< zoq> rcurtin: That reminds me, since we switch from SQLite to MySQL for the benchmark system; so that we can run multiple benchmarks at the same time, the views haven't been updated.
< zoq> I've updated the benchmarks views so that they can be used with SQLite and MySQL. Since we can't use javascript to communicate with the MySQL database, I used a small PHP script that is called within the javascript code. I know PHP...
< zoq> I'm not sure if you like to run PHP on the mlpack.org machine.
< zoq> I can run the PHP script on my FreeBSD machine, we just have to update the user permissions of the MySQL user. Btw. that the php script I'm talking about: https://github.com/zoq/benchmarks/blob/master/reports/php/mysql_wrapper.php
flyingpot has quit [Ping timeout: 260 seconds]
flyingpot has joined #mlpack
flyingpot has quit [Ping timeout: 260 seconds]
GjjvdBurg has joined #mlpack
GjjvdBurg has quit [Quit: Page closed]