verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
IRCFrEAK has joined #mlpack
IRCFrEAK has left #mlpack []
govg has quit [Ping timeout: 260 seconds]
govg has joined #mlpack
flyingpot has joined #mlpack
madgoat has joined #mlpack
madgoat has left #mlpack []
aashay has joined #mlpack
vpal has joined #mlpack
vivekp has quit [Ping timeout: 268 seconds]
vpal has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
vinayakvivek has joined #mlpack
govg has quit [Ping timeout: 240 seconds]
govg has joined #mlpack
aashay has quit [Quit: Connection closed for inactivity]
flyingpot has quit [Ping timeout: 240 seconds]
flyingpot has joined #mlpack
vinayakvivek has quit [Quit: Connection closed for inactivity]
vinayakvivek has joined #mlpack
vivekp has quit [Ping timeout: 260 seconds]
flyingpot has quit [Ping timeout: 240 seconds]
aashay has joined #mlpack
vivekp has joined #mlpack
vivekp is now known as vpal
vpal is now known as vivekp
flyingpot has joined #mlpack
flyingpot has quit [Ping timeout: 260 seconds]
< vivekp>
Hi, I was going through the code of adam optimizer to get an idea about how things are implemented there
< vivekp>
and actually I can't fully understand the expression at line 122 in adam_impl.hpp
< vivekp>
if I understand correctly, according to the algorithm given in the paper, I think the term "mean / (arma::sqrt(variance) + eps)"
< vivekp>
is missing sqrt(biasCorrection2) in multiplication with epsilon.
< vivekp>
so please correct me if wrong, but the correct expression would be "mean / (arma::sqrt(variance) + arma::sqrt(biasCorrection2) * eps)"
vinayakvivek has quit [Quit: Connection closed for inactivity]
< vivekp>
^for that particular term in the expression in line 122
< zoq>
vivekp: Hello, I guess we are talking about version 9 and the update parameters step is: a * m'_t / (sqrt(v'_t) + e))
< zoq>
maybe I missed something?
govg has quit [Ping timeout: 260 seconds]
< vivekp>
zoq: yes v9, that is correct.
< vivekp>
I'm actually confused a bit
< vivekp>
we actually never calculate m'_t and v'_t explicitly as such but use two terms i.e. biascorrection1 and
< vivekp>
biascorrection2 in the update parameter step for m'_t and v'_t respectively.
< vivekp>
I actually did some calculations on paper. At first, I was thinking that
< vivekp>
may be we ignore eps by taking it as approaching to zero but that was a wrong assumption as realized quickly later on
< zoq>
I think you are right it would be more clear if we rename the two parameter, that way it would be easier to follow the paper.
< zoq>
also, e is just for stability, you can ignore the term if you like
< vivekp>
yeah, so basically we have that step currently like this:
< zoq>
where does (sqrt(v_t) + (sqrt * biasCorrection2) come from?
< vivekp>
we already have sqrt(v_t) and sqrt(biasCorrection2) comes from taking lcm of the denominator in the term a * m'_t / (sqrt(v'_t) + e)) in the update parameters step
< vivekp>
v'_t is v_t / biasCorrection2
< vivekp>
oops, made a typo here " (sqrt * biasCorrection2) " -- I meant sqrt(biasCorrection2)
govg has joined #mlpack
< zoq>
I think, you are right, but since e is small it shouldn't make a difference.
< vivekp>
that was my intial thought that eps shouldn't be making a big deal out of it but still wanted to clarify to be sure. Thanks :)
< zoq>
I have to go through the latest paper (v9) and see what they changed.
flyingpot has joined #mlpack
< zoq>
I haven't noticed they updated it
< zoq>
Maybe we should add a comment that points out that it's an approximation and the right term should be, what do you think?
< vivekp>
I actually cross checked v8 with v9, and as far as algorithm 1 is concerned I didn't find any difference in that part
< vivekp>
zoq: yeah, that sounds like a good idea
govg has quit [Ping timeout: 240 seconds]
flyingpot has quit [Ping timeout: 260 seconds]
< zoq>
vivekp: If you like you can open a PR, don't feel obligated, I can also make the change that including the parameter naming.
< vivekp>
Sure, will do. Just to be clarify -- should we explcitly calculate m'_t and v'_t before update parameters step as done in the paper?
< vivekp>
uh, I make a lot of typos
< zoq>
hm, I think combining the steps is fine, and probably faster.
< vivekp>
yeah, you are right.
< vivekp>
Anyway, what names do you suggest for the parameters?
< vivekp>
zoq: also, in the paper they proposed an extension to adam i.e adamax as well which we don't have in mlpack yet.
< vivekp>
So, I'd like to implement that bit if its plausible and sounds like a good idea. What do you think?
< zoq>
I would go with beta1 and v and m, in mlpack we usually don't use underscore instead we use camel casing for all names. Do you think we can discard the time index?
< zoq>
The adamax idea sounds interesting, maybe we can combine the two methods into one and just use a flag?
< vivekp>
Sorry I missed something, what is the time index?
< vivekp>
and yes, combining the two methods sounds good.
< vivekp>
I was thinking of going with a separate implementation in a new file but that would mean a lot of duplicacy in the code among adam and adamax.
< vivekp>
I think combinig the two methods is a better idea.
< zoq>
I was talking about m_t is m at time t = ... I think we can drop the index and just go with m.
< vivekp>
oh, right. Yes, I think m is fine
govg has joined #mlpack
flyingpot has joined #mlpack
flyingpot has quit [Ping timeout: 260 seconds]
Kirizaki has joined #mlpack
Kirizaki has left #mlpack []
< rcurtin>
oops, this is not my browser :)
< rcurtin>
bah! laggy ssh connection, I hit "up and enter" accidentally, and it resends the message
< rcurtin>
I should leave my screen session on the irssi window less often I guess
< zoq>
rcurtin: There is an interesting OpenSSH feature ControlMaster, that could have helped in your situation, not sure.
< rcurtin>
hmm, that is interesting, I think I may have to look into this!
< rcurtin>
also, I figured out the issue with the ultrasparc t5220s---I was simply running the non-smp kernel when I needed to run the smp kernel
< rcurtin>
once I realized this, I got the installation on two of those systems (aunty.mlpack.org and ironbar.mlpack.org) finished, and I am in the process of getting them connected to masterblaster for jenkins builds now
< rcurtin>
I copied the account credentials from /etc/shadow so if you want to login to aunty or ironbar you can just ssh with the same credentials as masterblaster
travis-ci has joined #mlpack
< travis-ci>
mlpack/mlpack#1835 (master - 26e35e9 : Marcus Edel): The build is still failing.
< zoq>
rcurtin: sounds great, let's see if it works.
< zoq>
btw. I should watch Mad Max :)
< zoq>
rcurtin: Nice hardware, do you monitor the systems somehow e.g. Nagios, Munin, etc.?
< rcurtin>
zoq: nope, haven't set up nagios or anything, I don't have any monitoring infrastructure in place :)
< zoq>
rcurtin: Maybe another point that could be interesting for the infrastructure project.
flyingpot has joined #mlpack
< zoq>
rcurtin: That reminds me, since we switch from SQLite to MySQL for the benchmark system; so that we can run multiple benchmarks at the same time, the views haven't been updated.
< zoq>
I've updated the benchmarks views so that they can be used with SQLite and MySQL. Since we can't use javascript to communicate with the MySQL database, I used a small PHP script that is called within the javascript code. I know PHP...
< zoq>
I'm not sure if you like to run PHP on the mlpack.org machine.