#mlpack on 2019-06-28 — irc logs at libera.irclog.whitequark.org

2018-11-12 22:39 ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/

00:33 vivekp has quit [Ping timeout: 246 seconds]

01:49 xiaohong has joined #mlpack

02:38 xiaohong has quit [Ping timeout: 260 seconds]

03:59 < jenkins-mlpack2> Project docker mlpack weekly build build #54: STILL UNSTABLE in 6 hr 11 min: http://ci.mlpack.org/job/docker%20mlpack%20weekly%20build/54/

06:52 sreenik[m] has quit [Remote host closed the connection]

06:52 aleixrocks[m] has quit [Remote host closed the connection]

06:52 chandramouli_r has quit [Remote host closed the connection]

06:52 Sergobot has quit [Write error: Connection reset by peer]

07:03 chandramouli_r has joined #mlpack

07:37 aleixrocks[m] has joined #mlpack

07:37 Sergobot has joined #mlpack

07:37 sreenik[m] has joined #mlpack

07:42 < jenkins-mlpack2> Project docker mlpack nightly build build #370: STILL UNSTABLE in 3 hr 28 min: http://ci.mlpack.org/job/docker%20mlpack%20nightly%20build/370/

12:11 KimSangYeon-DGU has joined #mlpack

12:27 KimSangYeon-DGU has quit [Remote host closed the connection]

13:41 vivekp has joined #mlpack

13:45 < ShikharJ> sakshamB: Toshal: I don't think I'll be able to make it to the meeting today. Let's talk on Monday? If you guys have a message, please feel free to leave them in the channel.

13:46 < sakshamB> ShikharJ alright thats fine with me. Have a great weekend :)

13:50 KimSangYeon-DGU has joined #mlpack

16:16 sumedhghaisas has joined #mlpack

16:17 < sumedhghaisas> KimSangYeon-DGU: Hey

16:17 < sumedhghaisas> Sorry for the delay

16:17 < KimSangYeon-DGU> Hi Sumedh!!

16:17 < KimSangYeon-DGU> No worries :)

16:18 favre49 has joined #mlpack

16:18 < KimSangYeon-DGU> I implemented QGMM and found the method to update alpha using the NLL + lambda * approx constant

16:18 < KimSangYeon-DGU> Lagrange multiplier

16:19 < sumedhghaisas> great. So what lambda value did you use?

16:19 < KimSangYeon-DGU> I'll update the equation for alpha optimization soon

16:19 < KimSangYeon-DGU> Ah, I treat the lambda as constant

16:19 < KimSangYeon-DGU> It cancelled out later

16:20 < sumedhghaisas> hmmm... okay little confused

16:20 < KimSangYeon-DGU> When I implemented QGMM, it seem to be main point to optimize the alpha

16:20 < sumedhghaisas> so you implemented NLL + lambda * approx constant

16:20 < KimSangYeon-DGU> Yeah

16:21 < KimSangYeon-DGU> I used the equation to calculate the alpha

16:21 < sumedhghaisas> and whats the update procedure?

16:21 < KimSangYeon-DGU> First,

16:21 < KimSangYeon-DGU> I add KMeans clustering to initialize the parameters

16:22 < KimSangYeon-DGU> Second, implemented QGMM except for alpha optimization

16:22 < KimSangYeon-DGU> Third, I calculated the alpha optimization

16:22 < KimSangYeon-DGU> Oops

16:23 < KimSangYeon-DGU> The second progress is several days ago

16:23 < sumedhghaisas> what do you mean QGMM except for alpha optimzation?

16:23 < KimSangYeon-DGU> The third progress is recent progress

16:23 < KimSangYeon-DGU> Ahh,

16:23 < KimSangYeon-DGU> At first, when I implemented the QGMM, I didn't know how I can update the alpha

16:24 < KimSangYeon-DGU> So, I thought about the equation you said

16:24 < KimSangYeon-DGU> NLL + lambda * approx const

16:24 < sumedhghaisas> huh

16:24 < sumedhghaisas> so you update which parameters with QGMM?

16:24 < KimSangYeon-DGU> Yeah

16:25 < KimSangYeon-DGU> finally, I updated the alpha, theta

16:25 < KimSangYeon-DGU> but alpha optimization needs to be verified

16:25 < KimSangYeon-DGU> Because, I just calculated it

16:26 < KimSangYeon-DGU> I wanted to check the equations to you

16:27 < KimSangYeon-DGU> *want

16:27 < sumedhghaisas> wait... still confused. So you initialized all the parameters

16:27 < sumedhghaisas> there are means variance thera and alpha

16:27 < sumedhghaisas> so you used the equation given in the paper?

16:28 < KimSangYeon-DGU> I initialized all the parameters using K Means clustering algorithm and then trained the parameter using the equations in my final proposal

16:29 < sumedhghaisas> okay ... all the parameters?

16:29 < KimSangYeon-DGU> Ahh

16:29 < sumedhghaisas> I mean trained all the parameters using the equations?

16:29 < KimSangYeon-DGU> Yeah

16:30 < sumedhghaisas> so where is NLL + lambda * constraint is used?

16:30 < KimSangYeon-DGU> for the alpha optimization

16:32 < sumedhghaisas> hmmm ... okay I think code would make me understand it little better :P

16:32 < KimSangYeon-DGU> Oh.. sorry for confusing...

16:32 < sumedhghaisas> So mean variance and theta is updated using the equations in your paper

16:32 < sumedhghaisas> but for alpha you use NLL equation?

16:32 < KimSangYeon-DGU> Right

16:33 < sumedhghaisas> I see.

16:34 < sumedhghaisas> so in iteration of the update its first update mean variance and theta

16:34 < sumedhghaisas> and then use NLL equation to update alpha...

16:34 < sumedhghaisas> and then repeat for n iterations

16:34 < sumedhghaisas> is that the training?

16:34 < KimSangYeon-DGU> Right

16:35 < sumedhghaisas> okay :)

16:35 < sumedhghaisas> so what are the results?

16:35 < KimSangYeon-DGU> Cool

16:35 < KimSangYeon-DGU> You can check them at https://github.com/KimSangYeon-DGU/GSoC-2019/tree/master/Research/EMFit/Codes/QGMM

16:35 < KimSangYeon-DGU> Our project repository

16:36 < KimSangYeon-DGU> I check it with the classical GMM

16:37 < KimSangYeon-DGU> There is one problem. For some specific observations, QGMM violates the constraint (20) in the paper

16:38 < KimSangYeon-DGU> To prevent that, I think we need another constraint.

16:39 < sumedhghaisas> hmmm.. Sorry I couldn't find the results in that link

16:39 < sumedhghaisas> is it in the code itself?

16:40 < KimSangYeon-DGU> Oh sorry, I thought you mean the code

16:40 < sumedhghaisas> ahh yes that too.

16:40 < sumedhghaisas> Which file is it?

16:41 < sumedhghaisas> I will go over it a little bit quickly

16:41 < KimSangYeon-DGU> quantum_emfit.py at line 99~101

16:41 < KimSangYeon-DGU> Through the NLL + lambda * const, it is derivated

16:42 < KimSangYeon-DGU> (alpha_{k} / (alpha_{k} + alpha_{k'})) = N_{k} / N

16:46 < sumedhghaisas> hmm... okay firstly is this procedure converging?

16:46 < KimSangYeon-DGU> Yeah

16:46 < sumedhghaisas> interesting...

16:47 < sumedhghaisas> okay going through the code again in some details

16:47 < KimSangYeon-DGU> Thanks!!

16:49 < sumedhghaisas> Just a quic question... in the code which is the alpha parameter

16:49 < sumedhghaisas> ?

16:49 < KimSangYeon-DGU> Ah, it is named weights

16:50 < KimSangYeon-DGU> I'll change it

16:50 < KimSangYeon-DGU> in the QuantumGMM class

16:50 < sumedhghaisas> ahh okay that make sense

16:50 < KimSangYeon-DGU> QuantumGMM has alpha parameter as 'weights' variable

16:50 < sumedhghaisas> also why are you using the KMeans fit?

16:52 < KimSangYeon-DGU> Ahh, because the classical EM algorithm uses KMeans to initialize the parameters to train.

16:53 < sumedhghaisas> umm... it should also work without. Did you try without?

16:53 < KimSangYeon-DGU> Yeah

16:54 < sumedhghaisas> For simple enough dataset it should find the centers

16:54 < KimSangYeon-DGU> If we don't use it, It's not trained well

16:55 < KimSangYeon-DGU> When I introduce the KMeans to QGMM, the performance increases

16:56 < sumedhghaisas> okay. But did you see the means and variance after KMeans finishes?

16:56 < sumedhghaisas> maybe the result is already really close?

16:56 < KimSangYeon-DGU> I checked it

16:56 < KimSangYeon-DGU> It is not close

16:57 < sumedhghaisas> interesting...

16:57 < KimSangYeon-DGU> You can check the performance by typing "python main.py"

16:57 < sumedhghaisas> okay could you give me the means before and after QGMM algorithm?

16:58 < KimSangYeon-DGU> Yes

16:58 < sumedhghaisas> ahh okay. does it print just after KMeans too

16:58 < KimSangYeon-DGU> Wait a moment

17:00 < sumedhghaisas> I am trying to run but its missing some sklearn.cluster

17:00 < KimSangYeon-DGU> Initial mean is zero

17:01 < sumedhghaisas> I will install it later and run the file

17:01 < KimSangYeon-DGU> Yeah

17:01 < KimSangYeon-DGU> Sumedh

17:01 < sumedhghaisas> Don't want to keep you awake for too long actually

17:01 < KimSangYeon-DGU> The covariance is quite a between KMeans and QGMM

17:02 < KimSangYeon-DGU> but mean is quite similar

17:02 < KimSangYeon-DGU> sorry for the confusing but QGMM is more accurate

17:02 < KimSangYeon-DGU> I'm okay :)

17:03 < KimSangYeon-DGU> I used the two distributions, representing each class

17:03 < KimSangYeon-DGU> initial means of d1 and d2 is zero

17:04 < KimSangYeon-DGU> After KMeans, the mean of d1 is [ 5.15701407 6.36799727 3.09944505 3.18260357 2.03883024]

17:04 < KimSangYeon-DGU> the mean of d1 is [ 0.96782953 -0.9264934 -0.01378471 1.00829962 0.89609672]

17:04 < KimSangYeon-DGU> After QGMM, the mean of d1 is [ 5.01880026 6.16773633 2.96954073 3.10592078 1.93759082]

17:04 < KimSangYeon-DGU> the mean of d2 is [ 0.93731248 -1.00229649 -0.02631165 0.9955708 0.91626373]

17:05 < KimSangYeon-DGU> Finally, the actual mean of d1 is [5, 6, 3, 3, 2]

17:05 < KimSangYeon-DGU> the actual mean of d2 is [1, -1, 0, 1, 1]

17:06 < sumedhghaisas> hmmm... the means are really close I think

17:06 < sumedhghaisas> could you try with initialization little more far away?

17:07 < KimSangYeon-DGU> Yeah, the covariance is really different

17:07 < sumedhghaisas> maybe our method converges when the parameters are correct?

17:07 < KimSangYeon-DGU> Hmm..

17:07 < sumedhghaisas> that also

17:07 < sumedhghaisas> Also I thought NLL equation will be used to update ll the parameters :)

17:08 < sumedhghaisas> basically you take gradient with respect to each one

17:08 < sumedhghaisas> I have a suspicion that the update equation are wrong...

17:09 < KimSangYeon-DGU> Oops... which one??

17:09 < sumedhghaisas> ahh no not the code

17:09 < sumedhghaisas> so the formulation they use which is

17:10 < KimSangYeon-DGU> Ahh

17:10 < sumedhghaisas> so you take equation 16 and differentiate it

17:10 < sumedhghaisas> with each parameter right?

17:10 < KimSangYeon-DGU> Yeah, right

17:11 < sumedhghaisas> but in the derivation the Q is expanded right?

17:11 < sumedhghaisas> which I think is wrong. Because Q is just an estimate using the last iteration parameters

17:12 < sumedhghaisas> parameters inside Q are not variables but constants

17:12 < sumedhghaisas> For example

17:12 < sumedhghaisas> if you see GM update you do E step to do estimates which are used in M step as constants

17:13 < sumedhghaisas> If you see derivation of GMM EM update you will see the same behavior

17:13 < sumedhghaisas> Q is taken as an estimate and not an equation

17:13 < KimSangYeon-DGU> Ahh...

17:13 < sumedhghaisas> E step and M step are separate in that way

17:13 < KimSangYeon-DGU> So, I should re-derivate

17:13 < KimSangYeon-DGU> Oops...

17:13 < sumedhghaisas> but if you don't expand the Q you don't get any closed form solution

17:14 < sumedhghaisas> I tried that

17:14 < KimSangYeon-DGU> Hmm...

17:14 < KimSangYeon-DGU> When I used equations

17:14 < KimSangYeon-DGU> the covariances are trained well

17:14 < KimSangYeon-DGU> but the KMeans didn't

17:15 < sumedhghaisas> Now at the minimum point the estimate the update should give the same answers

17:15 < sumedhghaisas> but then the complete formed solution which we don't get

17:15 < sumedhghaisas> also in your derivation

17:17 < sumedhghaisas> i mean their derivation as well

17:17 < sumedhghaisas> when they expand the Q

17:17 < sumedhghaisas> equation 33

17:17 < sumedhghaisas> there are G's as well in the equation

17:18 < sumedhghaisas> nor G also contains variable 'mu' which is the mean

17:18 < sumedhghaisas> now why is that variable treated different that othr 'mu' variables?

17:18 < sumedhghaisas> this is all very non mathy

17:19 < KimSangYeon-DGU> Agreed

17:19 < sumedhghaisas> I have never seen an EM proof that actually does this kinda thing

17:19 < sumedhghaisas> actually no EM proof ever expands Q

17:19 < sumedhghaisas> cause that is a constant

17:20 < sumedhghaisas> Actually you should read the derivation of GMM EM

17:20 < sumedhghaisas> maybe you will understand better

17:20 < KimSangYeon-DGU> Yeah, I understand

17:20 < sumedhghaisas> So thats why we were just going to bail on the updates

17:20 < sumedhghaisas> we use good old gradient descent

17:21 < sumedhghaisas> use NLL + lambda * constraint to do gradient descent

17:21 < sumedhghaisas> but the its good to know that the updates converge around correct answer

17:21 < KimSangYeon-DGU> for alpha, mean, and covariances?

17:21 < sumedhghaisas> actually that makes sense

17:21 < sumedhghaisas> near the correct answer

17:22 < sumedhghaisas> estimate and the updated parameters will be same

17:22 < sumedhghaisas> so Q can be taken as constant or as equation

17:22 < sumedhghaisas> hmmm

17:23 < sumedhghaisas> we need to think ths through some more

17:23 < sumedhghaisas> ahh yes

17:23 < sumedhghaisas> all parameters could be updated with gradient descend

17:23 < KimSangYeon-DGU> I have a question

17:23 < sumedhghaisas> okay

17:24 < KimSangYeon-DGU> In the unsupervised learning, can we use gradient descent?

17:24 < sumedhghaisas> yes surely

17:24 < KimSangYeon-DGU> Ahh

17:24 < KimSangYeon-DGU> Thanks

17:24 < sumedhghaisas> we use gradient descend to maximize LL

17:24 < KimSangYeon-DGU> Yeah

17:24 < sumedhghaisas> or to minimize NLL

17:25 < sumedhghaisas> given constarint that becomes NLL + lambda * constraint

17:25 < sumedhghaisas> so its really easy

17:25 < sumedhghaisas> loss is that equation

17:25 < sumedhghaisas> use Tensorflow or pytorch to update all parameters using Adam optimizer or something

17:25 < KimSangYeon-DGU> Yeah, I'll try it

17:26 < sumedhghaisas> I don't have strong confidence it will work but we can try

17:26 < KimSangYeon-DGU> Right

17:26 < sumedhghaisas> theoretically it should work with correct constraint

17:26 < sumedhghaisas> but we are using approximate constraint remember?

17:27 < KimSangYeon-DGU> Can you remind me of it?

17:28 < sumedhghaisas> so the constraint is given just under equation 9

17:28 < KimSangYeon-DGU> Thanks

17:28 < sumedhghaisas> that the probability distribution should integrate to

17:28 < sumedhghaisas> 1

17:28 < KimSangYeon-DGU> Yeah

17:29 < sumedhghaisas> but we can find that integral

17:29 < sumedhghaisas> so we just use the current batch of points to estimate it

17:30 < KimSangYeon-DGU> Ah, for normalizing?

17:30 < sumedhghaisas> yes yes

17:30 < KimSangYeon-DGU> I remember we use the batch for normalization

17:30 < KimSangYeon-DGU> ah Cool

17:31 < sumedhghaisas> so we do NLL + lambda * constraint for a batch

17:31 < sumedhghaisas> and optimize it

17:31 < sumedhghaisas> i am not sure it works but lets try :P

17:31 < KimSangYeon-DGU> Got it :)

17:32 < sumedhghaisas> okay you should sleep now.

17:32 < KimSangYeon-DGU> Yeah, thanks for the meeting!

17:32 < sumedhghaisas> we are already way past 2AM :P

17:32 < KimSangYeon-DGU> :)

17:33 < sumedhghaisas> lets catch up tomorrow over hangouts if you have any questions

17:33 < KimSangYeon-DGU> Okay!

17:33 < sumedhghaisas> I will be out tomorrow so won't be able to get table connection for IRC

17:33 < favre49> zoq: In your experience, how long does NEAT take on double pole balancing with no velocities?

17:33 < KimSangYeon-DGU> Ah~ that make sense

17:34 < sumedhghaisas> *stable

17:34 < KimSangYeon-DGU> Thanks again :)

17:34 < favre49> I tested it on double pole balancing with velocities, and the results are almost too good, it seems the initial population itself usually has a member that can last indefinitely (probably through wiggling)

17:36 < favre49> On double pole balancing with velocities, I have run it for 500-700 generations, and the agent usually reaches 450ish steps.

17:36 < favre49> I'm making some changes to the fitness function though, so hopefully it gets better with that and some parameter tweaking

17:41 < favre49> Also, I was wondering what other features we should add to the NEAT implementation once testing is done (hopefully soon).

17:51 < favre49> In my proposal I mentioned Phased Searching from SharpNEAT which claimed to reduce genome bloat, but I'm not sure how much of a problem that is, based on the testing I've done. Moreover, the author argued that speciation prevents it. I'm ambivalent though, it may be interesting to try. I'll do what you say on that one.

17:53 favre49 has quit [Remote host closed the connection]

17:53 < zoq> favre49: Don't really have a number in mind, but around 500 sounds reasonable. In your tests you are going for a single run, but note that e.g. the CartPole task is "considered "solved" when the agent obtains an average reward of at least 195.0 over 100 consecutive episodes". We should run consecutive episodes as well.

17:56 < zoq> favre49: I really liked the idea (Phased Searching), so I would test it out.

17:57 < zoq> favre49: Maybe it makes sense to run NEAT against OpenAI's gym, that way we could get some recordings and see what the result looks like.

17:58 < zoq> favre49: If you like I can set something up.

18:03 favre49 has joined #mlpack

18:05 < favre49> zoq Sounds good, the OpenAI Gym idea also sounds great.

18:05 < favre49> What do you think about implementing some sort of config file?

18:09 < zoq> favre49: Right, definitely a good idea, perhaps the mlpack serialization feature is all we need, we could write a simple class that holds some parameter and see if the serialization output (txt, xml) looks simple enough. What do you think?

18:11 < favre49> zoq Yup, are there any examples I can read for this?

18:13 < favre49> Ah, j

18:13 < favre49> Oops ignore that. I just found it i think http://mlpack.org/doc/stable/doxygen/formatdoc.html

18:15 < zoq> right, so basically all you need is to implement the serialize function

18:15 < zoq> https://github.com/mlpack/mlpack/blob/897b0cddf6ba23733f701b4679fac08f9f90ebb2/src/mlpack/methods/logistic_regression/logistic_regression_impl.hpp#L169-L173

18:15 < zoq> ist one example

18:15 < zoq> *is

18:15 < zoq> and save/load the class as mentioned in the guide

18:16 < favre49> Ah yes thanks a lot. I'll implement these over the next week.

18:16 favre49 has quit [Remote host closed the connection]

18:17 < zoq> this would also allow us stop and continue the training process

20:16 sumedhghaisas has quit [Ping timeout: 260 seconds]