#mlpack on 2019-07-30 — irc logs at libera.irclog.whitequark.org

2018-11-12 22:39 ChanServ changed the topic of #mlpack to: "mlpack: a fast, flexible machine learning library :: We don't always respond instantly, but we will respond; please be patient :: Logs at http://www.mlpack.org/irc/

00:46 abernauer has quit [Remote host closed the connection]

01:38 xiaohong has joined #mlpack

01:45 xiaohong has quit [Remote host closed the connection]

01:57 < jenkins-mlpack2> Project mlpack - git commit test build #204: UNSTABLE in 47 min: http://ci.mlpack.org/job/mlpack%20-%20git%20commit%20test/204/

02:12 travis-ci has joined #mlpack

02:12 < travis-ci> mlpack/mlpack#7639 (master - b882909 : Ryan Curtin): The build is still failing.

02:12 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/86c9d452b9a2...b88290902de4

02:12 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/565258330

02:12 travis-ci has left #mlpack []

02:39 xiaohong has joined #mlpack

02:59 travis-ci has joined #mlpack

02:59 < travis-ci> robertohueso/mlpack#52 (pca_tree - 45eba1f : Roberto Hueso Gomez): The build failed.

02:59 < travis-ci> Change view : https://github.com/robertohueso/mlpack/compare/ccc4e9746be8^...45eba1f822f2

02:59 < travis-ci> Build details : https://travis-ci.org/robertohueso/mlpack/builds/565268358

02:59 travis-ci has left #mlpack []

03:22 xiaohong has quit [Remote host closed the connection]

03:24 xiaohong has joined #mlpack

03:31 xiaohong has quit [Remote host closed the connection]

03:32 xiaohong has joined #mlpack

03:52 xiaohong has quit [Remote host closed the connection]

03:54 xiaohong has joined #mlpack

03:56 xiaohong has quit [Remote host closed the connection]

04:47 xiaohong has joined #mlpack

04:51 xiaohong has quit [Remote host closed the connection]

04:52 xiaohong has joined #mlpack

04:58 xiaohong has quit [Remote host closed the connection]

05:16 vivekp has joined #mlpack

05:23 xiaohong has joined #mlpack

06:18 KimSangYeon-DGU has joined #mlpack

07:40 < jenkins-mlpack2> Project docker mlpack nightly build build #402: STILL UNSTABLE in 3 hr 26 min: http://ci.mlpack.org/job/docker%20mlpack%20nightly%20build/402/

09:32 xiaohong has quit [Remote host closed the connection]

09:57 xiaohong has joined #mlpack

10:00 xiaohong has quit [Remote host closed the connection]

10:46 xiaohong has joined #mlpack

11:23 k3nz0_ has joined #mlpack

11:26 k3nz0__ has quit [Ping timeout: 246 seconds]

12:18 KimSangYeon-DGU has quit [Remote host closed the connection]

12:42 xiaohong has quit [Remote host closed the connection]

12:45 xiaohong has joined #mlpack

13:01 KimSangYeon-DGU has joined #mlpack

13:01 sumedhghaisas has joined #mlpack

13:02 < KimSangYeon-DGU> Hi Ghaisas!

13:03 < KimSangYeon-DGU> sumedhghaisas: Hey Ghaisas!

13:03 < sumedhghaisas> Hey :) Hows it going?

13:03 < sumedhghaisas> I was just going through the document

13:03 < KimSangYeon-DGU> Currently working

13:03 < sumedhghaisas> sorry couldn't find some time before

13:04 < KimSangYeon-DGU> Ahh, no worries

13:04 < sumedhghaisas> Hows the multiple case going?

13:04 < KimSangYeon-DGU> I'm writing the code

13:04 < KimSangYeon-DGU> When I'm done, I'll let you know

13:05 < KimSangYeon-DGU> I think I need some time

13:06 < sumedhghaisas> Surely. Nicely written conclusion :)

13:06 < KimSangYeon-DGU> Thanks :)

13:06 < sumedhghaisas> Do you discuss it further here?

13:06 < sumedhghaisas> I saw a case where 2 clusters have converged on 1 cluster

13:06 < KimSangYeon-DGU> Ah, actually, no. Something I wan to say was written in the document

13:07 < sumedhghaisas> that seems normal enough

13:07 < KimSangYeon-DGU> Ah, when I used low lambda, the 2 clusters have converged on 1 cluster

13:08 < sumedhghaisas> interesting ... so initial phi matters a lot it seems

13:09 < KimSangYeon-DGU> Yeah, initial phi matters

13:09 < sumedhghaisas> so when you put initial value as 90 what is the final value of phi that you get?

13:09 < KimSangYeon-DGU> Wait a moment

13:10 < sumedhghaisas> If you could, could you also add the graph for changing phi over iterations? that ould be useful in analysis

13:10 < KimSangYeon-DGU> Depending on the test case, test case 4 is 84.2564

13:10 < KimSangYeon-DGU> In test case 5, phi is 89.18866.

13:11 < sumedhghaisas> so phi is not changing much from 90

13:11 < KimSangYeon-DGU> Yeah

13:11 < sumedhghaisas> in the document its test case 1 right?

13:11 < sumedhghaisas> this might be due to the fact that our dataset is from independent clusters

13:12 < sumedhghaisas> maybe we need to change the dataset to change phi

13:12 < KimSangYeon-DGU> In the test case 1, the last phi was 89.28725

13:12 < KimSangYeon-DGU> Ah, I'll try

13:12 < KimSangYeon-DGU> Yes, right, the phi wasn't changed largely

13:13 < sumedhghaisas> okay I have an interesting experiment to add to this to support this hypothesis

13:13 < sumedhghaisas> so in Figure 2 (c)

13:13 < KimSangYeon-DGU> Oh, yeah

13:13 < sumedhghaisas> you have set the initial phi as 180

13:13 < sumedhghaisas> also report the final phi, but I suspect its not changed much

13:14 < sumedhghaisas> phi 180 signifies that 2 clusters are subtracting each other at the intersection

13:15 < sumedhghaisas> thats why they can occupy 1 dataset cluster together

13:15 < KimSangYeon-DGU> The final phi was 177.96943931

13:15 xiaohong has quit [Ping timeout: 246 seconds]

13:15 < KimSangYeon-DGU> Ah

13:16 < sumedhghaisas> what was the final phi in Figure 2 (b)

13:16 < sumedhghaisas> ?

13:16 < KimSangYeon-DGU> 92.32387192

13:17 < sumedhghaisas> very very interesting

13:17 < sumedhghaisas> so when it was zero it finds the correct phi also

13:17 < sumedhghaisas> okay so the new experiment is

13:18 < sumedhghaisas> in Figure 2 (c) experiment put the initial clusters little farther from each other than they are right now

13:18 < KimSangYeon-DGU> Yeah

13:18 < sumedhghaisas> check for different initial distances from each other

13:18 < sumedhghaisas> I suspect that is what causing this

13:18 < KimSangYeon-DGU> Okay

13:18 < KimSangYeon-DGU> I'll do that

13:19 < sumedhghaisas> again do the saem (a) (b) (c) with different initial distances

13:19 < KimSangYeon-DGU> Got it

13:19 < sumedhghaisas> I bet you will find some distance when they actually find the correct clusters with phi 90

13:20 < KimSangYeon-DGU> Ah, yes

13:20 < sumedhghaisas> this turned out to be a very good research direction as it also provides us clues for further research

13:20 < KimSangYeon-DGU> Sounds good :)

13:20 < sumedhghaisas> one viable nex direction is to change the dataset in appropriate way to see if its able to find different phis

13:20 < KimSangYeon-DGU> Yes

13:20 < sumedhghaisas> I have an idea in that way

13:21 < KimSangYeon-DGU> What idea??

13:21 < KimSangYeon-DGU> Can you tell me?

13:21 < sumedhghaisas> In Figure 2 (b) you have found correct values

13:21 < KimSangYeon-DGU> yeah

13:21 < sumedhghaisas> so we have QGMM distribution in the end

13:22 < KimSangYeon-DGU> Right

13:22 < sumedhghaisas> use that distribution and change its phi value manually

13:22 < KimSangYeon-DGU> Ah~

13:22 < KimSangYeon-DGU> Nice idea

13:22 < sumedhghaisas> make it lets say 120 or 30

13:22 < sumedhghaisas> and then normalize it again

13:22 < sumedhghaisas> and sample from it

13:23 < sumedhghaisas> hmmm... but how would you sample from QGMM that we need to understand

13:24 < sumedhghaisas> we only have PDF and CDF cannot be computed

13:24 < KimSangYeon-DGU> Hmm..

13:24 < sumedhghaisas> we have to use some crazy sampling technique for this.. simple CDF technique won't cu it

13:25 < sumedhghaisas> That might be ou next research topic

13:25 < KimSangYeon-DGU> Yeah

13:25 < KimSangYeon-DGU> I'll think about it

13:25 < sumedhghaisas> how to sample from QGMM ... if we could do that we can create a good dataset

13:26 < sumedhghaisas> wait there another hacky way of doing it

13:26 < sumedhghaisas> so when you generate the normal dataset

13:27 < sumedhghaisas> create a circle in the middle of the dataset ... middle of the 2 centers

13:27 < sumedhghaisas> with radius R

13:27 < KimSangYeon-DGU> Yeah

13:27 < sumedhghaisas> and delete 67 percent points in the circle

13:28 < sumedhghaisas> change R to get different datasets

13:28 < sumedhghaisas> 67 percent would basically generate Gaussian like effect

13:28 < KimSangYeon-DGU> Ahh..

13:28 < sumedhghaisas> we can test Figure 2 (b) situation for these different datasets to see what phi do we get

13:29 < sumedhghaisas> if we are still getting 90 then something is wrong

13:29 < KimSangYeon-DGU> Yeah, thanks for the idea

13:30 < KimSangYeon-DGU> If I'm get stuck in doing that research, can I ask you questions later?

13:30 < sumedhghaisas> Surely. Anytime.

13:30 < sumedhghaisas> I always like some good research

13:30 < KimSangYeon-DGU> :)

13:31 < sumedhghaisas> Try to generate the datasets first and see if they look different enough

13:31 < KimSangYeon-DGU> Got it, your idea is really nice

13:31 < sumedhghaisas> rest of the job is straightforward

13:31 < KimSangYeon-DGU> Generating the dataset

13:31 < sumedhghaisas> lets hope it works

13:31 < KimSangYeon-DGU> Yeah

13:32 < KimSangYeon-DGU> Thanks!

13:32 < sumedhghaisas> if the phi is not changing much in the research document you sent me

13:32 < sumedhghaisas> just note down the final phi values as well

13:33 < sumedhghaisas> no need to add graphs

13:33 < sumedhghaisas> little less work I guess

13:33 < sumedhghaisas> Okay so far we seem to have 3 tasks at hand.

13:34 < sumedhghaisas> 1. Change distance in Figure 2 (c) and see the effect

13:34 < sumedhghaisas> 2. Multiple cluster case

13:34 < sumedhghaisas> 3. crazy gaussian effect idea with radius change

13:34 < KimSangYeon-DGU> Right :)

13:34 < sumedhghaisas> I will leave it to you to prioritize

13:35 < KimSangYeon-DGU> Ah, thanks

13:35 < sumedhghaisas> and take your time ... its always better to get done the right way :)

13:35 < KimSangYeon-DGU> I agree :)

13:36 < sumedhghaisas> Do you have any questions in the multiple cluster case?

13:36 < KimSangYeon-DGU> Thanks for organizing the tasks

13:36 < KimSangYeon-DGU> How should I manage Phi?

13:36 < KimSangYeon-DGU> as a matrix?

13:37 < sumedhghaisas> good question. but it won't be between 2 clusters anymore right

13:37 < sumedhghaisas> it will be assigned to each cluster separately?

13:37 < sumedhghaisas> just want make sure I understand correctly

13:37 < KimSangYeon-DGU> Hmm

13:38 < KimSangYeon-DGU> Can you give some time to think about it?

13:39 < sumedhghaisas> basically it will be equation 6 in the paper right?

13:39 < sumedhghaisas> where each cluster 'k' has phi_k

13:39 < KimSangYeon-DGU> Ah, yeah, but it is equation (7)?

13:40 < KimSangYeon-DGU> Ahh, I understand

13:42 < KimSangYeon-DGU> Because cos(phi)_{1,2} = cos(phi)_{2,1}, is it possible to assign the phi to each cluster separately?

13:42 < KimSangYeon-DGU> Actually, I intended to use a matrix

13:44 < sumedhghaisas> ahh yes equation 7 my bad

13:44 < sumedhghaisas> yes just create that many traineable variables

13:44 < sumedhghaisas> basically whenever you are using equation 7 in the code

13:45 < sumedhghaisas> use all the variables to generate the subtractions

13:45 < sumedhghaisas> matrix will be little harder to implement

13:45 < sumedhghaisas> this will be easier as its just subtractions basically... in a loop

13:46 < KimSangYeon-DGU> Okay

13:46 < sumedhghaisas> And yes take your time to understand

13:47 < KimSangYeon-DGU> Yeah :)

13:47 < sumedhghaisas> you can setup something again this week to ask questions

13:47 < KimSangYeon-DGU> Cool!

13:47 < KimSangYeon-DGU> Thanks for the meeting

13:48 < KimSangYeon-DGU> I'll ping you

13:48 < sumedhghaisas> while you are trying to understand this I would reccommend prioritizing other 2 task while doing this

13:48 < KimSangYeon-DGU> Ahh, right

13:48 < sumedhghaisas> lets take our time to implement multiple cluster case as its little bit complex

13:49 < KimSangYeon-DGU> Okay

13:50 < sumedhghaisas> I need to attend to another meeting now if you still have some questions could you send an email or ping me on Hnagout?

13:52 < KimSangYeon-DGU> No :)

13:52 < KimSangYeon-DGU> Yeah, If I have a question, I'll ping you or send emails :)

13:52 < sumedhghaisas> great. Gotta run. Best of luck for the work

13:52 < KimSangYeon-DGU> Thanks!!

13:58 k3nz0__ has joined #mlpack

14:00 k3nz0_ has quit [Ping timeout: 248 seconds]

14:23 xiaohong has joined #mlpack

14:42 ImQ009 has joined #mlpack

14:48 xiaohong has quit [Remote host closed the connection]

15:00 KimSangYeon-DGU has quit [Remote host closed the connection]

16:21 gtank___ has quit [Ping timeout: 244 seconds]

16:25 gtank___ has joined #mlpack

16:33 gtank___ has quit [Ping timeout: 252 seconds]

16:38 gtank___ has joined #mlpack

17:07 sumedhghaisas has quit [Ping timeout: 260 seconds]

17:28 vivekp has quit [Ping timeout: 245 seconds]

19:55 < sreenik[m]> zoq: Thanks for the comment on the PR. I will sort that out. On a different note, I thought of asking you one thing. For the *mlpack-onnx* converter, it would be helpful to have an ONNX API that builds a model layer by layer (just like FFN in mlpack or the somewhat roundabout procedure in Torch). However, none exists at the moment for C++. Now, the onnx model being a protobuf file it is possible to create a valid model

19:55 < sreenik[m]> from scratch but it will be rather painstaking to do so. So as to avoid this, I am thinking of creating a *mlpack-Torch* converter instead. The reason I chose Torch (over Tensorflow, etc.) is that Torch itself maintains a Torch to ONNX converter (in contrary to ONNX maintaing it), not to mention that it has a solid documentation too. However, Torch brings in a dependency of about 180MB. What would you suggest?

20:15 < zoq> sreenik[m]: So this means I could convert from mlpack to torch and from torch to whatever is supported?

20:16 < sreenik[m]> Yes exactly. Secondly as torch has a torch to onnx converter, we can convert to ONNX and then to anything we want\

20:17 ImQ009 has quit [Quit: Leaving]

20:27 < zoq> sreenik[m]: I see, so the base for us isn't ONNX but torch. I guess the only issue is that torch does come with a bigger footprint (180MB)?

20:28 < sreenik[m]> That's the only issue, hopefully

20:28 < zoq> sreenik[m]: Personally I don't mind to go via torch, if it makes things easier or even opens more opportunities.

20:29 < zoq> sreenik[m]: If Atharva and you like the idea, fine with me; probably a good idea to check the pipline before.

20:32 < sreenik[m]> Yes Atharva and I have discussed this issue. We wanted to take your opinion before proceeding. So would you suggest to create a mini-converter (with one or two supported layers) and see if everything is going right?

20:33 < zoq> sreenik[m]: Yes, I think that is a reasonable approach, might give us some insight about what works and what not.

20:35 < sreenik[m]> Yup. I'd need to look into extracting the layers from the mlpack model too. I hope that won't be a hassle

20:36 < sreenik[m]> zoq: Do you remember if mlpack has biases stored before weights in the *parameters* matrix?

20:36 < zoq> sreenik[m]: Maybe you can go even simpler with LinearNoBias instead of Linear.

20:36 < zoq> sreenik[m]: The bias comes last.

20:37 < zoq> sreenik[m]: Maybe LinearNoBias followed by Sigmoid?

20:38 < sreenik[m]> That too sounds reasonable. Quite relieved it's in the last, I have assumed it throughout in the onnx-mlpack converter that's completed

20:39 < zoq> sreenik[m]: Good, would be easy to change the position.

20:41 < zoq> sreenik[m]: https://github.com/mlpack/mlpack/blob/b88290902de45f4e525d6927560662f3d1c624d7/src/mlpack/methods/ann/layer/linear_impl.hpp#L41-L43

20:42 < zoq> sreenik[m]: That are the relevant lines.

20:43 < sreenik[m]> Ahh thanks

20:44 < sreenik[m]> Let's see. I'll let you know if I get stuck anywhere regarding the mlpack part :)

20:44 < zoq> sreenik[m]: The weights are stored in continues memory, and each layer references the vector of weights.

20:44 < zoq> sreenik[m]: Sounds good, if you have any issue on the torch part, let me know as well; happy to help.

20:45 < sreenik[m]> zoq: Sure I will. Thanks :)