#mlpack on 2018-03-15 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

01:26 sumedhghaisas has quit [Read error: Connection reset by peer]

01:26 sumedhghaisas2 has joined #mlpack

01:38 sumedhghaisas has joined #mlpack

01:40 sumedhghaisas2 has quit [Ping timeout: 264 seconds]

02:52 __amir__ has quit []

04:02 navneet has joined #mlpack

04:05 < navneet> Hi

04:07 himu has joined #mlpack

04:12 himu has quit [Remote host closed the connection]

04:16 himu has joined #mlpack

04:21 himu has quit [Remote host closed the connection]

08:33 SuRyA has joined #mlpack

08:52 SuRyA has quit [Ping timeout: 246 seconds]

10:09 sumedhghaisas has quit [Ping timeout: 248 seconds]

10:14 sumedhghaisas has joined #mlpack

10:26 Trion has joined #mlpack

10:45 AndroUser2 has joined #mlpack

10:51 AndroUser2 has quit [Read error: Connection reset by peer]

10:51 daivik has joined #mlpack

11:03 surya has joined #mlpack

11:04 MK_18 has joined #mlpack

11:05 < MK_18> I am a First Timer in GSoC. I have good experience in C++. And I want to learn about machine learning. Am I in the right place to achieve my goal??

11:06 sumedhghaisas has quit [Ping timeout: 276 seconds]

11:11 sumedhghaisas has joined #mlpack

11:11 Trion has quit [Ping timeout: 264 seconds]

11:13 daivik has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]

11:13 daivik has joined #mlpack

11:15 rajiv_ has joined #mlpack

11:17 < daivik> rcurtin: zoq: I'm having a lot of trouble deciding on an idea for my Gsoc proposal - I'm not particularly keen on any of the ideas on the ideas page. I was hoping you could address some of my concerns:

11:17 < daivik> 1. Is it so that the ideas on the ideas page are "preferred" in any way? If I were to propose something of my own -- which I'm leaning towards -- will it somehow be less ideal than if I were to select an idea from the ideas page? I realise that if I were to propose something new - then it will require a greater effort on the part of whoever mentors

11:17 < daivik> me to check my work/review my code. Does that factor in at all when you're reviewing student applications?

11:17 < daivik> 2. Is there a reason I don't see SVMs in any of the project ideas -- even in past years? IMHO, SVMs are pretty ubiquitous in machine learning -- and mlpack doesn't have an implementation yet. Would an SVM implementation (both C-SVM and nu-SVM formulations; with SMO algorithm for solving the QP) be something that could perhaps make a good proposal?

11:18 daivik has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]

11:20 ironstark_ has joined #mlpack

11:21 wiking_ has joined #mlpack

11:21 Trion has joined #mlpack

11:22 Trion has quit [Remote host closed the connection]

11:27 ironstark_ is now known as ironstark

11:28 wiking has quit [Ping timeout: 260 seconds]

11:28 ironstark has quit [Changing host]

11:28 ironstark has joined #mlpack

11:29 wiking_ is now known as wiking

11:33 rajiv_ has quit [Ping timeout: 260 seconds]

11:51 surya has quit [Ping timeout: 264 seconds]

12:09 namratamukhija has joined #mlpack

12:20 Trion has joined #mlpack

12:21 manish7294 has joined #mlpack

12:23 < namratamukhija> Is it a good idea to include an option for returning accuracy,precision, f1 score for programs? for example, on running the linear regression program, an output file with the predictions is created. However, there is no option for the user to specify that he/she would like the accuracy(or any other metric) to be reported.

12:27 < namratamukhija> Also, we can maybe only provide this option if the test file with the actual responses is given by the user?

12:30 sumedhghaisas has quit [Read error: Connection reset by peer]

12:30 sumedhghaisas has joined #mlpack

12:32 < manish7294> namratamukhija: If I am not wrong, mlpack already implement those. See src/mlpack/core/cv/metrics

12:39 < namratamukhija> manish7294: Thanks for pointing me to the files. I couldn't see how to specify them when running the linear regression program. Maybe I'm missing it but do we have a --evaluate-metrics kind of option(wherein the user the can specify the metrics he/she wishes to be reported) while running a program through cli commands?

12:44 < manish7294> namratamukhija: API structure is built in such a way to tackle the particular problem at a single instance. Alternatively, API provides user to call each method of his/her choice independently without depending on other. So, I think it's a best choice to have it this way only.

12:47 < manish7294> Sorry, if the above text seem a bit ambiguous.

13:13 < rcurtin> daivik: I would say the ideas on the ideas list are probably preferred, but you should not feel restricted to them. for SVMs, I think the reason we have not done this in years past is that we would need an implementation competitive with libsvm, and this is not the easiest thing

13:14 < manish7294> namratamukhija;Ah , I guess I haven't answered your question completely. You can use any of those metrics after training your model may be like this: mlpack::cv::Accuracy metric; metric.Evaluate(model, dataset, labels);

13:24 < manish7294> rcurtin: what do you suggest could be the best way to represent a 3 dimensional data as in case of (xi-xl)(xi-xl)^T - (xi-xj)(xi-xj)^T for i,j,l belongs to n, in 2d matrix form.

13:29 < rcurtin> ahh, I am not fully sure I understand the question

13:30 < rcurtin> it seems to me like each of x_i, x_j, and x_l could be stored as arma::vec

13:30 < rcurtin> (I am assuming each of those are 1x3 (or 3x1) vectors)

13:32 sumedhghaisas has quit [Read error: No route to host]

13:32 sumedhghaisas has joined #mlpack

13:34 < manish7294> rcurtin: Sorry for stating this too directly. Here xi's are vectors. And I need to find and the store the value of above expression for all possible triplets xi ,xj,xl in a 2d matrix. This is regarding to LMNN SDP form.

13:37 < rcurtin> hmm, I think if each of those values (xi-xl)(xi-xl)^T - (xi-xj)(xi-xj)^T are scalars, it seems like an arma::cube would be the best way

13:37 < rcurtin> but that will be expensive for large n, so maybe there is a way that only some of those are needed at a time

13:37 < rcurtin> but it has been a while since I have thought about LMNN in detail so I am not sure on that

13:41 sumedhghaisas has quit [Read error: Connection reset by peer]

13:41 sumedhghaisas has joined #mlpack

13:42 < MK_18> I am a First Timer in GSoC. I have good experience in C++. And I want to learn about machine learning. Am I in the right place to achieve my goal??

13:43 < manish7294> rcurtin: won't be able to use arma::cube in either way as it will not be compatible with SDP solvers. Though going for the sparsity of matrix can lead to some solution. Will look more into it.

13:45 < rcurtin> MK_18: mlpack could be a good place for that, yes :)

13:45 < rcurtin> manish7294: ok, I see. feel free to propose whatever solution turns out to be best; I am not sure on this one

13:47 < MK_18> rcurtin, can you tell me how to get my proposal accepted from what i am seeing here there is alot of competition.

13:51 < manish7294> rcurtin: In the worst case, I hope this may not happen. if LMNN doesn't turn out be optimally solvable with SDP's, Though we have standard SDP form, but I myself not sure whether it will work out as expected. It's kind of become more research oriented. Is it possible to switch to gradient based algorithm proposed in the literature?

13:51 YTTM has joined #mlpack

13:51 YTTM has quit [Client Quit]

13:53 Yttrium has joined #mlpack

13:57 MK_18 has quit [Ping timeout: 256 seconds]

13:59 manthan has joined #mlpack

14:11 Trion has quit [Quit: Entering a wormhole]

14:28 < manthan> in the pruning algorithm for decision trees that i have made a PR, i have traversed the tree using dfs and made the nodes leaves if the accuracy on valid set is more. I continue this till all the nodes are traversed and the resulting tree is the pruned tree

14:33 < manthan> Any suggestions for optimising this? If I calculate the complexity it is O(n*validationSetSize), it may or may not be a problem depending on the validationSetSize

14:34 thepro has joined #mlpack

14:34 < manthan> In addition to this, I was planning to store the starting and ending index of training data at each node, it can help to find the count of points under the current node

14:37 poomani98_ has joined #mlpack

14:40 < poomani98_> Hello. I've sent my GSoC draft proposal. I'd like to work on it. Where can i contact my mentor??

14:41 < thepro> What project you want to work for?

14:42 manish729493 has joined #mlpack

14:44 manish7294 has quit [Quit: AndroIRC - Android IRC Client ( http://www.androirc.com )]

14:44 manish729493 is now known as manish7294

14:46 < manish7294> poomani98_: You can contact with mentors here itself or through mailing list http://lists.mlpack.org/mailman/listinfo/mlpack

14:47 rf_sust2018 has joined #mlpack

14:49 < poomani98_> I looked through the documentation of available modules, and i found 'autoregression' is also missing, so i sent a draft proposal for implementing it.

14:52 < poomani98_> Proposal is to bulit 'autoregression', 'moving average', 'ARMA' and 'ARIMA'. I would like get feedback on the proposal, so that i can change it working on projects that are on missing list or start working on it.

14:54 sumedhghaisas has quit [Ping timeout: 246 seconds]

14:59 < zoq> poomani98_: If you submitted a draft we will look over it once we have a chance; The main challenge, I see here is we have to find a mentor who likes to mentor the idea and also knows the topic.

15:00 sumedhghaisas has joined #mlpack

15:04 ImQ009 has joined #mlpack

15:05 poomani98_ has quit [Ping timeout: 260 seconds]

15:06 < manthan> does the above idea sound good?

15:25 MystikNinja has joined #mlpack

15:27 < MystikNinja> Hi all! I'm looking to apply to GSoC 2018 and want to work with mlpack. In particular, I'm looking at the issue with MVU+LRSDP and would like to get to the bottom of it over the summer. However, I can't find any relevant issue open on GitHub or code files in the source. Could someone point me to where I should be looking?

15:30 MystikNinja has quit [Quit: Page closed]

15:31 robertohueso has joined #mlpack

15:31 MystikNinja2 has joined #mlpack

15:32 < rcurtin> MystikNinja2: there have been some posts on the mailing list about this so you might want to take a look at the archives; for the LRSDP code itself you can find it in src/mlpack/core/optimizers/sdp/

15:32 < rcurtin> and for the MVU code it is in src/mlpack/methods/mvu/ but it is out of date and it does not work, so part of the project would be that it would have to be rewritten

15:32 < rcurtin> definitely good knowledge of SDPs and related literature will be a necessity for that project, so be sure you are up to speed with any relevant papers

15:33 MystikNinja has joined #mlpack

15:33 MystikNinja has quit [Client Quit]

15:35 < manish7294> rcurtin: What's your view on LRML metric learning. The method can be expected to work with LRSDP as it's orginial solution is by solving its SDP.

15:36 < MystikNinja2> rcurtin: Do you think it is feasible to acquire the necessary knowledge during the course of the project?

15:38 < manish7294> MystikNinja2: You need to be clear about the project while proposing it. So, it will be a good, if you can get familiar with it as much as possible.

15:40 < MystikNinja2> manish7294: Fair enough. Are there any tests which are failing demonstratably that are part of the source code?

15:42 < manish7294> MystikNinja2: The method doesn't prove to converge earlier. Consequently it's implementation is deprecated. So, it will require good amount of work to replenish it.

15:44 < rcurtin> MystikNinja2: yes, basically, the project would entail reimplementing MVU (although that should just be an adjustment to a modified API, it should not be too hard)

15:44 < rcurtin> and then the hard part is debugging why it does not converge with LRSDP

15:44 MystikNinja has joined #mlpack

15:45 < rcurtin> manish7294: I don't have knowledge of LRML but if there is reason to believe it will outperform LMNN (in terms of accuracy or performance) I would not have a huge problem switching the project to that

15:45 < rcurtin> I have to get lunch now, but I will be back later

15:48 MystikNinja2 has quit [Ping timeout: 260 seconds]

15:50 MystikNinja has quit [Ping timeout: 260 seconds]

15:51 MystikNinja has joined #mlpack

15:52 < manish7294> rcurtin: Sure, have your time. I am leaving a message here in case you find some time. Both the methods perform there respective part in supervised and semi-supervised learning and prove to excel respectively. So, I intend to follow the LMNN literature algorithm and consequently build a LRML implementation using LRSDP. Doing the same, we will cover

15:52 < manish7294> whole spectrum of metric learning techniques as we already have NCA for unsupervised part. And both the algorithms ensures to give good results. Does it sounds like a fair idea? Please give your view whenever you find time as its a turning point for my proposal.

15:53 < MystikNinja> Is there any existing testing framework for the MVU+LRSDP code?

15:53 K4k has quit [Read error: Connection reset by peer]

15:53 thepro has quit [Ping timeout: 240 seconds]

15:55 < manish7294> MystikNinja: Unfortunately, there is no test implementation currently regarding it. So, you may have to decide the testing part may be through literatures.

15:56 thepro has joined #mlpack

16:00 Yttrium has quit [Ping timeout: 265 seconds]

16:08 daivik has joined #mlpack

16:10 Yttrium has joined #mlpack

16:17 sumedhghaisas has quit [Read error: Connection reset by peer]

16:19 sumedhghaisas has joined #mlpack

16:20 csoni has joined #mlpack

16:21 poomani98 has joined #mlpack

16:22 daivik has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]

16:23 K4k has joined #mlpack

16:30 poomani98 has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]

16:31 poomani98 has joined #mlpack

16:31 manish7294 has quit [Remote host closed the connection]

16:32 manish7294 has joined #mlpack

16:37 MystikNinja has quit [Ping timeout: 240 seconds]

16:37 poomani98 has quit [Ping timeout: 264 seconds]

16:42 Yttrium has quit [Ping timeout: 245 seconds]

16:44 daivik has joined #mlpack

16:47 MystikNinja has joined #mlpack

16:47 MystikNinja has quit [Client Quit]

16:48 Yttrium has joined #mlpack

16:48 < thepro> is it useful to implement the Factor Analysis for dimensionality reduction as we already have the PCA implemented.

16:56 thepro has quit [Quit: Leaving]

17:04 namratamukhija has quit [Ping timeout: 260 seconds]

17:06 Yttrium has quit [Ping timeout: 255 seconds]

17:06 thepro has joined #mlpack

17:07 thepro has quit [Client Quit]

17:08 sumedhghaisas has quit [Read error: Connection reset by peer]

17:09 sumedhghaisas has joined #mlpack

17:09 Yttrium has joined #mlpack

17:13 Atharva has joined #mlpack

17:17 manish7294 has quit [Remote host closed the connection]

17:17 manish7294 has joined #mlpack

17:18 < Atharva> dk97: Are you actively reviewing #1294?

17:20 < Atharva> dk97 implemented the KL divergence loss in PR #1294. Should I mention in my proposal to continue from that or should I plan to redo that?

17:20 < Atharva> I am talking about VAE project

17:21 < dk97[m]> I will be updating the style, and probably shift the MSE and Cross-entropy loss to the loss function folder

17:21 < Atharva> dk97: okay :)

17:22 < dk97[m]> Atharva:

17:26 Yttrium has quit [Ping timeout: 240 seconds]

17:30 Yttrium has joined #mlpack

17:37 guest6489 has joined #mlpack

17:37 guest6489 has quit [Client Quit]

17:43 MystikNinja has joined #mlpack

17:49 MK_18 has joined #mlpack

17:50 MystikNinja has quit [Ping timeout: 246 seconds]

18:01 Atharva has quit [Quit: Page closed]

18:11 Yttrium has quit [Quit: Leaving]

18:15 poomani98 has joined #mlpack

18:16 poomani98 has quit [Client Quit]

18:17 poomani98 has joined #mlpack

18:17 poomani98 has quit [Client Quit]

18:18 poomani98 has joined #mlpack

18:23 poomani98 has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]

18:24 poomani98 has joined #mlpack

18:30 csoni has quit [Quit: Connection closed for inactivity]

18:31 poomani98 has quit [Ping timeout: 264 seconds]

18:39 sumedhghaisas has quit [Read error: Connection reset by peer]

18:39 sumedhghaisas has joined #mlpack

18:43 sumedhghaisas has quit [Ping timeout: 246 seconds]

18:45 sumedhghaisas has joined #mlpack

18:46 imraj has joined #mlpack

18:48 poomani98 has joined #mlpack

18:50 robertohueso has quit [Quit: leaving]

19:00 imraj has quit [Ping timeout: 260 seconds]

19:04 manish7294 has quit [Remote host closed the connection]

19:04 manish7294 has joined #mlpack

19:29 sumedhghaisas has quit [Ping timeout: 246 seconds]

19:43 < daivik> rcurtin: thanks for the inputs .. I have been going through some of the libsvm code, and the paper that it is based on -- really tricky stuff, solving that quadratic program. I agree that getting an implementation competitive to theirs is a tough task (although I'm not sure why it's so critical - isn't an implementation, perhaps slightly suboptimal

19:43 < daivik> , a good place to start?). I would also like to get your thoughts on another idea I've been thinking about: I'm not really sure that I can give a good name to it though - basically, it will just be a bunch of clustering algorithms. Currently, I find that the following clustering methods exist in mlpack: kmeans, mean shift, dbscan, and gmm clusterin

19:43 < daivik> g (also single linkage HAC - masquerading as an euclidean minimum spanning tree). I think that adding the following algorithms, perhaps along with a refactoring of the existing clustering methods could be a nice contribution:

19:43 < daivik> 1. Agglomerative Clustering with Ward, Complete and Average linkage

19:43 < daivik> 2. BIRCH

19:43 < daivik> 3. Affinity Propagation

19:43 < daivik> 4. Spectral Clustering

19:46 sumedhghaisas has joined #mlpack

19:50 sumedhghaisas has quit [Ping timeout: 246 seconds]

19:50 sumedhghaisas has joined #mlpack

19:55 poomani98 has quit [Remote host closed the connection]

20:06 sumedhghaisas2 has joined #mlpack

20:07 sumedhghaisas has quit [Ping timeout: 246 seconds]

20:17 daivik has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]

20:20 < rcurtin> daivik: I agree, that could be nice. Personally at the moment my preference for something to mentor would be more along the lines of improving or expanding existing support moreso than adding something new (that is more code to maintain), but I can't disagree that adding more clustering algorithms could be useful

20:32 sumedhghaisas2 has quit [Read error: Connection reset by peer]

20:32 sumedhghaisas has joined #mlpack

20:36 daivik has joined #mlpack

20:47 rf_sust2018 has quit [Ping timeout: 265 seconds]

20:48 rf_sust2018 has joined #mlpack

20:53 manthan has quit [Ping timeout: 260 seconds]

21:04 rf_sust2018 has quit [Quit: Leaving.]

21:14 daivik has quit [Quit: http://www.kiwiirc.com/ - A hand crafted IRC client]

21:28 MystikNinja has joined #mlpack

21:47 ImQ009 has quit [Quit: Leaving]

21:58 s1998_ has joined #mlpack

22:05 s1998_ has quit [Ping timeout: 260 seconds]

22:06 manish7294 has quit [Ping timeout: 256 seconds]

22:22 < MystikNinja> Is there a specific paper the current MVU+LRSDP implementation is based on? I was looking at Vasiloglou et. al.

22:25 s1998_ has joined #mlpack

22:25 < s1998_> zoq, w.r.t RBFN as custom deep learning layer,

22:26 < s1998_> This constructor looks better : https://pastebin.com/0TYTxEXy

22:27 < s1998_> I was thinking of how the internal working would happen

22:28 < s1998_> I could see two difficulties and their possible solution which I mentioned here : https://pastebin.com/jTe88GmV

22:28 < s1998_> Please let me know if you think it is feasible

22:28 < s1998_> and if it is, is it decent enough for the codebase ?

22:32 MystikNinja has quit [Ping timeout: 264 seconds]

23:07 MK_18 has quit [Ping timeout: 264 seconds]

23:07 sumedhghaisas2 has joined #mlpack

23:09 sumedhghaisas has quit [Ping timeout: 240 seconds]

23:43 < zoq> s1998_: Do you think a user could pass an already initiated object?