verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
sumedhghaisas has quit [Read error: Connection reset by peer]
sumedhghaisas2 has joined #mlpack
sumedhghaisas has joined #mlpack
sumedhghaisas2 has quit [Ping timeout: 264 seconds]
__amir__ has quit []
navneet has joined #mlpack
< navneet>
Hi
himu has joined #mlpack
himu has quit [Remote host closed the connection]
himu has joined #mlpack
himu has quit [Remote host closed the connection]
SuRyA has joined #mlpack
SuRyA has quit [Ping timeout: 246 seconds]
sumedhghaisas has quit [Ping timeout: 248 seconds]
sumedhghaisas has joined #mlpack
Trion has joined #mlpack
AndroUser2 has joined #mlpack
AndroUser2 has quit [Read error: Connection reset by peer]
daivik has joined #mlpack
surya has joined #mlpack
MK_18 has joined #mlpack
< MK_18>
I am a First Timer in GSoC. I have good experience in C++. And I want to learn about machine learning. Am I in the right place to achieve my goal??
sumedhghaisas has quit [Ping timeout: 276 seconds]
< daivik>
rcurtin: zoq: I'm having a lot of trouble deciding on an idea for my Gsoc proposal - I'm not particularly keen on any of the ideas on the ideas page. I was hoping you could address some of my concerns:
< daivik>
1. Is it so that the ideas on the ideas page are "preferred" in any way? If I were to propose something of my own -- which I'm leaning towards -- will it somehow be less ideal than if I were to select an idea from the ideas page? I realise that if I were to propose something new - then it will require a greater effort on the part of whoever mentors
< daivik>
me to check my work/review my code. Does that factor in at all when you're reviewing student applications?
< daivik>
2. Is there a reason I don't see SVMs in any of the project ideas -- even in past years? IMHO, SVMs are pretty ubiquitous in machine learning -- and mlpack doesn't have an implementation yet. Would an SVM implementation (both C-SVM and nu-SVM formulations; with SMO algorithm for solving the QP) be something that could perhaps make a good proposal?
Trion has quit [Remote host closed the connection]
ironstark_ is now known as ironstark
wiking has quit [Ping timeout: 260 seconds]
ironstark has quit [Changing host]
ironstark has joined #mlpack
wiking_ is now known as wiking
rajiv_ has quit [Ping timeout: 260 seconds]
surya has quit [Ping timeout: 264 seconds]
namratamukhija has joined #mlpack
Trion has joined #mlpack
manish7294 has joined #mlpack
< namratamukhija>
Is it a good idea to include an option for returning accuracy,precision, f1 score for programs? for example, on running the linear regression program, an output file with the predictions is created. However, there is no option for the user to specify that he/she would like the accuracy(or any other metric) to be reported.
< namratamukhija>
Also, we can maybe only provide this option if the test file with the actual responses is given by the user?
sumedhghaisas has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
< manish7294>
namratamukhija: If I am not wrong, mlpack already implement those. See src/mlpack/core/cv/metrics
< namratamukhija>
manish7294: Thanks for pointing me to the files. I couldn't see how to specify them when running the linear regression program. Maybe I'm missing it but do we have a --evaluate-metrics kind of option(wherein the user the can specify the metrics he/she wishes to be reported) while running a program through cli commands?
< manish7294>
namratamukhija: API structure is built in such a way to tackle the particular problem at a single instance. Alternatively, API provides user to call each method of his/her choice independently without depending on other. So, I think it's a best choice to have it this way only.
< manish7294>
Sorry, if the above text seem a bit ambiguous.
< rcurtin>
daivik: I would say the ideas on the ideas list are probably preferred, but you should not feel restricted to them. for SVMs, I think the reason we have not done this in years past is that we would need an implementation competitive with libsvm, and this is not the easiest thing
< manish7294>
namratamukhija;Ah , I guess I haven't answered your question completely. You can use any of those metrics after training your model may be like this: mlpack::cv::Accuracy metric; metric.Evaluate(model, dataset, labels);
< manish7294>
rcurtin: what do you suggest could be the best way to represent a 3 dimensional data as in case of (xi-xl)(xi-xl)^T - (xi-xj)(xi-xj)^T for i,j,l belongs to n, in 2d matrix form.
< rcurtin>
ahh, I am not fully sure I understand the question
< rcurtin>
it seems to me like each of x_i, x_j, and x_l could be stored as arma::vec
< rcurtin>
(I am assuming each of those are 1x3 (or 3x1) vectors)
sumedhghaisas has quit [Read error: No route to host]
sumedhghaisas has joined #mlpack
< manish7294>
rcurtin: Sorry for stating this too directly. Here xi's are vectors. And I need to find and the store the value of above expression for all possible triplets xi ,xj,xl in a 2d matrix. This is regarding to LMNN SDP form.
< rcurtin>
hmm, I think if each of those values (xi-xl)(xi-xl)^T - (xi-xj)(xi-xj)^T are scalars, it seems like an arma::cube would be the best way
< rcurtin>
but that will be expensive for large n, so maybe there is a way that only some of those are needed at a time
< rcurtin>
but it has been a while since I have thought about LMNN in detail so I am not sure on that
sumedhghaisas has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
< MK_18>
I am a First Timer in GSoC. I have good experience in C++. And I want to learn about machine learning. Am I in the right place to achieve my goal??
< manish7294>
rcurtin: won't be able to use arma::cube in either way as it will not be compatible with SDP solvers. Though going for the sparsity of matrix can lead to some solution. Will look more into it.
< rcurtin>
MK_18: mlpack could be a good place for that, yes :)
< rcurtin>
manish7294: ok, I see. feel free to propose whatever solution turns out to be best; I am not sure on this one
< MK_18>
rcurtin, can you tell me how to get my proposal accepted from what i am seeing here there is alot of competition.
< manish7294>
rcurtin: In the worst case, I hope this may not happen. if LMNN doesn't turn out be optimally solvable with SDP's, Though we have standard SDP form, but I myself not sure whether it will work out as expected. It's kind of become more research oriented. Is it possible to switch to gradient based algorithm proposed in the literature?
YTTM has joined #mlpack
YTTM has quit [Client Quit]
Yttrium has joined #mlpack
MK_18 has quit [Ping timeout: 256 seconds]
manthan has joined #mlpack
Trion has quit [Quit: Entering a wormhole]
< manthan>
in the pruning algorithm for decision trees that i have made a PR, i have traversed the tree using dfs and made the nodes leaves if the accuracy on valid set is more. I continue this till all the nodes are traversed and the resulting tree is the pruned tree
< manthan>
Any suggestions for optimising this? If I calculate the complexity it is O(n*validationSetSize), it may or may not be a problem depending on the validationSetSize
thepro has joined #mlpack
< manthan>
In addition to this, I was planning to store the starting and ending index of training data at each node, it can help to find the count of points under the current node
poomani98_ has joined #mlpack
< poomani98_>
Hello. I've sent my GSoC draft proposal. I'd like to work on it. Where can i contact my mentor??
< poomani98_>
I looked through the documentation of available modules, and i found 'autoregression' is also missing, so i sent a draft proposal for implementing it.
< poomani98_>
Proposal is to bulit 'autoregression', 'moving average', 'ARMA' and 'ARIMA'. I would like get feedback on the proposal, so that i can change it working on projects that are on missing list or start working on it.
sumedhghaisas has quit [Ping timeout: 246 seconds]
< zoq>
poomani98_: If you submitted a draft we will look over it once we have a chance; The main challenge, I see here is we have to find a mentor who likes to mentor the idea and also knows the topic.
sumedhghaisas has joined #mlpack
ImQ009 has joined #mlpack
poomani98_ has quit [Ping timeout: 260 seconds]
< manthan>
does the above idea sound good?
MystikNinja has joined #mlpack
< MystikNinja>
Hi all! I'm looking to apply to GSoC 2018 and want to work with mlpack. In particular, I'm looking at the issue with MVU+LRSDP and would like to get to the bottom of it over the summer. However, I can't find any relevant issue open on GitHub or code files in the source. Could someone point me to where I should be looking?
MystikNinja has quit [Quit: Page closed]
robertohueso has joined #mlpack
MystikNinja2 has joined #mlpack
< rcurtin>
MystikNinja2: there have been some posts on the mailing list about this so you might want to take a look at the archives; for the LRSDP code itself you can find it in src/mlpack/core/optimizers/sdp/
< rcurtin>
and for the MVU code it is in src/mlpack/methods/mvu/ but it is out of date and it does not work, so part of the project would be that it would have to be rewritten
< rcurtin>
definitely good knowledge of SDPs and related literature will be a necessity for that project, so be sure you are up to speed with any relevant papers
MystikNinja has joined #mlpack
MystikNinja has quit [Client Quit]
< manish7294>
rcurtin: What's your view on LRML metric learning. The method can be expected to work with LRSDP as it's orginial solution is by solving its SDP.
< MystikNinja2>
rcurtin: Do you think it is feasible to acquire the necessary knowledge during the course of the project?
< manish7294>
MystikNinja2: You need to be clear about the project while proposing it. So, it will be a good, if you can get familiar with it as much as possible.
< MystikNinja2>
manish7294: Fair enough. Are there any tests which are failing demonstratably that are part of the source code?
< manish7294>
MystikNinja2: The method doesn't prove to converge earlier. Consequently it's implementation is deprecated. So, it will require good amount of work to replenish it.
< rcurtin>
MystikNinja2: yes, basically, the project would entail reimplementing MVU (although that should just be an adjustment to a modified API, it should not be too hard)
< rcurtin>
and then the hard part is debugging why it does not converge with LRSDP
MystikNinja has joined #mlpack
< rcurtin>
manish7294: I don't have knowledge of LRML but if there is reason to believe it will outperform LMNN (in terms of accuracy or performance) I would not have a huge problem switching the project to that
< rcurtin>
I have to get lunch now, but I will be back later
MystikNinja2 has quit [Ping timeout: 260 seconds]
MystikNinja has quit [Ping timeout: 260 seconds]
MystikNinja has joined #mlpack
< manish7294>
rcurtin: Sure, have your time. I am leaving a message here in case you find some time. Both the methods perform there respective part in supervised and semi-supervised learning and prove to excel respectively. So, I intend to follow the LMNN literature algorithm and consequently build a LRML implementation using LRSDP. Doing the same, we will cover
< manish7294>
whole spectrum of metric learning techniques as we already have NCA for unsupervised part. And both the algorithms ensures to give good results. Does it sounds like a fair idea? Please give your view whenever you find time as its a turning point for my proposal.
< MystikNinja>
Is there any existing testing framework for the MVU+LRSDP code?
K4k has quit [Read error: Connection reset by peer]
thepro has quit [Ping timeout: 240 seconds]
< manish7294>
MystikNinja: Unfortunately, there is no test implementation currently regarding it. So, you may have to decide the testing part may be through literatures.
thepro has joined #mlpack
Yttrium has quit [Ping timeout: 265 seconds]
daivik has joined #mlpack
Yttrium has joined #mlpack
sumedhghaisas has quit [Read error: Connection reset by peer]
poomani98 has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]
poomani98 has joined #mlpack
manish7294 has quit [Remote host closed the connection]
manish7294 has joined #mlpack
MystikNinja has quit [Ping timeout: 240 seconds]
poomani98 has quit [Ping timeout: 264 seconds]
Yttrium has quit [Ping timeout: 245 seconds]
daivik has joined #mlpack
MystikNinja has joined #mlpack
MystikNinja has quit [Client Quit]
Yttrium has joined #mlpack
< thepro>
is it useful to implement the Factor Analysis for dimensionality reduction as we already have the PCA implemented.
thepro has quit [Quit: Leaving]
namratamukhija has quit [Ping timeout: 260 seconds]
Yttrium has quit [Ping timeout: 255 seconds]
thepro has joined #mlpack
thepro has quit [Client Quit]
sumedhghaisas has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
Yttrium has joined #mlpack
Atharva has joined #mlpack
manish7294 has quit [Remote host closed the connection]
manish7294 has joined #mlpack
< Atharva>
dk97: Are you actively reviewing #1294?
< Atharva>
dk97 implemented the KL divergence loss in PR #1294. Should I mention in my proposal to continue from that or should I plan to redo that?
< Atharva>
I am talking about VAE project
< dk97[m]>
I will be updating the style, and probably shift the MSE and Cross-entropy loss to the loss function folder
< Atharva>
dk97: okay :)
< dk97[m]>
Atharva:
Yttrium has quit [Ping timeout: 240 seconds]
Yttrium has joined #mlpack
guest6489 has joined #mlpack
guest6489 has quit [Client Quit]
MystikNinja has joined #mlpack
MK_18 has joined #mlpack
MystikNinja has quit [Ping timeout: 246 seconds]
Atharva has quit [Quit: Page closed]
Yttrium has quit [Quit: Leaving]
poomani98 has joined #mlpack
poomani98 has quit [Client Quit]
poomani98 has joined #mlpack
poomani98 has quit [Client Quit]
poomani98 has joined #mlpack
poomani98 has quit [Quit: Yaaic - Yet another Android IRC client - http://www.yaaic.org]
poomani98 has joined #mlpack
csoni has quit [Quit: Connection closed for inactivity]
poomani98 has quit [Ping timeout: 264 seconds]
sumedhghaisas has quit [Read error: Connection reset by peer]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 246 seconds]
sumedhghaisas has joined #mlpack
imraj has joined #mlpack
poomani98 has joined #mlpack
robertohueso has quit [Quit: leaving]
imraj has quit [Ping timeout: 260 seconds]
manish7294 has quit [Remote host closed the connection]
manish7294 has joined #mlpack
sumedhghaisas has quit [Ping timeout: 246 seconds]
< daivik>
rcurtin: thanks for the inputs .. I have been going through some of the libsvm code, and the paper that it is based on -- really tricky stuff, solving that quadratic program. I agree that getting an implementation competitive to theirs is a tough task (although I'm not sure why it's so critical - isn't an implementation, perhaps slightly suboptimal
< daivik>
, a good place to start?). I would also like to get your thoughts on another idea I've been thinking about: I'm not really sure that I can give a good name to it though - basically, it will just be a bunch of clustering algorithms. Currently, I find that the following clustering methods exist in mlpack: kmeans, mean shift, dbscan, and gmm clusterin
< daivik>
g (also single linkage HAC - masquerading as an euclidean minimum spanning tree). I think that adding the following algorithms, perhaps along with a refactoring of the existing clustering methods could be a nice contribution:
< daivik>
1. Agglomerative Clustering with Ward, Complete and Average linkage
< daivik>
2. BIRCH
< daivik>
3. Affinity Propagation
< daivik>
4. Spectral Clustering
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 246 seconds]
sumedhghaisas has joined #mlpack
poomani98 has quit [Remote host closed the connection]
sumedhghaisas2 has joined #mlpack
sumedhghaisas has quit [Ping timeout: 246 seconds]
< rcurtin>
daivik: I agree, that could be nice. Personally at the moment my preference for something to mentor would be more along the lines of improving or expanding existing support moreso than adding something new (that is more code to maintain), but I can't disagree that adding more clustering algorithms could be useful
sumedhghaisas2 has quit [Read error: Connection reset by peer]