verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
nilay has joined #mlpack
marcosirc has quit [Quit: WeeChat 1.4]
tham has joined #mlpack
< tham> @keonkim I list out the features already done/undone and some questions at here
< tham> I think most of the features proposed are done already
< tham> Please tell me if I missed something
< tham> Do mlpack or armadillo provide any function for Gram Schmidt computation?
< tham> fast pca algorithm use GS(Gram Schmidt) to compute the eigenvector, it is not hard to implement
< tham> But I prefer to reuse existing solution first
Stellar_Mind has joined #mlpack
< nilay> after unrolling an std::tuple using recursion, how can we access both the previous std::get<I - 1> and next, std::get<I+1> element, is it because tuple is a doubly linked list?
< tham> nilay : Could you post the code?
< tham> if you pass in the whole tuple parameters
< tham> you can access it like this std::get<I-1>(tuple_type), std::get<I+1>(tuple_type)
< tham> You can treat it as a "vector" with random access ability, but you have to specify the index at compile time
< tham> keonkim : sorry , I use wrong symbol to notify you(by @)
< tham> You can find out the codes of cnn iterate the tuple by Index, but they still pass the whole network(tuple type)
< nilay> tham: this is the code used in cnn.hpp , zoq sent me a mail explaining this. . . http://pastebin.com/CsAi4P16 .. so i could use std::get<I + 3> also, its just that we must know the index at compile time, so a loop will not work in this case
< tham> A normal loop can not
< tham> you must use recursive to specify the index of the element you want to access
< nilay> why do we have to give it at compile time?
< tham> Because it is the restriction of std::tuple
< tham> If you want to access the element of tuple, you need to know the index of the element at compile time, or the type of the tuple element
< nilay> ok thanks.
< nilay> so if all the values in the tuple were int, we could use a loop
< nilay> ?
< tham> no, that is not what I mean
< tham> What I mean is
< tham> std::tuple<int, double, char> vvv = .....
< tham> then you can access the element as
< tham> std::get<int>(vvv), std::get<double>(vvv), std::get<char>(vvv) or
< tham> std::get<0>(vvv), std::get<double>(vvv), std::get<char>(vvv)
< tham> but both of the access conditions are known at compile time
< tham> sorry, it should be
< tham> std::get<0>(vvv), std::get<1>(vvv), std::get<2>(vvv)
< nilay> so in the code i pasted before, we don't know what layers are in the network at compile time?
< tham> You already know it, exactly, when you construct the networks by tuple
< tham> you already specify the layers type you want to construct
< tham> nilay : I add some comments on the codes--http://pastebin.com/iTf1Kv3Q
< tham> Do you need a simpler example?
< tham> This kind of trick looks a little confusing at first, but you will get familiar with them very soon
< tham> The different are--The syntax is a little bit different + part of the information have to known at compile time
< nilay> yes i know what it does, but the thing is templates are also specialized at compile time only
< nilay> so we know the type at compile time?
< tham> yes
< tham> in other words, we have to know them
< tham> else we cannot apply template tricks
< tham> If you only can deduce the type/index at run time, then you have to rely on dynamic features
< tham> Like notorious(in most cases) RAII
< tham> sorry, not RAII
< tham> RTTI
< tham> virtual, std::function
< tham> void*(hard to see this thing now)
Mathnerd314_ has quit [Ping timeout: 240 seconds]
< keonkim> tham: hello, I was out for lunch sorry
< keonkim> tham: I read your pastebin, the initial plan was written without deep understanding of the features inside mlpack & armadillo. Turns out some of the core functions I mentioned in the proposal already exist in the both libraries.
< keonkim> tham: I agree with the facts you pointed out in the pastebin.
< keonkim> For DataIO I also don't like a new dependency (especially that is non-relevant to machine learning) to be introduced to mlpack.
< keonkim> For Data Transformation.. I think is basically done after DatasetMapper and Imputer is merged.
< keonkim> For Statistical Analysis, I will finish this by this week.
< tham> nilay : another example--http://pastebin.com/vfrCCm5U
< tham> keonkim : looking forward to that
< keonkim> For Mathematical Operator, I think the most relevant and useful application of being able to parse time data is to allow time series data analysis. but I am not used to this so I need to think on it.
< keonkim> tham: I guess most of the features written in the proposal is done after statistical analysis.
< keonkim> ?
< keonkim> tham: GSoC admins said initial project plans can shift if mentors and mentees agree on it. So I was thinking
< tham> keonkim : I agree
< keonkim> 1. I could do a simple project using ann module and write a tutorial.
< keonkim> 2. Make more Policy and Imputation classes
< tham> "allow time series data analysis", could you give us some examples?
< tham> most of the features will be done after statistical analysis
< keonkim> tham: You know, some economic datasets have titles like "Monthly Milk Production" or "Annual Unemployment Rate". (financial datasets too)
< keonkim> tham: and they come with all kinds of date format like 1993-04 2013.12 2016/06
< keonkim> tham: currently mlpack cannot parse them but maybe thats up to users to preprocess it
< tham> Problem is there are many time formats
< tham> second question is what do you want to do after you parse them?
< tham> I think boost date time can parse most of the time format
< tham> we need to think about this, I would study more data about this, it would be better if there are some libraries can refer to
< keonkim> after parsing them,
< keonkim> tham: I wanted to implement stock prediction model using mlpack (I guess that is going to be after gsoc)
< tham> stock rediction model....
< tham> prediction model....
< tham> soud interesting
< keonkim> :)
< tham> Which machine learning tech you want to use?
< tham> ann?
< keonkim> I think it is going to be a little hard, but the goal is using cnn.
< tham> keonkim : cnn works very well on image classification tasks
< tham> but I am not sure it works for stock prediction too
< keonkim> tham: Yea, there is not much resources on it. Thats why I wanted to give it a try.
< tham> zoq is the expert of ann, I think you could ask him to give you some recommendation
< tham> autoencoder, RBM are a more general way to extract features
< tham> keonkim : hope you get good resutls on stock prediction
< keonkim> tham: haha I hope :)
< keonkim> tham: Boost.Date_Time seems to be the perfect library for manipulating data. Maybe I could use this alone instead of introducing it into mlpack.
mentekid has joined #mlpack
< tham> keonkim : glad you like it :)
< nilay> tham: thanks, that example helped,
mentekid has quit [Ping timeout: 264 seconds]
tham has quit [Ping timeout: 250 seconds]
mentekid has joined #mlpack
Stellar_Mind has quit [Ping timeout: 252 seconds]
Stellar_Mind has joined #mlpack
nilay has quit [Ping timeout: 250 seconds]
Stellar_Mind has quit [Ping timeout: 244 seconds]
Stellar_Mind has joined #mlpack
nilay has joined #mlpack
marcosirc has joined #mlpack
Stellar_Mind has quit [Ping timeout: 272 seconds]
nilay has quit [Ping timeout: 250 seconds]
Stellar_Mind has joined #mlpack
Mathnerd314 has joined #mlpack
< keonkim> rcurtin zoq tham: I believe #694 is ready to be merged. Please have a look. :)
< zoq> keonkim: Sure I'll go and take a look at the code later today, any idea why the imputation test fails?
< rcurtin> keonkim: I'll take a look also later today
Stellar_Mind has quit [Ping timeout: 252 seconds]
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1176 (master - 98babfc : Ryan Curtin): The build was broken.
travis-ci has left #mlpack []
< zoq> nilay: In the PrepareData function, don't we use arma::umat loc(lenLoc / 2, 2); locations instead of arma::umat loc(lenLoc * 2, 2);?
mentekid has quit [Ping timeout: 258 seconds]
sumedhghaisas has joined #mlpack
nilay has joined #mlpack
< nilay> zoq: we take all the posLoc locations (size = lenLoc) and all the negLoc locations (size = lenLoc). so total lenLoc * 2 locations.
< keonkim> zoq: the fail caused by the other model (probabilistic)
< keonkim> oh wait it was failing in ImputationTest. I was confused because the test passes on my computer. I will look into it tomorrow.
< keonkim> I think it is just a minor bug in test code.
< zoq> nilay: hm, the reference code returns 1000 locations. In that case, the feature vector is way smaller.
< zoq> keonkim: sounds good
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1179 (master - e7b9b04 : Marcus Edel): The build is still failing.
travis-ci has left #mlpack []
nilay has quit [Ping timeout: 250 seconds]
sumedhghaisas has quit [Ping timeout: 260 seconds]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Read error: No route to host]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 258 seconds]
marcosirc has quit [Quit: WeeChat 1.4]
< zoq> nilay: n_pos_per_gt = int(ceil(float(n_pos) / n_img / len(bnds))) returns the number of positive locations and n_neg_per_gt the number of negative locations, in both cases 500 locations. If Boundary contains more than 500 postive or negative locations, we just randomly choose the right number of locations.
< zoq> nilay: We could also use 2000 locations, but that would increase the number of features and at the end the runtime.
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#1180 (master - 6147ed0 : Marcus Edel): The build was fixed.
travis-ci has left #mlpack []