verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
nilay has joined #mlpack
marcosirc has quit [Quit: WeeChat 1.4]
tham has joined #mlpack
< tham>
@keonkim I list out the features already done/undone and some questions at here
< tham>
I think most of the features proposed are done already
< tham>
Please tell me if I missed something
< tham>
Do mlpack or armadillo provide any function for Gram Schmidt computation?
< tham>
fast pca algorithm use GS(Gram Schmidt) to compute the eigenvector, it is not hard to implement
< tham>
But I prefer to reuse existing solution first
Stellar_Mind has joined #mlpack
< nilay>
after unrolling an std::tuple using recursion, how can we access both the previous std::get<I - 1> and next, std::get<I+1> element, is it because tuple is a doubly linked list?
< tham>
nilay : Could you post the code?
< tham>
if you pass in the whole tuple parameters
< tham>
you can access it like this std::get<I-1>(tuple_type), std::get<I+1>(tuple_type)
< tham>
You can treat it as a "vector" with random access ability, but you have to specify the index at compile time
< tham>
keonkim : sorry , I use wrong symbol to notify you(by @)
< tham>
You can find out the codes of cnn iterate the tuple by Index, but they still pass the whole network(tuple type)
< nilay>
tham: this is the code used in cnn.hpp , zoq sent me a mail explaining this. . . http://pastebin.com/CsAi4P16 .. so i could use std::get<I + 3> also, its just that we must know the index at compile time, so a loop will not work in this case
< tham>
A normal loop can not
< tham>
you must use recursive to specify the index of the element you want to access
< nilay>
why do we have to give it at compile time?
< tham>
Because it is the restriction of std::tuple
< tham>
If you want to access the element of tuple, you need to know the index of the element at compile time, or the type of the tuple element
< nilay>
ok thanks.
< nilay>
so if all the values in the tuple were int, we could use a loop
< nilay>
?
< tham>
no, that is not what I mean
< tham>
What I mean is
< tham>
std::tuple<int, double, char> vvv = .....
< tham>
then you can access the element as
< tham>
std::get<int>(vvv), std::get<double>(vvv), std::get<char>(vvv) or
< tham>
This kind of trick looks a little confusing at first, but you will get familiar with them very soon
< tham>
The different are--The syntax is a little bit different + part of the information have to known at compile time
< nilay>
yes i know what it does, but the thing is templates are also specialized at compile time only
< nilay>
so we know the type at compile time?
< tham>
yes
< tham>
in other words, we have to know them
< tham>
else we cannot apply template tricks
< tham>
If you only can deduce the type/index at run time, then you have to rely on dynamic features
< tham>
Like notorious(in most cases) RAII
< tham>
sorry, not RAII
< tham>
RTTI
< tham>
virtual, std::function
< tham>
void*(hard to see this thing now)
Mathnerd314_ has quit [Ping timeout: 240 seconds]
< keonkim>
tham: hello, I was out for lunch sorry
< keonkim>
tham: I read your pastebin, the initial plan was written without deep understanding of the features inside mlpack & armadillo. Turns out some of the core functions I mentioned in the proposal already exist in the both libraries.
< keonkim>
tham: I agree with the facts you pointed out in the pastebin.
< keonkim>
For DataIO I also don't like a new dependency (especially that is non-relevant to machine learning) to be introduced to mlpack.
< keonkim>
For Data Transformation.. I think is basically done after DatasetMapper and Imputer is merged.
< keonkim>
For Statistical Analysis, I will finish this by this week.
< keonkim>
For Mathematical Operator, I think the most relevant and useful application of being able to parse time data is to allow time series data analysis. but I am not used to this so I need to think on it.
< keonkim>
tham: I guess most of the features written in the proposal is done after statistical analysis.
< keonkim>
?
< keonkim>
tham: GSoC admins said initial project plans can shift if mentors and mentees agree on it. So I was thinking
< tham>
keonkim : I agree
< keonkim>
1. I could do a simple project using ann module and write a tutorial.
< keonkim>
2. Make more Policy and Imputation classes
< tham>
"allow time series data analysis", could you give us some examples?
< tham>
most of the features will be done after statistical analysis
< keonkim>
tham: You know, some economic datasets have titles like "Monthly Milk Production" or "Annual Unemployment Rate". (financial datasets too)
< keonkim>
tham: and they come with all kinds of date format like 1993-04 2013.12 2016/06
< keonkim>
tham: currently mlpack cannot parse them but maybe thats up to users to preprocess it
< tham>
Problem is there are many time formats
< tham>
second question is what do you want to do after you parse them?
< tham>
I think boost date time can parse most of the time format
< tham>
we need to think about this, I would study more data about this, it would be better if there are some libraries can refer to
< keonkim>
after parsing them,
< keonkim>
tham: I wanted to implement stock prediction model using mlpack (I guess that is going to be after gsoc)
< tham>
stock rediction model....
< tham>
prediction model....
< tham>
soud interesting
< keonkim>
:)
< tham>
Which machine learning tech you want to use?
< tham>
ann?
< keonkim>
I think it is going to be a little hard, but the goal is using cnn.
< tham>
keonkim : cnn works very well on image classification tasks
< tham>
but I am not sure it works for stock prediction too
< keonkim>
tham: Yea, there is not much resources on it. Thats why I wanted to give it a try.
< tham>
zoq is the expert of ann, I think you could ask him to give you some recommendation
< tham>
autoencoder, RBM are a more general way to extract features
< tham>
keonkim : hope you get good resutls on stock prediction
< keonkim>
tham: haha I hope :)
< keonkim>
tham: Boost.Date_Time seems to be the perfect library for manipulating data. Maybe I could use this alone instead of introducing it into mlpack.
mentekid has joined #mlpack
< tham>
keonkim : glad you like it :)
< nilay>
tham: thanks, that example helped,
mentekid has quit [Ping timeout: 264 seconds]
tham has quit [Ping timeout: 250 seconds]
mentekid has joined #mlpack
Stellar_Mind has quit [Ping timeout: 252 seconds]
Stellar_Mind has joined #mlpack
nilay has quit [Ping timeout: 250 seconds]
Stellar_Mind has quit [Ping timeout: 244 seconds]
Stellar_Mind has joined #mlpack
nilay has joined #mlpack
marcosirc has joined #mlpack
Stellar_Mind has quit [Ping timeout: 272 seconds]
nilay has quit [Ping timeout: 250 seconds]
Stellar_Mind has joined #mlpack
Mathnerd314 has joined #mlpack
< keonkim>
rcurtin zoq tham: I believe #694 is ready to be merged. Please have a look. :)
< zoq>
keonkim: Sure I'll go and take a look at the code later today, any idea why the imputation test fails?
< rcurtin>
keonkim: I'll take a look also later today
Stellar_Mind has quit [Ping timeout: 252 seconds]
travis-ci has joined #mlpack
< travis-ci>
mlpack/mlpack#1176 (master - 98babfc : Ryan Curtin): The build was broken.
sumedhghaisas has quit [Ping timeout: 260 seconds]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Read error: No route to host]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Ping timeout: 258 seconds]
marcosirc has quit [Quit: WeeChat 1.4]
< zoq>
nilay: n_pos_per_gt = int(ceil(float(n_pos) / n_img / len(bnds))) returns the number of positive locations and n_neg_per_gt the number of negative locations, in both cases 500 locations. If Boundary contains more than 500 postive or negative locations, we just randomly choose the right number of locations.
< zoq>
nilay: We could also use 2000 locations, but that would increase the number of features and at the end the runtime.
travis-ci has joined #mlpack
< travis-ci>
mlpack/mlpack#1180 (master - 6147ed0 : Marcus Edel): The build was fixed.