verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
ank_95_ has joined #mlpack
< zoq> uzipaz: Have you thought about reducing the dimension e.g using PCA?
ank_95_ has quit [Quit: Connection closed for inactivity]
uzipaz has joined #mlpack
< uzipaz> zoq: the original dataset I was given had about 1600 features and 2050 samples... also it contained many missing values... we used WEKA to do feature selction, we used best first, genetic search which reduced the dataset to 150 f, 1123 s and 842 f, 1123s respectively
< uzipaz> zoq: didn't try using PCA though
ank_95_ has joined #mlpack
travis-ci has joined #mlpack
< travis-ci> mlpack/mlpack#718 (master - fb3994c : Marcus Edel): The build passed.
travis-ci has left #mlpack []
uzipaz has quit [Quit: Page closed]
sumedhghaisas has joined #mlpack
Nilabhra has joined #mlpack
ranjan123 has joined #mlpack
ranjan123 has quit [Quit: Page closed]
sumedhghaisas has quit [Ping timeout: 264 seconds]
christie has joined #mlpack
tsathoggua has joined #mlpack
tsathoggua has quit [Client Quit]
ranjan123 has joined #mlpack
jand has joined #mlpack
< jand> Hi, i have a dataset of 60k samples, each is of dimension 4k. I train a DET, but I always get +inf as the density estimate for all training examples. I use the default params and 10 fold cross-validation. Is it due to the volume at the specific leaves being very small, such that f_N(x) becomes +inf? Thanks for any help you can give.
Nilabhra has quit [Read error: Connection reset by peer]
ranjan123 has quit [Quit: Page closed]
jand has quit [Ping timeout: 250 seconds]
< rcurtin> jand: that's almost certainly what is happening, with 4k dimensions
< rcurtin> ah, too late, they already left... well, hopefully they know where to find the IRC logs...
jandrews_ has joined #mlpack
< jandrews_> hi rcurtin
< jandrews_> i read the irc logs :)
< jandrews_> were you going to say anything else, before i lost my connection
< jandrews_> ?
< rcurtin> ah great, good to know you got the answer :)
< rcurtin> I don't really have any other suggestions... the DET volume calculations are generally done in logspace, which helps with the extremely small volumes in very large dimensions
< rcurtin> but still in 4000 dimensions the volumes will still get too small or too large
< rcurtin> maybe you could try PCA or some other dimensionality reduction technique (or even just feature selection of some sort?) to reduce the dimensionality before using DETs?
< jandrews_> ok, great. thanks for the suggestion. i had thought about PCA.
jandrews_ has quit [Ping timeout: 250 seconds]
christie has quit [Ping timeout: 250 seconds]
ank_95_ has quit [Quit: Connection closed for inactivity]