verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
sgupta has quit [Ping timeout: 260 seconds]
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
mikeling has joined #mlpack
chenzhe has joined #mlpack
< chenzhe>
a simple question: In armadillo, when we call solve(A, b) for short fat matrix A, does it automatically give least square solution? The example on Armadillo's wikipedia seems to suggest so~
< chenzhe>
sorry, I mean, tall thin matrix A~
< rcurtin>
chenzhe: I think it does, I will check for sure tomorrow
< rcurtin>
the source will be something like auxlib::solve() in armadillo_bits/auxlib_meat.hpp
< rcurtin>
but that may be a hard file to read, Armadillo internals can be kinda crazy :)
< chenzhe>
rcurtin: Thanks a lot Ryan!
chenzhe has quit [Ping timeout: 246 seconds]
chenzhe has joined #mlpack
chenzhe has quit [Read error: Connection reset by peer]
chenzhe has joined #mlpack
chenzhe has quit [Ping timeout: 255 seconds]
kris1 has joined #mlpack
s1998 has joined #mlpack
s1998 has quit [Ping timeout: 246 seconds]
s1998 has joined #mlpack
s1998 has left #mlpack []
Trion has joined #mlpack
shikhar has joined #mlpack
sgupta has joined #mlpack
s1998 has joined #mlpack
vivekp has quit [Ping timeout: 240 seconds]
vivekp has joined #mlpack
Trion has quit [Ping timeout: 246 seconds]
Trion has joined #mlpack
Trion has quit [Remote host closed the connection]
Trion has joined #mlpack
Trion has quit [Quit: Have to go, see ya!]
kris1 has quit [Remote host closed the connection]
aneiman has joined #mlpack
< aneiman>
hello
< aneiman>
I try to run decision tree on the simplest example and it is not clear to me where the result is expected
< aneiman>
my training file includes 1 feature - 10 lines :the first 5 lines - number 2.33, the rest lines - number 3.33
< aneiman>
The labels file includes 10 lines - the first 5 lines - number 0 , the rest - number 1
< aneiman>
I expect the output model will somehow express " if feature = 2.33 => label =0 ; else if feature = 3.33=> label = 1"
< aneiman>
the output model includes following :<model class_id="0" tracking_level="0" version="0"> <tree class_id="1" tracking_level="0" version="0"> <numChildren>0</numChildren> <splitDimension>140237688660736</splitDimension> <dimensionTypeOrMajorityClass>0</dimensionTypeOrMajorityClass> <classProbabilities class_id="2" tracking_level="0" version="0"> <n_rows>2</n_rows> <n_cols>1</n_cols> <n_elem>2</n_elem> <vec_state>1</v
< aneiman>
How can I see the relation between the feature and label value ?
< rcurtin>
aneiman: unfortunately the model files aren't really meant to be readable
< rcurtin>
to me that output looks like the decision tree didn't split at all, and the majority class is class 0
< aneiman>
Ok, so which file will show me this relation :" if feature = 2.33 => label =0 ; else if feature = 3.33=> label = 1" ?
< aneiman>
what is the majority class ? label value ? In my care it is 50% - 0, 50 % - 1
< aneiman>
In my case it is 50% - 0, 50 % - 1
nish21 has quit [Ping timeout: 260 seconds]
< rcurtin>
aneiman: sorry for the slow response
< rcurtin>
in what you pasted I can't see what the class probabilities are (they should be 0.5 0.5)
sgupta has quit [Ping timeout: 260 seconds]
< rcurtin>
I don't think any output file will show you that relation specifically, instead you would probably have to either parse the XML to extract that, or write some C++ and work with the DecisionTree<> object itself
< rcurtin>
in your case, the tree doesn't split because there aren't enough samples
< rcurtin>
by default, the --minimum_leaf_size parameter is set to 20
< rcurtin>
meaning that a node won't split if it has less than 20 points in it
< rcurtin>
and in your case, there are only 10 points, so no splitting
< rcurtin>
you could try --minimum_leaf_size with some smaller value (or make a larger training set)
< rcurtin>
and then when you looked at my_dec_tree_model.xml, you would probably see that the tree had three nodes
< aneiman>
The probabilities are visible in the output model :<classProbabilities class_id="2" tracking_level="0" version="0"> <n_rows>2</n_rows> <n_cols>1</n_cols> <n_elem>2</n_elem> <vec_state>1</vec_state> <item>0.5</item> <item>0.5</item> </classProbabilities>
< rcurtin>
ah, there it is---the two '0.5' items mean that the probability of each class is 0.5, which is correct when the tree doesn't split
< aneiman>
yes, the probabilities are correct
Trion has quit [Ping timeout: 240 seconds]
< aneiman>
I extended the example to 20 lines - just copy-paste, but the result is the same - I don't see " if feature = 2.33 => label =0 ; else if feature = 3.33=> label = 1"
< rcurtin>
try 22 lines, maybe you need just a few more to get it to split
< aneiman>
You wrote about parse XML to receive the relation between feature and labels . which fields of xml express it ?
< rcurtin>
each node has four features:
< rcurtin>
- numChildren
< rcurtin>
- splitDimension
< rcurtin>
- dimensionTypeOrMajorityClass
< rcurtin>
- classProbabilities
< rcurtin>
(I guess if the node has children it also has a 'children' feature)
< rcurtin>
the splitDimension, if it's not size_t(-1) (the really large number) means that the node is not a leaf, and that field represents the dimension for splitting
< rcurtin>
dimensionTypeOrMajorityClass represents the majority class, if the node is a leaf, and the type of dimension (categorical or numeric) otherwise
< rcurtin>
if the node is not a leaf node, then the classProbabilities vector will hold the actual split value
< rcurtin>
I know that might seem somewhat confusing (and it is) but the data members are compressed in this way to save space
< aneiman>
So dimensionTypeOrMajorityClass is the feature value ? And what is the label value ?
< aneiman>
I tried 22 lines - the same result
< rcurtin>
can you paste your data into pastebin or something?
< aneiman>
But still I don't see the feature value in the output model.
< aneiman>
Please, confirm that dimensionTypeOrMajorityClass is the label value
vivekp has quit [Ping timeout: 255 seconds]
< aneiman>
And additional question :
< aneiman>
why probability (at the end ) is 2.8300000000000001? it seems should be 0.5-0.5
vivekp has joined #mlpack
< rcurtin>
XML is nested, so the probabilities at the end correspond to the root node
< rcurtin>
and like I said, if it is not a leaf node in the tree, the 'classProbabilities' member holds the split value of that node
< rcurtin>
so the root node splits on dimension 0, and splits on value 2.83
< rcurtin>
for the two children, the dimensionTypeOrMajorityClass value holds the majority class, which is 0 for the first node and 1 for the second node
< aneiman>
Is the majority class the major value of feature ?
< aneiman>
I see that 2.83 is the average of the feature, but where can I see the features values, related to labels ?
< aneiman>
I'm sorry for too much questions, but I evaluate the mlpack for use in the new project, so need to try and understand the result
< rcurtin>
I don't understand what you mean when you say 'the feature values'
< rcurtin>
and don't worry about the questions, I am happy to try and help where I can :)
< rcurtin>
I am doing some other work too though, so sometimes my responses may be a little slow
< aneiman>
In my case the the feature values are 2.33 and 3.33 ( in the training file ) and if the feature value is 2.33 then label value is 0; if the feature value is 3.33 then the label value is 1
< aneiman>
I try to find how I retrieve such information from decision tree output
< rcurtin>
the tree does not store the values of the features that it was trained on
< rcurtin>
instead, you would have to make a file of test points, and then use the tree to classify them
< rcurtin>
i.e. mlpack_decision_tree -T test_points.csv -m model.xml --predictions_file predicted_classes.csv
< rcurtin>
however, there was a bug in mlpack 2.2.1 and 2.2.2 that caused --predictions_file to not work correctly, so you probably need to upgrade to mlpack 2.2.3 for that to work correctly
< aneiman>
Is mlpack2.2.3 stable version ?
< aneiman>
And in this case test_points.csv should include 2 values 2.33 and 3.33. Is it correct ?
< rcurtin>
yes, 2.2.3 is stable
< rcurtin>
and yes, you could put 2.33 and 3.33 into test_points.csv
< rcurtin>
you could actually put any value into test_points.csv; based on the way that tree has trained, anything below 2.83 will be predicted to have label value 0 and anything greater than 2.83 will be predicted to have label value 1
< aneiman>
I'll try it tomorrow. Thank you very much for your help!
< rcurtin>
sure, happy to try and help out :)
< sgupta>
rcurtin: hi! I guess we have to install docker-squash on the server.
< rcurtin>
sgupta: seems like upgrading to Docker 1.13 could also provide --squash functionality:
< rcurtin>
let me see if I can get 1.13 on masterblaster
< sgupta>
rcurtin: sure. That'll help
< rcurtin>
hm, not available in Ubuntu yet, so let's just do the best we can without the --squash option for now, and then when the package becomes available we can add the flag then
< rcurtin>
I can see 1.13.1 is available in debian sid, so it's presumably just a matter of time until the Ubuntu versions are upgraded
< rcurtin>
I guess, do you know how much of a difference squashing would make?
< rcurtin>
if it is really a huge improvement I can go out of my way to get 1.13 there :)
< sgupta>
rcurtin: well! The examples that I looked upon showed great improvement
< sgupta>
rcurtin: but we require a large number of libraries to run mlpack. So, not sure whether the improvement would be huge.
< rcurtin>
sgupta: ok, I'll go ahead and set up docker's repos then to get the new version and we can see how it does
< rcurtin>
hang on...
< sgupta>
rcurtin: sure :)
< rcurtin>
I need to bring down the docker service, can I stop all the containers you are running on masterblaster?
sumedhghaisas has joined #mlpack
< sgupta>
rcurtin: yes sygo ahead
< sgupta>
rcurtin: yes sure. Please go ahead.
< rcurtin>
ok, now it's version 17.03.1-ce
< rcurtin>
that should have the --squash option, let me know if there are any issues
shikhar has quit [Read error: Connection reset by peer]
< sgupta>
rcurtin: the squash flag is not there. I guess it was just experimental and removed it in production.
shikhar_ has joined #mlpack
shikhar_ has quit [Read error: Connection reset by peer]
shikhar_ has joined #mlpack
s1998 has quit [Read error: Connection reset by peer]
s1998 has joined #mlpack
shikhar_ has quit [Ping timeout: 246 seconds]
Trion has quit [Quit: Have to go, see ya!]
shikhar_ has joined #mlpack
kris1 has joined #mlpack
< kris1>
Just a simple question why in linear layer are we doing this //! Modify the parameters.
< kris1>
Should we also not return the bias parameters?
< rcurtin>
kris1: take a look at the Reset() function---the matrices 'weight' and 'bias' are embedded in the 'weights' matrix
< kris1>
Ohhh okay
< kris1>
Thanks, just one more thing InputDataType inputParameter; is never actually set either in the linear.hpp and linear_impl.hpp
s1998 has quit [Read error: Connection reset by peer]
shikhar_ has quit [Quit: WeeChat 1.7]
< rcurtin>
sgupta: when I try to use --squash, I get:
< rcurtin>
Error response from daemon: squash is only supported with experimental mode
< rcurtin>
I can restart the docker daemon in experimental mode if you like
< sgupta>
rcurtin: yes sure
< rcurtin>
sgupta: ok, enabled now, try again
< sgupta>
rcurtin: okay
< zoq>
kris1: inputParameter is used to transfer/store the input between layer, so even if it's not internally used, it might be used for the previous or next layer.
< zoq>
rcurtin: Any idea about the core.hpp issues?
< rcurtin>
looking now...
< zoq>
thanks!
< rcurtin>
I suspect that the linter is getting confused like the first error warns might happen, but I can't see any reason why
< rcurtin>
but I'm not sure what the parse error is
< rcurtin>
I'd be fine just leaving core.hpp out of the style check
< rcurtin>
hm, so I did a comparison with TensorFlow just now
< rcurtin>
I built a three-layer ReLU FFN with mlpack and with Keras, using RMSprop for the optimizer and the pokerhand dataset (700k 10-d training points)
< rcurtin>
I found that training with mlpack took 14 minutes while TensorFlow (via Keras) took 11.5 minutes
< rcurtin>
but then predictions for the test set (300k points) only took 1.5 seconds with mlpack but 5.3 seconds with TensorFlow
< rcurtin>
this is really good, because I am in the process of negotiating internally at Symantec to get the mlpack neural network code in use on Symantec endpoint software (like the virus scanners that run on people's systems)
< rcurtin>
now I have a data point strongly supporting the use of mlpack over TF :) (I'll need more, but this is a start!)
< rcurtin>
ah I should also say, this was all CPU-only testing; mlpack was using OpenBLAS, TF using whatever defaults
chenzhe has joined #mlpack
< zoq>
I can also think about some ideas to speed it up.
< rcurtin>
chenzhe: I took a look into it, solve() uses LAPACK's dgels() and dgelsd(), which find the least-squares solution for an overdetermined system and the minimum norm solution for an underdetermined system
< rcurtin>
oh, nice
< rcurtin>
I was playing with CNNs too, it seems like there could be a lot of speedup there
< zoq>
yes, the conv operation is super slow
< rcurtin>
trying to implement the MNIST Keras tutorial in mlpack I found that a single epoch takes about 100 minutes with mlpack :)
< rcurtin>
vs. 90 seconds with Keras/TF on the CPU
< zoq>
oh
aneiman has quit [Ping timeout: 260 seconds]
< rcurtin>
I think there are some unnecessary copies going on; I've been playing with it
< rcurtin>
I've got about a 10% speedup so far, but for today I'm out of time to dig deeper
< chenzhe>
rcurtin: Cool! Thanks Ryan~
< zoq>
definitely
< rcurtin>
I think that in the Convolution<> class there is some copying going on with outputTmp, but I don't think I've pinned everything down yet
< rcurtin>
switching NaiveConvolution<> to use arma::conv2() actually slowed things down by a factor of about 2
< zoq>
what kernel did you use for the test?
< zoq>
I think something like 4x4 would be hard to beat without winograd.
< rcurtin>
I was only testing Forward(), which uses ValidConvolution
< rcurtin>
or, sorry... I was only using Forward() in later tests where I was just testing the calculation of Evaluate()
< rcurtin>
for the full test, I used the default, so for Forward() that's ValidConvolution and for Backward() that's FullConvolution
< rcurtin>
FullConvolution looks to have a copy in it; I didn't look too hard into how to avoid it, I think it will be tricky
< rcurtin>
do you think I should try FFTConvolution not NaiveConvolution?
< rcurtin>
I am not too familiar with the state of the art in CNNs, so I'm not sure exactly how TF will do the CNN calculation under the hood
< zoq>
Depends on the kernel size, but I wouldn't expect a huge speedup.
< rcurtin>
this was just 3x3
< zoq>
hm, okay, probably not the best size for fft
< rcurtin>
I'll look into it more tomorrow---I think there are some optimizations that can be done to produce significant speedups via avoiding copies
< zoq>
yes
< zoq>
Not sure what Conrad's plans are but I think it would be somewhat straightforward to use cudnn for the conv operator.
< rcurtin>
I'm working with Conrad to try and assemble a library called 'bandicoot', the idea being that it's a GPU matrix library
< rcurtin>
I hope that eventually it will be usable just like arma::mat or arma::sp_mat
< rcurtin>
but I think it will be a while until it gets there, probably many months
< rcurtin>
I only have a few hours a week to contribute, so I think he is doing the majority of the work by far
< zoq>
Would be nice to just build against bandicoot instead of armadillo.
< rcurtin>
I think it would be really tricky to integrate cudnn as-is into Armadillo, since arma::mat has no way to store gpu-specific memory
< rcurtin>
yeah; right now bandicoot doesn't really do much yet though
< zoq>
Right maybe not that straightforward.
chenzhe1 has joined #mlpack
chenzhe has quit [Ping timeout: 246 seconds]
chenzhe1 is now known as chenzhe
chenzhe1 has joined #mlpack
chenzhe has quit [Read error: Connection reset by peer]
chenzhe1 is now known as chenzhe
aashay has quit [Quit: Connection closed for inactivity]
< rcurtin>
zoq: I think #1019 is ready to merge, if you agree go ahead and merge it
sgupta has quit [Ping timeout: 240 seconds]
sgupta has joined #mlpack
mikeling has quit [Quit: Connection closed for inactivity]
< zoq>
rcurtin: okay, merged
sumedhghaisas has quit [Ping timeout: 255 seconds]
travis-ci has joined #mlpack
< travis-ci>
mlpack/mlpack#2509 (master - 09d8793 : Marcus Edel): The build was fixed.
< kris1>
zoq: Can you tell why are you doing this boost::apply_visitor(SetInputWidthVisitor(width), network[i]); when executing a forward pass through ffn network with lets linear layers
< kris1>
and does input width:- indicate the num of features in the data points and output width: indicate the number of points
< zoq>
kris1. Actually, I can do it for the Convolution layer, since it implements the InputWidth() function, so suppose you have some 3rd order tensor (RGB image), some layer in the network need the input size (e.g. width and height of the input image or previous layer). So all this function does is to propagate that information through the network. I think in your case, you don't have to do this unless you like to
< zoq>
use e.g. Conv layer.
sumedhghaisas has joined #mlpack
< kris1>
Yes
< rcurtin>
kris1: have you had a chance to write a blog post yet? let me know if there are any issues pushing to the blog repository
< kris1>
rcurtin: Actually no, i would start writing tomorrow i guess.
< zoq>
sgupta: Excited to read your next report, ubuntu -> alpine -> busybox -> ... -> debian ----> FreeBSD; that is gonna make one hell of a story :)
< rcurtin>
kris1: ok, sounds good
chenzhe has quit [Quit: chenzhe]
< kris1>
rcurtin: I just completed the blog. I have pushed to the repo. Have a look :)
sgupta has quit [Ping timeout: 240 seconds]
sgupta has joined #mlpack
sumedhghaisas has quit [Remote host closed the connection]