#mlpack on 2017-06-06 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

00:48 sgupta has quit [Ping timeout: 260 seconds]

02:19 vivekp has quit [Ping timeout: 240 seconds]

02:20 vivekp has joined #mlpack

02:28 mikeling has joined #mlpack

03:00 chenzhe has joined #mlpack

03:05 < chenzhe> a simple question: In armadillo, when we call solve(A, b) for short fat matrix A, does it automatically give least square solution? The example on Armadillo's wikipedia seems to suggest so~

03:08 < chenzhe> sorry, I mean, tall thin matrix A~

03:28 < rcurtin> chenzhe: I think it does, I will check for sure tomorrow

03:29 < rcurtin> the source will be something like auxlib::solve() in armadillo_bits/auxlib_meat.hpp

03:29 < rcurtin> but that may be a hard file to read, Armadillo internals can be kinda crazy :)

03:38 < chenzhe> rcurtin: Thanks a lot Ryan!

03:49 chenzhe has quit [Ping timeout: 246 seconds]

03:58 chenzhe has joined #mlpack

04:00 chenzhe has quit [Read error: Connection reset by peer]

04:01 chenzhe has joined #mlpack

04:05 chenzhe has quit [Ping timeout: 255 seconds]

04:28 kris1 has joined #mlpack

05:02 s1998 has joined #mlpack

05:58 s1998 has quit [Ping timeout: 246 seconds]

06:04 s1998 has joined #mlpack

06:11 s1998 has left #mlpack []

06:24 Trion has joined #mlpack

07:25 shikhar has joined #mlpack

07:27 sgupta has joined #mlpack

07:52 s1998 has joined #mlpack

08:09 vivekp has quit [Ping timeout: 240 seconds]

08:11 vivekp has joined #mlpack

08:41 Trion has quit [Ping timeout: 246 seconds]

08:48 Trion has joined #mlpack

09:22 Trion has quit [Remote host closed the connection]

09:22 Trion has joined #mlpack

11:21 Trion has quit [Quit: Have to go, see ya!]

12:05 kris1 has quit [Remote host closed the connection]

12:38 aneiman has joined #mlpack

12:38 < aneiman> hello

12:39 < aneiman> I try to run decision tree on the simplest example and it is not clear to me where the result is expected

12:40 < aneiman> my training file includes 1 feature - 10 lines :the first 5 lines - number 2.33, the rest lines - number 3.33

12:41 < aneiman> The labels file includes 10 lines - the first 5 lines - number 0 , the rest - number 1

12:42 < aneiman> I expect the output model will somehow express " if feature = 2.33 => label =0 ; else if feature = 3.33=> label = 1"

12:43 < aneiman> the output model includes following :<model class_id="0" tracking_level="0" version="0"> <tree class_id="1" tracking_level="0" version="0"> <numChildren>0</numChildren> <splitDimension>140237688660736</splitDimension> <dimensionTypeOrMajorityClass>0</dimensionTypeOrMajorityClass> <classProbabilities class_id="2" tracking_level="0" version="0"> <n_rows>2</n_rows> <n_cols>1</n_cols> <n_elem>2</n_elem> <vec_state>1</v

12:44 < aneiman> How can I see the relation between the feature and label value ?

12:44 < aneiman> Thanks in advance

12:54 nish21 has joined #mlpack

12:55 < aneiman> my command line:

12:55 < aneiman> ~/mlpack-2.2.1/build/bin/mlpack_decision_tree -t my_tree_example.csv --labels_file my_tree_labels.csv --output_model_file /tmp/my_dec_tree_model.xml

12:57 Trion has joined #mlpack

13:09 < rcurtin> aneiman: unfortunately the model files aren't really meant to be readable

13:09 < rcurtin> to me that output looks like the decision tree didn't split at all, and the majority class is class 0

13:15 < aneiman> Ok, so which file will show me this relation :" if feature = 2.33 => label =0 ; else if feature = 3.33=> label = 1" ?

13:16 < aneiman> what is the majority class ? label value ? In my care it is 50% - 0, 50 % - 1

13:18 < aneiman> In my case it is 50% - 0, 50 % - 1

13:22 nish21 has quit [Ping timeout: 260 seconds]

13:30 < rcurtin> aneiman: sorry for the slow response

13:30 < rcurtin> in what you pasted I can't see what the class probabilities are (they should be 0.5 0.5)

13:31 sgupta has quit [Ping timeout: 260 seconds]

13:31 < rcurtin> I don't think any output file will show you that relation specifically, instead you would probably have to either parse the XML to extract that, or write some C++ and work with the DecisionTree<> object itself

13:31 < rcurtin> in your case, the tree doesn't split because there aren't enough samples

13:32 < rcurtin> by default, the --minimum_leaf_size parameter is set to 20

13:32 < rcurtin> meaning that a node won't split if it has less than 20 points in it

13:32 < rcurtin> and in your case, there are only 10 points, so no splitting

13:32 < rcurtin> you could try --minimum_leaf_size with some smaller value (or make a larger training set)

13:33 < rcurtin> and then when you looked at my_dec_tree_model.xml, you would probably see that the tree had three nodes

13:33 < aneiman> The probabilities are visible in the output model :<classProbabilities class_id="2" tracking_level="0" version="0"> <n_rows>2</n_rows> <n_cols>1</n_cols> <n_elem>2</n_elem> <vec_state>1</vec_state> <item>0.5</item> <item>0.5</item> </classProbabilities>

13:34 < rcurtin> ah, there it is---the two '0.5' items mean that the probability of each class is 0.5, which is correct when the tree doesn't split

13:36 < aneiman> yes, the probabilities are correct

13:36 Trion has quit [Ping timeout: 240 seconds]

13:37 < aneiman> I extended the example to 20 lines - just copy-paste, but the result is the same - I don't see " if feature = 2.33 => label =0 ; else if feature = 3.33=> label = 1"

13:38 < rcurtin> try 22 lines, maybe you need just a few more to get it to split

13:39 < aneiman> You wrote about parse XML to receive the relation between feature and labels . which fields of xml express it ?

13:39 < rcurtin> each node has four features:

13:39 < rcurtin> - numChildren

13:39 < rcurtin> - splitDimension

13:39 < rcurtin> - dimensionTypeOrMajorityClass

13:39 < rcurtin> - classProbabilities

13:40 < rcurtin> (I guess if the node has children it also has a 'children' feature)

13:40 < rcurtin> the splitDimension, if it's not size_t(-1) (the really large number) means that the node is not a leaf, and that field represents the dimension for splitting

13:40 < rcurtin> dimensionTypeOrMajorityClass represents the majority class, if the node is a leaf, and the type of dimension (categorical or numeric) otherwise

13:41 < rcurtin> if the node is not a leaf node, then the classProbabilities vector will hold the actual split value

13:41 < rcurtin> I know that might seem somewhat confusing (and it is) but the data members are compressed in this way to save space

13:41 < aneiman> So dimensionTypeOrMajorityClass is the feature value ? And what is the label value ?

13:42 < aneiman> I tried 22 lines - the same result

13:42 < rcurtin> can you paste your data into pastebin or something?

13:43 < aneiman> what is it pastebin?

13:43 < rcurtin> https://pastebin.com/

13:43 < rcurtin> I want to see the contents of the files, it seems that something is wrong here

13:44 Trion has joined #mlpack

13:47 < aneiman> I pasted, do you see it ?

13:52 < rcurtin> you have to send me the link

13:54 sgupta has joined #mlpack

14:00 < aneiman> Please, find below:

14:00 < aneiman> https://pastebin.com/FrSGqnjZ

14:06 < rcurtin> right, sorry, I misspoke, try that again with --minimum_leaf_size=10

14:11 < aneiman> Yes, it changes. I pasted the xml file to https://pastebin.com/QUwyUjHp

14:11 < aneiman> But still I don't see the feature value in the output model.

14:12 < aneiman> Please, confirm that dimensionTypeOrMajorityClass is the label value

14:13 vivekp has quit [Ping timeout: 255 seconds]

14:15 < aneiman> And additional question :

14:18 < aneiman> why probability (at the end ) is 2.8300000000000001? it seems should be 0.5-0.5

14:21 vivekp has joined #mlpack

14:23 < rcurtin> XML is nested, so the probabilities at the end correspond to the root node

14:24 < rcurtin> and like I said, if it is not a leaf node in the tree, the 'classProbabilities' member holds the split value of that node

14:24 < rcurtin> so the root node splits on dimension 0, and splits on value 2.83

14:24 < rcurtin> for the two children, the dimensionTypeOrMajorityClass value holds the majority class, which is 0 for the first node and 1 for the second node

14:28 < aneiman> Is the majority class the major value of feature ?

14:29 < aneiman> I see that 2.83 is the average of the feature, but where can I see the features values, related to labels ?

14:30 < aneiman> I'm sorry for too much questions, but I evaluate the mlpack for use in the new project, so need to try and understand the result

14:32 < rcurtin> I don't understand what you mean when you say 'the feature values'

14:32 < rcurtin> and don't worry about the questions, I am happy to try and help where I can :)

14:33 < rcurtin> I am doing some other work too though, so sometimes my responses may be a little slow

14:35 < aneiman> In my case the the feature values are 2.33 and 3.33 ( in the training file ) and if the feature value is 2.33 then label value is 0; if the feature value is 3.33 then the label value is 1

14:36 < aneiman> I try to find how I retrieve such information from decision tree output

14:36 < rcurtin> the tree does not store the values of the features that it was trained on

14:36 < rcurtin> instead, you would have to make a file of test points, and then use the tree to classify them

14:37 < rcurtin> i.e. mlpack_decision_tree -T test_points.csv -m model.xml --predictions_file predicted_classes.csv

14:37 < rcurtin> however, there was a bug in mlpack 2.2.1 and 2.2.2 that caused --predictions_file to not work correctly, so you probably need to upgrade to mlpack 2.2.3 for that to work correctly

14:38 < aneiman> Is mlpack2.2.3 stable version ?

14:39 < aneiman> And in this case test_points.csv should include 2 values 2.33 and 3.33. Is it correct ?

14:40 < rcurtin> yes, 2.2.3 is stable

14:40 < rcurtin> and yes, you could put 2.33 and 3.33 into test_points.csv

14:40 < rcurtin> you could actually put any value into test_points.csv; based on the way that tree has trained, anything below 2.83 will be predicted to have label value 0 and anything greater than 2.83 will be predicted to have label value 1

14:40 < aneiman> I'll try it tomorrow. Thank you very much for your help!

14:41 < rcurtin> sure, happy to try and help out :)

14:45 < sgupta> rcurtin: hi! I guess we have to install docker-squash on the server.

14:47 < rcurtin> sgupta: seems like upgrading to Docker 1.13 could also provide --squash functionality:

14:47 < rcurtin> https://blog.docker.com/2017/01/whats-new-in-docker-1-13/

14:47 < sgupta> rcurtin: yes I read that too!

14:48 < rcurtin> let me see if I can get 1.13 on masterblaster

14:49 < sgupta> rcurtin: sure. That'll help

14:50 < rcurtin> hm, not available in Ubuntu yet, so let's just do the best we can without the --squash option for now, and then when the package becomes available we can add the flag then

14:51 < rcurtin> I can see 1.13.1 is available in debian sid, so it's presumably just a matter of time until the Ubuntu versions are upgraded

14:52 < rcurtin> I guess, do you know how much of a difference squashing would make?

14:52 < rcurtin> if it is really a huge improvement I can go out of my way to get 1.13 there :)

15:22 < sgupta> rcurtin: well! The examples that I looked upon showed great improvement

15:22 < sgupta> rcurtin: but we require a large number of libraries to run mlpack. So, not sure whether the improvement would be huge.

15:24 < rcurtin> sgupta: ok, I'll go ahead and set up docker's repos then to get the new version and we can see how it does

15:24 < rcurtin> hang on...

15:24 < sgupta> rcurtin: sure :)

15:26 < rcurtin> I need to bring down the docker service, can I stop all the containers you are running on masterblaster?

15:36 sumedhghaisas has joined #mlpack

15:40 < sgupta> rcurtin: yes sygo ahead

15:40 < sgupta> rcurtin: yes sure. Please go ahead.

15:42 < rcurtin> ok, now it's version 17.03.1-ce

15:42 < rcurtin> that should have the --squash option, let me know if there are any issues

15:45 shikhar has quit [Read error: Connection reset by peer]

15:51 < sgupta> rcurtin: the squash flag is not there. I guess it was just experimental and removed it in production.

15:52 shikhar_ has joined #mlpack

15:58 shikhar_ has quit [Read error: Connection reset by peer]

15:58 shikhar_ has joined #mlpack

16:01 s1998 has quit [Read error: Connection reset by peer]

16:04 s1998 has joined #mlpack

16:10 shikhar_ has quit [Ping timeout: 246 seconds]

16:13 Trion has quit [Quit: Have to go, see ya!]

16:16 shikhar_ has joined #mlpack

16:23 kris1 has joined #mlpack

17:09 < kris1> Just a simple question why in linear layer are we doing this //! Modify the parameters.

17:09 < kris1> OutputDataType& Parameters() { return weights; }

17:09 < kris1> Should we also not return the bias parameters?

17:12 < rcurtin> kris1: take a look at the Reset() function---the matrices 'weight' and 'bias' are embedded in the 'weights' matrix

17:12 < kris1> Ohhh okay

17:30 < kris1> Thanks, just one more thing InputDataType inputParameter; is never actually set either in the linear.hpp and linear_impl.hpp

17:30 s1998 has quit [Read error: Connection reset by peer]

17:37 shikhar_ has quit [Quit: WeeChat 1.7]

17:53 < rcurtin> sgupta: when I try to use --squash, I get:

17:53 < rcurtin> Error response from daemon: squash is only supported with experimental mode

17:53 < rcurtin> I can restart the docker daemon in experimental mode if you like

17:54 < sgupta> rcurtin: yes sure

17:58 < rcurtin> sgupta: ok, enabled now, try again

17:58 < sgupta> rcurtin: okay

18:19 < zoq> kris1: inputParameter is used to transfer/store the input between layer, so even if it's not internally used, it might be used for the previous or next layer.

18:23 < zoq> rcurtin: Any idea about the core.hpp issues?

18:25 < rcurtin> looking now...

18:25 < zoq> thanks!

18:26 < rcurtin> I suspect that the linter is getting confused like the first error warns might happen, but I can't see any reason why

18:27 < rcurtin> but I'm not sure what the parse error is

18:27 < rcurtin> I'd be fine just leaving core.hpp out of the style check

18:29 < rcurtin> hm, so I did a comparison with TensorFlow just now

18:30 < rcurtin> I built a three-layer ReLU FFN with mlpack and with Keras, using RMSprop for the optimizer and the pokerhand dataset (700k 10-d training points)

18:30 < rcurtin> I found that training with mlpack took 14 minutes while TensorFlow (via Keras) took 11.5 minutes

18:30 < rcurtin> but then predictions for the test set (300k points) only took 1.5 seconds with mlpack but 5.3 seconds with TensorFlow

18:31 < rcurtin> this is really good, because I am in the process of negotiating internally at Symantec to get the mlpack neural network code in use on Symantec endpoint software (like the virus scanners that run on people's systems)

18:31 < rcurtin> now I have a data point strongly supporting the use of mlpack over TF :) (I'll need more, but this is a start!)

18:32 < rcurtin> ah I should also say, this was all CPU-only testing; mlpack was using OpenBLAS, TF using whatever defaults

18:32 chenzhe has joined #mlpack

18:33 < zoq> I can also think about some ideas to speed it up.

18:33 < rcurtin> chenzhe: I took a look into it, solve() uses LAPACK's dgels() and dgelsd(), which find the least-squares solution for an overdetermined system and the minimum norm solution for an underdetermined system

18:33 < rcurtin> oh, nice

18:33 < rcurtin> I was playing with CNNs too, it seems like there could be a lot of speedup there

18:33 < zoq> yes, the conv operation is super slow

18:34 < rcurtin> trying to implement the MNIST Keras tutorial in mlpack I found that a single epoch takes about 100 minutes with mlpack :)

18:34 < rcurtin> vs. 90 seconds with Keras/TF on the CPU

18:34 < zoq> oh

18:34 aneiman has quit [Ping timeout: 260 seconds]

18:34 < rcurtin> I think there are some unnecessary copies going on; I've been playing with it

18:34 < rcurtin> I've got about a 10% speedup so far, but for today I'm out of time to dig deeper

18:34 < chenzhe> rcurtin: Cool! Thanks Ryan~

18:34 < zoq> definitely

18:34 < rcurtin> I think that in the Convolution<> class there is some copying going on with outputTmp, but I don't think I've pinned everything down yet

18:35 < rcurtin> switching NaiveConvolution<> to use arma::conv2() actually slowed things down by a factor of about 2

18:46 < zoq> what kernel did you use for the test?

18:49 < zoq> I think something like 4x4 would be hard to beat without winograd.

18:49 < rcurtin> I was only testing Forward(), which uses ValidConvolution

18:50 < rcurtin> or, sorry... I was only using Forward() in later tests where I was just testing the calculation of Evaluate()

18:50 < rcurtin> for the full test, I used the default, so for Forward() that's ValidConvolution and for Backward() that's FullConvolution

18:50 < rcurtin> FullConvolution looks to have a copy in it; I didn't look too hard into how to avoid it, I think it will be tricky

18:51 < rcurtin> do you think I should try FFTConvolution not NaiveConvolution?

18:51 < rcurtin> I am not too familiar with the state of the art in CNNs, so I'm not sure exactly how TF will do the CNN calculation under the hood

18:53 < zoq> Depends on the kernel size, but I wouldn't expect a huge speedup.

18:53 < rcurtin> this was just 3x3

18:54 < zoq> hm, okay, probably not the best size for fft

18:54 < rcurtin> I'll look into it more tomorrow---I think there are some optimizations that can be done to produce significant speedups via avoiding copies

18:55 < zoq> yes

18:56 < zoq> Not sure what Conrad's plans are but I think it would be somewhat straightforward to use cudnn for the conv operator.

18:57 < rcurtin> I'm working with Conrad to try and assemble a library called 'bandicoot', the idea being that it's a GPU matrix library

18:57 < rcurtin> I hope that eventually it will be usable just like arma::mat or arma::sp_mat

18:57 < rcurtin> but I think it will be a while until it gets there, probably many months

18:57 < rcurtin> I only have a few hours a week to contribute, so I think he is doing the majority of the work by far

18:59 < zoq> Would be nice to just build against bandicoot instead of armadillo.

18:59 < rcurtin> I think it would be really tricky to integrate cudnn as-is into Armadillo, since arma::mat has no way to store gpu-specific memory

18:59 < rcurtin> yeah; right now bandicoot doesn't really do much yet though

19:00 < zoq> Right maybe not that straightforward.

19:05 chenzhe1 has joined #mlpack

19:06 chenzhe has quit [Ping timeout: 246 seconds]

19:06 chenzhe1 is now known as chenzhe

19:31 chenzhe1 has joined #mlpack

19:32 chenzhe has quit [Read error: Connection reset by peer]

19:32 chenzhe1 is now known as chenzhe

19:57 aashay has quit [Quit: Connection closed for inactivity]

20:09 < rcurtin> zoq: I think #1019 is ready to merge, if you agree go ahead and merge it

20:43 sgupta has quit [Ping timeout: 240 seconds]

20:44 sgupta has joined #mlpack

20:47 mikeling has quit [Quit: Connection closed for inactivity]

20:56 < zoq> rcurtin: okay, merged

20:59 sumedhghaisas has quit [Ping timeout: 255 seconds]

21:33 travis-ci has joined #mlpack

21:33 < travis-ci> mlpack/mlpack#2509 (master - 09d8793 : Marcus Edel): The build was fixed.

21:33 < travis-ci> Change view : https://github.com/mlpack/mlpack/compare/d9b4b59039d5...09d8793d1426

21:33 < travis-ci> Build details : https://travis-ci.org/mlpack/mlpack/builds/240132229

21:33 travis-ci has left #mlpack []

21:35 < kris1> zoq: Can you tell why are you doing this boost::apply_visitor(SetInputWidthVisitor(width), network[i]); when executing a forward pass through ffn network with lets linear layers

21:38 < kris1> and does input width:- indicate the num of features in the data points and output width: indicate the number of points

21:42 < zoq> kris1. Actually, I can do it for the Convolution layer, since it implements the InputWidth() function, so suppose you have some 3rd order tensor (RGB image), some layer in the network need the input size (e.g. width and height of the input image or previous layer). So all this function does is to propagate that information through the network. I think in your case, you don't have to do this unless you like to

21:42 < zoq> use e.g. Conv layer.

21:45 sumedhghaisas has joined #mlpack

21:47 < kris1> Yes

22:02 < rcurtin> kris1: have you had a chance to write a blog post yet? let me know if there are any issues pushing to the blog repository

22:08 < kris1> rcurtin: Actually no, i would start writing tomorrow i guess.

22:09 < zoq> sgupta: Excited to read your next report, ubuntu -> alpine -> busybox -> ... -> debian ----> FreeBSD; that is gonna make one hell of a story :)

22:56 < rcurtin> kris1: ok, sounds good

23:05 chenzhe has quit [Quit: chenzhe]

23:19 < kris1> rcurtin: I just completed the blog. I have pushed to the repo. Have a look :)

23:29 sgupta has quit [Ping timeout: 240 seconds]

23:29 sgupta has joined #mlpack

23:32 sumedhghaisas has quit [Remote host closed the connection]

23:41 chenzhe has joined #mlpack