#mlpack on 2021-07-28 — irc logs at libera.irclog.whitequark.org

2021-07-27 15:44 rcurtin_irc changed the topic of #mlpack to: mlpack: a scalable machine learning library (https://www.mlpack.org/) -- channel logs: https://libera.irclog.whitequark.org/mlpack -- NOTE: messages sent here might not be seen by bridged users on matrix, gitter, or slack

08:23 < RishabhGarg108Ri> Hey @ryan:ratml.org, I wanted to know why we are taking square root in this line https://github.com/mlpack/mlpack/blob/c8647b2bfef68dfd29a5a8612813f92cf2b79de0/src/mlpack/methods/decision_tree/multiple_random_dimension_select.hpp#L44?

08:23 <RishabhGarg108Ri> Hey @ryan:ratml.org, I wanted to know why we are taking square root in this line https://github.com/mlpack/mlpack/blob/c8647b2bfef68dfd29a5a8612813f92cf2b79de0/src/mlpack/methods/decision_tree/multiple_random_dimension_select.hpp#L44?

08:24 < RishabhGarg108Ri> What is the intuition behind it? Why not `numDimensions = dimensions` or `numDimensions = dimensions / 2`? Is it square root a heuristic?

08:24 <RishabhGarg108Ri> What is the intuition behind it? Why not `numDimensions = dimensions` or `numDimensions = dimensions / 2`? Is it square root a heuristic?

08:25 < RishabhGarg108Ri> * What is the intuition behind it? Why not `numDimensions = dimensions` or `numDimensions = dimensions / 2`? Is taking square root a heuristic?

08:25 <RishabhGarg108Ri> * What is the intuition behind it? Why not `numDimensions = dimensions` or `numDimensions = dimensions / 2`? Is taking square root a heuristic?

16:04 < rcurtin[m]> RishabhGarg108 (RishabhGarg108): sorry for the slow response. I went digging for the source this morning, but I think the original Random Forests paper by Leo Breiman suggested using sqrt(d) here

16:04 <rcurtin[m]> RishabhGarg108 (RishabhGarg108): sorry for the slow response. I went digging for the source this morning, but I think the original Random Forests paper by Leo Breiman suggested using sqrt(d) here

16:11 < RishabhGarg108Ri> Got it. Thanks for the clarification!

16:11 <RishabhGarg108Ri> Got it. Thanks for the clarification!

16:13 < rcurtin[m]> Wikipedia suggests it's page 592 in "The Elements of Statistical Learning": https://en.wikipedia.org/wiki/Random_forest (see "From bagging to random forests")

16:13 <rcurtin[m]> Wikipedia suggests it's page 592 in "The Elements of Statistical Learning": https://en.wikipedia.org/wiki/Random_forest (see "From bagging to random forests")

16:15 < RishabhGarg108Ri> I checked the book. It says `floor(sqrt(p))` for classification and `floor(p / 3)` for regression.

16:15 <RishabhGarg108Ri> I checked the book. It says `floor(sqrt(p))` for classification and `floor(p / 3)` for regression.

16:16 < rcurtin[m]> Nice to note that regression is different---want to point that out in #2619?

16:16 <rcurtin[m]> Nice to note that regression is different---want to point that out in #2619?

16:17 < RishabhGarg108Ri> Yup. I will put the direct quotes from the book :+1:

16:17 <RishabhGarg108Ri> Yup. I will put the direct quotes from the book :+1: