#mlpack on 2016-04-20 — irc logs at libera.irclog.whitequark.org

2015-01-15 23:05 verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/

03:51 awhitesong has quit [Ping timeout: 250 seconds]

04:35 awhitesong has joined #mlpack

04:37 Nilabhra has joined #mlpack

04:49 awhitesong has quit [Ping timeout: 250 seconds]

06:02 mentekid has joined #mlpack

06:42 Mathnerd314 has quit [Ping timeout: 268 seconds]

07:23 slardar has quit [Ping timeout: 260 seconds]

07:24 slardar has joined #mlpack

10:18 mentekid has quit [Remote host closed the connection]

10:19 mentekid has joined #mlpack

11:02 awhitesong has joined #mlpack

11:31 Nilabhra has quit [Remote host closed the connection]

12:09 Nilabhra has joined #mlpack

12:43 Nilabhra has quit [Remote host closed the connection]

13:01 sohail has joined #mlpack

13:05 < sohail> hey guys, I'm new to machine learning (takinng a course). This is not for my course, this is my own work. Anyway, I have a binary classifier with a couple of features which are correlated pretty strongly with the output. I don't really know what the relationship is except it is guaranteed to be something like h(x) = 1 iff x_1_low <= x_1 <= x1_high and x2_low <= x2 <= x2_high

13:06 < sohail> it seems like overkill to use machine learning for this... perhaps I should just drop outliers on either end? But if there was a simple algorithm that could learn (the user will help with the learning), that would be preferred to trying to figure out the exact relationship

13:07 < sohail> btw, when I say "it is guaranteed", I mean "it is probably..."

13:51 Bartek has joined #mlpack

13:54 < zoq> ohail: Hello, sounds like a simple decision tree would work.

13:58 < sohail> zoq: I think you're right

13:59 < zoq> sohail: You could test if the decision stump is sufficient for you data.

13:59 < sohail> zoq: the dataset looks like this: https://i.imgur.com/SzCYDdT.gifv

13:59 < sohail> that's a 2d histogram

14:01 < rcurtin> yeah, a decision stump could work if the relationship is simple

14:01 < rcurtin> if it's more complex a decision tree might be more useful, but mlpack only has "streaming decision trees" (Hoeffding trees) at the moment

14:02 < sohail> how would I Know the difference between something that would be more complex?

14:03 < rcurtin> you could just test a decision stump and see how accurate it is, and if it's not accurate enough, you could try something more complex

14:03 < sohail> I actually have another dataset I can use as the test dataset, so I'll try that

14:06 daidaeee has joined #mlpack

14:07 < daidaeee> in lars, how could we do the cross validation

14:08 < daidaeee> why in lars/lasso, we have to explicitly set lamda1 , normally I guess, the lars could give the full path of lambda during the computation

14:09 < rcurtin> daidaeee: lambda is the penalty parameter, I thought this was set to only one value throughout the course of hte algorithm

14:12 Bartek has quit [Ping timeout: 260 seconds]

14:38 < sohail> zoq, rcurtin: I found another interpretation of the data which results in something MUCH simpler. What do you think of this?

14:38 < sohail> https://i.imgur.com/CJldG1x.png

14:44 < zoq> sohail: Without any labels, it's hard to say something about the representation.

14:45 < sohail> zoq: the x label is the "segment of the week" - broken up into 15 minute increments, the y is the number of times something occurred in that week egment

14:47 < sohail> an example of how it changes over time (howlonago = days) https://i.imgur.com/ybrHuzm.png

14:47 < sohail> the "count" label should be "segment"

14:50 < zoq> sohail: looks interesting, I guess, you like to predict the number of times something occurred in a segment?

14:50 < sohail> zoq: even simpler - will something happen in a segment

14:51 < zoq> sohail: Ah right, you said it's a binary decision.

14:56 < sohail> Perhaps a simple logistical regression could help

14:56 < sohail> let me try that with the test data

15:30 sumedhghaisas has joined #mlpack

15:49 Bartek has joined #mlpack

16:00 < daidaeee> rcurtin: I guess for LARS implementation, the lambda1 is not static, in the LARS class, there's function called lamdapath : Access the set of values for lambda1 after each iteration; the solution is the last element

16:01 < rcurtin> daidaeee: you're right, I see what you mean

16:01 < rcurtin> the way the lambda1 parameter is used is to determine when the algorithm is complete

16:01 < rcurtin> see lars.cpp:96 -- the algorithm terminates when the maximum correlation among dimensions is less than lambda1

16:04 < daidaeee> I plan to sweep the lamda1 from 10-3 to 10+3 outside the function to do cross-validation in order to determine the best lambda1, it's an efficient way?

16:05 < daidaeee> or is there better way to determine inside the best lamda1

16:11 < daidaeee> or just give a very very small lamda1, say 1e-9, then get lamdapath and betapath, then I just need to determine the best beta out of the betapath using cross-validation

16:20 daidaeee has quit [Quit: Page closed]

16:22 daidaeee has joined #mlpack

16:22 daidaeee has quit [Client Quit]

16:24 Bartek has quit [Ping timeout: 268 seconds]

16:29 Mathnerd314 has joined #mlpack

16:47 ranajn123 has joined #mlpack

17:05 Bartek has joined #mlpack

17:05 daidaeee has joined #mlpack

17:05 daidaeee has quit [Client Quit]

17:10 Bartek has quit [Ping timeout: 276 seconds]

17:36 Bartek has joined #mlpack

17:38 ranajn123 has quit [Quit: Page closed]

17:46 Bartek has quit [Ping timeout: 260 seconds]

18:11 sumedhghaisas has quit [Ping timeout: 260 seconds]

18:13 Bartek has joined #mlpack

18:18 Bartek has quit [Ping timeout: 276 seconds]

18:30 Bartek has joined #mlpack

18:36 ank_95_ has joined #mlpack

18:37 Bartek has quit [Ping timeout: 246 seconds]

19:17 wasiq has quit [Ping timeout: 250 seconds]

19:19 Bartek has joined #mlpack

19:30 pin3da has joined #mlpack

19:32 wasiq has joined #mlpack

19:34 pin3da has quit [Client Quit]

19:51 awhitesong has left #mlpack []

20:02 Bartek has quit [Ping timeout: 240 seconds]

20:04 wasiq has quit [Ping timeout: 276 seconds]

20:08 Darcy has joined #mlpack

20:09 Darcy is now known as Guest21334

20:10 Guest21334 has quit [Client Quit]

20:49 Bartek has joined #mlpack

21:53 Bartek has quit [Ping timeout: 260 seconds]

22:28 Bartek has joined #mlpack

22:36 mentekid has quit [Ping timeout: 240 seconds]

22:52 Bartek has quit [Remote host closed the connection]

23:35 ank_95_ has quit [Quit: Connection closed for inactivity]