verne.freenode.net changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
awhitesong has quit [Ping timeout: 250 seconds]
awhitesong has joined #mlpack
Nilabhra has joined #mlpack
awhitesong has quit [Ping timeout: 250 seconds]
mentekid has joined #mlpack
Mathnerd314 has quit [Ping timeout: 268 seconds]
slardar has quit [Ping timeout: 260 seconds]
slardar has joined #mlpack
mentekid has quit [Remote host closed the connection]
mentekid has joined #mlpack
awhitesong has joined #mlpack
Nilabhra has quit [Remote host closed the connection]
Nilabhra has joined #mlpack
Nilabhra has quit [Remote host closed the connection]
sohail has joined #mlpack
< sohail> hey guys, I'm new to machine learning (takinng a course). This is not for my course, this is my own work. Anyway, I have a binary classifier with a couple of features which are correlated pretty strongly with the output. I don't really know what the relationship is except it is guaranteed to be something like h(x) = 1 iff x_1_low <= x_1 <= x1_high and x2_low <= x2 <= x2_high
< sohail> it seems like overkill to use machine learning for this... perhaps I should just drop outliers on either end? But if there was a simple algorithm that could learn (the user will help with the learning), that would be preferred to trying to figure out the exact relationship
< sohail> btw, when I say "it is guaranteed", I mean "it is probably..."
Bartek has joined #mlpack
< zoq> ohail: Hello, sounds like a simple decision tree would work.
< sohail> zoq: I think you're right
< zoq> sohail: You could test if the decision stump is sufficient for you data.
< sohail> zoq: the dataset looks like this: https://i.imgur.com/SzCYDdT.gifv
< sohail> that's a 2d histogram
< rcurtin> yeah, a decision stump could work if the relationship is simple
< rcurtin> if it's more complex a decision tree might be more useful, but mlpack only has "streaming decision trees" (Hoeffding trees) at the moment
< sohail> how would I Know the difference between something that would be more complex?
< rcurtin> you could just test a decision stump and see how accurate it is, and if it's not accurate enough, you could try something more complex
< sohail> I actually have another dataset I can use as the test dataset, so I'll try that
daidaeee has joined #mlpack
< daidaeee> in lars, how could we do the cross validation
< daidaeee> why in lars/lasso, we have to explicitly set lamda1 , normally I guess, the lars could give the full path of lambda during the computation
< rcurtin> daidaeee: lambda is the penalty parameter, I thought this was set to only one value throughout the course of hte algorithm
Bartek has quit [Ping timeout: 260 seconds]
< sohail> zoq, rcurtin: I found another interpretation of the data which results in something MUCH simpler. What do you think of this?
< zoq> sohail: Without any labels, it's hard to say something about the representation.
< sohail> zoq: the x label is the "segment of the week" - broken up into 15 minute increments, the y is the number of times something occurred in that week egment
< sohail> an example of how it changes over time (howlonago = days) https://i.imgur.com/ybrHuzm.png
< sohail> the "count" label should be "segment"
< zoq> sohail: looks interesting, I guess, you like to predict the number of times something occurred in a segment?
< sohail> zoq: even simpler - will something happen in a segment
< zoq> sohail: Ah right, you said it's a binary decision.
< sohail> Perhaps a simple logistical regression could help
< sohail> let me try that with the test data
sumedhghaisas has joined #mlpack
Bartek has joined #mlpack
< daidaeee> rcurtin: I guess for LARS implementation, the lambda1 is not static, in the LARS class, there's function called lamdapath : Access the set of values for lambda1 after each iteration; the solution is the last element
< rcurtin> daidaeee: you're right, I see what you mean
< rcurtin> the way the lambda1 parameter is used is to determine when the algorithm is complete
< rcurtin> see lars.cpp:96 -- the algorithm terminates when the maximum correlation among dimensions is less than lambda1
< daidaeee> I plan to sweep the lamda1 from 10-3 to 10+3 outside the function to do cross-validation in order to determine the best lambda1, it's an efficient way?
< daidaeee> or is there better way to determine inside the best lamda1
< daidaeee> or just give a very very small lamda1, say 1e-9, then get lamdapath and betapath, then I just need to determine the best beta out of the betapath using cross-validation
daidaeee has quit [Quit: Page closed]
daidaeee has joined #mlpack
daidaeee has quit [Client Quit]
Bartek has quit [Ping timeout: 268 seconds]
Mathnerd314 has joined #mlpack
ranajn123 has joined #mlpack
Bartek has joined #mlpack
daidaeee has joined #mlpack
daidaeee has quit [Client Quit]
Bartek has quit [Ping timeout: 276 seconds]
Bartek has joined #mlpack
ranajn123 has quit [Quit: Page closed]
Bartek has quit [Ping timeout: 260 seconds]
sumedhghaisas has quit [Ping timeout: 260 seconds]
Bartek has joined #mlpack
Bartek has quit [Ping timeout: 276 seconds]
Bartek has joined #mlpack
ank_95_ has joined #mlpack
Bartek has quit [Ping timeout: 246 seconds]
wasiq has quit [Ping timeout: 250 seconds]
Bartek has joined #mlpack
pin3da has joined #mlpack
wasiq has joined #mlpack
pin3da has quit [Client Quit]
awhitesong has left #mlpack []
Bartek has quit [Ping timeout: 240 seconds]
wasiq has quit [Ping timeout: 276 seconds]
Darcy has joined #mlpack
Darcy is now known as Guest21334
Guest21334 has quit [Client Quit]
Bartek has joined #mlpack
Bartek has quit [Ping timeout: 260 seconds]
Bartek has joined #mlpack
mentekid has quit [Ping timeout: 240 seconds]
Bartek has quit [Remote host closed the connection]
ank_95_ has quit [Quit: Connection closed for inactivity]