naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
sumedh_ has joined #mlpack
sumedhghaisas has quit [Ping timeout: 240 seconds]
< jenkins-mlpack>
Ryan Curtin: Fix #334 by ensuring vector accesses don't go out of bounds.
udit_s has quit [Ping timeout: 264 seconds]
udit_s has joined #mlpack
sumedh_ has quit [Quit: Leaving]
sumedhghaisas has joined #mlpack
govg has quit [Ping timeout: 252 seconds]
< sumedhghaisas>
naywhayare: hey ryan, you there??
govg has joined #mlpack
sumedhghaisas has quit [Ping timeout: 240 seconds]
udit_s has quit [Ping timeout: 245 seconds]
udit_s has joined #mlpack
udit_s has quit [Ping timeout: 240 seconds]
udit_s has joined #mlpack
govg has quit [Ping timeout: 272 seconds]
govg has joined #mlpack
govg has quit [Changing host]
govg has joined #mlpack
udit_s has quit [Quit: Leaving]
govg has quit [Ping timeout: 240 seconds]
govg has joined #mlpack
oldbeardo has joined #mlpack
< oldbeardo>
naywhayare: just sent you a mail
govg has quit [Quit: leaving]
< naywhayare>
oldbeardo: great! I will take a look shortly
< oldbeardo>
naywhayare: good thing I could get it done before your vacation :)
udit_s has joined #mlpack
andrewmw94 has joined #mlpack
Anand_ has joined #mlpack
< naywhayare>
udit_s: ok, so perceptrons
< naywhayare>
I'm looking at the API you wrote in your proposal
< udit_s>
I think I'll probably have to change it, though the main skeleton will remain the same. So, like always, take input, inputLabels.
< udit_s>
Make a weight vector matrix.
< naywhayare>
right; and in this case, input should be real-valued (not categorical) and inputLabels can only handle two classes; is that right?
< udit_s>
We could go for multi-class perceptron. And yeah, the input will be real-valued, like in decision_stumps.
< naywhayare>
it's your call on whether to do multi-class perceptron
< udit_s>
multiple classes can be handled quite easily too.
< udit_s>
just differently.
< naywhayare>
I'm trying to understand the differences between the binary perceptron and the multi-class perceptron now
< udit_s>
and because we're doing so in the decision stump, keeping that aspect consistent would be better.
< naywhayare>
right, I agree
< udit_s>
should I send you some links ?
< naywhayare>
no, I think I get it. the wikipedia article section, I think, is a bit confusing
< naywhayare>
my understanding is that the basic idea is to construct several perceptrons, each of which recognizes an individual class
< naywhayare>
then, to do classification, you run your input vector through all of these perceptrons, then whichever one output the highest prediction is taken to be the class label
< naywhayare>
does that seem about right?
< udit_s>
almost. Instead of multiple perceptrons, you just have multiple weight vectors, each of for one class.
< naywhayare>
oh, ok. I see now
< naywhayare>
that seems like a straightforward generalization then
< udit_s>
yeah.
Anand__ has joined #mlpack
Anand__ has quit [Client Quit]
Anand__ has joined #mlpack
< udit_s>
now,
< udit_s>
about our update criterion, and the bias ...
Anand_ has quit [Ping timeout: 246 seconds]
< udit_s>
while reading up, I came across different implementations,
< udit_s>
some use a bias weight with value 1.
< udit_s>
others, don't or ignore it.
< naywhayare>
it seems to me like maybe the bias should be a user parameter given in the constructor
< udit_s>
and I've also come across multiple update criterions.
< oldbeardo>
naywhayare: multilayer perceptron sounds like softmax regression
< udit_s>
the user input will be an option for bias or the value of the bias ?
< naywhayare>
oldbeardo: yes, but the focus here is single-layer perceptron :)
< naywhayare>
udit_s: I think the value of the bias; then if the user doesn't want bias, they just specify 0
< udit_s>
okay. then the update criterion. apparently, there are several update criteria, with no information on which one converges the earliest.
< naywhayare>
we can do this just like the AMF/NMF code then -- make it a template parameter, and implement one (or a few) update criteria
< naywhayare>
you have it written like that in your proposal
< udit_s>
okay.
< udit_s>
also, I'm a bit confused as to how the update will proceed.
< naywhayare>
assuming that each of these algorithms are iterative, then the update only needs to provide an updated weight vector
< naywhayare>
i.e. UpdateRule::Update(weights, ...) (where ... is whatever other parameters the update rule needs) should just take the existing weights vector and update its value using a single application of the update rule
< udit_s>
say a 'run' is going through the input set once. so in one run, for *each* input vector, you update the weight matrix if reqd so, and then restart the run; am I correct ? also, I think there should be a lower and upper limit to the number of runs.
< udit_s>
or should the perceptron stop only on convergence ?
< naywhayare>
a limit on the number of runs (or iterations) should be a parameter, yeah
< naywhayare>
I would write UpdateRule::Update() to perform the update for every input vector, not just one
< naywhayare>
because there may be some update algorithms that don't use that type of loop-over-every-input-vector approach (which is kind of like stochastic gradient descent)
< naywhayare>
I'd like to grab some lunch... can we continue this in a few hours? (I think we've covered most everything ?)
< udit_s>
okay, so update weights every time you go over an incorrect classification while training, in one run, and then repeat.
< naywhayare>
yeah, that seems reasonable to me
< udit_s>
yeah, I'll have dinner as well. I'll see if anything else comes up.
< naywhayare>
ok. I'll be back in about two hours
< udit_s>
okay.
Anand__ has quit [Ping timeout: 246 seconds]
< oldbeardo>
naywhayare: you saw the code?
udit_s has quit [Ping timeout: 255 seconds]
Anand_ has joined #mlpack
udit_s has joined #mlpack
< Anand_>
Marcus : I am adding metrics to weka today. I will try to follow the same design as scikit.
andrewmw94 has left #mlpack []
< marcus_zoq>
Anand_: Great, sounds like a plan.
< jenkins-mlpack>
Starting build #1947 for job mlpack - svn checkin test (previous build: STILL UNSTABLE -- last SUCCESS #1944 1 day 9 hr ago)
< Anand_>
Marcus : How will I get the predicted labels in weka? I donot find any function that predicts the class of an instance.
< Anand_>
Also, I will need to include the weka src path into the file to call weka functions. Right?
< marcus_zoq>
Anand_: And the function you need to use is called: 'distributionForInstance(Instance instance)'.
< Anand_>
'distributionForInstance(Instance instance)' returns the probabilities. It is not the actual classifier function like the predict function we used in scikit
< Anand_>
I need to use the classifyInstance for this
< Anand_>
Maybe I will add functions to NBC.java to return the probabilities and the predicted labels
< marcus_zoq>
Anand_: Right to get the actual classes, it's line 86. So you need to save the results in a file to use them.
< marcus_zoq>
Anand_: I think there isn't antoher way. Because you can't use weka with python?
< marcus_zoq>
Anand_: You can compile the source code with 'make scripts WEKA_CLASSPATH=<location to the weka.jar file>'.
< Anand_>
So, I will need the weka jar?
< Anand_>
And where will the .class files be?
< marcus_zoq>
Anand_: methods/weka methods/weka/src/; I use the following command on the build server: make scripts WEKA_CLASSPATH=".:/opt/weka/weka-3-6-9:/opt/weka/weka-3-6-9/weka.jar"
< marcus_zoq>
Anand_: And right you need the weka.jar file.
< Anand_>
ok
Anand_ has quit [Ping timeout: 246 seconds]
sumedhghaisas has joined #mlpack
sumedhghaisas has quit [Client Quit]
sumedhghaisas has joined #mlpack
< sumedhghaisas>
naywhayare: hey ryan... you there?
< naywhayare>
sumedhghaisas: yes, I am here now
< sumedhghaisas>
WH matrix and the original seems to differ ... even though residue is very small
< oldbeardo>
naywhayare: any feedback on the code?
< jenkins-mlpack>
Starting build #1948 for job mlpack - svn checkin test (previous build: STILL UNSTABLE -- last SUCCESS #1944 1 day 11 hr ago)
< naywhayare>
sumedhghaisas: in what way do they differ?
< naywhayare>
oldbeardo: I am solving a bug with the NMF tests first
< oldbeardo>
naywhayare: it would be great if you could give me feedback today, since you would be sort of unavailable for the next 10 days
< naywhayare>
oldbeardo: yes, it will be done before I leave, don't worry
< sumedhghaisas>
naywhayare: Means entires don't seem to match... should I paste the output here??
< oldbeardo>
naywhayare: okay, thanks
< sumedhghaisas>
naywhayare: you unavailable for next 10 days??
< naywhayare>
sumedhghaisas: show me the code you are using, not the output
< sumedhghaisas>
okay I will just send you the code my mail... okay??
< naywhayare>
well, sort of; I will do my best to be unavailable. I sent you (and everyone) an email about it
< naywhayare>
sure, email is fine
< sumedhghaisas>
naywhayare: Okay I have sent you the amf_impl.hpp...
< sumedhghaisas>
by the way... What bug with NMF tests??
< naywhayare>
the tests are written poorly -- NMF isn't guaranteed to return a unique factorization
< naywhayare>
so I'm rewriting the tests to not check the individual values of the W and H matrices since they aren't guaranteed to be the same for two different runs of NMF
< naywhayare>
anyway, you sent me amf_impl.hpp, but where is the code that shows the entries don't seem to match?
< sumedhghaisas>
run the code it will print the some values at the end... basically test values... entries for matrix V and WH...
< sumedhghaisas>
yes... this will print the real value and computed value...
< naywhayare>
the RMSE calculation seems okay, and it's definitely true that some of the WH values will be different than the V values
< naywhayare>
the residue can't take the test points into account, so the residue might be tiny but the difference between the V and WH values for the test points may be larger
< naywhayare>
how is the RMSE performance?
< sumedhghaisas>
RMSE is really bad...
< sumedhghaisas>
wait I will just paste the values here...
< sumedhghaisas>
2683 2192.77
< sumedhghaisas>
186 1829.49
< sumedhghaisas>
2102 3336.24
< sumedhghaisas>
1663 963.512
< sumedhghaisas>
595 2486.42
< sumedhghaisas>
1892 649.105
< sumedhghaisas>
3 0.602375
< sumedhghaisas>
4 1.40428
< sumedhghaisas>
1 1.93011
< sumedhghaisas>
1.11958e+06
< naywhayare>
you should be taking the sqrt of the RMSE, too, don't forget that part
< naywhayare>
I don't know what the values you printed mean
< sumedhghaisas>
the first column is actual entries and second is computed entries...
< sumedhghaisas>
the last value is RMSE...
< naywhayare>
I thought the movielens dataset was full of ratings from 1 to 5, but you are showing values as high as 2683
< naywhayare>
hang on, you are setting 'r_test(i, count) = temp' but you should do 'r_test(i, count) = V(i, temp)'
< naywhayare>
and then V(i, temp) = 0
< naywhayare>
you should be able to do this code entirely without the I matrix; I think that's the problem
< naywhayare>
instead of checking if(I(i, temp) == 1) you can just do if(V(i, temp) != 0)
< naywhayare>
remember that the I matrix is only representing which values of V are not zero -- which is information you can already get directly from V, so there's not much reason to maintain I at all
< sumedhghaisas>
Ohh For testing I have just passed I into update rule...
< sumedhghaisas>
I think there is some problem in my main... hang on...
< sumedhghaisas>
naywhayare: ohh my god... I am using the direct csv file as the matrix... I have to compute the matrix from raw csv right??
< sumedhghaisas>
I did that for GroupLens...
< naywhayare>
yeah, take a look at how it is done in cf_test.cpp
< sumedhghaisas>
yeah I am using that code only...
udit_s has quit [Read error: Connection reset by peer]
< jenkins-mlpack>
saxena.udit: Fixed armadillo issues, along with removing uninitialized and unused variables
< naywhayare>
ok, if you are using that code, then that should be fine
< naywhayare>
but either way, I don't think you need to have an I matrix at all, and I think that is the source of your problems
< sumedhghaisas>
no... I compute the matrix and store it for later use... I forgot to do that for MovieLens... I will compute it now...
sumedhghaisas has quit [Ping timeout: 264 seconds]
udit_s has joined #mlpack
< naywhayare>
udit_s: thanks for fixing the build :)
< udit_s>
Awesome.
< udit_s>
:)
udit_s has quit [Quit: Leaving]
< oldbeardo_>
naywhayare: you there?
< naywhayare>
oldbeardo_: yeah, I was going to leave in a few minutes, but I'm here for now
< naywhayare>
I can hang around as long as you need (well... for a few hours :))
< oldbeardo_>
heh, it won't take that long :)
< oldbeardo_>
so, I saw your mail, the part where you talk about ExtractSVD(), it's not test code, that is how it is in the algorithm
< oldbeardo_>
also about the columns > rows part, I agree
< naywhayare>
oh... I see, ok. I misunderstood the algorithm
< naywhayare>
so you build the CosineTree to basically get a smaller basis
< naywhayare>
then run actual SVD on the smaller basis
< oldbeardo_>
yes, that's right
< naywhayare>
you should include a flag that allows the user to specify whether or not to use arma::svd() or arma::svd_econ(), then
< naywhayare>
because sometimes arma::svd() can be slow, but svd_econ() produces approximate results in much less time (if I remember right)
< oldbeardo_>
okay, will do that, otherwise the new code looks fine right?
< naywhayare>
seems fine to me, as long as it passes the tests
< naywhayare>
I don't dig into code too hard until we have tests that it passes so I can start trying to make simple speed modifications, avoiding temporaries, etc.
< oldbeardo_>
right, will you be available tomorrow and day after?