naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
Anand_ has joined #mlpack
< Anand_> Marcus : I have added the code to get the files in NBC.java. Please have a look. I have not been able to test the code as I don't know which weka jar to use. The current jar that I downloaded is giving a lot of errors while compiling. Can you throw some light on the jar version and source?
< jenkins-mlpack> Yippie, build fixed!
< jenkins-mlpack> Project mlpack - nightly matrix build build #485: FIXED in 1 hr 50 min: http://big.cc.gt.atl.ga.us:8080/job/mlpack%20-%20nightly%20matrix%20build/485/
< jenkins-mlpack> * saxena.udit: Fixed armadillo issues, along with removing uninitialized and unused variables
< jenkins-mlpack> * andrewmw94: More bug fixes.
Anand_ has quit [Ping timeout: 246 seconds]
Anand_ has joined #mlpack
< marcus_zoq> Anand_: Okay, I'll take a look in a few minutes.
< Anand_> Ok. Tell me the jar version and source too or share the jar you are using.
< marcus_zoq> Anand_: I use weka-3-6-11 you can download the jar file here: http://prdownloads.sourceforge.net/weka/weka-3-6-11.zip. It's the newest version from the weka website.
< marcus_zoq> Anand_: I use the following command to compile the source code: make scripts WEKA_CLASSPATH=".:/home/marcus/Downloads/weka-3-6-11/weka.jar".
< marcus_zoq> Anand_: You can also use: javac -cp ".:/home/marcus/Downloads/weka-3-6-11/weka.jar" -d methods/weka methods/weka/src/Timers.java methods/weka/src/NBC.java to compile the code.
< marcus_zoq> Anand_: I've fixed some lines so that the code compiles. If you run the code you can find the file in the repo root. THis is my config file to test the code: https://gist.github.com/zoq/b0a3d58738b118473c80.
< Anand_> Ok. So, the files are getting generated right?
< Anand_> Did you push some changes?
< marcus_zoq> Anand_: It's in your branch.
< Anand_> Ok. Thanks a lot!
< marcus_zoq> Anand_: No problem.
< Anand_> Why is it trying to compile the shogun KMeans
< Anand_> ?
< Anand_> My compilation terminated with this error
< Anand_> g++ -O0 methods/shogun/src/kmeans.cpp -o methods/shogun/kmeans -I""/include -L""/lib -lshogun methods/shogun/src/kmeans.cpp:13:30: fatal error: shogun/base/init.h: No such file or directory compilation terminated. make: *** [.scripts] Error 1
< Anand_> It has compiled the weka methods, though!
< Anand_> Is it ok/
< Anand_> ?
< marcus_zoq> Anand_: It's because the command make scripts builds all source files to run the timings.
< Anand_> So, is some file missing from shogun?
< marcus_zoq> Anand_: Not really, the problem is, for the shogun kmeans method there is no default option to set initial centroids. So we used a workaround in the c++ code to set initial centraoids.
< Anand_> Alright! Shouldn't be a problem then. Now, I am adding the metrics to nbc.py in weka. How do I go about changing the code structure in that file? I have all the required datasets now. Just the separation between timing and other metrics is required, I guess
< Anand_> The files will be generated when we run the code, right?
< marcus_zoq> Anand_: Yeah, It shouldn't affect you if you can't compile the modified shogun kmeans method.
< marcus_zoq> Anand_: Right, the files are in the repo root.
< marcus_zoq> Anand_: You need to add the RunMetrics method
< Anand_> Yeah, right.
< Anand_> But will that be all?
< marcus_zoq> Anand_: I think so.
< Anand_> Will the config file take care of the required separation?
< marcus_zoq> Anand_: Seperation between timing and metrics?
< Anand_> Yes
< marcus_zoq> Anand_: Yeah, I think yiu need to add 'metric' to the run: ['timing'] block in the config file.
< Anand_> Ok.
< Anand_> Marcus : I am not able to run the weka method using the small_config file. The following error comes up :
< Anand_> [FATAL] Could not execute command: ['java', '-classpath', ':methods/weka', 'NBC', '-t', 'datasets/iris_train.csv', '-T', 'datasets/iris_test.csv'] weka iris failure
< Anand_> I have added the metrics to weka too!
< marcus_zoq> Anand_: You need to set the WEKA_CLASSPATH.
< Anand_> Where?
< marcus_zoq> Anand_: You can set the location in the makefile or you can you something like: make run LOG=false CONFIG=small_config.yaml WEKA_CLASSPATH=".:/home/marcus/Downloads/weka-3-6-11/weka.jar"
< Anand_> Ok. Got it
< Anand_> Seems like there is some problem in writing to the files. Both the files contain only one line
< Anand_> I guess we are overwriting
< marcus_zoq> Anand_: Ah, okay.
< Anand_> Let me see
< Anand_> Marcus : I have modified the code a bit and it seems good to me. Still, the file generation is not correct. The probabilities file contains 447 lines while there are 150 instances in my testData and the predicted file contains only 3 entries.
< Anand_> I am not able to figure out yet!
< Anand_> Can you have a look? I have pushed the code
< Anand_> I have also written the counts at the end of file now
< Anand_> They are correct
< marcus_zoq> Anand_: Sorry for the slow response, can you wait a little bit longer?
< Anand_> Marcus : yeah, ok. I still haven't figured out what's wrong.
< marcus_zoq> Anand_: You missed some brackets around the for loop. I've added the missing {} in the last commit.
< marcus_zoq> Anand_: And you should use FileWriter(src, false) instead of FileWriter(src, true), so that we overwrite the exsiting file.
< Anand_> That will overwrite each time we write something and the files will always have a single entry
< Anand_> I think I will make a dummy write before actually writing the data to empty the existing file.
< Anand_> Thats a work around
< marcus_zoq> Anand_: I've tested the code, and the file contains 150 lines.
< Anand_> Ok. Yeah, I saw it. I used it to append things. It is working now, though!
< Anand_> So, currently, the weka and scikit are good to go with the metrics, it seems
< marcus_zoq> Anand_: Great!
< Anand_> Sorry for the bugs though! :) What should we take next?
< marcus_zoq> Anand_: Maybe shogun?
< Anand_> Ok. I will start looking at it. Any pointers?
< marcus_zoq> Anand_: There should be a function.
< Anand_> apply_multiclass, I guess is one of the functions
< marcus_zoq> Anand_: Currently we use apply to get the prediction for the classes.
< Anand_> Ok.
< Anand_> And probabilities?
< marcus_zoq> Anand_: Maybe there isn't a function to get the probabilities.
< Anand_> Yes, there isn't
< marcus_zoq> Okay in this case there are two options 1. we modify the c++ code. 2. We move on to the next libary/method.
< Anand_> if other libraries have the required methods, let us cover them first and then come back to shogun
< Anand_> else, we will modify the c__ code
< Anand_> *c++
< marcus_zoq> Anand_: Okay, I think, the last one is the matlab code.
< Anand_> Which one?
< Anand_> Oh, ok matlab
< Anand_> let us see if it has all the required methods
< Anand_> Does it?
< marcus_zoq> Anand_: Yeah, should be straightforward.
< marcus_zoq> Anand_: Do you know matlab?
< Anand_> No
< Anand_> But I can go through the code and with your guidance I will do it
< marcus_zoq> Anand_: Sure let us do that :)
< Anand_> Ok
< Anand_> I can see the nbc.m and it has the predicted labels
< Anand_> probabilities?
< marcus_zoq> I think we can use something like: fit(data, 'Prior')
< marcus_zoq> And save the results with csvwrite(results, 'results.csv')
< Anand_> What does that do?
< marcus_zoq> In line 30, we create the model and then we can use the model to predict the classes.
< marcus_zoq> The default option returns the classes and if we add 'prior' to the fit function it should return the props.
< Anand_> ok
< Anand_> And we need both
< Anand_> Ok, I will add the required call and save both to files using csvwrite, correct?
< Anand_> And then to nbc.py, I will add the metrics as usual
< marcus_zoq> Anand_: Right.
< Anand_> Alright, should be done today!
< Anand_> I will be back in a while.
< Anand_> Marcus : I have made the required changes. Have a look and let me know if I did it right
< Anand_> Marcus : I won't be available tomorrow. I am going out with my family. Let me know if any changes are to be made. I will do it on Monday. :)
< Anand_> And then we will start with Shogun
< marcus_zoq> Anand_: Okay, I'll take a look at the code tonight, okay?
< Anand_> Okay, cool!
Anand_ has quit [Ping timeout: 246 seconds]
oldbeardo has joined #mlpack
< oldbeardo> naywhayare: you there?
oldbeardo has quit [Ping timeout: 246 seconds]