naywhayare changed the topic of #mlpack to: http://www.mlpack.org/ -- We don't respond instantly... but we will respond. Give it a few minutes. Or hours. -- Channel logs: http://www.mlpack.org/irc/
Anand_ has joined #mlpack
< Anand_>
Marcus : I have added the code to get the files in NBC.java. Please have a look. I have not been able to test the code as I don't know which weka jar to use. The current jar that I downloaded is giving a lot of errors while compiling. Can you throw some light on the jar version and source?
< marcus_zoq>
Anand_: I use the following command to compile the source code: make scripts WEKA_CLASSPATH=".:/home/marcus/Downloads/weka-3-6-11/weka.jar".
< marcus_zoq>
Anand_: You can also use: javac -cp ".:/home/marcus/Downloads/weka-3-6-11/weka.jar" -d methods/weka methods/weka/src/Timers.java methods/weka/src/NBC.java to compile the code.
< marcus_zoq>
Anand_: I've fixed some lines so that the code compiles. If you run the code you can find the file in the repo root. THis is my config file to test the code: https://gist.github.com/zoq/b0a3d58738b118473c80.
< Anand_>
Ok. So, the files are getting generated right?
< Anand_>
Why is it trying to compile the shogun KMeans
< Anand_>
?
< Anand_>
My compilation terminated with this error
< Anand_>
g++ -O0 methods/shogun/src/kmeans.cpp -o methods/shogun/kmeans -I""/include -L""/lib -lshogun methods/shogun/src/kmeans.cpp:13:30: fatal error: shogun/base/init.h: No such file or directory compilation terminated. make: *** [.scripts] Error 1
< Anand_>
It has compiled the weka methods, though!
< Anand_>
Is it ok/
< Anand_>
?
< marcus_zoq>
Anand_: It's because the command make scripts builds all source files to run the timings.
< Anand_>
So, is some file missing from shogun?
< marcus_zoq>
Anand_: Not really, the problem is, for the shogun kmeans method there is no default option to set initial centroids. So we used a workaround in the c++ code to set initial centraoids.
< Anand_>
Alright! Shouldn't be a problem then. Now, I am adding the metrics to nbc.py in weka. How do I go about changing the code structure in that file? I have all the required datasets now. Just the separation between timing and other metrics is required, I guess
< Anand_>
The files will be generated when we run the code, right?
< marcus_zoq>
Anand_: Yeah, It shouldn't affect you if you can't compile the modified shogun kmeans method.
< marcus_zoq>
Anand_: Right, the files are in the repo root.
< marcus_zoq>
Anand_: You need to add the RunMetrics method
< Anand_>
Yeah, right.
< Anand_>
But will that be all?
< marcus_zoq>
Anand_: I think so.
< Anand_>
Will the config file take care of the required separation?
< marcus_zoq>
Anand_: Seperation between timing and metrics?
< Anand_>
Yes
< marcus_zoq>
Anand_: Yeah, I think yiu need to add 'metric' to the run: ['timing'] block in the config file.
< Anand_>
Ok.
< Anand_>
Marcus : I am not able to run the weka method using the small_config file. The following error comes up :
< Anand_>
[FATAL] Could not execute command: ['java', '-classpath', ':methods/weka', 'NBC', '-t', 'datasets/iris_train.csv', '-T', 'datasets/iris_test.csv'] weka iris failure
< Anand_>
I have added the metrics to weka too!
< marcus_zoq>
Anand_: You need to set the WEKA_CLASSPATH.
< Anand_>
Where?
< marcus_zoq>
Anand_: You can set the location in the makefile or you can you something like: make run LOG=false CONFIG=small_config.yaml WEKA_CLASSPATH=".:/home/marcus/Downloads/weka-3-6-11/weka.jar"
< Anand_>
Ok. Got it
< Anand_>
Seems like there is some problem in writing to the files. Both the files contain only one line
< Anand_>
I guess we are overwriting
< marcus_zoq>
Anand_: Ah, okay.
< Anand_>
Let me see
< Anand_>
Marcus : I have modified the code a bit and it seems good to me. Still, the file generation is not correct. The probabilities file contains 447 lines while there are 150 instances in my testData and the predicted file contains only 3 entries.
< Anand_>
I am not able to figure out yet!
< Anand_>
Can you have a look? I have pushed the code
< Anand_>
I have also written the counts at the end of file now
< Anand_>
They are correct
< marcus_zoq>
Anand_: Sorry for the slow response, can you wait a little bit longer?
< Anand_>
Marcus : yeah, ok. I still haven't figured out what's wrong.
< marcus_zoq>
Anand_: You missed some brackets around the for loop. I've added the missing {} in the last commit.
< marcus_zoq>
Anand_: And you should use FileWriter(src, false) instead of FileWriter(src, true), so that we overwrite the exsiting file.
< Anand_>
That will overwrite each time we write something and the files will always have a single entry
< Anand_>
I think I will make a dummy write before actually writing the data to empty the existing file.
< Anand_>
Thats a work around
< marcus_zoq>
Anand_: I've tested the code, and the file contains 150 lines.
< Anand_>
Ok. Yeah, I saw it. I used it to append things. It is working now, though!
< Anand_>
So, currently, the weka and scikit are good to go with the metrics, it seems
< marcus_zoq>
Anand_: Great!
< Anand_>
Sorry for the bugs though! :) What should we take next?
< marcus_zoq>
Anand_: Maybe shogun?
< Anand_>
Ok. I will start looking at it. Any pointers?