--- Log opened Sun Dec 11 00:00:19 2011 | ||
-!- Ram108 [~amma@14.99.61.176] has joined #shogun | 05:04 | |
-!- ishaanmlhtr [~chatzilla@14.98.111.202] has joined #shogun | 05:20 | |
ishaanmlhtr | Hi I am new here..Is there anyone whom i could talk to to get started? | 05:35 |
---|---|---|
ishaanmlhtr | I am interested in working on shogun machine learning toolbox. So, can anyone please let me know whom to talk to? | 05:40 |
-!- ishaanmlhtr [~chatzilla@14.98.111.202] has quit [Read error: Connection reset by peer] | 06:11 | |
-!- Ram108 [~amma@14.99.61.176] has quit [Ping timeout: 268 seconds] | 06:22 | |
-!- ishaanmlhtr [~chatzilla@14.98.12.160] has joined #shogun | 06:33 | |
-!- Ram108 [~amma@14.96.137.67] has joined #shogun | 06:45 | |
-!- puneetgoyal [~puneet@115.242.21.16] has joined #shogun | 07:06 | |
-!- ishaanmlhtr [~chatzilla@14.98.12.160] has left #shogun [] | 07:32 | |
-!- ishaanmlhtr [~chatzilla@14.98.12.160] has joined #shogun | 07:49 | |
blackburn | (01:30:43 AM) sonney2k: blackburn, btw how can you be sure that elections were manipulated and not that you are being manipulated to believe they were? -- believe me they were ;) we have a lot of evidences of dirty cheating | 10:24 |
blackburn | hmm three guys wanting to work on shogun the same time? | 10:49 |
Ram108 | lol | 10:50 |
ishaanmlhtr | Ya..I wanted to start working on shogun. Can work on its interface with matlab or python. Any of the two. | 10:52 |
Ram108 | same here :) matlab cpp or python | 10:52 |
Ram108 | :) | 10:52 |
Ram108 | wen i run any of those py files all i get is the name of the algorithm | 10:53 |
Ram108 | do i have to create the data set myself? | 10:54 |
Ram108 | or is there any documentation? | 10:54 |
blackburn | Ram108: yes, it means all is ok | 10:54 |
blackburn | ishaanmlhtr: what is the field you are interested in? | 10:54 |
Ram108 | ah but i really want to see it segregate something into groups | 10:55 |
Ram108 | lol | 10:55 |
ishaanmlhtr | blackburn : I haven't done much in machine learning but want to start off . I have full two months free ahead of me. | 10:55 |
blackburn | guys do you know each other? | 10:55 |
blackburn | :D | 10:55 |
Ram108 | oh no not really | 10:55 |
ishaanmlhtr | blackburn: right now i was just going through the abstracts of shogun | 10:55 |
ishaanmlhtr | no,i dont know Ram108 at all | 10:56 |
blackburn | that's crazy.. ok | 10:56 |
Ram108 | hi ishaan :) | 10:56 |
ishaanmlhtr | hi Ram108 | 10:56 |
ishaanmlhtr | :) | 10:56 |
Ram108 | call me ram :) | 10:56 |
ishaanmlhtr | ok sure | 10:56 |
blackburn | we have no one wanting to work on shogun anyhow for a while | 10:56 |
blackburn | and now three :D | 10:56 |
ishaanmlhtr | blackburn : ok,so could you help me get started with some aim | 10:56 |
blackburn | sure I'll try | 10:57 |
ishaanmlhtr | blackburn: thats cool ,right? | 10:57 |
blackburn | yes, kind of cool | 10:57 |
Ram108 | yeah i guess :P | 10:57 |
Ram108 | so can we get started out with something..... perhaps a small task maybe? | 10:58 |
Ram108 | am asking.... | 10:58 |
blackburn | hmm ok :) | 10:58 |
Ram108 | :) | 10:58 |
Ram108 | well? who is the third guy? | 10:59 |
blackburn | puneetgoyal is | 10:59 |
Ram108 | oh hmmm | 10:59 |
Ram108 | rrenaud? | 10:59 |
blackburn | rrenaud was asking some things about VW | 11:00 |
blackburn | not a volkswagen but vowpal wabbit | 11:00 |
Ram108 | hmmm | 11:00 |
blackburn | :D | 11:00 |
Ram108 | lol yeah i know.... i did ask if u implemented that yesterday | 11:00 |
blackburn | ok what is the interface you want to work with? I suggest python for sure | 11:00 |
Ram108 | yeah python is fine :) | 11:01 |
ishaanmlhtr | blackburn : first of all i wanted to have an experience working with shogun,so i was thinking of going through the tuts..would that be helpful? | 11:01 |
ishaanmlhtr | and yeah python is fine for me too. | 11:01 |
blackburn | so you want to learn some basics right? | 11:02 |
Ram108 | yep | 11:02 |
ishaanmlhtr | yes | 11:03 |
blackburn | ok I could give you some dataset and suggest you to try different classifiers on these data | 11:03 |
Ram108 | okay :) | 11:04 |
blackburn | you know basics of ML, like trainset/testset, etc? | 11:04 |
ishaanmlhtr | hmm..alright.. | 11:04 |
ishaanmlhtr | ya,sort of..i could study more if need be | 11:04 |
Ram108 | i could study too | 11:05 |
blackburn | http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data | 11:10 |
blackburn | lets try on this data | 11:10 |
Ram108 | hmmm okay | 11:10 |
blackburn | let me describe a little | 11:10 |
blackburn | you have to read each row as a vector | 11:10 |
blackburn | and store it in numpy matrix column wise | 11:10 |
Ram108 | k | 11:10 |
Ram108 | hmmm | 11:11 |
blackburn | matrix [vector1 vector2 ...] is called a feature matrix | 11:11 |
Ram108 | yes | 11:11 |
blackburn | meanwhile you have to construct labels vector | 11:11 |
Ram108 | labels vector ..... hmmm | 11:11 |
ishaanmlhtr | ok.. | 11:11 |
blackburn | i-th label is an 'answer' on i-th vector | 11:11 |
blackburn | we do not support complex labels like 'Iris-virginica' | 11:12 |
blackburn | but 0, 1, 2.. | 11:12 |
Ram108 | oh | 11:12 |
Ram108 | hmmmm | 11:12 |
blackburn | http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.names that's description | 11:12 |
blackburn | take a look on examples/undocumented/python_modular/classifier_knn_modular.py | 11:13 |
blackburn | when you have feature matrix for the train set | 11:13 |
blackburn | and for the test set | 11:13 |
blackburn | you have to create RealFeatures() shogun object | 11:13 |
Ram108 | hmmm okay | 11:13 |
ishaanmlhtr | alright. | 11:14 |
blackburn | in case of kNN classifier distance is used | 11:14 |
Ram108 | k | 11:14 |
blackburn | cause it is a distance-based classifier | 11:14 |
Ram108 | hmmm | 11:14 |
blackburn | so just try read the data, separate train/test set somehow, create features, and train say LDA (or any other one) classifier | 11:15 |
ishaanmlhtr | ok..fine.,. | 11:15 |
blackburn | then try to apply it to test data and compare results | 11:15 |
Ram108 | oh okay | 11:16 |
Ram108 | what u gave us is the train..... we make up our own test right? | 11:16 |
blackburn | no | 11:17 |
Ram108 | oops... | 11:17 |
Ram108 | ? | 11:17 |
blackburn | well just select a few for test | 11:17 |
blackburn | and separate | 11:17 |
Ram108 | from the train itself.... hmmm okay | 11:17 |
blackburn | just cherry-pick say 5 each class | 11:19 |
Ram108 | ok | 11:20 |
Ram108 | each vector is a list right? say "vector1=[5.0,3.4,1.5,0.2]" and the total matrix is of the form "matrix=[vector1, vector 2, vector3......] did i get that correctly? | 11:53 |
blackburn | could be list but feature matrix should be numpy array | 11:54 |
blackburn | if vector1 = [1,2,3] | 11:54 |
blackburn | and vector2 = [4,5,6] | 11:54 |
blackburn | and vector3 = [7,8,9] | 11:54 |
blackburn | then feature matrix is | 11:54 |
blackburn | [[1,4,7], | 11:54 |
blackburn | [2,5,8], | 11:54 |
blackburn | [3,6,9]] | 11:54 |
blackburn | column-wise | 11:54 |
Ram108 | oh got it | 11:55 |
blackburn | i.e. each column is a feature vector | 11:55 |
Ram108 | i have a question...... vector1=[5.0,3.4,1.5,0.2,0], vector2=[4.4,2.9,1.4,0.2,0], vector3=[6.5,2.8,4.6,1.5,1]..... the feature matrix is[[5.0,3.4,1.5,0.2,0],[4.4,2.9,1.4,0.2,0],[6.5,2.8,4.6,1.5,1]] right? | 12:12 |
Ram108 | where iris-setosa=0 | 12:13 |
Ram108 | Iris-versicolor=1 | 12:13 |
Ram108 | Iris-virginica=2 | 12:13 |
Ram108 | we form that for all the given data and form the test/ train by picking some from the given data...... | 12:14 |
Ram108 | am i correct? | 12:14 |
Ram108 | hence the handpicked set of test or train vectors as numpy columns forms the feature set | 12:16 |
-!- puneetgoyal [~puneet@115.242.21.16] has quit [Quit: Leaving] | 12:42 | |
-!- ishaanmlhtr [~chatzilla@14.98.12.160] has left #shogun [] | 12:44 | |
blackburn | Ram108: sorry was away | 13:14 |
blackburn | Ram108: no, your feature matrix is transposed, feature matrix is a matrix of dim rows and N columns | 13:15 |
blackburn | and vectors are stored by column | 13:15 |
blackburn | you are right about labels | 13:16 |
blackburn | 0,1,2 will be enough | 13:16 |
-!- puneetgoyal [~puneet@115.242.7.139] has joined #shogun | 13:59 | |
Ram108 | am really sorry i had to leave too.... had to visit the doctor | 15:14 |
blackburn | hey you have not to sorry ;) | 15:16 |
-!- Ram108 [~amma@14.96.137.67] has quit [Ping timeout: 240 seconds] | 15:20 | |
-!- Ram108 [~amma@115.117.251.142] has joined #shogun | 15:38 | |
-!- blackburn [~blackburn@188.168.4.87] has quit [Ping timeout: 240 seconds] | 15:57 | |
-!- blackburn [~blackburn@188.168.5.195] has joined #shogun | 16:53 | |
Ram108 | hi can i email u the code? | 17:29 |
blackburn | Ram108: sure | 17:31 |
Ram108 | done :) | 17:33 |
Ram108 | well i hope i have generated the feature matrix correctly...... | 17:34 |
-!- puneetgoyal [~puneet@115.242.7.139] has quit [Ping timeout: 252 seconds] | 17:34 | |
Ram108 | pls feel free to tell me any modifications necessary | 17:34 |
Ram108 | now what do i have to do? how do i generate test vectors from it? | 17:41 |
blackburn | Ram108: you could do it in a more simple way | 17:46 |
blackburn | much more | 17:46 |
blackburn | f = open('yourdatafile') | 17:47 |
Ram108 | oh u mean without using an intermediate file? | 17:47 |
blackburn | for line in f.readlines(): | 17:47 |
blackburn | dataline = line.split(',') | 17:47 |
blackburn | and then you will get a list of numbers and Iris-* | 17:47 |
Ram108 | oh hmmm okay sure i ll get to it nw | 17:48 |
Ram108 | by the way are these for classifying the flowers lol :P | 17:48 |
blackburn | yes | 17:48 |
blackburn | :D | 17:48 |
Ram108 | :) | 17:48 |
-!- puneetgoyal [~puneet@115.242.82.82] has joined #shogun | 17:57 | |
blackburn | Ram108: once you got matrix | 18:00 |
Ram108 | yes? | 18:00 |
blackburn | btw you can create zero matrix with numpy.zeros([dim,N]) | 18:00 |
blackburn | once you got this matrix you have to create RealFeatures(matrix) | 18:00 |
Ram108 | oh hmmm thanks :) | 18:00 |
Ram108 | hw do i do that? | 18:00 |
blackburn | and then it is the similar as in examples in shogun sources | 18:00 |
Ram108 | how do i create real features? | 18:01 |
blackburn | Ram108: features = RealFeatures(matrix) | 18:03 |
blackburn | and 'from modshogun import *' before | 18:04 |
Ram108 | oh hmmm thanks :) | 18:04 |
Ram108 | okay :) | 18:04 |
blackburn | Ram108: it could be much shorter still but continue with creating features and training the classifier | 18:25 |
Ram108 | yeah sure..... | 18:25 |
Ram108 | why is my data in curly braces? lol i guess RealFeatures function did somethn to it | 18:34 |
blackburn | ehm? | 18:34 |
Ram108 | now that i have the feature matrix what do i do :) | 18:34 |
blackburn | take a look on any classifier_* python example | 18:35 |
Ram108 | lol nothn.... was just fooling arnd with the print statement after calling RealFeatures() | 18:35 |
Ram108 | okay :) | 18:35 |
Ram108 | i ll have to create "train data.dat, test data.dat, label_train_twoclass.dat" right? | 18:37 |
Ram108 | am sorry am asking a lot of stupid questions..... :( | 18:38 |
blackburn | Ram108: no, you can use your files | 18:40 |
blackburn | Ram108: you need to create two RealFeatures instances | 18:41 |
blackburn | for train and test data | 18:41 |
blackburn | then create labels for train data | 18:41 |
blackburn | and create/train/etc classifier | 18:42 |
Ram108 | oh hmmm | 18:42 |
Ram108 | label_traindat = lm.load_labels('../data/label_train_twoclass.dat') whats that? | 18:46 |
Ram108 | i ll have to write the feature matrix obtained after "features=RealFeatures(new_mat)" onto a file named train and test seperately right? | 18:49 |
Ram108 | and what do u mean by create labels for train data? | 18:49 |
blackburn | Ram108: load_labels() loads labels from file, you do not have to use it, | 18:54 |
blackburn | Ram108: labels is vector of 'answers' | 18:54 |
blackburn | just set 0,1,2 to corresponding positions | 18:54 |
Ram108 | oh hmmm | 18:54 |
Ram108 | so i ll just neglect that | 18:55 |
Ram108 | and create the test and train files by writing the "features" obtained onto a file and setting the path correct on classifier_*.py | 18:56 |
Ram108 | am i correct? | 18:56 |
-!- ishaanmlhtr [~chatzilla@14.98.155.181] has joined #shogun | 18:57 | |
blackburn | Ram108: no you do not have to do it | 18:57 |
blackburn | Ram108: just read your data file into the matrix | 18:58 |
blackburn | and create RealFeatures(matrix) | 18:58 |
Ram108 | u mean set the path? | 18:58 |
blackburn | no, just write your own with available example | 18:58 |
-!- ishaanmlhtr [~chatzilla@14.98.155.181] has quit [Client Quit] | 18:58 | |
Ram108 | okay sorry am missing something here | 18:58 |
Ram108 | well i ll have to first write the RealFeatures matrix onto two files as it is, right? | 18:59 |
blackburn | Ram108: | 19:00 |
blackburn | 1) read real valued columns of your file into the numpy matrix, and for each ith vector store 0 for iris-setosa, 1 for iris-*, 2 for iris-* in the labels vector | 19:00 |
blackburn | 2) create RealFeatures(matrix) and Labels(labels) | 19:00 |
blackburn | 3) train classifier | 19:00 |
Ram108 | oh okay | 19:01 |
blackburn | after step 1) you will have some matrix | 19:02 |
Ram108 | yeah got it :) | 19:02 |
blackburn | [[x1, y1 ...], | 19:02 |
blackburn | [x2, y2 ...], | 19:02 |
blackburn | [x3, y3 ...]] | 19:02 |
blackburn | and labels | 19:02 |
Ram108 | now i have 2 matrices RealFeature matrix and label matrix :) | 19:02 |
Ram108 | yeah :) | 19:02 |
blackburn | yes | 19:02 |
Ram108 | i have created the RealFeatures(matrix) and Labels(labels) instances | 19:08 |
Ram108 | how do i link it with the classifier function? | 19:09 |
blackburn | for example lda = LDA(1.0, features, labels) | 19:10 |
Ram108 | oh hmmm :) | 19:11 |
Ram108 | yeah i did that.... | 19:12 |
Ram108 | then what? | 19:12 |
blackburn | lda.train() | 19:12 |
Ram108 | i got that error | 19:14 |
blackburn | oh sure | 19:14 |
blackburn | LDA is two-class classifier | 19:14 |
Ram108 | hmmm okay..... | 19:14 |
blackburn | well then try KNN | 19:15 |
Ram108 | god i really need to catch up on a lot of theory...... i ll read it up..... which othr classifier wud u suggest | 19:15 |
Ram108 | oh okay | 19:15 |
blackburn | just do it as in classifier_knn_modular.py | 19:15 |
Ram108 | k | 19:16 |
Ram108 | knn requires distance as a parameter....... | 19:18 |
blackburn | just create it | 19:20 |
Ram108 | i did it threw up an error again :( | 19:20 |
Ram108 | am sorry am really bugging u nw | 19:20 |
blackburn | copy-paste it here | 19:20 |
Ram108 | Traceback (most recent call last): | 19:21 |
Ram108 | File "ammended.py", line 44, in <module> | 19:21 |
Ram108 | knn=KNN(1.0, features, labels) | 19:21 |
Ram108 | File "/usr/local/lib/python2.6/dist-packages/modshogun.py", line 20997, in __init__ | 19:21 |
Ram108 | this = _modshogun.new_KNN(*args) | 19:21 |
Ram108 | NotImplementedError: Wrong number of arguments for overloaded function 'new_KNN'. | 19:21 |
Ram108 | Possible C/C++ prototypes are: | 19:21 |
Ram108 | shogun::CKNN() | 19:21 |
Ram108 | shogun::CKNN(int32_t,shogun::CDistance *,shogun::CLabels *) | 19:21 |
blackburn | yes, you should do it e.g. KNN(3,distance,labels) | 19:21 |
blackburn | not features | 19:21 |
Ram108 | oh hmmm ok | 19:22 |
Ram108 | well i guess u r kind of exhausted with this..... do u want me to meet u later? | 19:22 |
Ram108 | am really sorry...... | 19:22 |
blackburn | Ram108: not really, just trying to do two things in parallel ;) | 19:24 |
Ram108 | :) | 19:24 |
-!- puneetgoyal [~puneet@115.242.82.82] has quit [Ping timeout: 268 seconds] | 19:29 | |
Ram108 | how do i see the op? the last bit of implementation of the code is on the screenshot. hope i have done it correctly | 19:37 |
blackburn | you shouldnot include labels in features | 19:38 |
blackburn | use new_mat[:3,:] | 19:38 |
blackburn | and you should init distance with train,train to train | 19:38 |
Ram108 | k | 19:38 |
Ram108 | oh hmmm | 19:39 |
blackburn | well train does pretty nothing for KNN | 19:39 |
Ram108 | oh gee! hmmm | 19:39 |
Ram108 | well yeah i got a fair idea of how it works :) | 19:39 |
blackburn | you will catch everything, it will become pretty easy | 19:40 |
Ram108 | hmmm sure hoping forward to see that day...... | 19:41 |
Ram108 | i guess i ll finish it off tomorrow | 19:42 |
Ram108 | been a tiring day and am exhausted | 19:42 |
Ram108 | goodbye :) | 19:42 |
blackburn | okay | 19:43 |
blackburn | see you | 19:43 |
Ram108 | thanks a lot for your help :) really appreciate it :) | 19:43 |
Ram108 | c u tomorrow :) | 19:43 |
Ram108 | bye :) | 19:44 |
-!- Ram108 [~amma@115.117.251.142] has quit [Quit: Ex-Chat] | 19:44 | |
-!- puneetgoyal [~puneet@115.240.60.197] has joined #shogun | 19:47 | |
@sonney2k | blackburn, any news? | 20:06 |
blackburn | sonney2k: news on? | 20:07 |
@sonney2k | any | 20:07 |
blackburn | ehmm | 20:07 |
blackburn | no idea :) | 20:08 |
blackburn | sonney2k: what kind of news do you want? | 20:09 |
blackburn | sonney2k: well we have THREE 'newcomers' | 20:10 |
blackburn | I'll try to take care on that :) | 20:11 |
blackburn | I did suggest to implement t-SNE technique to Eugeniy aka gsomix | 20:12 |
blackburn | and we are pretty near to release I guess | 20:12 |
puneetgoyal | hello, I am spending a little more time on studying various things about svm...about how its working and all | 20:53 |
puneetgoyal | is it ok? or I should concentrate more on implementing it? | 20:53 |
blackburn | puneetgoyal: ehm, implementing what? | 20:58 |
puneetgoyal | blackburn: implementing svm, first to learn how it classifies the data and then will try to use it to detect spams | 20:59 |
blackburn | it would be pretty difficult to implement svm once again | 20:59 |
blackburn | and hey we have much already :D | 21:00 |
blackburn | just use it | 21:00 |
puneetgoyal | so I should use the one included in shogun as an objecty | 21:00 |
blackburn | that is pretty consistent to use implemented | 21:01 |
blackburn | cause implementing svm solver is a hard task | 21:01 |
@sonney2k | puneetgoyal, it will be enough work to prepare data, use the svm in shogun and later on to write some more optimized 'features' | 21:02 |
@sonney2k | blackburn, nice that we have new interested students and great that you take care :) | 21:03 |
blackburn | sure | 21:04 |
puneetgoyal | sonney2k: I am sorry if I got it wrong, but wont the svm given in svm will map the data onto some specific features?...so It means I would edit the existing code to write some more optimized features? | 21:04 |
puneetgoyal | given in shogun* | 21:04 |
@sonney2k | puneetgoyal, you will need to do this mapping at some point to get best results | 21:05 |
puneetgoyal | ok great, I would get back in a short while after implementing some using shogun's module | 21:07 |
shogun-buildbot | build #83 of nightly_all is complete: Success [build successful] Build details are at http://www.shogun-toolbox.org/buildbot/builders/nightly_all/builds/83 | 21:13 |
@sonney2k | for spam you can currently use the CommWordStringKernel - blackburn did show you an embedding recently - that is how it works :) | 21:13 |
@sonney2k | blackburn, any plan on the release? any trac -> github migration started? | 21:16 |
@sonney2k | My concern is that we are both very busy and w/o a plan won't make this release. | 21:16 |
@sonney2k | So I could either just type make release now or we have to make a concrete plan | 21:16 |
blackburn | sonney2k: make release now | 21:34 |
blackburn | :) | 21:34 |
blackburn | if you have no really hard concerns about current status | 21:35 |
blackburn | sonney2k: I'll continue with migrating issues soon | 21:36 |
@sonney2k | blackburn, have you updated NEWS? | 21:39 |
blackburn | is it outdated? let me check | 21:40 |
blackburn | yes, updating | 21:43 |
15SAAI18M | shogun: Sergey Lisitsyn master * r55984f5 / src/NEWS : Updated news - http://git.io/q1jo8A | 22:02 |
blackburn | sonney2k: I guess we can start release? | 22:02 |
shogun-buildbot | build #370 of python_static is complete: Failure [failed configure] Build details are at http://www.shogun-toolbox.org/buildbot/builders/python_static/builds/370 blamelist: blackburn91@gmail.com | 22:06 |
@sonney2k | too tired now will try to do sth hopefully tomorrow... | 22:09 |
blackburn | hah | 22:09 |
blackburn | hm ok | 22:09 |
-!- puneetgoyal [~puneet@115.240.60.197] has quit [Quit: Leaving] | 22:23 | |
-!- Netsplit *.net <-> *.split quits: @sonney2k, rrenaud, sonne|work, blackburn, shogun-buildbot | 23:18 | |
-!- Netsplit over, joins: blackburn, shogun-buildbot, @sonney2k | 23:18 | |
-!- sonne|work [~sonnenbu@194.78.35.195] has joined #shogun | 23:20 | |
-!- rrenaud [~rrenaud@cpe-66-108-112-118.nyc.res.rr.com] has joined #shogun | 23:25 | |
-!- puneetgoyal [~puneet@115.241.195.173] has joined #shogun | 23:26 | |
--- Log closed Mon Dec 12 00:00:19 2011 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!