IRC logs of #shogun for Sunday, 2011-12-11

--- Log opened Sun Dec 11 00:00:19 2011
-!- Ram108 [~amma@14.99.61.176] has joined #shogun05:04
-!- ishaanmlhtr [~chatzilla@14.98.111.202] has joined #shogun05:20
ishaanmlhtrHi I am new here..Is there anyone whom i could talk to to get started?05:35
ishaanmlhtrI am interested in working on shogun machine learning toolbox.  So, can anyone please let me know whom to talk to?05:40
-!- ishaanmlhtr [~chatzilla@14.98.111.202] has quit [Read error: Connection reset by peer]06:11
-!- Ram108 [~amma@14.99.61.176] has quit [Ping timeout: 268 seconds]06:22
-!- ishaanmlhtr [~chatzilla@14.98.12.160] has joined #shogun06:33
-!- Ram108 [~amma@14.96.137.67] has joined #shogun06:45
-!- puneetgoyal [~puneet@115.242.21.16] has joined #shogun07:06
-!- ishaanmlhtr [~chatzilla@14.98.12.160] has left #shogun []07:32
-!- ishaanmlhtr [~chatzilla@14.98.12.160] has joined #shogun07:49
blackburn(01:30:43 AM) sonney2k: blackburn, btw how can you be sure that elections were manipulated and not that you are being manipulated to believe they were? -- believe me they were ;) we have a lot of evidences of dirty cheating10:24
blackburnhmm three guys wanting to work on shogun the same time?10:49
Ram108lol10:50
ishaanmlhtrYa..I wanted to start working on shogun. Can work on its interface with matlab or python. Any of the two.10:52
Ram108same here :) matlab cpp or python10:52
Ram108:)10:52
Ram108wen i run any of those py files all i get is the name of the algorithm10:53
Ram108do i have to create the data set myself?10:54
Ram108or is there any documentation?10:54
blackburnRam108: yes, it means all is ok10:54
blackburnishaanmlhtr: what is the field you are interested in?10:54
Ram108ah but i really want to see it segregate something into groups10:55
Ram108lol10:55
ishaanmlhtrblackburn : I haven't done much in machine learning but want to start off . I have full two months free ahead of me.10:55
blackburnguys do you know each other?10:55
blackburn:D10:55
Ram108oh no not really10:55
ishaanmlhtrblackburn: right now i was just going through the abstracts of shogun10:55
ishaanmlhtrno,i dont know Ram108 at all10:56
blackburnthat's crazy.. ok10:56
Ram108hi ishaan :)10:56
ishaanmlhtrhi Ram10810:56
ishaanmlhtr:)10:56
Ram108call me ram :)10:56
ishaanmlhtrok sure10:56
blackburnwe have no one wanting to work on shogun anyhow for a while10:56
blackburnand now three :D10:56
ishaanmlhtrblackburn : ok,so could you help me get started with some aim10:56
blackburnsure I'll try10:57
ishaanmlhtrblackburn: thats cool ,right?10:57
blackburnyes, kind of cool10:57
Ram108yeah i guess :P10:57
Ram108so can we get started out with something..... perhaps a small task maybe?10:58
Ram108am asking....10:58
blackburnhmm ok :)10:58
Ram108:)10:58
Ram108well? who is the third guy?10:59
blackburnpuneetgoyal is10:59
Ram108oh hmmm10:59
Ram108rrenaud?10:59
blackburnrrenaud was asking some things about VW11:00
blackburnnot a volkswagen but vowpal wabbit11:00
Ram108hmmm11:00
blackburn:D11:00
Ram108lol yeah i know.... i did ask if u implemented that yesterday11:00
blackburnok what is the interface you want to work with? I suggest python for sure11:00
Ram108yeah python is fine :)11:01
ishaanmlhtrblackburn : first of all i wanted to have an experience working with shogun,so i was thinking of going through the tuts..would that be helpful?11:01
ishaanmlhtrand yeah python is fine for me too.11:01
blackburnso you want to learn some basics right?11:02
Ram108yep11:02
ishaanmlhtryes11:03
blackburnok I could give you some dataset and suggest you to try different classifiers on these data11:03
Ram108okay :)11:04
blackburnyou know basics of ML, like trainset/testset, etc?11:04
ishaanmlhtrhmm..alright..11:04
ishaanmlhtrya,sort of..i could study more if need be11:04
Ram108i could study too11:05
blackburnhttp://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data11:10
blackburnlets try on this data11:10
Ram108hmmm okay11:10
blackburnlet me describe a little11:10
blackburnyou have to read each row as a vector11:10
blackburnand store it in numpy matrix column wise11:10
Ram108k11:10
Ram108hmmm11:11
blackburnmatrix [vector1 vector2 ...] is called a feature matrix11:11
Ram108yes11:11
blackburnmeanwhile you have to construct labels vector11:11
Ram108labels vector ..... hmmm11:11
ishaanmlhtrok..11:11
blackburni-th label is an 'answer' on i-th vector11:11
blackburnwe do not support complex labels like 'Iris-virginica'11:12
blackburnbut 0, 1, 2..11:12
Ram108oh11:12
Ram108hmmmm11:12
blackburnhttp://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.names that's description11:12
blackburntake a look on examples/undocumented/python_modular/classifier_knn_modular.py11:13
blackburnwhen you have feature matrix for the train set11:13
blackburnand for the test set11:13
blackburnyou have to create RealFeatures() shogun object11:13
Ram108hmmm okay11:13
ishaanmlhtralright.11:14
blackburnin case of kNN classifier distance is used11:14
Ram108k11:14
blackburncause it is a distance-based classifier11:14
Ram108hmmm11:14
blackburnso just try read the data, separate train/test set somehow, create features, and train say LDA (or any other one) classifier11:15
ishaanmlhtrok..fine.,.11:15
blackburnthen try to apply it to test data and compare results11:15
Ram108oh okay11:16
Ram108what u gave us is the train..... we make up our own test right?11:16
blackburnno11:17
Ram108oops...11:17
Ram108?11:17
blackburnwell just select a few for test11:17
blackburnand separate11:17
Ram108from the train itself.... hmmm okay11:17
blackburnjust cherry-pick say 5 each class11:19
Ram108ok11:20
Ram108each vector is a list right? say "vector1=[5.0,3.4,1.5,0.2]" and the total matrix is of the form "matrix=[vector1, vector 2, vector3......] did i get that correctly?11:53
blackburncould be list but feature matrix should be numpy array11:54
blackburnif vector1 = [1,2,3]11:54
blackburnand vector2 = [4,5,6]11:54
blackburnand vector3 = [7,8,9]11:54
blackburnthen feature matrix is11:54
blackburn[[1,4,7],11:54
blackburn [2,5,8],11:54
blackburn [3,6,9]]11:54
blackburncolumn-wise11:54
Ram108oh got it11:55
blackburni.e. each column is a feature vector11:55
Ram108i have a question...... vector1=[5.0,3.4,1.5,0.2,0], vector2=[4.4,2.9,1.4,0.2,0], vector3=[6.5,2.8,4.6,1.5,1]..... the feature matrix is[[5.0,3.4,1.5,0.2,0],[4.4,2.9,1.4,0.2,0],[6.5,2.8,4.6,1.5,1]] right?12:12
Ram108where iris-setosa=012:13
Ram108Iris-versicolor=112:13
Ram108Iris-virginica=212:13
Ram108we form that for all the given data and form the test/ train by picking some from the given data......12:14
Ram108am i correct?12:14
Ram108hence the handpicked set of test or train vectors as numpy columns forms the feature set12:16
-!- puneetgoyal [~puneet@115.242.21.16] has quit [Quit: Leaving]12:42
-!- ishaanmlhtr [~chatzilla@14.98.12.160] has left #shogun []12:44
blackburnRam108: sorry was away13:14
blackburnRam108: no, your feature matrix is transposed, feature matrix is a matrix of dim rows and N columns13:15
blackburnand vectors are stored by column13:15
blackburnyou are right about labels13:16
blackburn0,1,2 will be enough13:16
-!- puneetgoyal [~puneet@115.242.7.139] has joined #shogun13:59
Ram108am really sorry i had to leave too.... had to visit the doctor15:14
blackburnhey you have not to sorry ;)15:16
-!- Ram108 [~amma@14.96.137.67] has quit [Ping timeout: 240 seconds]15:20
-!- Ram108 [~amma@115.117.251.142] has joined #shogun15:38
-!- blackburn [~blackburn@188.168.4.87] has quit [Ping timeout: 240 seconds]15:57
-!- blackburn [~blackburn@188.168.5.195] has joined #shogun16:53
Ram108hi can i email u the code?17:29
blackburnRam108: sure17:31
Ram108done :)17:33
Ram108well i hope i have generated the feature matrix correctly......17:34
-!- puneetgoyal [~puneet@115.242.7.139] has quit [Ping timeout: 252 seconds]17:34
Ram108pls feel free to tell me any modifications necessary17:34
Ram108now what do i have to do? how do i generate test vectors from it?17:41
blackburnRam108: you could do it in a more simple way17:46
blackburnmuch more17:46
blackburnf = open('yourdatafile')17:47
Ram108oh u mean without using an intermediate file?17:47
blackburnfor line in f.readlines():17:47
blackburn   dataline = line.split(',')17:47
blackburnand then you will get a list of numbers and Iris-*17:47
Ram108oh hmmm okay sure i ll get to it nw17:48
Ram108by the way are these for classifying the flowers lol :P17:48
blackburnyes17:48
blackburn:D17:48
Ram108:)17:48
-!- puneetgoyal [~puneet@115.242.82.82] has joined #shogun17:57
blackburnRam108: once you got matrix18:00
Ram108yes?18:00
blackburnbtw you can create zero matrix with numpy.zeros([dim,N])18:00
blackburnonce you got this matrix you have to create RealFeatures(matrix)18:00
Ram108oh hmmm thanks :)18:00
Ram108hw do i do that?18:00
blackburnand then it is the similar as in examples in shogun sources18:00
Ram108how do i create real features?18:01
blackburnRam108: features = RealFeatures(matrix)18:03
blackburnand 'from modshogun import *' before18:04
Ram108oh hmmm thanks :)18:04
Ram108okay :)18:04
blackburnRam108: it could be much shorter still but continue with creating features and training the classifier18:25
Ram108yeah sure.....18:25
Ram108why is my data in curly braces? lol i guess RealFeatures function did somethn to it18:34
blackburnehm?18:34
Ram108now that i have the feature matrix what do i do :)18:34
blackburntake a look on any classifier_* python example18:35
Ram108lol nothn.... was just fooling arnd with the print statement after calling RealFeatures()18:35
Ram108okay :)18:35
Ram108i ll have to create "train data.dat, test data.dat, label_train_twoclass.dat" right?18:37
Ram108am sorry am asking a lot of stupid questions..... :(18:38
blackburnRam108: no, you can use your files18:40
blackburnRam108: you need to create two RealFeatures instances18:41
blackburnfor train and test data18:41
blackburnthen create labels for train data18:41
blackburnand create/train/etc classifier18:42
Ram108oh hmmm18:42
Ram108label_traindat = lm.load_labels('../data/label_train_twoclass.dat') whats that?18:46
Ram108i ll have to write the feature matrix obtained after "features=RealFeatures(new_mat)" onto a file named train and test seperately right?18:49
Ram108and what do u mean by create labels for train data?18:49
blackburnRam108: load_labels() loads labels from file, you do not have to use it,18:54
blackburnRam108: labels is vector of 'answers'18:54
blackburnjust set 0,1,2 to corresponding positions18:54
Ram108oh hmmm18:54
Ram108so i ll just neglect that18:55
Ram108and create the test and train files by writing the "features" obtained onto a file and setting the path correct on classifier_*.py18:56
Ram108am i correct?18:56
-!- ishaanmlhtr [~chatzilla@14.98.155.181] has joined #shogun18:57
blackburnRam108: no you do not have to do it18:57
blackburnRam108: just read your data file into the matrix18:58
blackburnand create RealFeatures(matrix)18:58
Ram108u mean set the path?18:58
blackburnno, just write your own with available example18:58
-!- ishaanmlhtr [~chatzilla@14.98.155.181] has quit [Client Quit]18:58
Ram108okay sorry am missing something here18:58
Ram108well i ll have to first write the RealFeatures matrix onto two files as it is, right?18:59
blackburnRam108:19:00
blackburn1) read real valued columns of your file into the numpy matrix, and for each ith vector store 0 for iris-setosa, 1 for iris-*, 2 for iris-* in the labels vector19:00
blackburn2) create RealFeatures(matrix) and Labels(labels)19:00
blackburn3) train classifier19:00
Ram108oh okay19:01
blackburnafter step 1) you will have some matrix19:02
Ram108yeah got it :)19:02
blackburn[[x1, y1 ...],19:02
blackburn [x2, y2 ...],19:02
blackburn [x3, y3 ...]]19:02
blackburnand labels19:02
Ram108now i have 2 matrices RealFeature matrix and label matrix :)19:02
Ram108yeah :)19:02
blackburnyes19:02
Ram108i have created the RealFeatures(matrix) and Labels(labels) instances19:08
Ram108how do i link it with the classifier function?19:09
blackburnfor example lda = LDA(1.0, features, labels)19:10
Ram108oh hmmm :)19:11
Ram108yeah i did that....19:12
Ram108then what?19:12
blackburnlda.train()19:12
Ram108i got that error19:14
blackburnoh sure19:14
blackburnLDA is two-class classifier19:14
Ram108hmmm okay.....19:14
blackburnwell then try KNN19:15
Ram108god i really need to catch up on a lot of theory...... i ll read it up..... which othr classifier wud u suggest19:15
Ram108oh okay19:15
blackburnjust do it as in classifier_knn_modular.py19:15
Ram108k19:16
Ram108knn requires distance as a parameter.......19:18
blackburnjust create it19:20
Ram108i did it threw up an error again :(19:20
Ram108am sorry am really bugging u nw19:20
blackburncopy-paste it here19:20
Ram108Traceback (most recent call last):19:21
Ram108  File "ammended.py", line 44, in <module>19:21
Ram108    knn=KNN(1.0, features, labels)19:21
Ram108  File "/usr/local/lib/python2.6/dist-packages/modshogun.py", line 20997, in __init__19:21
Ram108    this = _modshogun.new_KNN(*args)19:21
Ram108NotImplementedError: Wrong number of arguments for overloaded function 'new_KNN'.19:21
Ram108  Possible C/C++ prototypes are:19:21
Ram108    shogun::CKNN()19:21
Ram108    shogun::CKNN(int32_t,shogun::CDistance *,shogun::CLabels *)19:21
blackburnyes, you should do it e.g. KNN(3,distance,labels)19:21
blackburnnot features19:21
Ram108oh hmmm ok19:22
Ram108well i guess u r kind of exhausted with this..... do u want me to meet u later?19:22
Ram108am really sorry......19:22
blackburnRam108: not really, just trying to do two things in parallel ;)19:24
Ram108:)19:24
-!- puneetgoyal [~puneet@115.242.82.82] has quit [Ping timeout: 268 seconds]19:29
Ram108how do i see the op? the last bit of implementation of the code is on the screenshot. hope i have done it correctly19:37
blackburnyou shouldnot include labels in features19:38
blackburnuse new_mat[:3,:]19:38
blackburnand you should init distance with train,train to train19:38
Ram108k19:38
Ram108oh hmmm19:39
blackburnwell train does pretty nothing for KNN19:39
Ram108oh gee! hmmm19:39
Ram108well yeah i got a fair idea of how it works :)19:39
blackburnyou will catch everything, it will become pretty easy19:40
Ram108hmmm sure hoping forward to see that day......19:41
Ram108i guess i ll finish it off tomorrow19:42
Ram108been a tiring day and am exhausted19:42
Ram108goodbye :)19:42
blackburnokay19:43
blackburnsee you19:43
Ram108thanks a lot for your help :) really appreciate it :)19:43
Ram108c u tomorrow :)19:43
Ram108bye :)19:44
-!- Ram108 [~amma@115.117.251.142] has quit [Quit: Ex-Chat]19:44
-!- puneetgoyal [~puneet@115.240.60.197] has joined #shogun19:47
@sonney2kblackburn, any news?20:06
blackburnsonney2k: news on?20:07
@sonney2kany20:07
blackburnehmm20:07
blackburnno idea :)20:08
blackburnsonney2k: what kind of news do you want?20:09
blackburnsonney2k: well we have THREE 'newcomers'20:10
blackburnI'll try to take care on that :)20:11
blackburnI did suggest to implement t-SNE technique to Eugeniy aka gsomix20:12
blackburnand we are pretty near to release I guess20:12
puneetgoyalhello, I am spending a little more time on studying various things about svm...about how its working and all20:53
puneetgoyalis it ok? or I should concentrate more on implementing it?20:53
blackburnpuneetgoyal: ehm, implementing what?20:58
puneetgoyalblackburn: implementing svm, first to learn how it classifies the data and then will try to use it to detect spams20:59
blackburnit would be pretty difficult to implement svm once again20:59
blackburnand hey we have much already :D21:00
blackburnjust use it21:00
puneetgoyalso I should use the one included in shogun as an objecty21:00
blackburnthat is pretty consistent to use implemented21:01
blackburncause implementing svm solver is a hard task21:01
@sonney2kpuneetgoyal, it will be enough work to prepare data, use the svm in shogun and later on to write some more optimized 'features'21:02
@sonney2kblackburn, nice that we have new interested students and great that you take care :)21:03
blackburnsure21:04
puneetgoyalsonney2k: I am sorry if I got it wrong, but wont the svm given in svm will map the data onto some specific features?...so It means I would edit the existing code to write some more optimized features?21:04
puneetgoyalgiven in shogun*21:04
@sonney2kpuneetgoyal, you will need to do this mapping at some point to get best results21:05
puneetgoyalok great, I would get back in a short while after implementing some using shogun's module21:07
shogun-buildbotbuild #83 of nightly_all is complete: Success [build successful]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/nightly_all/builds/8321:13
@sonney2kfor spam you can currently use the CommWordStringKernel - blackburn did show you an embedding recently - that is how it works :)21:13
@sonney2kblackburn, any plan on the release? any trac -> github migration started?21:16
@sonney2kMy concern is that we are both very busy and w/o a plan won't make this release.21:16
@sonney2kSo I could either just type make release now or we have to make a concrete plan21:16
blackburnsonney2k: make release now21:34
blackburn:)21:34
blackburnif you have no really hard concerns about current status21:35
blackburnsonney2k: I'll continue with migrating issues soon21:36
@sonney2kblackburn, have you updated NEWS?21:39
blackburnis it outdated? let me check21:40
blackburnyes, updating21:43
15SAAI18Mshogun: Sergey Lisitsyn master * r55984f5 / src/NEWS : Updated news - http://git.io/q1jo8A22:02
blackburnsonney2k: I guess we can start release?22:02
shogun-buildbotbuild #370 of python_static is complete: Failure [failed configure]  Build details are at http://www.shogun-toolbox.org/buildbot/builders/python_static/builds/370  blamelist: blackburn91@gmail.com22:06
@sonney2ktoo tired now will try to do sth hopefully tomorrow...22:09
blackburnhah22:09
blackburnhm ok22:09
-!- puneetgoyal [~puneet@115.240.60.197] has quit [Quit: Leaving]22:23
-!- Netsplit *.net <-> *.split quits: @sonney2k, rrenaud, sonne|work, blackburn, shogun-buildbot23:18
-!- Netsplit over, joins: blackburn, shogun-buildbot, @sonney2k23:18
-!- sonne|work [~sonnenbu@194.78.35.195] has joined #shogun23:20
-!- rrenaud [~rrenaud@cpe-66-108-112-118.nyc.res.rr.com] has joined #shogun23:25
-!- puneetgoyal [~puneet@115.241.195.173] has joined #shogun23:26
--- Log closed Mon Dec 12 00:00:19 2011

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!