--- Log opened Mon Mar 05 00:00:19 2012 | ||
n4nd0 | in high school people that go into letters may study Latin and/or Greek | 00:00 |
---|---|---|
blackburn | I knew about 200-300 words I guess | 00:01 |
blackburn | it impacts sometimes when you understand a word you have never seen because it is similar to some latin | 00:02 |
n4nd0 | yeah | 00:02 |
n4nd0 | I had a teacher in Swedish who had some knowledge in Spanish and Italian | 00:03 |
n4nd0 | not enough for a fluent conversation but he could read quite a bit | 00:03 |
n4nd0 | he claimed that he never studied those, just Latin | 00:03 |
n4nd0 | really curious! | 00:04 |
blackburn | like meta-language | 00:04 |
blackburn | :) | 00:04 |
blackburn | uh 3 am | 00:05 |
blackburn | I guess I have to sleep a little :) | 00:05 |
n4nd0 | oh that's late | 00:05 |
n4nd0 | "just" 12 here | 00:05 |
n4nd0 | good night then | 00:06 |
blackburn | I wish it was 12 here | 00:06 |
blackburn | :) | 00:06 |
blackburn | good night | 00:06 |
-!- blackburn [~qdrgsm@31.28.32.139] has quit [Quit: Leaving.] | 00:06 | |
-!- n4nd0 [~nando@s83-179-44-135.cust.tele2.se] has quit [Ping timeout: 276 seconds] | 01:08 | |
-!- axitkhurana [~akshit@14.98.227.233] has joined #shogun | 01:47 | |
-!- axitkhurana [~akshit@14.98.227.233] has left #shogun [] | 01:47 | |
-!- vikram360 [~vikram360@117.192.171.117] has quit [Read error: Connection reset by peer] | 02:12 | |
-!- n4nd0 [~nando@s83-179-44-135.cust.tele2.se] has joined #shogun | 08:29 | |
CIA-64 | shogun: Soeren Sonnenburg master * rd3f6438 / (4 files in 2 dirs): | 09:07 |
CIA-64 | shogun: Mahalanobis distance fixes | 09:07 |
CIA-64 | shogun: - use mean of all examples | 09:07 |
CIA-64 | shogun: - improve documentation | 09:07 |
CIA-64 | shogun: - serialization support - http://git.io/0kJS3w | 09:07 |
-!- sonne|work [~sonnenbu@194.78.35.195] has joined #shogun | 09:10 | |
sonne|work | n4nd0: please have a look at my mahalanobis commit | 09:11 |
sonne|work | this is what I meant - but I didn't have time to check it thoroughly would be great if you could do it | 09:11 |
sonne|work | thanks! | 09:11 |
n4nd0 | sonne|work: sure I will check it, give me some minutes | 09:20 |
sonne|work | n4nd0: you basically did it like I had in mind but missed to compute the mean over both lhs/rhs and some minor issues (serialization / documentation) | 09:21 |
n4nd0 | sonne|work: I will take it a look so I can do it better next time | 09:22 |
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun | 09:23 | |
sonne|work | n4nd0: keep in mind that not everything I do is correct so have a critical eye on it - I am open for discussion :) | 09:23 |
-!- wiking [~wiking@huwico/staff/wiking] has quit [Quit: wiking] | 09:30 | |
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun | 09:30 | |
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 260 seconds] | 09:35 | |
n4nd0 | sonne|work: ups! is the current build working good? I just pulled and compiled but the linker is complaining in lot of points | 10:26 |
n4nd0 | multiple definition of lot of methods in shogun::MulticlassMachine | 10:27 |
sonne|work | n4nd0: yes - do a git clean -dfx to erase all files not in the repository (warning...) | 10:28 |
n4nd0 | I deleted .o files in multiclass and it worked | 10:29 |
n4nd0 | thank you :) | 10:29 |
-!- blackburn [5bdfb203@gateway/web/freenode/ip.91.223.178.3] has joined #shogun | 10:34 | |
n4nd0 | sonne|work: so one thing in your commit is the use of (l == r) | 10:35 |
sonne|work | n4nd0: yeah that's sufficient | 10:36 |
n4nd0 | sonne|work: that makes that they are considered different even if they have the same values but MahalanobisDistance is instantiated with different CSimpleFeatures | 10:36 |
n4nd0 | sonne|work: ah ok, no problem then | 10:36 |
blackburn | sonne|work: I answered ;) | 10:38 |
blackburn | sonne|work: about ocas - it is working | 10:39 |
sonne|work | you fixed it? | 10:39 |
sonne|work | blackburn: ^ | 10:39 |
blackburn | nope, but for simple examples it was ok | 10:40 |
blackburn | I have to check my code | 10:40 |
blackburn | sonne|work: well test we have says it is ok | 10:40 |
blackburn | from tester_all I mean | 10:40 |
sonne|work | I had another report from sb else who also complained that it didn't work | 10:40 |
sonne|work | blackburn: our oversight then | 10:41 |
blackburn | probably, I'll check later | 10:41 |
blackburn | sonne|work: about mc-liblinear - yes it works | 10:43 |
blackburn | I even got better results on my data over simple OvR liblinear | 10:43 |
sonne|work | so 97 again now? | 10:43 |
blackburn | sonne|work: 96.8 but I didn't do model selection very well :) | 10:46 |
blackburn | pretty good anyway | 10:46 |
blackburn | such exact homogeneous map works well and I like it pretty much :) much better to use linear spaces | 10:47 |
n4nd0 | sonne|work: I tested the results, they are right | 10:52 |
n4nd0 | sonne|work: it actually makes sense using the whole data when l != r and take the mean over both distributions, sorry I didn't get you :S | 10:54 |
sonne|work | blackburn: yeah it is really fast later on! | 10:54 |
sonne|work | n4nd0: yeah - I thought it is the same like cov / one should use lhs and rhs if available for mean too | 10:55 |
blackburn | sonne|work: btw, I've added rejection strategies class | 10:55 |
blackburn | I can't mind any not threshold based rejection strategy but it is ok to keep it modular I think | 10:56 |
sonne|work | add an example to show how it works... | 10:56 |
blackburn | yeah gradually I'll do, just some rush | 10:56 |
blackburn | sonne|work: rejects are particularly important for me (e.g. actual accuracy can be measured w/o rejects and it should be ~1.0) | 10:58 |
blackburn | I have seen some SVMs that trains with reject option, but it would take time to implement it.. | 10:58 |
sonne|work | it is unclear though if you can gain a lot using this. I would suspect your simple thresholding works good enough for most cases :) | 11:06 |
blackburn | sonne|work: maybe some assumption that trainset should have rejected vectors that should not turn hyperplane round ;) | 11:09 |
sonne|work | yeah but you can control that already by giving different Cs to examples | 11:10 |
blackburn | true | 11:11 |
sonne|work | of course you would need to know which examples could be problematic | 11:11 |
sonne|work | probably the ones misclassified in a previous run :D | 11:12 |
blackburn | sonne|work: I had some idea (unrelated to classification) - can you imagine some python object that delegates some ops to lambdas? | 11:12 |
blackburn | some example: | 11:12 |
blackburn | PythonFeatures with get_feature_vector implemented in python | 11:13 |
blackburn | I did not get *any* idea how to get it done.. | 11:13 |
sonne|work | ? | 11:13 |
blackburn | sonne|work: imagine Features instance with set get_feature_vector/get_dim_feature_space/etc to lambda | 11:14 |
blackburn | I think it is impossible.. | 11:14 |
blackburn | I mean it could be custom then | 11:14 |
sonne|work | to lambda? | 11:14 |
blackburn | yeah to functions | 11:15 |
sonne|work | I don't understand what you want to say? | 11:15 |
blackburn | e.g. get_feature_vector = lambda x: some-sql-select | 11:15 |
sonne|work | autogenerated features? | 11:15 |
blackburn | not, custom | 11:15 |
sonne|work | formulas | 11:15 |
blackburn | where you can set operations | 11:15 |
sonne|work | custom!?! | 11:15 |
sonne|work | like you provide some python script? | 11:15 |
blackburn | yes-yes | 11:16 |
sonne|work | that's easy | 11:16 |
blackburn | how? | 11:16 |
sonne|work | just overload the get_feature_vector functions etc | 11:16 |
sonne|work | (from python) | 11:16 |
blackburn | really? | 11:16 |
blackburn | will it work?? | 11:16 |
sonne|work | for this to work you have to enable directors for swig though | 11:16 |
blackburn | do you find it useful? I do.. | 11:16 |
sonne|work | well I accidentally did that in the first swig based releases | 11:16 |
sonne|work | things become very slow then | 11:17 |
blackburn | that's bad | 11:17 |
sonne|work | so I would rather want a separate class just for that | 11:17 |
sonne|work | then only this class gets director functionality | 11:17 |
sonne|work | and get/set * can be overriden from $LANG | 11:17 |
blackburn | damn I thought it is impossible | 11:17 |
sonne|work | welcome to swig | 11:18 |
blackburn | sonne|work: another issue (have you 2 mins more?) | 11:18 |
sonne|work | you can overload a C++ method from $LANG | 11:18 |
sonne|work | no | 11:18 |
blackburn | bad, ok then later | 11:18 |
blackburn | hmm nevermind, useless suggestion (I thought of integrating lapack to shogun code) | 11:19 |
n4nd0 | blackburn: hey there! hope you are not too angry after the results in the elections | 11:30 |
n4nd0 | blackburn: I wanted to ask you one thing about QDA | 11:33 |
n4nd0 | blackburn: LDA is shogun is implemented regularized so I suppose that we are interested in regularized QDA right? | 11:33 |
blackburn | n4nd0: not angry at all - let this people live with this guy ;) | 11:35 |
blackburn | n4nd0: is regularization there some X+delta I? | 11:35 |
n4nd0 | blackburn: do you mean in QDA or LDA? | 11:36 |
blackburn | both? :) | 11:37 |
blackburn | I just don't know what is the regularization there | 11:37 |
blackburn | as for your question - I just meant that it would possibly be pretty easy to make it regularized | 11:37 |
blackburn | or not? | 11:37 |
n4nd0 | I am not really sure right now | 11:38 |
n4nd0 | I am still reading documentation about it | 11:38 |
n4nd0 | but it seems to me that the method changes more than just a little when regularization is used | 11:39 |
blackburn | really? | 11:41 |
blackburn | n4nd0: I think the easiest way is to implement it just as it is in scikits ;) | 11:42 |
n4nd0 | blackburn: haha ok | 11:42 |
n4nd0 | I took a look there | 11:43 |
n4nd0 | but I didn't find documentation about how they do it | 11:44 |
n4nd0 | there is an example showing a couple of plots, and the code of course | 11:44 |
blackburn | looks pretty straightforward.. | 11:49 |
blackburn | what makes you unhappy? ;) | 11:49 |
n4nd0 | nothing :P | 11:53 |
-!- sonne|work [~sonnenbu@194.78.35.195] has quit [Ping timeout: 276 seconds] | 11:53 | |
blackburn | oh we lost colonel sonnenburg | 11:54 |
blackburn | :D | 11:54 |
blackburn | n4nd0: http://s1-05.twitpicproxy.com/photos/large/531249569.png?key=890213 | 11:57 |
blackburn | it is for real ;) | 11:59 |
n4nd0 | oh | 12:00 |
n4nd0 | I saw some percentages but they were not that high | 12:01 |
n4nd0 | I saw something like 60 something for Putin over 70 total votations | 12:01 |
blackburn | ah it is in chechnya | 12:01 |
blackburn | local region | 12:01 |
n4nd0 | haha it is big local region | 12:02 |
n4nd0 | it could almost be capital in Sweden in terms of population | 12:02 |
blackburn | small republic | 12:02 |
n4nd0 | I am guessing those numbers in black are # voters | 12:03 |
blackburn | yes | 12:03 |
n4nd0 | ah fuck I did't recognize the name at first sight | 12:03 |
n4nd0 | I recognize it as "Chechenia" | 12:03 |
n4nd0 | it is how we pronounce it in Spanish | 12:03 |
blackburn | there was a war as you may probably know :) | 12:04 |
n4nd0 | yeah, that's why I remember the name | 12:05 |
n4nd0 | it appeared a lot in the news | 12:05 |
-!- sonne|work [~sonnenbu@194.78.35.195] has joined #shogun | 12:09 | |
-!- vikram360 [~vikram360@117.192.190.106] has joined #shogun | 12:16 | |
vikram360 | blackburn : and putin wins | 12:16 |
blackburn | surprise? | 12:17 |
blackburn | :) | 12:17 |
-!- n4nd0 [~nando@s83-179-44-135.cust.tele2.se] has quit [Ping timeout: 252 seconds] | 12:31 | |
vikram360 | nope.. but the media seems to be having a field day. 3000 official complaints with the voting. | 12:35 |
sonne|work | blackburn: isn't QDA the same as LDA but just on quadratic features? | 13:17 |
blackburn | sonne|work: what is quadratic features? | 13:19 |
sonne|work | all monomials of degree 2 | 13:19 |
blackburn | you probably know better? ;) | 13:20 |
sonne|work | x_1*x_2 x_1^2 x_2^2 | 13:20 |
sonne|work | for 2d input vectors | 13:20 |
blackburn | sonne|work: well we have no such features? | 13:20 |
sonne|work | polynomialdotfeatures? | 13:20 |
sonne|work | or sth? | 13:20 |
sonne|work | PolyFeatures | 13:21 |
blackburn | ah | 13:21 |
blackburn | sonne|work: well I don't know then, do you think QDA is useless? | 13:21 |
sonne|work | anyway it makes sense to make things explicit, i.e., if it is the same use LDA on simplefeatures? | 13:22 |
sonne|work | err imple QDA on simplefeatures by using PolyFeatures internally | 13:23 |
blackburn | yeah i got it | 13:23 |
-!- vikram360 [~vikram360@117.192.190.106] has quit [Read error: Connection reset by peer] | 13:24 | |
-!- sonne|work [~sonnenbu@194.78.35.195] has quit [Ping timeout: 276 seconds] | 13:30 | |
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun | 13:38 | |
-!- sonne|work [~sonnenbu@194.78.35.195] has joined #shogun | 13:45 | |
blackburn | sonne|work: it took one year for me to finally understand how features are working here in shogun hahahah | 13:47 |
sonne|work | blackburn: anyway better check that LDA on squared features is QDA - could be that there is sth else to it :) | 14:04 |
blackburn | sonne|work: n4ndo will do probably ;) | 14:04 |
blackburn | sonne|work: I have seen interesting thing in your talk | 14:04 |
blackburn | optimizing svm with auprc | 14:04 |
blackburn | did you try to train svm this way? | 14:06 |
sonne|work | blackburn: doesn't help look at t joachims paper (best paper award ICML) - gives you like 0.00000001% :) | 14:06 |
sonne|work | blackburn: which talk? | 14:06 |
blackburn | sonne|work: http://sonnenburgs.de/soeren/talks/2006-05-02-perf-measures.pdf | 14:06 |
sonne|work | ohh that crap | 14:07 |
sonne|work | probably all wrong | 14:07 |
blackburn | ah I see | 14:07 |
blackburn | :D | 14:07 |
sonne|work | I guess best is to look at this one page in my thesis - there are all the perf measures I know of (and in shogun) in there | 14:07 |
blackburn | I was interested in svm on last page | 14:08 |
blackburn | sonne|work: I can't understand why in mc svms there are (<w_n,x>+b_n) - (<w_m,x>+b_m) < 1 - \xi_m | 14:08 |
sonne|work | yeah it is some paper by Thorsten Joachims doing that fast but it doesn't help | 14:08 |
blackburn | why 1? :) | 14:08 |
sonne|work | margin fixed to 1 | 14:08 |
sonne|work | like in svm | 14:08 |
blackburn | I am thinking about ECOC training of svm | 14:09 |
sonne|work | like in mc-svm like in structured output leraning | 14:09 |
blackburn | and don't know how to formulate this boundary | 14:09 |
blackburn | something makes me think there won't be 1 :) | 14:09 |
sonne|work | in words: f(good_x) - f(other_x) > 1- sth | 14:10 |
sonne|work | anyway back to work | 14:10 |
blackburn | it looks like you work in iron mine | 14:10 |
blackburn | :) | 14:11 |
-!- faktotum [~cronor@fb.ml.tu-berlin.de] has joined #shogun | 14:45 | |
faktotum | Hello! | 14:46 |
blackburn | hi | 14:46 |
faktotum | is there the possibility to set a custom sparse kernel? | 14:46 |
faktotum | i know there are sparse kernels and that you can set custom kernels. but how do you set custom sparse kernels? | 14:46 |
faktotum | i'm using python module if that is of interest | 14:46 |
blackburn | I see.. I guess it is not yet implemented | 14:46 |
blackburn | but I think it is pretty straightforward to implement | 14:47 |
faktotum | my current workaround is to do a cholasky K = LL* decomposition and then use L as a sparse feature vector but that is not tractable with bigger matrices | 14:47 |
blackburn | I am not sure I understood why do you do cholesky | 14:48 |
sonne|work | faktotum: sounds like an easy task to add - patches welcome :) | 14:48 |
faktotum | i will try it tonight | 14:48 |
faktotum | blackburn: if i have K = LL* i can set my sparse features to L and then use a linear kernel. then i would end up with K as a custom sparse kernel | 14:50 |
blackburn | whoa I see | 14:50 |
sonne|work | faktotum: depending on how sparse things are you could just use SGSparseMatrix | 14:51 |
sonne|work | but it is not fast enough I guess.. | 14:51 |
faktotum | oh that was my idea, why is it not fast enough? | 14:53 |
blackburn | faktotum: probably it would be slow in means of checking if k_i,j is zero | 15:02 |
faktotum | ok, but doesn't the kernel created from sparse real features have the same problem? | 15:04 |
blackburn | it has as well.. | 15:05 |
blackburn | I guess some hash map should be here | 15:05 |
blackburn | faktotum: anyway cholesky of some say 4000x4000 matrix is pretty slow ;) | 15:17 |
faktotum | ha! don't try 15k x 15k! | 15:23 |
blackburn | 15k x 15k?! | 15:23 |
blackburn | that probably takes a lot of memory :) | 15:24 |
faktotum | chompack has a sparse cholesky decomposition implemented | 15:24 |
blackburn | ah | 15:24 |
blackburn | I guess dense 15K would never finish | 15:25 |
-!- vikram360 [~vikram360@117.192.190.106] has joined #shogun | 16:35 | |
-!- blackburn [5bdfb203@gateway/web/freenode/ip.91.223.178.3] has quit [Quit: Page closed] | 16:38 | |
sonne|work | faktotum: maybe it is good enough: basically finding the kernel row is fast but not finding the column | 16:43 |
sonne|work | if it is really sparse some kind of hasmap of tuples or whatever could be faster... | 16:44 |
sonne|work | but a lot of overhead then | 16:44 |
sonne|work | faktotum: so please go ahead with the sparse matrix idea - should do the job | 16:49 |
-!- in3xes [~in3xes@180.149.49.227] has joined #shogun | 16:50 | |
-!- cronor [~cronor@141.23.80.206] has joined #shogun | 16:52 | |
-!- faktotum [~cronor@fb.ml.tu-berlin.de] has quit [Ping timeout: 260 seconds] | 16:56 | |
-!- cronor [~cronor@141.23.80.206] has quit [Remote host closed the connection] | 17:00 | |
-!- cronor [~cronor@fb.ml.tu-berlin.de] has joined #shogun | 17:00 | |
-!- cronor [~cronor@fb.ml.tu-berlin.de] has quit [Quit: cronor] | 17:07 | |
-!- cronor [~cronor@fb.ml.tu-berlin.de] has joined #shogun | 17:17 | |
-!- in3xes [~in3xes@180.149.49.227] has quit [Quit: Leaving] | 17:19 | |
-!- cronor_ [~cronor@141.23.80.206] has joined #shogun | 17:21 | |
-!- cronor [~cronor@fb.ml.tu-berlin.de] has quit [Ping timeout: 260 seconds] | 17:23 | |
-!- cronor_ is now known as cronor | 17:23 | |
-!- n4nd0 [~nando@s83-179-44-135.cust.tele2.se] has joined #shogun | 17:28 | |
-!- wiking [~wiking@huwico/staff/wiking] has quit [Remote host closed the connection] | 17:50 | |
-!- wiking [~wiking@huwico/staff/wiking] has joined #shogun | 17:50 | |
vikram360 | I know this is probably a n00b question but there seems to be very little information about it: In what way is the C5.0 algorithm better than the C4.5? | 17:53 |
-!- wiking_ [~wiking@huwico/staff/wiking] has joined #shogun | 18:01 | |
-!- wiking [~wiking@huwico/staff/wiking] has quit [Ping timeout: 244 seconds] | 18:02 | |
-!- wiking_ is now known as wiking | 18:02 | |
-!- cronor [~cronor@141.23.80.206] has quit [Quit: cronor] | 18:52 | |
-!- axitkhurana [~akshit@14.98.55.250] has joined #shogun | 19:29 | |
-!- axitkhurana [~akshit@14.98.55.250] has left #shogun [] | 19:29 | |
-!- blackburn [~qdrgsm@188.168.4.3] has joined #shogun | 19:47 | |
-!- blackburn [~qdrgsm@188.168.4.3] has quit [Quit: Leaving.] | 20:12 | |
@sonney2k | vikram360, it is not clear to me either - all I know is that there were papers showing that it is better... | 22:46 |
@sonney2k | there just was no open source impl. of c5.0 around | 22:46 |
@sonney2k | and for c4.5 only some free for acadamic use thingy | 22:47 |
@sonney2k | so people tried c4.5 if they could but that's it | 22:48 |
@sonney2k | ahh btw weka has a java version of c4.5 (iirc called j45) that has probably much more clean code | 22:48 |
n4nd0 | sonney2k: hey! I read before you talked with blackburn about QDA | 23:05 |
n4nd0 | sonney2k: I have been reading into it so I could implement it in shogun | 23:06 |
n4nd0 | sonney2k: but I am not really sure if I relate what I have read about it what with that you said before | 23:06 |
n4nd0 | sonney2k: so it seems that QDA and LDA are similar in that they assume that the feature vectors follow a normal distribution, but LDA assumes that the distributions for all the classes have the same covariances while QDA doesn't make that assumption | 23:08 |
n4nd0 | sonney2k: is that right this far? | 23:08 |
@sonney2k | I guess so - at least LDA when cov matrices are considered the same the problem becomes linear | 23:17 |
n4nd0 | sonney2k: ok, so I understand that | 23:21 |
n4nd0 | sonney2k: but is it then equivalent to use LDA using polynomial features? | 23:21 |
n4nd0 | I mean, can we just make polynomial features from the original ones (e.g. if we have at the beginning x1 and x2, we expand the feature vectors so they also contain x1?, x2? and x1?x2) | 23:23 |
n4nd0 | would solving that with LDA be equivalent to QDA? | 23:23 |
@sonney2k | n4nd0, it must very close but I am not sure if it is exactly the same | 23:26 |
@sonney2k | best description about LDA/QDA I found is https://onlinecourses.science.psu.edu/stat857/book/export/html/17 | 23:27 |
n4nd0 | sonney2k: cool, thank you very much, I was using this reference http://www.slac.stanford.edu/cgi-wrap/getdoc/slac-pub-4389.pdf | 23:31 |
n4nd0 | I have some trouble when it gets into the regularization part | 23:32 |
@sonney2k | hmmh, seems like QDA / LDA results on quad features differ but it is always mentioned that one can use it to get quadratic classifier ... | 23:34 |
n4nd0 | so do you think it would be interesting to add QDA in shogun? and if so how? | 23:39 |
n4nd0 | something similar to LDA that is already implemented using regularization? | 23:39 |
-!- wiking [~wiking@huwico/staff/wiking] has quit [Quit: wiking] | 23:42 | |
--- Log closed Tue Mar 06 00:00:19 2012 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!