--- Log opened Wed Apr 13 00:00:36 2011 | ||
-!- skydiver [c255a037@gateway/web/freenode/ip.194.85.160.55] has quit [Quit: Page closed] | 00:22 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 03:38 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Quit: Page closed] | 04:53 | |
-!- oiwah [~oiwah@i114-181-177-143.s04.a013.ap.plala.or.jp] has joined #shogun | 06:13 | |
-!- siddharth_ [~siddharth@117.211.88.150] has quit [Read error: Connection reset by peer] | 06:32 | |
@sonney2k | Hmmhh no news yet | 06:34 |
---|---|---|
-!- siddharth [~siddharth@117.211.88.150] has joined #shogun | 06:38 | |
serialhex | you're up early sonney2k | 06:48 |
alesis-novik | sonney2k, I was wondering if you took a look at the PDF function I sent a while back? | 06:56 |
-!- oiwah [~oiwah@i114-181-177-143.s04.a013.ap.plala.or.jp] has quit [Quit: Leaving...] | 07:14 | |
@sonney2k | alesis-novik, I must have missed that email(?) | 07:33 |
@sonney2k | serialhex, too exciting... but no news still | 07:34 |
alesis-novik | sonney2k, I sent an email quite a while back, given that the code had more to do with my potential project than with anything else in shogun | 07:34 |
serialhex | yeah, i'm going to sleep in a few so hopefully there will be news when i wake! | 07:35 |
@sonney2k | alesis-novik, then I certainly must have missed it | 07:35 |
@sonney2k | hmmhh people at google seem to have as long working hours as I do ;-) | 07:40 |
alesis-novik | Well, going to sleep | 07:41 |
@sonney2k | alesis-novik, what was the subject of your email? | 07:41 |
alesis-novik | The chain was "GSoC and Shogun questions" | 07:41 |
@sonney2k | alesis-novik, hmmhh I just have one email from you then - no pdfs though | 07:43 |
@sonney2k | wait now | 07:43 |
@sonney2k | found | 07:43 |
@sonney2k | alesis-novik, want to hear comments now or go to bed? | 07:44 |
alesis-novik | bed can wait | 07:44 |
@sonney2k | Could you derive your pdf from CDistribution? | 07:47 |
@sonney2k | was used for discrete data but still... | 07:48 |
@sonney2k | if you do this and implement the interface then e.g. computing the fisher / top kernel from your pdf becomes possible | 07:49 |
-!- yayo3 [~jakub@ip-78-45-113-245.net.upcbroadband.cz] has joined #shogun | 07:50 | |
alesis-novik | I'm not sure how the Gaussian PDF should build on CDistribution | 07:51 |
@sonney2k | http://www.shogun-toolbox.org/doc/classshogun_1_1CDistribution.html | 07:53 |
@sonney2k | I mean if I understand correctly you assume the converiance matrix / mean to be given | 07:53 |
@sonney2k | and you supply a function to compute p(x) | 07:53 |
@sonney2k | compute_PDF | 07:54 |
alesis-novik | yes | 07:54 |
@sonney2k | that would be the get_likelihood_example on in CDistribution | 07:54 |
@sonney2k | CFeatures would then be some CDotFeatures and you would have to call get_feature_vector to actually get the vector | 07:55 |
@sonney2k | returning model parameters is not so difficult eather | 07:55 |
@sonney2k | just the covariance matrix and mean flattened | 07:55 |
@sonney2k | (I mean as long vector) | 07:55 |
@sonney2k | and then only derivatives and you are done | 07:56 |
@sonney2k | (ok training is missing but if you assume that it is done outside - it is ok) | 07:56 |
@sonney2k | at least for now. | 07:58 |
@sonney2k | alesis-novik, patches in this direction are highly welcome :) | 07:59 |
alesis-novik | ok, I'll adapt that tomorrow. So then I don't really need the general PDF class | 07:59 |
@sonney2k | yes - sounds good | 07:59 |
yayo3 | by the way | 08:06 |
yayo3 | shogun is supposed to include logistic regression, according to feature matrix from web page | 08:07 |
yayo3 | but I can't find it. there must be some trick :) | 08:07 |
yayo3 | (or I'm blind, anyway( | 08:07 |
alesis-novik | When in doubt, grep it | 08:08 |
alesis-novik | Anyway, goodnight | 08:08 |
serialhex | sleepz iz teh r0x0rz!!! nite all! | 08:13 |
@sonney2k | serialhex, good nite | 08:14 |
@sonney2k | yayo3, it is hidden in liblinear | 08:15 |
@sonney2k | you need to use the logistic loss then | 08:15 |
yayo3 | yeah I used the grep :) but LR can predict probability of each class, and I don't think there's an interface for that. at least I can't find one | 08:32 |
yayo3 | anyway I'm off to school. see ya | 08:32 |
-!- yayo3 [~jakub@ip-78-45-113-245.net.upcbroadband.cz] has left #shogun [] | 08:32 | |
-!- blackburn [~qdrgsm@188.168.4.116] has joined #shogun | 08:36 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 09:05 | |
blackburn | sonney2k: waiting for ya :) | 09:06 |
@sonney2k | blackburn, morning! | 09:06 |
CIA-8 | shogun: Soeren Sonnenburg master * r6e53c49 / src/libshogun/kernel/InverseMultiQuadricKernel.cpp : don't assert anything when registering parameters - http://bit.ly/dWPVrw | 09:07 |
CIA-8 | shogun: Soeren Sonnenburg master * rc010bae / (4 files in 2 dirs): Introduce Multiquadric kernel. Thanks Joanna Stocka for the patch. - http://bit.ly/i8TFEo | 09:07 |
CIA-8 | shogun: Soeren Sonnenburg master * rb0e729e / src/libshogun/kernel/Kernel.cpp : Fix duplicate case entry for MultiQuadric Kernel - http://bit.ly/fVy4s6 | 09:07 |
CIA-8 | shogun: Soeren Sonnenburg master * r388d634 / (17 files): minor code reshuffling and verious coding style fixes in the recently added kernels - http://bit.ly/eXS1KE | 09:07 |
blackburn | Morgen :) | 09:07 |
@sonney2k | blackburn, If you'd like to know sth about #slots - no news yet | 09:07 |
-!- Tanmoy [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 09:08 | |
@sonney2k | Hmmhh I had to do quite some more polishing of all the new kernels - reviewing code is quite some work (I know now) | 09:09 |
blackburn | I see, commit of 17 files :D | 09:09 |
@sonney2k | anyways - only very minor stuff | 09:09 |
blackburn | how many new kernels do you have now? | 09:10 |
@sonney2k | I lost track | 09:10 |
@sonney2k | and actually before we don't have examples in all languages - I won't bother counting | 09:10 |
@sonney2k | these examples are used as test-suite btw (at least for the python interfaces) | 09:11 |
@sonney2k | (which makes live easy - just write one example and the testsuite will use it :) | 09:12 |
Tanmoy | @sonney2k i was just seeing the list of kernels many of the list follows from here | 09:12 |
Tanmoy | http://crsouza.blogspot.com/2010/03/kernel-functions-for-machine-learning.html#circular | 09:12 |
Tanmoy | except few exceptions | 09:12 |
@sonney2k | Tanmoy, yes - blackburn's fault :D | 09:12 |
@bettyboo | he!! sonney2k | 09:12 |
Tanmoy | ill send in a list of mine | 09:13 |
Tanmoy | though some of them needs to be checked ...had some doubts in multiquadratic and circular kernels | 09:13 |
blackburn | eh? | 09:13 |
@sonney2k | similarly easy would be to program distributions derived from CDistributions and also performance measures like accuracy etc | 09:14 |
@sonney2k | blackburn, no don't look so innocent - I mean this in a strictly positive way :) | 09:15 |
blackburn | not pretty sure what I did wrong :) | 09:15 |
@sonney2k | Tanmoy, I think some of them are actually not positive | 09:15 |
@sonney2k | but hey - I am using SVMs for 10 years now and never needed them... well I didn't know that I need them. | 09:16 |
-!- skydiver [c255a037@gateway/web/freenode/ip.194.85.160.55] has joined #shogun | 09:16 | |
blackburn | ah, you mean that some kernels are 'bad'? sure it is | 09:16 |
@sonney2k | blackburn, well they are not positive definite | 09:16 |
@sonney2k | but empirically they probably still are | 09:16 |
Tanmoy | well thereotically are shld or need not be +ve def | 09:17 |
blackburn | aha, Mercer's theorem, etc | 09:17 |
@sonney2k | blackburn, yes... an easy way to check is - just to generate some random vectors and then computing eigenvalues - checking if they are postive... | 09:17 |
Tanmoy | x\transpose A x | 09:18 |
blackburn | sonney2k: btw, about tests, may be I'll write it now | 09:18 |
@sonney2k | (I had negative ones even with gaussian kernels - welcome to numerics) | 09:18 |
@sonney2k | blackburn, for the new kernels or what? | 09:18 |
blackburn | yeap, new kernels | 09:18 |
@sonney2k | blackburn, that would be nice | 09:18 |
Tanmoy | i was thinking of extending kernels for Graphs | 09:18 |
Tanmoy | like diffusion kernel | 09:18 |
blackburn | sonney2k: examples/undocumented/ , right? | 09:18 |
blackburn | but only for python :D | 09:19 |
@sonney2k | blackburn, yes - put a *short* description in examples/description - it is automatically prepended | 09:19 |
@sonney2k | blackburn, yeah python / python_modular is sufficient | 09:19 |
@sonney2k | that will test both interfaces (static and modular) and that is sufficient to show that it works correctly | 09:20 |
blackburn | sonney2k: eh, static? | 09:20 |
blackburn | thought that we don't have static for our new kernels | 09:20 |
blackburn | only siddharth wrote one | 09:21 |
@sonney2k | blackburn, configure --interfaces=libshogun,libshogunui,python for example | 09:21 |
@sonney2k | blackburn, I can add that in 15 minutes or so | 09:21 |
blackburn | ok | 09:21 |
blackburn | 310-450, so we have 15 new kernels now | 09:22 |
blackburn | 1/15 :D | 09:27 |
blackburn | sonney2k: btw, is it good practice? CWaveletKernel(CDotFeatures* l, CDotFeatures* r, int32_t size,float64_t Wdilation, float64_t Wtranslation); | 09:27 |
blackburn | int32_t size | 09:28 |
blackburn | don't sure why I should set cache size when using it in python | 09:28 |
blackburn | 4 tested so far.. | 09:37 |
-!- Tanmoy [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Quit: Page closed] | 09:59 | |
@sonney2k | blackburn, true - not needed | 10:05 |
@sonney2k | remove that cache size arg | 10:05 |
blackburn | okay | 10:05 |
blackburn | sonney2k: will it be okay if I will commit only modular tests this time? | 10:08 |
@sonney2k | blackburn, sure | 10:08 |
@sonney2k | blackburn, I am already greateful that you are doing this | 10:08 |
blackburn | qdrgsm@blackburn-R519:~/Documents/GSoC-SHOGUN/shogun_myfork/shogun/examples/undocumented/python_modular$ python kernel_exponential_modular.py | 10:08 |
blackburn | Exponential | 10:08 |
blackburn | Segmentation fault | 10:08 |
@sonney2k | heh :) | 10:09 |
@bettyboo | yeah ;D | 10:09 |
@sonney2k | so serialization doesn't work or so | 10:09 |
blackburn | sonney2k: there is no distance in CExponentialKernel constructors | 10:10 |
blackburn | but for some reason it is used | 10:10 |
@sonney2k | blackburn, I am not even sure if any of these 'kernels' is a valid kernel when the distance is not euclidian | 10:10 |
@sonney2k | but hey - do you want to check or shall I | 10:11 |
@sonney2k | ? | 10:11 |
blackburn | sonney2k: sure it wil not in e.g. hamming distance or etc | 10:11 |
blackburn | will* | 10:11 |
blackburn | sonney2k: I'm talking about code of ExponentialKernel, don't about its math | 10:11 |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has joined #shogun | 10:11 | |
blackburn | sonney2k: think it will be valid in case of minkowski metric, but not sure at all | 10:12 |
@sonney2k | blackburn, I think we also have sth like kerneldistance that tries to make a distance from a kernel - so one can do this infintiely long :D | 10:14 |
blackburn | anyway, I will fix ExponentiaKernel because it doesn't work :) | 10:14 |
blackburn | WHAT THE F--K | 10:18 |
blackburn | CExponentialKernel::CExponentialKernel(int32_t size, float64_t w) | 10:18 |
blackburn | : CDotKernel(size) | 10:18 |
blackburn | { | 10:18 |
blackburn | init(); | 10:18 |
blackburn | width=w; | 10:18 |
blackburn | ASSERT(distance); | 10:18 |
blackburn | SG_REF(distance); | 10:18 |
blackburn | } | 10:18 |
blackburn | why do it asserting and referencing NULL?! | 10:18 |
-!- blackburn was kicked from #shogun by bettyboo [flood] | 10:18 | |
-!- blackburn [~qdrgsm@188.168.4.116] has joined #shogun | 10:18 | |
@sonney2k | blackburn, yeah ok... bug | 10:20 |
blackburn | sonney2k: not your I think :) | 10:21 |
blackburn | sonney2k: removed that constructor | 10:21 |
blackburn | sonney2k: 22 forks so far | 10:24 |
@sonney2k | lets see if this trend goes on when we know the nr of slots | 10:25 |
blackburn | sonney2k: btw, may be later we should refactor these kernels to not use distance | 10:26 |
blackburn | it seems to a hard job proving it's 'mercerness' :D for all the distances | 10:26 |
@sonney2k | maybe... but hey these kernels are so exotic... if someone uses them he should really know what he is doing | 10:27 |
blackburn | hehe | 10:28 |
blackburn | i broke something.. | 10:28 |
@sonney2k | so not only me can do that :D | 10:28 |
blackburn | ImportError: /usr/local/lib/python2.6/dist-packages/shogun/_Kernel.so: undefined symbol: _ZN6shogun18CExponentialKernelC1Eid | 10:28 |
blackburn | ehhh | 10:30 |
@sonney2k | forgot make install? | 10:30 |
@sonney2k | (or setting LD_LIBRARY_PATH) | 10:30 |
blackburn | nope | 10:30 |
blackburn | hehe | 10:30 |
@bettyboo | yeah ;D | 10:30 |
blackburn | my kernels become unavailable | 10:30 |
blackburn | hmm | 10:31 |
blackburn | my shogun became unavailable, funny | 10:31 |
blackburn | sonney2k: what LD_LIBRARY_PATH is? | 10:32 |
@sonney2k | blackburn, the pyth the dynamic linker is searching for libraries (like libshogun.so*) | 10:32 |
blackburn | ah | 10:33 |
blackburn | seems it is working | 10:33 |
blackburn | but shogun not for some reason :) | 10:33 |
blackburn | will try to reconfig it and make another install | 10:34 |
blackburn | sonney2k: why googlers don't informate you about slots? | 10:35 |
@sonney2k | well they have 175 orgs to go through | 10:35 |
@sonney2k | that takes time | 10:35 |
blackburn | ah | 10:35 |
blackburn | when they will? today and may be tomorrow? | 10:35 |
@sonney2k | when they are done | 10:36 |
blackburn | ok :) | 10:36 |
@sonney2k | this is the official statement over at #gsoc: <gsocbot> sonney2k: "slots" is Slot allocation is done manually by Chris DiBona and Carol Smith, be a good org, play nice on the mentor list and #gsoc, ask for a non-crazy-high number of slots, and you'll probably get what you ask. Note that non-crazy-high for new orgs is around 1 or 2. | 10:36 |
blackburn | ahaha | 10:37 |
blackburn | 1 or 2, nice, with 60 candidates | 10:37 |
@sonney2k | I mean they will see that and they will see that we have about 15 mentors assigned so they might be nice to us *I hope* | 10:38 |
blackburn | sonney2k: if they will give you 2 slots ask they to choose these two ones :D | 10:38 |
@sonney2k | but we have to understand them: I mean we are a new org and they just have no data about us so we could potentially screw up completely | 10:40 |
blackburn | sonney2k: yeap | 10:40 |
blackburn | but there are a few reasons to believe in you too | 10:40 |
* sonney2k wonders what these could be | 10:41 | |
blackburn | e.g. you really have a choice with so many candidates available | 10:41 |
blackburn | and I think your mentors have some authority (more than some php-ers from some open-source project), but I don't know at all | 10:43 |
blackburn | just my (may be silly) opinion | 10:43 |
heiko | hello | 10:44 |
@sonney2k | we are all above 18 too ;-) | 10:44 |
@sonney2k | hi | 10:44 |
blackburn | sonney2k: haha, all possible gsocers are | 10:45 |
heiko | sonney2k: i would like to talk about the cross-val framework a bit, so you have a moment? | 10:45 |
@sonney2k | shoot | 10:45 |
heiko | ok, as you mentioned there at least two new classes needed: | 10:46 |
heiko | one to store parameters of a particular learning machine, live kernel, data, C-param, etc | 10:46 |
heiko | then there is another one that gets the overall cross-val parameters and generates multiple instances of the first one | 10:47 |
heiko | at last, there has to be a class which gets the overall cross-val params and uses the second class to create mnultiple machines | 10:48 |
heiko | and runs these in an order based on the cross-val params | 10:48 |
heiko | evaluates and etxc | 10:48 |
@sonney2k | exeactly | 10:48 |
heiko | ok | 10:48 |
heiko | and parameters for cross-val are like | 10:48 |
@sonney2k | the problem I see so far is where to draw the line / what is a parameter. | 10:49 |
heiko | ok | 10:49 |
heiko | i thought of: | 10:49 |
@sonney2k | I mean in the usual setup one fixes the data set | 10:50 |
@sonney2k | and then trains on subsets of it | 10:50 |
heiko | fold (n), split type(stratified, normal etc), optimizeby (performance measures), parameters to optimize, and the serach type | 10:50 |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has left #shogun [] | 10:50 | |
heiko | you mean whether the subset of data that is used for a fold is a parameter? | 10:51 |
@sonney2k | yes, like e.g. a kernel parameter | 10:52 |
blackburn | sonney2k: can RealFeatures (in python) be used as SimpleFeatures (internally)? | 10:52 |
@sonney2k | I think we should assume to have a big data set that is fixed (data is no parameter) | 10:52 |
@sonney2k | blackburn, in C++ you mean - yes | 10:53 |
heiko | and then the CParameterSetting class just stores indizes of the data to use for training/test? | 10:53 |
blackburn | sonney2k: ah, my fault! | 10:53 |
@sonney2k | heiko, yes. | 10:54 |
@sonney2k | heiko, one would need to have specific features for that though that don't exist just yet | 10:54 |
@sonney2k | : | 10:55 |
heiko | specific features? | 10:55 |
@sonney2k | Not sure how to call them... Features that take a feature object as input and then just access vectors (etc) as subset | 10:56 |
@sonney2k | or alternatively the feature framework needs to be extended such that one can set a permutation | 10:56 |
heiko | ah ok I understand | 10:57 |
@sonney2k | otherwise it will not be transparent to the learning algorithm on which features to train / predict on | 10:57 |
heiko | do you think the permutation should be done in the feature framework? | 10:57 |
heiko | wouldnt it be better to have like a new class that extracts subsets from a feature object? | 10:58 |
@sonney2k | heiko, the problem then is that you have to write it for all kinds of features, like stringfeatures, sparse,simple(aka dense) .... | 10:58 |
heiko | I will have a look, one minute | 10:59 |
blackburn | someone forgot to integrate CirculateKernel in Kernel.i :D | 10:59 |
@sonney2k | blackburn, these bastards ;-) | 10:59 |
blackburn | 2 kernels left | 11:00 |
blackburn | all but Exponential and Circular are (at least) working | 11:00 |
@sonney2k | pretty good | 11:02 |
blackburn | but now working | 11:02 |
blackburn | ahhahah | 11:03 |
blackburn | InverseMultiquadric | 11:03 |
blackburn | oh, it seems to be 1/InverseMultiquadric | 11:03 |
blackburn | sonney2k: you now have really exotic kernels :D | 11:03 |
blackburn | sonney2k: may I commit all tests in one? | 11:04 |
@sonney2k | yes | 11:05 |
@sonney2k | http://www.shogun-toolbox.org/doc/classshogun_1_1CFeatures.html | 11:05 |
@sonney2k | heiko, ^^ | 11:06 |
heiko | already there :) | 11:06 |
heiko | I see, the actual data is stored in the subclasses | 11:07 |
@sonney2k | heiko, yes. | 11:07 |
-!- skydiver [c255a037@gateway/web/freenode/ip.194.85.160.55] has quit [Quit: Page closed] | 11:08 | |
heiko | so either an extractor class for each feature type, or one for all, or extend the feature classes | 11:08 |
blackburn | sonney2k: could you say some example of string kernel (need now some template to test heiko's kernel)? | 11:09 |
heiko | you are testing my kernel? :) | 11:09 |
blackburn | heiko: I'm writing tests for all new kernels now | 11:09 |
blackburn | including yours :) | 11:10 |
heiko | nice | 11:10 |
@sonney2k | blackburn, I am not sure what heikos kernel does (does it work on DNA?) if so look at the weighteddegreekernel | 11:10 |
@sonney2k | heiko^^? | 11:10 |
blackburn | sonney2k: thank you | 11:10 |
heiko | it does work on any kind of strings | 11:10 |
heiko | alphabet does not play a role | 11:10 |
heiko | also on histograms of SIFT-features in computer-vision :) | 11:11 |
@sonney2k | ok then blackburn will work | 11:11 |
@sonney2k | heh | 11:11 |
blackburn | just getting rush with need of study numerical methods and want to finish it earlier :) | 11:11 |
blackburn | heiko: you forgot to integrate your kernel in Kernel.i, I will do it now | 11:12 |
heiko | oh did not know, thanks | 11:12 |
@sonney2k | heiko, I have one more idea regarding feature splitting | 11:14 |
heiko | yes? | 11:14 |
blackburn | heiko: what should I use as delta and theta? | 11:14 |
@sonney2k | preprocs | 11:14 |
blackburn | will (1, 2) and (2, 1) be good? | 11:14 |
@sonney2k | heiko, forget it - preprocs cannot change the index | 11:15 |
heiko | yes, but then the kernel will not perform well :) | 11:15 |
blackburn | just say proper :) | 11:15 |
@sonney2k | :) | 11:15 |
heiko | blackburn: 5 and 5 is ok, thsi strongly depends on the data | 11:15 |
blackburn | okay | 11:16 |
blackburn | 5,5 and 6,6, is it ok? | 11:16 |
heiko | sonney2k, does the index really have to be changed, i mean we are talking about sets | 11:16 |
heiko | blackburn, yes, but as said, depends on data, these are legal at least :) | 11:16 |
@sonney2k | heiko, so I don't see any other option than extending dotfeatures/combinedfeatures/Stringfeatures | 11:16 |
blackburn | heiko: thank you | 11:17 |
@sonney2k | heiko, the problem is that you want to e.g. train an SVM on a certain subset of the data. | 11:17 |
heiko | blackburn, ah one think i forgot, do not set these lasrger than your word lengths | 11:17 |
@sonney2k | how does the svm know what this subset is? | 11:17 |
@sonney2k | it cannot... | 11:17 |
heiko | yes I understand | 11:18 |
@sonney2k | I mean there are only 2 options: tell the learning machine which data indices to use or the learnign machine always uses all features | 11:18 |
heiko | I always just generated cunstom kernel matrices and extracted the corresponding lines, but the matrices were precomputed :) | 11:18 |
@sonney2k | then one needs data splits | 11:19 |
@sonney2k | great example showing that this idea won't work for custom kernels | 11:19 |
@sonney2k | however, I guess when you craft a learner that does not need training examples then one simply cannot do any data splitting anyways | 11:21 |
blackburn | oh ich bin mÃŒde with all these examples | 11:21 |
@sonney2k | blackburn, daway daway rabotatch :) | 11:21 |
@bettyboo | lol ;D | 11:21 |
heiko | ;) | 11:21 |
blackburn | sonney2k: da rabotayu ya | 11:21 |
blackburn | sonney2k: vse uzhe dodelal pochti | 11:22 |
blackburn | sonney2k: and do you saying you forgot russian?! :) | 11:22 |
@sonney2k | sonney2k, one would then probably use different kernels as parameter inputs | 11:22 |
blackburn | DONE | 11:22 |
@sonney2k | I cannot read latin written russian | 11:22 |
blackburn | а ÑакПй ЌПжеÑÑ? | 11:23 |
@sonney2k | but I cannot produce these cyrillic letters though | 11:23 |
heiko | brb | 11:24 |
@sonney2k | blackburn, yes | 11:24 |
@sonney2k | blackburn, thanks | 11:24 |
@sonney2k | I will now catch the train to go to work | 11:24 |
blackburn | sonney2k: thanks for ..? | 11:24 |
@sonney2k | for the kernel examples | 11:24 |
blackburn | aha, I will do a pull request | 11:24 |
@sonney2k | it is not a good idea to start work early and then arrive late at the place where you are supposed to work. | 11:25 |
blackburn | :) | 11:25 |
@sonney2k | anyway l8r | 11:26 |
blackburn | see you | 11:26 |
heiko | re | 11:27 |
heiko | good ride | 11:27 |
heiko | hey blackburn, does the current git compile for you? | 11:29 |
blackburn | heiko: seems so, but I don't pretty sure I have all of these last commits | 11:30 |
blackburn | I'll commit my tests and check it | 11:30 |
heiko | my python_modular does not compile | 11:30 |
blackburn | what is the error? | 11:31 |
heiko | no rule to create target Classifier.h, needed by Classifier_wrap.cxx | 11:32 |
blackburn | eh | 11:32 |
blackburn | do you have classifier/Classifier.h file? | 11:32 |
blackburn | you could do | 11:33 |
heiko | i have one | 11:33 |
blackburn | git checkout libshogun/classifier/Classifier.h | 11:33 |
blackburn | hmm | 11:33 |
blackburn | ok, I have up-to-date repo and it is compiling for me | 11:34 |
blackburn | with python_modular enabled | 11:34 |
heiko | mmh | 11:34 |
heiko | I have problems with git anyway, perhaps that is the reason | 11:35 |
heiko | git fetch upstream just does not upste my local copy | 11:35 |
blackburn | ah | 11:35 |
blackburn | did you done | 11:35 |
blackburn | git merge upstream/master | 11:35 |
blackburn | ? | 11:35 |
heiko | yes | 11:36 |
heiko | sais already up to date | 11:36 |
blackburn | hehe | 11:36 |
blackburn | strange | 11:36 |
heiko | yes | 11:36 |
heiko | indeed | 11:36 |
heiko | how do you checkout the repo? | 11:37 |
heiko | git clone git@github.com:karlnapf/shogun.git | 11:38 |
heiko | git remote add upstream git clone git@github.com:karlnapf/shogun.git | 11:38 |
blackburn | ehhh | 11:38 |
heiko | i mean git remote add upstream git clone git@github.com:karlnapf/shogun.git | 11:38 |
blackburn | what did you do with this line? | 11:38 |
heiko | argh | 11:38 |
heiko | git remote add upstream git@github.com:karlnapf/shogun.git | 11:38 |
blackburn | you should git remote add upstream https://github.com/shogun-toolbox/shogun.git | 11:38 |
heiko | mmh got the line from github ... i will try | 11:39 |
heiko | a man, there we go | 11:40 |
blackburn | with that line you add remote for main repo | 11:40 |
heiko | i added my fork as remote | 11:40 |
blackburn | aha | 11:40 |
blackburn | you did | 11:40 |
heiko | well , thanks :) | 11:40 |
blackburn | np :) | 11:40 |
@bettyboo | blackburn, he | 11:40 |
blackburn | bettyboo: make doc | 11:40 |
@bettyboo | blackburn: over 60 students applied so far - there is now way to make a fair ranking - and we seriously cannot handly more than 10 of you :`-( | 11:40 |
blackburn | bettyboo: will I go thru? | 11:41 |
@bettyboo | blackburn: however, I guess when you craft a learner that does not need training examples then one simply cannot do any data splitting anyways | 11:41 |
blackburn | bettyboo: do you think so? | 11:41 |
@bettyboo | blackburn: why do you think so? | 11:41 |
blackburn | hehehe | 11:41 |
heiko | hey bettyboo, are you also a mentor ? | 11:46 |
@bettyboo | heiko: since you have so many people interested, you could organize a mentoring project outside the GSoC framework | 11:46 |
heiko | what? | 11:47 |
heiko | since I have so many people interested? | 11:47 |
blackburn | heiko: it is a bot :) | 11:47 |
heiko | oh, hehe :) | 11:48 |
heiko | it just succeded the turing test :D | 11:48 |
heiko | bettyboo, what is your purpose? | 11:48 |
@bettyboo | heiko: @sonney2k well the reason after scanning thru wat i felt is because the sparsity DS which u have defined is strong | 11:48 |
blackburn | bettyboo: SVM kNN sparse dense int main | 11:49 |
@bettyboo | blackburn: SVMs up until 97 from what I remember dealt with STRICTLY linear seperable data | 11:49 |
blackburn | bettyboo: SVM data | 11:49 |
@bettyboo | blackburn: the problem is that you want to e.g. train an SVM on a certain subset of the data. | 11:49 |
@bettyboo | sure | 11:49 |
heiko | bettyboo: svms are smarter than you | 11:49 |
@bettyboo | heiko: you dare killing our CIA bot! | 11:49 |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has quit [Quit: Leaving.] | 11:52 | |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has joined #shogun | 11:52 | |
@sonney2k | Seems like slot notifications are available too bad I don't have access to melange right now. | 11:56 |
blackburn | crap! | 11:56 |
blackburn | sonney2k: when will you have access to it? :) | 11:58 |
@sonney2k | I am just reading on # gsoc that other new orgs got 2 slots though they had 125 applications | 11:59 |
heiko | oh mann | 12:00 |
blackburn | fck | 12:00 |
siddharth | zzzz....sad | 12:01 |
siddharth | maybe they had less mentors | 12:02 |
@sonney2k | 5 slots | 12:02 |
@sonney2k | puh at least | 12:02 |
blackburn | sonney2k: you have 5 slots? | 12:02 |
@sonney2k | better than 2 right - seems to be an exceptionally high number for new orgs | 12:03 |
@sonney2k | yes | 12:03 |
blackburn | oh | 12:03 |
blackburn | sonney2k: IIRC you may request some more if you want | 12:03 |
blackburn | :D | 12:03 |
siddharth | yeah :P | 12:03 |
heiko | hehe :) | 12:04 |
@sonney2k | is 7% acceptance rate not a good deal o_O ? | 12:04 |
blackburn | Ð¿ÐŸÐµÑ Ð°Ð»Ðž | 12:05 |
blackburn | :D | 12:05 |
@sonney2k | with only 5 slots to choose from we have to be dead sure that the students that get it are 100% successful. If they are we might get more slots next year | 12:05 |
@sonney2k | I guess that is what google wants | 12:06 |
* blackburn mind what to do to get higher :) | 12:07 | |
* siddharth will have to finish SGD-QN fast :P | 12:08 | |
blackburn | sonney2k: is it a good practice to start working on our projects | 12:09 |
blackburn | ? | 12:09 |
@sonney2k | next monday the mentors will have a phone conf deciding about candidates | 12:09 |
blackburn | it seems no, but asking :) | 12:09 |
@sonney2k | blackburn, it is a good thing if you intend to finish a usuable chunk of it no matter whether you get a slot from google or not | 12:10 |
blackburn | sonney2k: okay, will not do it, better improve something | 12:11 |
@sonney2k | in the end google is doing this to get long term contributors into the projects | 12:11 |
@sonney2k | blackburn, have a look at performance measures if you like | 12:11 |
@sonney2k | the computation of the auRPC is highly suboptimal | 12:11 |
blackburn | sonney2k: okay, will do it if have sufficient time, thank you | 12:11 |
heiko | any other suboptimal stuff ? :) | 12:12 |
@sonney2k | heiko, still there? | 12:12 |
@bettyboo | hihi | 12:12 |
heiko | yes | 12:12 |
@sonney2k | How about doing the subset thing? | 12:12 |
heiko | yes thought of it. | 12:12 |
heiko | for every feature class? | 12:12 |
siddharth | sonney2k, and should I work on SGD-QN? | 12:12 |
blackburn | sonney2k: will you say something about us after that next monday? | 12:12 |
blackburn | or we have to wait until 25, April? | 12:13 |
@sonney2k | heiko, I think you can put the code for that in CFeatures and then just call it from whatever sub-class. | 12:13 |
@sonney2k | we were asked to to say anything. | 12:14 |
blackburn | to not*? | 12:14 |
@sonney2k | it also depends whether any of you applied at some other organization | 12:14 |
@sonney2k | to not | 12:14 |
@sonney2k | yes | 12:14 |
blackburn | okay | 12:14 |
blackburn | it will be awful week :D | 12:14 |
@sonney2k | heiko, I think you only need an int32_t* subset array and implement get/set function for that | 12:15 |
heiko | like | 12:16 |
heiko | get_feature_subset(int32_t* inds) ? | 12:16 |
@sonney2k | heiko, and then whenever sth calls get_feature_vector() do the subset magic / change the number of available vectors virtually and give an warning when someone wants to access the whole feature matrix | 12:16 |
heiko | ok | 12:17 |
@sonney2k | set_feature_subset(int32_t* inds, int32_t num_inds); | 12:17 |
@sonney2k | siddharth, yes of course | 12:18 |
blackburn | sonney2k: have to go now, tests are in my pull request :) | 12:19 |
blackburn | see you | 12:19 |
@sonney2k | blackburn, thanks will look at them | 12:20 |
heiko | sonney2k: thanks for the tipp, I will start now and probably bother you again in the next time :) | 12:20 |
@bettyboo | :> | 12:20 |
heiko | blackburn, bye | 12:20 |
-!- blackburn [~qdrgsm@188.168.4.116] has quit [Quit: Leaving.] | 12:20 | |
@sonney2k | the further we see that you have a plan the more likely you will be in... | 12:20 |
* siddharth speeding up things | 12:21 | |
siddharth | sonney2k, can you tell what does Fvector and Svector refer to in this code? | 12:38 |
siddharth | Fvector==feature vector? | 12:38 |
@sonney2k | not sure | 12:39 |
@sonney2k | maybe dense and sparse feature vector? | 12:39 |
@sonney2k | siddharth, antoine should know | 12:39 |
@sonney2k | ask him | 12:39 |
siddharth | ok...where can I find his email address? | 12:39 |
@sonney2k | there is a link to his homepage on the ideas list | 12:40 |
siddharth | ya got it | 12:41 |
siddharth | he has protected his email from spam :P | 12:41 |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 12:57 | |
heiko | sonney2k: are you there? | 13:07 |
heiko | currently changing StringFeatures. beside get_feature_vector, also all other functions related to feature vectors have to be changed, right | 13:09 |
heiko | (get_num_vetors, get_max_vector_length ...) | 13:09 |
heiko | i think its best to define get-num_vectors like this and add a new virtual method: | 13:14 |
heiko | virtual int32_t get_num_vectors() { return subset_inds == NULL ? get_num_vectors_all() : num_subset_inds; } | 13:14 |
heiko | virtual int32_t get_num_vectors_all()=0;, which has to implemented in all feature classes | 13:14 |
heiko | the question is what should happen when set_feature_subset is called | 13:42 |
heiko | should ALL following actions be working on the subset? Or only get_feature_vector | 13:42 |
heiko | to name a few: get_transposed, get_features, copy_features, set_feature_vector ... | 13:43 |
heiko | I think they should ALL be applied to the subset | 13:43 |
heiko | then the subset option may be removed with reset_feature_subset | 13:43 |
heiko | what do you think? | 13:43 |
heiko | however, certain functions, like cleanup() do have to stay the same (work on all features, not only on the subset) | 13:46 |
heiko | ok i stop spaming now :) | 13:47 |
@sonney2k | heiko, yes you are right. I would for now add SG_NOTIMPLEMENTED; for functions other then get_feature_vector()/ free_feature_vector() when the index array is set. | 14:08 |
CIA-8 | shogun: Sergey Lisitsyn master * r42ec6c9 / (2 files): Fixed ExponentialKernel - http://bit.ly/i3Ktma | 14:26 |
CIA-8 | shogun: Sergey Lisitsyn master * rcce0b5d / (14 files): Added python_modular examples for kernels introduced earlier - http://bit.ly/fgrUmi | 14:26 |
CIA-8 | shogun: Sergey Lisitsyn master * re805a2c / src/modular/Kernel.i : Integrated some kernels to Kernel.i - http://bit.ly/gGmyTd | 14:26 |
CIA-8 | shogun: Soeren Sonnenburg master * r1371376 / (2 files): rename width to m_width too - http://bit.ly/fYsv1u | 14:26 |
CIA-8 | shogun: Soeren Sonnenburg master * r8725706 / (2 files): Draft a histogram function. - http://bit.ly/gaFY4y | 14:26 |
-!- yayo3 [9320e890@gateway/web/freenode/ip.147.32.232.144] has joined #shogun | 14:37 | |
-!- skydiver [4deac315@gateway/web/freenode/ip.77.234.195.21] has joined #shogun | 14:58 | |
-!- yayo3 [9320e890@gateway/web/freenode/ip.147.32.232.144] has quit [Quit: Page closed] | 15:03 | |
-!- skydiver [4deac315@gateway/web/freenode/ip.77.234.195.21] has quit [Ping timeout: 252 seconds] | 15:24 | |
-!- gxr_ [c07c1afa@gateway/web/freenode/ip.192.124.26.250] has joined #shogun | 17:17 | |
gxr_ | #topic | 17:18 |
-!- gxr_ [c07c1afa@gateway/web/freenode/ip.192.124.26.250] has quit [Client Quit] | 17:18 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Quit: Page closed] | 17:27 | |
-!- blackburn [~qdrgsm@188.168.5.124] has joined #shogun | 17:39 | |
blackburn | hi, how it's going there? | 17:39 |
-!- siddharth [~siddharth@117.211.88.150] has quit [Remote host closed the connection] | 17:54 | |
-!- siddharth [~siddharth@117.211.88.150] has joined #shogun | 17:59 | |
-!- ChanServ changed the topic of #shogun to: Shogun Machine Learning Toolbox | We have been accepted for GSoC 2011 with 5 slots | GSoC Timeline http://bit.ly/gy7Pdi | This channel is logged. | 18:18 | |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has left #shogun [] | 18:45 | |
-!- yayo3 [~jakub@ip-78-45-113-245.net.upcbroadband.cz] has joined #shogun | 19:17 | |
* serialhex watches a tumbleweed pass by | 20:10 | |
-!- skydiver [c255a037@gateway/web/freenode/ip.194.85.160.55] has joined #shogun | 20:33 | |
@sonney2k | :) | 21:09 |
blackburn | sonney2k: is there something new? :) | 21:11 |
blackburn | sonney2k: now looking on PerfomanceMeasures, it's so intricated! | 21:12 |
@sonney2k | blackburn, yes it is unnecessarily complex | 21:15 |
@sonney2k | blackburn, new in what sense? | 21:15 |
blackburn | sonney2k: gsoc :) or there will not be any changes? | 21:16 |
blackburn | not only about slots, but something another may be | 21:16 |
@sonney2k | no - there won't be any news for a while I suspect | 21:17 |
serialhex | well fie on them!!! fie on google for not giving us information!!! fie fie fie!!! | 21:18 |
@sonney2k | sorry, but what more information do you need? | 21:21 |
@sonney2k | ^^ #topic | 21:21 |
blackburn | may be they will say about the matter of life | 21:22 |
blackburn | or what happened with elvis | 21:23 |
@sonney2k | look we are really lucky to get 5 slots - other new orgs wit even more applications got just 2 | 21:23 |
alesis-novik | sonney2k, is there code in shogun for calculating means and covariances? Because that could be in the training for my Gaussian thing | 21:23 |
blackburn | sonney2k: we are the lucky one :) | 21:24 |
@sonney2k | alesis-novik, partially in CMath::mean ... please put CMath::cov there too | 21:24 |
alesis-novik | then training for it would be just computing the mean and covariance, because that's what the parameters really are | 21:25 |
@sonney2k | alesis-novik, yes - makes a lot of sense :) | 21:27 |
@bettyboo | :> | 21:27 |
@sonney2k | ahh betty is back too | 21:27 |
@bettyboo | sonney2k: betty is always there for us :) | 21:27 |
@sonney2k | well said | 21:27 |
alesis-novik | bettyboo, achieved singularity yet? | 21:28 |
@bettyboo | alesis-novik: ah, joking, see now :) | 21:28 |
yayo3 | what does "fie" stand for? | 21:29 |
blackburn | we need to do something with regression | 21:30 |
yayo3 | or agression | 21:31 |
@sonney2k | yayo3, fie? | 21:31 |
blackburn | or clustering | 21:31 |
blackburn | we have strict framework only for classifying | 21:31 |
yayo3 | sonney2k: serialhex said that | 21:32 |
@sonney2k | blackburn, yes the naming is confusing | 21:32 |
yayo3 | blackburn: well there's at least clustering interface. not a regression one I think | 21:32 |
@sonney2k | and I don't want to derive from more than one base class either | 21:32 |
blackburn | yayo3: there is no clustering interface | 21:33 |
serialhex | i dont know what 'fie' really means, i just use it to cuss at people when i'm trying not to cuss :P | 21:33 |
@sonney2k | the regression one is derived from classifier - it jus uses the same names | 21:33 |
yayo3 | also it would be nice to have ProbabilisticClassifier (or something) that has methods for returning class probabilities | 21:34 |
blackburn | sonney2k: now thinking about dimreduction domain | 21:34 |
serialhex | yayo3: fie | 21:34 |
serialhex | (archaic) Used to express distaste, disgust, or outrage. | 21:34 |
serialhex | Fie upon you, you devilish fool! | 21:34 |
yayo3 | that's important to some applications. however, since I was pretty sure I say clustering interface, it's possible I just missed it | 21:34 |
blackburn | sonney2k: do you like the way it used now? | 21:34 |
@sonney2k | blackburn, no I don't | 21:35 |
@sonney2k | shogun all started with a HMM I implemented a decade back and then an SVM | 21:35 |
@sonney2k | so only distributions and classifier are somehow well represented | 21:35 |
@sonney2k | but there are problems like: we have CDistanceMachine and CKernelMachine - IIRC they both derive from classifier | 21:36 |
blackburn | sonney2k: aha, so we will think about refactoring it | 21:36 |
@sonney2k | blackburn, I am very open for suggestions | 21:36 |
@sonney2k | it is not as easy though | 21:36 |
blackburn | sonney2k: the only thing I know should stay - it should be preproc | 21:37 |
blackburn | or not? :D | 21:37 |
@bettyboo | ;D | 21:37 |
yayo3 | that's hard | 21:37 |
@sonney2k | for example if you have a kernel machine and you derive a SVM from it - kernel machine has to be derived from classifier | 21:37 |
yayo3 | one method can be probabilistic and nonprobabilistic, kernel and regresion at the same time | 21:38 |
@sonney2k | at the same time support vector regression / kPCA etc should derive from kernel machine | 21:38 |
@sonney2k | yes that is the problem | 21:38 |
blackburn | sonney2k: eh, do you think it is possible? | 21:38 |
blackburn | sonney2k: I see no way to do it without multiple inheritance | 21:39 |
yayo3 | yeah | 21:39 |
@sonney2k | blackburn, well I don't know - but I don't want multiple inheritance in shogun | 21:39 |
@sonney2k | that is creating all sorts of other problems | 21:39 |
yayo3 | if the dependencies are not a tree in real world, it's pretty much impossible to make model without them | 21:40 |
alesis-novik | multiple inheritance is really difficult to manage properly in C++ | 21:40 |
@sonney2k | well we could do one thing: we could name it in a more generic way | 21:41 |
@sonney2k | say Method | 21:41 |
@sonney2k | and then KernelMethod | 21:41 |
@sonney2k | or Machine and KernelMachine etc | 21:41 |
@sonney2k | then it does not need to be named Classifier | 21:41 |
@sonney2k | or Regressor or Clusterer or so | 21:42 |
alesis-novik | Well, someone will probably be implementing EM, so they will have to deal with clustering | 21:42 |
@sonney2k | then we name the train function train() / apply() very generally | 21:42 |
blackburn | KernelMultipleKungFuClassifierMachine | 21:42 |
@sonney2k | blackburn, your proposed name is much preferred :D | 21:43 |
@bettyboo | sonney2k, funny | 21:43 |
blackburn | sonney2k: and we have one more proposal | 21:43 |
@sonney2k | bettyboo, these Russians are at times | 21:43 |
@bettyboo | sonney2k: And sometimes I like more explanation and less brevity | 21:43 |
blackburn | we should rename all constructors to Lenin | 21:43 |
alesis-novik | or we could go for the yesterday discussed one | 21:43 |
alesis-novik | blackburn, yes | 21:43 |
blackburn | and destructors for Stalin | 21:43 |
yayo3 | what about something similar to Java interfaces? | 21:43 |
blackburn | sonney2k: if you will agree, we will start just now | 21:44 |
alesis-novik | The general class would be CCCP, other things (like Ukraine or Belarus) would derive from it | 21:44 |
yayo3 | you can use inheritance as a way to mark what the method provides, but inherit no code | 21:44 |
blackburn | sonney2k: I pretty sure agree with yayo3 (being acknowledged with java se and ee too) :) | 21:45 |
blackburn | Kernel interface could derive only some compute functions and there could not be any problem with multiple inheritance | 21:46 |
yayo3 | say, can it do regression? inherit Regressor. can it do classification, inherit Classifier | 21:46 |
blackburn | yayo3: why we should? | 21:46 |
* sonney2k is waiting for the finalized proposal | 21:47 | |
yayo3 | blackburn: well it's generally non crazy way of using multiple inheritace | 21:47 |
alesis-novik | well interfaces in Java are something like abstract classes with no fields in cpp | 21:47 |
blackburn | yayo3: I mean I don't know why we should discriminate regression/classifying/etc | 21:47 |
yayo3 | blackburn: not discriminante | 21:48 |
blackburn | sonney2k: the final proposal is to rename the whole shogun | 21:48 |
serialhex | it almost sounds like modules in ruby... | 21:48 |
yayo3 | blackburn: it can do both and therefore inherit both. | 21:48 |
blackburn | yayo3: we could use 'interfaces' only for kernel or etc | 21:48 |
alesis-novik | Well, if you start doing it differently for different parts of shogun, it will get messy and hard to get in to | 21:49 |
* serialhex just noticed that shogun got 5 slots *SQUEE* | 21:49 | |
yayo3 | or that. also I think it would be nice to have the inheritance tree as flat as possible | 21:49 |
blackburn | we have to do some umls :D | 21:49 |
* sonney2k dies | 21:50 | |
serialhex | umls?? | 21:50 |
* serialhex is lost | 21:50 | |
alesis-novik | UMLs | 21:50 |
blackburn | yeap | 21:50 |
* serialhex is googling | 21:50 | |
alesis-novik | unified modelling language, was it? | 21:51 |
blackburn | sonney2k: we demand stalin and UMLs! | 21:51 |
blackburn | alesis-novik: yeap, I mean the class diagram could be more impressive than words :) | 21:51 |
@bettyboo | blackburn, ;> | 21:51 |
yayo3 | I'll make a quick example of what I have in mind and drop a link | 21:51 |
alesis-novik | blackburn, I was just remembering what UML actually stands for :) | 21:51 |
serialhex | ooh, umls look like fun! (NOT!!) | 21:51 |
* sonney2k feels like being in stalingrad a few decades back... | 21:52 | |
blackburn | just be happy that I not proposed to use Rational Unified Process :D | 21:52 |
serialhex | lol | 21:52 |
alesis-novik | I'm a bad software engineer, if I'm required to present UMLs I first code everything and then generate UMLs from it | 21:52 |
yayo3 | alesis-novik: doesn't that mean you're good software engineer? :) | 21:52 |
alesis-novik | that means I'm a (good or not) coder | 21:53 |
blackburn | I'm pretty sure that UMLs could be created prior :) | 21:53 |
alesis-novik | The whole engineering process of designing architectures is bleh for me | 21:53 |
blackburn | some of the UMLs doesn't make sense with code, state diagrams, etc | 21:54 |
alesis-novik | you SHOULD create UMLs prior in theory, I just can't be bothered to | 21:54 |
serialhex | umm... i usually just make scribbly notes on a pice of coffe-staned paper...with bubbles and arrows and notes on how it should be done | 21:54 |
blackburn | alesis-novik: yeap it ain't so funny at all :) | 21:54 |
serialhex | ...tho i've lost thoes sheets of paper more often then not :P | 21:55 |
blackburn | serialhex: oh, I like to see that kind of scheme for some boring difficult tangled enterprise level system :D | 21:55 |
alesis-novik | I had a course of software engineering in my undergrad where we basically had to come up with an architecture for a business | 21:55 |
serialhex | OOH!!!! i'd need lots of markers and posterboard!!! | 21:55 |
alesis-novik | One of the most boring courses for me | 21:56 |
serialhex | alesis-novik, i can imagine! | 21:56 |
blackburn | and balloons | 21:56 |
blackburn | you probably will need balloons | 21:56 |
serialhex | OOH!!! I GET TO USE REAL BALLOONS!?!?!?? | 21:56 |
alesis-novik | I like coding and thinking about algorithms, not thinking about SOA and stuff like that | 21:56 |
blackburn | alesis-novik: for some reasons software engineering is important too. may be for some things we discussed earlier :) | 21:57 |
serialhex | ok, i'm gonna need some big ones, some little ones, and some of those that the clowns use to make dogs and hats with! :P | 21:57 |
@bettyboo | serialhex, yep | 21:57 |
alesis-novik | No, I know it's important, it's just not something I enjoy | 21:57 |
blackburn | serialhex: and always use stalin | 21:58 |
blackburn | I have to stop joking about stalin :D | 21:58 |
serialhex | yeah, thats going to be the base class of 'all-that-is-evil-and-wrong-with-this-world' | 21:58 |
blackburn | serialhex: nope, just destruction scheme :D | 21:59 |
blackburn | 'here,guys, we will use stalin for garbage collecting' | 21:59 |
serialhex | well evil things usually destroy stuff... but they also screw it up in the process | 21:59 |
serialhex | :P | 21:59 |
@bettyboo | ;D | 21:59 |
blackburn | sonney2k: what about renaming shogun? | 22:00 |
blackburn | we really demand it! | 22:00 |
serialhex | so i'v noticed that bettyboo's responses have gotten much better recently | 22:00 |
@bettyboo | serialhex: I've only gotten around to reading the paper on it and trying to understand the maths | 22:00 |
@sonney2k | blackburn, into bettyboo? | 22:00 |
@bettyboo | sonney2k: nope | 22:00 |
serialhex | HAHAHAHAHA!!!!!! | 22:00 |
@sonney2k | hah! | 22:00 |
blackburn | ^^ | 22:00 |
blackburn | damn bingo | 22:00 |
serialhex | OMFG THAT WAS AWSOME!!!! | 22:00 |
alesis-novik | yet disturbing | 22:01 |
serialhex | whoever coded her needs a raise! | 22:01 |
@sonney2k | so blackburn anything against renaming Classifier to Machine? | 22:01 |
alesis-novik | Hail bettyboo - future ruler of mankind | 22:01 |
@sonney2k | and classify to apply() | 22:01 |
@bettyboo | alesis-novik: says april 25 / april 22 is a conflict resolution | 22:01 |
@sonney2k | ? | 22:01 |
serialhex | lol | 22:01 |
yayo3 | sonney2k: I'm working on example of the stuff we talked about earlier | 22:01 |
blackburn | sonney2k: I'm pretty sure I will have no difference :D | 22:01 |
@sonney2k | blackburn, ? | 22:02 |
blackburn | sonney2k: I mean there is no such difference | 22:02 |
@sonney2k | you mean it does not matter? or ? | 22:03 |
blackburn | yeap, that is my intricated mind produce, I think it doesn't matter | 22:03 |
blackburn | sonney2k: you just want to make scheme train() / apply(), right? | 22:04 |
yayo3 | that still doesn't make sense in clustering | 22:05 |
blackburn | yayo3: we could just use apply() | 22:05 |
CIA-8 | shogun: Soeren Sonnenburg master * rf6f47f3 / src/libshogun/features/SNPFeatures.cpp : Get histogram to reliably work in SNPFeatures - http://bit.ly/hEex7c | 22:05 |
@sonney2k | http://www.shogun-toolbox.org/doc/classshogun_1_1CClassifier.html | 22:05 |
blackburn | sonney2k: why classifier should be machine? | 22:06 |
alesis-novik | I feel machine would be less intuitive | 22:06 |
@sonney2k | if you look at this you will see that what is called classifier now is just a general Method or Machine | 22:06 |
blackburn | sonney2k: it is, but why machine? what should it change? | 22:07 |
@sonney2k | because it currently is a classifer/ clustering method / regression method | 22:07 |
blackburn | ah | 22:07 |
@sonney2k | it woudl fix the confusion that people have | 22:07 |
blackburn | sonney2k: I prefer supervised/unsupervised | 22:07 |
blackburn | but it will have own problems | 22:07 |
@sonney2k | semisupervised :) | 22:08 |
@sonney2k | what problems? | 22:08 |
serialhex | how do those algorithms work?? | 22:08 |
blackburn | sonney2k: for example multiple inheritance again | 22:08 |
@sonney2k | blackburn, ahh kernelmachine can be supervised or unsupervised | 22:09 |
@sonney2k | exactly | 22:09 |
@sonney2k | So Machine + apply is the only thing so far that is name wise not confusing | 22:09 |
blackburn | sonney2k: we really should to draw scheme | 22:09 |
serialhex | why not supervised, unsupervised & bi_supervised?? | 22:09 |
alesis-novik | semisupervised is having some data labelled and some labelled as far as I know | 22:10 |
alesis-novik | unlabelledŪ | 22:10 |
serialhex | i got ya the first time alesis-novik | 22:10 |
blackburn | sonney2k: so, don't you like multiple inheritance in shogun at all? no way? | 22:11 |
@sonney2k | blackburn, no way | 22:11 |
yayo3 | there's probably a way to make it non-crazy | 22:11 |
@sonney2k | big problems with templates, diamonds and possibly swig wrapped interfaces | 22:11 |
alesis-novik | semi-supervised methods seem to be a good path to research in ML, because labelling everything is expensive | 22:12 |
blackburn | sad :) we could realize some of these ideas with inheritance | 22:12 |
serialhex | and while i dont know much about how most of this stuff works (in practice) why not have a superclass for both supervised & unsupervised... and anything else (like semi-supervised) | 22:12 |
yayo3 | well, the "java-like inheritace" example is here: https://gist.github.com/918302 | 22:12 |
@sonney2k | my suggestion would be to keep things as they are for the time being and at the end of the summer when you are all experts do a reasonable renaming | 22:12 |
alesis-novik | I can see the possible problems sonney2k, things like this should be thought of when designing the thing, so if the decision was no multiple inheritance, I can see how that might cause problems | 22:13 |
serialhex | yayo3: that looks interesting | 22:13 |
blackburn | sonney2k: nice plan, and we have to do some schemes, no way to understand all the details in code | 22:13 |
yayo3 | serialhex: that's the java way. thich can be good or bad | 22:13 |
yayo3 | which* | 22:13 |
blackburn | yayo3: sure it good :D | 22:14 |
@bettyboo | yep | 22:14 |
blackburn | betty agrees | 22:14 |
@bettyboo | blackburn: Getting comfortable with ML concepts most definitely does not happen overnight. | 22:14 |
serialhex | yayo3: yeah, ruby could do this in a different way, using modules & mixins instead of multiple inheritance (as there is none of that in ruby) | 22:14 |
blackburn | oh, sorry, I thought i will learn everything this night :( | 22:14 |
* serialhex is getting afraid of bettyboo's learning rate... | 22:15 | |
serialhex | by the end of the summer we might just ask her for a few suggestions :P | 22:15 |
yayo3 | now that's some machine learning :P | 22:15 |
@bettyboo | yeah ;D | 22:15 |
blackburn | bettyboo: make examples | 22:16 |
serialhex | see what i mean? | 22:16 |
@bettyboo | blackburn: addpath('../examples/documented/matlab/graphical') | 22:16 |
serialhex | bettyboo: make vodka | 22:16 |
@bettyboo | serialhex: you just want to make scheme train() / apply(), right? | 22:16 |
serialhex | no no not that bettyboo!! make vodka! | 22:16 |
@bettyboo | serialhex: just see your email. yes, makes perfectly sense. I tend to even say that the dim reduction techniques open up a new field inside shogun - so I would even consider this proposal to be stronger | 22:16 |
blackburn | bettyboo: like vodka? | 22:16 |
@bettyboo | blackburn: likes :"Features.i:303: Warning 453: Can't apply (shogun::TSparse< uint64_t > **ARGOUT_SPARSE,int32_t *DIM1,int32_t *DIM2,int64_t *NNZ). No typemaps are defined. | 22:16 |
blackburn | bettyboo: make doc and vodka | 22:17 |
@bettyboo | blackburn: over 60 students applied so far - there is now way to make a fair ranking - and we seriously cannot handly more than 10 of you :`-( | 22:17 |
blackburn | bettyboo: vodka | 22:17 |
@bettyboo | blackburn: we should celebrate what do you want?? vodka? yager? beer?? | 22:17 |
blackburn | JUST SAY IT! | 22:17 |
serialhex | ...she's hiding her true intelligence from us... | 22:17 |
serialhex | she knows we know... lets just hope she's a benevolent ruler | 22:18 |
blackburn | "just see your email. yes, makes perfectly sense. I tend to even say that the dim reduction techniques open up a new field inside shogun - so I would even consider this proposal to be stronger" | 22:18 |
blackburn | hehe have I some competitors? | 22:18 |
@sonney2k | blackburn, one more pressing design decision is how to properly do a cross-validation framework in shogun. Is data a parameter or not, how can I split up the data into several parts (other than just hacking everything into the feature base classes) and what can we do when there are no training data (just a distance or kernel matrix is available) etc... | 22:18 |
blackburn | sonney2k: oh, you wrote me a letter :) | 22:18 |
yayo3 | sonney2k: is there any serious code reuse from inheritance now? | 22:18 |
blackburn | sonney2k: about CV you will have a huuuuuuge problems with design | 22:19 |
serialhex | yayo3: it seems to be the case from looking at the inheritnce pictures & functions in the shogun docs | 22:19 |
@sonney2k | yayo3, sure there is | 22:19 |
@sonney2k | yayo3, which letter? | 22:20 |
@sonney2k | blackburn, in which way? | 22:20 |
blackburn | sonney2k: in which way what? | 22:20 |
@sonney2k | problem with design | 22:20 |
@sonney2k | for CV? | 22:21 |
blackburn | in the way you described, you will have to think hardly to do it with 'beauty | 22:21 |
yayo3 | sonney2k: so that would make "don't inherit implementation, only interfaces" way rather nonworking, right | 22:21 |
yayo3 | sonney2k: and I never got any letter :( | 22:21 |
@sonney2k | blackburn, yes that is true | 22:21 |
@sonney2k | blackburn, which letter? | 22:21 |
@sonney2k | yayo3, sorry was supposed for blackburn | 22:21 |
blackburn | sonney2k: just copied bettyboo answer :) | 22:22 |
@bettyboo | blackburn: answering* | 22:22 |
@bettyboo | yep! | 22:22 |
blackburn | just interested to whom it was addressed because want to work on dim.reduction :) | 22:22 |
yayo3 | the problem with CV is that many methods themselves split data into training and testing data. | 22:23 |
alesis-novik | sonney2k, you also need to shuffle the training data, so you the whole CV would be a beast | 22:23 |
yayo3 | well, one of the problems with CV anyway. | 22:23 |
@sonney2k | yayo3, none here - all methods in shogun just get one fixed data set to operate on | 22:24 |
@sonney2k | alesis-novik, exactly | 22:24 |
@sonney2k | so I think the most reasonable approach is to store some extra index array in the features | 22:24 |
@sonney2k | then look up which indices to use | 22:24 |
blackburn | sonney2k: you really have to make some diagrams when discussing design, it would be easier _a lot_ | 22:24 |
alesis-novik | sonney2k, I think that would safe space and time for large dimensionality datasets | 22:25 |
yayo3 | some CV implementations just take random data. generates random indices and takes data from them | 22:27 |
yayo3 | or I think I saw it somewhere | 22:27 |
blackburn | yayo3: anyway you have to remember what you chose | 22:27 |
@sonney2k | blackburn, probably. | 22:27 |
@sonney2k | alesis-novik, that is the intention | 22:28 |
alesis-novik | yayo3, how would random data make sense when training? Unless it's generated from a specified distribution | 22:28 |
blackburn | alesis-novik: in CV makes | 22:28 |
@sonney2k | the problem is of course a) it makes the code difficult to read (indirect access in all the feature functions) and b) it is intrusive and has to be done in all feature classes | 22:29 |
blackburn | alesis-novik: sorry, not random data, random indices makes | 22:29 |
alesis-novik | blackburn, random sampling of data or the actual random data? | 22:29 |
@sonney2k | heiko drafted that btw already https://github.com/shogun-toolbox/shogun/pull/34/files | 22:29 |
alesis-novik | blackburn, random sampling makes sense and is encouraged, I haven't heard about random data CV | 22:31 |
@sonney2k | ahh and forgot to say - one needs to a subset of the data too so an artificial limit of the available data | 22:31 |
blackburn | alesis-novik: not understood you right, random data CV doesn't make sense | 22:31 |
alesis-novik | blackburn, that's why I was asking what yayo3 meant | 22:32 |
@sonney2k | I wish this could be done without hacking up each feature class separetely | 22:32 |
blackburn | alesis-novik: aha, he meant sampling | 22:32 |
yayo3 | yeah, sampling. sorry for being obtuse | 22:33 |
@sonney2k | but an extra IndirectIndexFeature class won't work - the basic feature class has no notion of get_vector or so | 22:33 |
alesis-novik | sonney2k, you can always create a new class and just take the data from CFeatures, question is the overhead and general usefulness | 22:33 |
@sonney2k | alesis-novik, yes but all the algorithms need to continue to work without knowing that they operate on a subset of the data | 22:34 |
alesis-novik | sonney2k, so there could be a dedicated class/algorithm for splitting data into CV folds I worry about the overhead though | 22:36 |
yayo3 | that's hard. reshuffling features whould be easier, but you'd need much more memory | 22:36 |
@sonney2k | not something I'd like to have... I am not so rarely training on datasets that rarely once fit in memory | 22:37 |
yayo3 | might be good idea to do both. indirect features and creating new "classic" features, because what's faster depends on availible memory and other things | 22:37 |
@sonney2k | ohh well... | 22:39 |
* sonney2k reviews this patch and hopes that does not ruin all the readability in shoguns features | 22:39 | |
* blackburn wonders why all are discussing one specific 'project' | 22:42 | |
alesis-novik | blackburn, what project did you apply to (I forget theses things) | 22:43 |
blackburn | alesis-novik: dim reduction | 22:43 |
alesis-novik | good :D | 22:43 |
blackburn | LLE, MDS, ISOMAP and SNE | 22:44 |
blackburn | (as I hope) :) | 22:44 |
alesis-novik | Actually, I'm trying to (after I'm done with the Gaussian patch) get the PCA and kPCA to work | 22:45 |
alesis-novik | Bach in a bit | 22:45 |
@sonney2k | blackburn, KFA? | 22:46 |
@mlsec | Nested CV would also be cool | 22:47 |
@sonney2k | heh | 22:55 |
@sonney2k | seems like heiko has some good plans for that.... | 22:56 |
-!- skydiver [c255a037@gateway/web/freenode/ip.194.85.160.55] has quit [Quit: Page closed] | 22:56 | |
yayo3 | hmm. the .mainpage files are generated, right? | 22:56 |
@sonney2k | yayo3, some of them yes | 23:05 |
yayo3 | they're pretty annoying :) | 23:06 |
yayo3 | I sent a little pull request (really minor stuff, I should probably hold on them and send them when there's more) | 23:08 |
blackburn | sonney2k: was away, sorry, not KFA | 23:08 |
yayo3 | and good night | 23:08 |
-!- yayo3 [~jakub@ip-78-45-113-245.net.upcbroadband.cz] has quit [Quit: leaving] | 23:09 | |
blackburn | sonney2k: thought you know my proposal :D | 23:09 |
@sonney2k | yayo3, you mean because they are not in git ignore | 23:09 |
@sonney2k | too late | 23:09 |
blackburn | he could make a commit, reverting some changes | 23:10 |
@sonney2k | blackburn, now we can accept only 1/5 of the students we wanted so you would of course have to do twice as much work ;-) | 23:10 |
blackburn | I did one when was a conflict with some header | 23:10 |
blackburn | sonney2k: oh! do you think I'm doing not much work? :) | 23:11 |
@bettyboo | blackburn, ;D | 23:11 |
@sonney2k | blackburn, is that a trap? shall I answer yes or no? | 23:12 |
blackburn | not a trap, just interesting :) | 23:12 |
@sonney2k | then I say no - lets see what your reactions are :D | 23:13 |
@bettyboo | :> | 23:13 |
* sonney2k does not know if no means yes or yes means no. | 23:13 | |
blackburn | ahahah | 23:13 |
blackburn | so what is the answer? | 23:14 |
blackburn | at least I'm trying to do all what I want/have to :) SergeyLisitsyn (32 commits, 1710 additions, 78 deletions) | 23:14 |
blackburn | sonney2k (2191 commits, 665702 additions, 6929 deletions) | 23:14 |
blackburn | oh, 665702 hehe | 23:14 |
blackburn | but really (not a trap or nearly), do you think I'm not doing much? | 23:15 |
@sonney2k | blackburn, I don't complain - not at all | 23:15 |
blackburn | how polite it is :D | 23:16 |
@sonney2k | I am very happy actually | 23:16 |
blackburn | I just don't want to start working on dim reduction | 23:16 |
blackburn | that's why I am not doing some very 'useful' | 23:17 |
@sonney2k | blackburn, I don't really understand why you don't want to start on dim reduction methods.... | 23:19 |
blackburn | just like heiko or etc | 23:19 |
blackburn | sonney2k: because I want to mind it a little more | 23:19 |
@sonney2k | as long as you don't have to work for days to do some changes it is ok | 23:19 |
@sonney2k | ok | 23:19 |
@sonney2k | makes sense | 23:19 |
blackburn | sonney2k: my proposal includes searching for ideas in up-to-date articles and etc | 23:20 |
blackburn | that is what I want to do in may | 23:20 |
@sonney2k | heiko has probably chosen one of the most difficult parts - it will really take time and potentially partial rewrites to do it nicely | 23:21 |
blackburn | but if it will be a positive for me, I could start design of dim reduction there | 23:21 |
blackburn | hehe there is a malloc in performancemeasures | 23:34 |
@sonney2k | blackburn, there sometimes has to be to interact with external languages that assume malloc'd memory | 23:36 |
blackburn | float64_t** det=(float64_t**) malloc(sizeof(float64_t**)); | 23:37 |
@sonney2k | blackburn, yeah - maybe at some point we rewirte the swig interfaces to use references instead of pointers | 23:40 |
* sonney2k yawns | 23:41 | |
@sonney2k | off to bed - cu all tomorrow | 23:42 |
blackburn | leaving too, see you | 23:46 |
-!- blackburn [~qdrgsm@188.168.5.124] has quit [Quit: Leaving.] | 23:46 | |
-!- alesis-novik [~alesis@188.74.87.84] has quit [Quit: I'll be Bach] | 23:53 | |
--- Log closed Thu Apr 14 00:00:36 2011 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!