--- Log opened Thu Apr 14 00:00:36 2011 | ||
-!- alesis-novik [~alesis@188.74.87.84] has joined #shogun | 02:12 | |
-!- josip [~josip@unaffiliated/josip] has joined #shogun | 02:13 | |
-!- serialhex [~quassel@99-101-149-136.lightspeed.wepbfl.sbcglobal.net] has quit [Remote host closed the connection] | 06:16 | |
-!- serialhex [~quassel@99-101-149-136.lightspeed.wepbfl.sbcglobal.net] has joined #shogun | 06:17 | |
-!- siddharth [~siddharth@117.211.88.150] has quit [Read error: Connection reset by peer] | 07:08 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 07:14 | |
-!- siddharth [~siddharth@117.211.88.150] has joined #shogun | 07:16 | |
-!- Tanmoy [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 08:32 | |
-!- blackburn [~qdrgsm@188.168.4.102] has joined #shogun | 08:50 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Quit: Page closed] | 09:39 | |
alesis-novik | Damn it, I totally forgot that sleeping is kind of important! | 09:49 |
---|---|---|
blackburn | :D | 09:51 |
* sonney2k just noticed the same thing | 09:52 | |
alesis-novik | I'll continue my work..... today as the case may be. | 09:52 |
@sonney2k | Endlessly yawning was never part of the plan | 09:52 |
alesis-novik | Good...morning? Time for some rest | 09:53 |
alesis-novik | See you all later | 09:53 |
blackburn | see you | 09:53 |
blackburn | sonney2k: why do you use train? do you work far from home? | 09:54 |
blackburn | just remembered that and interested :) | 09:55 |
-!- Tanmoy [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Quit: Page closed] | 09:58 | |
siddharth | sonney2k, is there any matrix multiplication function in shogun? | 10:01 |
siddharth | like there is dot product for vectors | 10:03 |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 11:22 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Ping timeout: 252 seconds] | 11:40 | |
-!- josip [~josip@unaffiliated/josip] has quit [Ping timeout: 258 seconds] | 11:54 | |
-!- josip [~josip@212.201.44.245] has joined #shogun | 11:55 | |
-!- josip [~josip@212.201.44.245] has quit [Ping timeout: 260 seconds] | 12:00 | |
-!- josip [~josip@212.201.44.245] has joined #shogun | 12:00 | |
-!- josip [~josip@212.201.44.245] has quit [Changing host] | 12:00 | |
-!- josip [~josip@unaffiliated/josip] has joined #shogun | 12:00 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 12:06 | |
blackburn | sonney2k: for some reason there is a EPreProcType in Preproc.h and it is _not_ used, but 'FOUR letter id' is used | 12:27 |
blackburn | am I right I should refactor it to use EPreProcType? | 12:28 |
@sonney2k | Yes makes sense. | 12:29 |
blackburn | sonney2k: what you mind about new abstract class CDimReductionPreproc? | 12:30 |
@sonney2k | The 4 letter stuff is legacy code to dump the object. This can go away | 12:30 |
@sonney2k | No. You could reuse code this way too | 12:31 |
blackburn | I mean it could be inherited from CSimplePreproc | 12:32 |
blackburn | doesn't make sense to you? | 12:36 |
@sonney2k | blackburn, was typing the above from my mobile... | 12:44 |
@sonney2k | I was thinking it might make sense to have a general preproc for DotFeatures | 12:45 |
blackburn | hm.. | 12:45 |
@sonney2k | but then again one cannot preprocess a feature matrix (only for simple features) | 12:45 |
blackburn | i see, will mind it | 12:46 |
@sonney2k | so it makes sense to have something on top | 12:46 |
@sonney2k | but dim reduction methods in the end give you some transformation function to extract or compute a subset of features | 12:47 |
@sonney2k | (since you yesterday were so upset about KFA - it is also just a dim reduction method) | 12:48 |
blackburn | ehh, was I upset about kfa? :) | 12:48 |
blackburn | I just said I'm not applying on KFA's task | 12:48 |
@sonney2k | blackburn, true. but you are applying to do dim reduction methods and KFA is also one heh | 12:50 |
blackburn | sonney2k: if things will go fast may be I could implement KFA too (if there will no student doing it), no problem | 12:50 |
blackburn | just wondering about I was upset about KFA :) | 12:50 |
* sonney2k needs to take lessons in non-offensive wording | 12:51 | |
blackburn | I really don't remember I talked about KFA! :D | 12:51 |
blackburn | the only thing was your question 'KFA?' and I answered something about I'm applying to other idea | 12:54 |
blackburn | ah, whatever :) | 12:54 |
* sonney2k is going through the most recent pull requests | 12:56 | |
* blackburn will make a pull request in 5 minutes in case everything is compiling now | 12:58 | |
-!- heiko1 [~heiko@infole-06.uni-duisburg.de] has joined #shogun | 13:02 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Ping timeout: 252 seconds] | 13:05 | |
@sonney2k | heiko1, nice work. I have a few more comments though (just going through the patch) | 13:22 |
heiko1 | hey there | 13:22 |
@bettyboo | hey heiko1 | 13:22 |
heiko1 | ok | 13:22 |
heiko1 | I also just found some errors | 13:22 |
heiko1 | pyhton_modular does not compile with the current | 13:23 |
heiko1 | But i only compile libshogun at the moment because it takes so long on my slow machine :) | 13:23 |
-!- heiko1 is now known as heiko | 13:23 | |
@sonney2k | I thought it compiles | 13:24 |
* sonney2k checks | 13:24 | |
heiko | sorry only for libshogun | 13:24 |
heiko | there is aproblem with the templates for SGInterface | 13:24 |
@sonney2k | mine compiles... | 13:24 |
heiko | no I meant my pull request repo | 13:24 |
blackburn | I have a problem with 'Kernel.py', do you have one? | 13:24 |
@sonney2k | heiko, I see | 13:25 |
@sonney2k | blackburn, that doesn't exist in the source | 13:25 |
blackburn | Unable to open file Kernel.py: Permission denied | 13:25 |
blackburn | make[1]: *** [Kernel_wrap.cxx] Error 1 | 13:25 |
blackburn | make[1]: Leaving directory `/home/qdrgsm/Documents/GSoC-SHOGUN/shogun_myfork/shogun/src/python_modular' | 13:25 |
blackburn | sonney2k: aha, but for some reason Kernel_wrap.cxx 'want it' | 13:25 |
@sonney2k | then erase Kernel_wrap.cxx | 13:25 |
blackburn | ok | 13:26 |
heiko | sonney2k I dont understand why you want virtual int32_t get_num_vectors_all() {return get_num_vectors();} | 13:26 |
@sonney2k | I thought that num_vectors == num_vectors_total if no subset exists. | 13:27 |
@sonney2k | and you will be overloading that function for all feature classes that support sub-indexing if I understand correctly | 13:28 |
@sonney2k | what do you propose? | 13:28 |
heiko | yes, but what if one wants to have the number of all vecs if a subset exists | 13:29 |
* sonney2k just now wants to change all of typemaps in shogun to actually create copies of the data themselves | 13:29 | |
@sonney2k | heiko, ahh, it will be overloaded in the classes that support it to return a variable num_vec_total | 13:30 |
heiko | yes indeed | 13:30 |
heiko | and it stays like your verison in the classes that do not support subsets yet | 13:30 |
blackburn | heiko: do you have MANY warnings of StringFeatures.h? | 13:31 |
heiko | yes | 13:31 |
-!- josip [~josip@unaffiliated/josip] has quit [Quit: Lost terminal] | 13:33 | |
heiko | sonney2k, do you use any code formater for coding style? | 13:33 |
heiko | I think it would be great to have one (at least for braces, white space, 80symbols per line etc) | 13:35 |
@sonney2k | heiko, I am through the patch now | 13:38 |
@sonney2k | heiko, no unfortunately no | 13:38 |
@sonney2k | tab size = 4 / no spaces | 13:38 |
@sonney2k | and what I wrote in README.Developer | 13:38 |
blackburn | oh, damn, it compiles slow with all these warnings | 13:39 |
@sonney2k | heiko, I think you still need this num_vectors_total function and then ok | 13:39 |
@sonney2k | blackburn, which warnings? | 13:39 |
heiko | I read this, but its hard to change your habits :) sometomes I oversee things because I am used to other styles, however ;) | 13:39 |
blackburn | sonney2k: ../shogun/features/StringFeatures.h:1267: warning: ignoring return value of ‘size_t fread(void*, size_t, size_t, FILE*)’, declared with attribute warn_unused_result | 13:39 |
@sonney2k | heiko, I agree difficult | 13:39 |
@sonney2k | blackburn, hmmhh | 13:39 |
blackburn | sonney2k: about 20 similar warnings for different lines | 13:40 |
blackburn | ha! even more | 13:40 |
@sonney2k | is this in current code? | 13:40 |
blackburn | sonney2k: seems so, because heiko have ones too | 13:40 |
@sonney2k | blackburn, I need you opinion on sth | 13:40 |
@sonney2k | have a bit of time? | 13:40 |
blackburn | of course | 13:40 |
@sonney2k | and now for sth completely different | 13:41 |
@sonney2k | we have all these swig wrapped interfaces and typemaps for them | 13:41 |
@sonney2k | Just now when heiko was implementing this sub-setting I recognize that it has a big issue: | 13:41 |
@sonney2k | it assumes that whenever a function set_vector(float64_*t vec, int32_t len) is wrapped that this function actually copies the data | 13:42 |
@sonney2k | now that is not something one wants to have inside C++ | 13:42 |
@sonney2k | wasting memory / cycles | 13:43 |
@sonney2k | so I think that this part of the code should be in the interface code only | 13:43 |
@sonney2k | blackburn, would you agree? | 13:43 |
blackburn | don't pretty sure I understood a problem | 13:43 |
blackburn | you mean that if we set_vector in subset | 13:44 |
blackburn | we will have to copy it? | 13:44 |
@sonney2k | blackburn, for all such functions not just subset | 13:44 |
@sonney2k | e.g. in simplefeatures there is a set_feature_matrix and also a copy_feature_matrix function | 13:44 |
blackburn | hm, why it should copy data? | 13:44 |
@sonney2k | blackburn, when we come from python/octave/R/... we cannot easily borrow a pointer to that data | 13:45 |
@sonney2k | it might get destroyed | 13:45 |
blackburn | ah | 13:45 |
blackburn | see now | 13:45 |
@sonney2k | also data representation might be totally different | 13:45 |
blackburn | so when we try to use subset we have to keep a set? | 13:45 |
@sonney2k | blackburn, same issue when returning a vector | 13:45 |
blackburn | and 10 subsets of one set will produce 10 sets, right? | 13:46 |
@sonney2k | blackburn, forget about the subset thing, just think in general terms like setting a vector and getting a vector for example. | 13:46 |
blackburn | ah | 13:46 |
blackburn | okay | 13:46 |
-!- dvevre [b49531e3@gateway/web/freenode/ip.180.149.49.227] has joined #shogun | 13:47 | |
@sonney2k | the problem is that now each function in libshogun has to have a copy_vector / set_vector function | 13:47 |
@sonney2k | hi dvevre, how is your online feature framework progressing ? | 13:47 |
@bettyboo | sonney2k, morning | 13:47 |
* sonney2k smacks betty | 13:47 | |
dvevre | sonney2k: hi. just got back today. | 13:48 |
blackburn | sonney2k: it is a non-proper way to do it, we have to make it encapsulated | 13:48 |
@sonney2k | dvevre, sry I forgot | 13:48 |
dvevre | i've made some basic stuff, hope to have something ready by tomorrow or so | 13:48 |
blackburn | I have to go for a 10-15 minutes | 13:48 |
dvevre | implementing parser and training in two separate threads | 13:48 |
@sonney2k | blackburn, I think the proper fix would be to do this in the typemaps | 13:49 |
@sonney2k | dvevre, sounds good | 13:49 |
blackburn | sonney2k: I'll mind it, ok? and will say something :) | 13:49 |
@sonney2k | sonney2k, yes talk to you when you are back l8r | 13:49 |
@sonney2k | dvevre, I hope you succeed :) | 13:50 |
@bettyboo | ;> | 13:50 |
dvevre | sonney2k: me too :) | 13:50 |
* blackburn is here | 13:57 | |
blackburn | sonney2k: but why we should use get/set for vectors in interfaces? | 13:57 |
@sonney2k | blackburn, if you are in python and have a numpy matrix, you would want it to become a shogun simplefeature object right? | 13:58 |
blackburn | aha | 13:58 |
blackburn | and when numpy matrix becames simplefeature it is copied, right? | 13:59 |
blackburn | sonney2k: but is there any another way? | 14:01 |
blackburn | any other* :D | 14:02 |
@sonney2k | blackburn, consider that we support several languages ... octave,r,python, and more to come - so I suspect not generally no | 14:02 |
@sonney2k | for python *maybe* one can set a feature object to read only or so | 14:02 |
@sonney2k | but though | 14:02 |
@sonney2k | tough | 14:02 |
heiko | sonney2k, a question. there is a function get_features which returs a pointer to the feature array. I think this may not be supported with subsets, right? | 14:02 |
@sonney2k | heiko, no - throw an error | 14:03 |
heiko | ill put an assert | 14:03 |
@sonney2k | SG_ERROR("xxx") | 14:03 |
@sonney2k | ok | 14:03 |
blackburn | sonney2k: and what can we do for solving it? | 14:04 |
heiko | oh another thing, there is a method copy_features .. here it might make sense to only copy the subset? | 14:04 |
@sonney2k | heiko, yes | 14:05 |
@sonney2k | blackburn, well what I would propose is to copy the data once but only in the interface code - not in libshogun/ code | 14:06 |
blackburn | sonney2k: yeap, seems to be better | 14:06 |
@sonney2k | blackburn, this way we only need one function set_features() not also copy_features() | 14:06 |
@sonney2k | in libshogun | 14:07 |
blackburn | understand now, if it is possible it would be exactly better | 14:07 |
@sonney2k | blackburn, we have the same problem with get_features() though | 14:07 |
@sonney2k | this should return just the pointer and never a copy | 14:07 |
@sonney2k | but currently (because of my badly designed typemaps it copies) | 14:08 |
@sonney2k | so I would do the same thing there too | 14:08 |
-!- bettyboo [~bettyboo@bane.ml.tu-berlin.de] has quit [Read error: Operation timed out] | 14:08 | |
@sonney2k | this is a lot of work though - and will touch almost all code | 14:08 |
-!- bettyboo [~bettyboo@bane.ml.tu-berlin.de] has joined #shogun | 14:09 | |
blackburn | hm, why it will touch e.g. algorithms? | 14:09 |
-!- mode/#shogun [+o bettyboo] by ChanServ | 14:09 | |
@sonney2k | blackburn, algorithms return vectors too | 14:10 |
@sonney2k | e.g. svm -> alphas | 14:10 |
blackburn | ah | 14:10 |
@sonney2k | lets think of the most convenient signature for these functions first | 14:10 |
@sonney2k | void set_vector(float64_t* vec, int32_t len); | 14:11 |
@sonney2k | I think set_* functions are pretty optimal | 14:11 |
@sonney2k | now for the ones doing get_* | 14:11 |
@sonney2k | void get_vector(float64_t* &vec, int32_t &len); ? | 14:12 |
@sonney2k | (it currently is void get_vector(float64_t** vec, int32_t* len); | 14:12 |
blackburn | yeap, float64_t* get_vector(int32_t &len) isn't so convenient | 14:12 |
blackburn | so we have no another way :) | 14:12 |
blackburn | sonney2k: don't you want to write a small wrapper to these vectors? | 14:13 |
@sonney2k | well void get_vector(float64_t* &vec, int32_t &len); | 14:13 |
@sonney2k | blackburn, like void get_vector(T_VEC<float64_t> &vec)? | 14:13 |
blackburn | yeap | 14:13 |
@sonney2k | afk | 14:14 |
blackburn | seems there will no serious overhead with using it | 14:14 |
blackburn | will be no* | 14:14 |
blackburn | sonney2k: pull request with some refactoring of preproc is in github | 14:17 |
-!- dvevre [b49531e3@gateway/web/freenode/ip.180.149.49.227] has quit [Quit: Page closed] | 14:34 | |
-!- dvevre [b49531e3@gateway/web/freenode/ip.180.149.49.227] has joined #shogun | 14:34 | |
-!- dvevre_ [b49531e3@gateway/web/freenode/ip.180.149.49.227] has joined #shogun | 14:49 | |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 14:50 | |
blackburn | sonney2k: I started to work on kPCA from the blank :) | 15:10 |
@bettyboo | yep | 15:10 |
blackburn | bettyboo: do you like it? | 15:10 |
blackburn | have to go now | 15:10 |
@bettyboo | blackburn: not something I'd like to have... I am not so rarely training on datasets that rarely once fit in memory | 15:10 |
blackburn | see you | 15:10 |
-!- blackburn [~qdrgsm@188.168.4.102] has quit [Quit: Leaving.] | 15:10 | |
@sonney2k | re | 15:16 |
CIA-8 | shogun: Sergey Lisitsyn master * ra9d9eaf / (15 files): Refactored preproc package for using EPreProcType enum - http://bit.ly/hODES8 | 15:19 |
CIA-8 | shogun: Heiko Strathmann master * r0bd912d / (2 files): | 15:21 |
CIA-8 | shogun: added getter for subset matrix that is swig compatible | 15:21 |
CIA-8 | shogun: small changed to std getter | 15:21 |
CIA-8 | shogun: method subset_idx_conversion() is now not pure virtual anymore | 15:21 |
CIA-8 | shogun: added comments - http://bit.ly/e1iCJa | 15:21 |
CIA-8 | shogun: Soeren Sonnenburg master * r166a5bc / (3 files): Merge branch 'master' of https://github.com/karlnapf/shogun - http://bit.ly/igIzav | 15:21 |
CIA-8 | shogun: Heiko Strathmann master * r601b8ab / src/libshogun/features/StringFeatures.h : | 15:21 |
CIA-8 | shogun: max_string_length adjustments with respect to subsets | 15:21 |
CIA-8 | shogun: copy_features() now works woth subsets | 15:21 |
CIA-8 | shogun: code cleanups - http://bit.ly/h7t6nU | 15:21 |
-!- CIA-8 was kicked from #shogun by bettyboo [flood] | 15:21 | |
-!- CIA-110 [~CIA@208.69.182.149] has joined #shogun | 15:21 | |
@sonney2k | mlsec, could betty please not kill our CIA bot? | 15:22 |
@bettyboo | sonney2k: betty boo is a bot | 15:22 |
dvevre_ | :) | 15:22 |
@sonney2k | who would have thought that | 15:22 |
@mlsec | what's the problem with cia and batty | 15:27 |
@sonney2k | mlsec, batty is using her bat | 15:27 |
@sonney2k | and kicking CIA | 15:27 |
@mlsec | CIA-8 kicked from #shogun by bettyboo: flood | 15:28 |
@bettyboo | mlsec: kicked and bored you | 15:28 |
@mlsec | i don't see a problem ;) "kicked and bored you" | 15:28 |
@mlsec | ok seriously, it's a flood proteciton. i will raise the limit accordingly | 15:29 |
@mlsec | so we can raise this to 30 messages in 60 seconds | 15:30 |
heiko | hi | 15:32 |
@bettyboo | yo heiko | 15:32 |
heiko | hi bettyboo ;) | 15:33 |
@bettyboo | heiko: machine learning rocks! {;)} | 15:33 |
-!- bettyboo [~bettyboo@bane.ml.tu-berlin.de] has quit [Remote host closed the connection] | 15:33 | |
heiko | sonney2k, what about the typemaps? any news? | 15:33 |
-!- siddharth [~siddharth@117.211.88.150] has quit [Ping timeout: 248 seconds] | 15:34 | |
@sonney2k | heiko, no not yet | 15:34 |
heiko | so, I need to change this setter? | 15:34 |
-!- bettyboo [~bettyboo@bane.ml.tu-berlin.de] has joined #shogun | 15:34 | |
-!- mode/#shogun [+o bettyboo] by ChanServ | 15:34 | |
heiko | because i want to build an example | 15:34 |
@sonney2k | not easy to do in a few minutes either | 15:34 |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Ping timeout: 252 seconds] | 15:34 | |
@sonney2k | then yes change things for now - it will take me probably one full day to change the typemaps | 15:35 |
heiko | alright | 15:35 |
@sonney2k | and I cannot do this incrementally really - it is one big patch | 15:35 |
heiko | so the parameters of the setter have to be pointers to scalars/matrix-pointers right? | 15:36 |
heiko | and also, i was wondering what i have to do in this Features.i file to access the functions from python | 15:36 |
@sonney2k | heiko, look at how it is done in features/SimpleFeatures (for get and copy feature_matrix | 15:39 |
@sonney2k | ) | 15:39 |
@sonney2k | in the *.i file you have to add %apply (int32_t* IN_ARRAY1, int32_t DIM1) {(int32_t* subset_idx, int32_t subset_len)}; | 15:41 |
@sonney2k | then the set_feature_subset function will work | 15:41 |
heiko | ok so all functions with that signature then will work? | 15:41 |
@sonney2k | heiko, btw subset_idx_conversion should be virtual | 15:41 |
@sonney2k | yes | 15:42 |
@sonney2k | you want to override it later in stringfeatures right? | 15:42 |
@sonney2k | (in both Features.h and StringFeatures.h) | 15:42 |
heiko | ah yes, i forgot | 15:43 |
heiko | which .i file do i need? modular or python_modular? | 15:44 |
@sonney2k | heiko, same for set_feature_subset | 15:44 |
@sonney2k | in StringFeatures | 15:44 |
@sonney2k | modular/Features.i | 15:44 |
@sonney2k | there is only one | 15:44 |
@sonney2k | the ones in *_modular are symlinks | 15:44 |
heiko | ah ok, did not see that in eclipse :) | 15:45 |
heiko | just to be sure: | 15:49 |
heiko | there are two functions for setting a feature vector in StringFeatures: | 15:49 |
heiko | set_feature_vector(int32_t num, ST* string, int32_t len) | 15:49 |
heiko | set_feature_vector(ST* src, int32_t len, int32_t num) | 15:49 |
heiko | one copies, and one doies not | 15:49 |
heiko | but the signature has to be different because of swig? | 15:50 |
@sonney2k | heiko, yes | 15:53 |
@sonney2k | that will go away when I will have done this part of the refactoring | 15:53 |
@sonney2k | no idea when though | 15:53 |
@sonney2k | heiko, if you do fixes like the virtual one etc - please always try to keep the patches small | 15:54 |
@sonney2k | it is much more readable for me this way (reviewing takes lots of time too ...) | 15:55 |
heiko | ok | 15:55 |
heiko | compile time is so long :( i always have to wait for ages to see if the changes compile somoothly :) | 15:55 |
@sonney2k | heiko, install ccache | 15:56 |
@sonney2k | and --enable-debug --disable-optimization | 15:56 |
* sonney2k recompiles | 15:56 | |
@sonney2k | . | 15:56 |
@sonney2k | . | 15:56 |
@sonney2k | . | 15:56 |
@sonney2k | done | 15:56 |
heiko | ok, i will try this | 15:57 |
heiko | oh i see there already is a line in the Features.i file :) | 16:06 |
@sonney2k | for subset_idx ? | 16:07 |
heiko | is the parameter name important? | 16:07 |
heiko | %apply (uint32_t* IN_ARRAY1, int32_t DIM1) {(uint32_t* src, int32_t len)}; | 16:07 |
@sonney2k | sure | 16:07 |
heiko | oh sorry | 16:07 |
heiko | (this stuff is new to me) | 16:07 |
@sonney2k | not really sth one wants to deal about when doing algorithm design | 16:08 |
@sonney2k | s/about/with | 16:09 |
heiko | hehe yes indeed :) | 16:09 |
heiko | oh, do you care about the int32_t uint32_t types? | 16:09 |
heiko | because i only used int32_t | 16:09 |
heiko | but only indices are stored | 16:10 |
@sonney2k | we (currently) use int32_t for indices | 16:10 |
heiko | ok nice | 16:10 |
heiko | this should be the line for the getter right? | 16:12 |
heiko | (int32_t** subset_idx, int32_t* subset_len) | 16:12 |
heiko | sorry | 16:12 |
heiko | %apply (int32_t** ARGOUT2, int32_t* DIM1) {(int32_t** subset_idx, int32_t* subset_len)}; | 16:12 |
@sonney2k | yes | 16:13 |
@sonney2k | wait argout1 | 16:14 |
@sonney2k | it is only a vector so 1-dimentsional | 16:14 |
heiko | ah yes sorry | 16:14 |
heiko | pull request is out | 16:20 |
heiko | i am going home now | 16:20 |
heiko | have a nice evening | 16:20 |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has left #shogun [] | 16:21 | |
@mlsec | whoah. every conversation is full of code snippets | 16:28 |
@mlsec | hopefully betty is tracking all this code for later fun | 16:30 |
@bettyboo | mlsec: anyway, I have no problem with doing this, may I ask you later about some details of it by email or in irc? | 16:30 |
@mlsec | yes, of course! | 16:31 |
* sonney2k wonders when betty will pass the turing test | 16:32 | |
@sonney2k | err did pass the turing test | 16:32 |
@mlsec | she passed already | 16:33 |
@mlsec | she convinced the committee with her profound knowledge of c++ coding and machine learning algorithms | 16:34 |
@mlsec | bettyboo, you are an c++ expert right? | 16:34 |
@bettyboo | mlsec: well it still needs to be found in the interfaces right? So you changed that too? | 16:34 |
@mlsec | hmm | 16:34 |
* sonney2k wonders what his well count is | 16:35 | |
@mlsec | well count? | 16:35 |
@sonney2k | nr of times I am saying 'well' | 16:36 |
@mlsec | is it a measure of confusion? | 16:37 |
@sonney2k | I suspect so | 16:40 |
@mlsec | bettyboo: did you know that any 2-class svm can be used as a one-class svm if only one class is provided and no bias term is learned? | 16:50 |
@bettyboo | mlsec: ah! http://help.github.com/fork-a-repo/ | 16:50 |
@mlsec | bettyboo: marius told me so, i need check this now. seems strange | 16:50 |
@bettyboo | mlsec: i think i saw at least one person on the mailing list talking about his PhD & working on this... | 16:50 |
@mlsec | ah, a phd student takes care of this already! | 16:51 |
@mlsec | !calc sqrt((0.73 * 0.73) * 2) | 17:03 |
alesis-novik | Good morning | 17:13 |
alesis-novik | sonney2k, did you have a chance to view my pull request? | 17:14 |
* mlsec waiting for compilation of m2crypto and soapy to finish | 17:28 | |
-!- dvevre_ [b49531e3@gateway/web/freenode/ip.180.149.49.227] has quit [Quit: Page closed] | 17:48 | |
-!- dvevre_ [b49531e3@gateway/web/freenode/ip.180.149.49.227] has joined #shogun | 17:49 | |
-!- dvevre [b49531e3@gateway/web/freenode/ip.180.149.49.227] has quit [Ping timeout: 252 seconds] | 18:06 | |
alesis-novik | I'll be Bach later, any comments on the code will be appreciated | 18:17 |
-!- alesis-novik [~alesis@188.74.87.84] has quit [Quit: I'll be Bach] | 18:17 | |
@mlsec | see you, Bach! | 18:23 |
@mlsec | ^_^ | 18:23 |
-!- dvevre_ [b49531e3@gateway/web/freenode/ip.180.149.49.227] has quit [Quit: Page closed] | 18:37 | |
CIA-110 | shogun: Soeren Sonnenburg master * r5eebbac / src/configure : | 18:42 |
CIA-110 | shogun: Fix configure to disable any kind of optimization | 18:42 |
CIA-110 | shogun: ...when --disable-optimization is selected. - http://bit.ly/dMtcxE | 18:42 |
CIA-110 | shogun: Soeren Sonnenburg master * r67272f1 / src/libshogun/preproc/KernelPCACut.cpp : fix typo - it is KPCA*CUT* - http://bit.ly/h6ooDv | 18:42 |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has joined #shogun | 18:43 | |
CIA-110 | shogun: Soeren Sonnenburg master * rd1be826 / src/libshogun/classifier/Perceptron.cpp : Merge branch 'master' of https://github.com/kisa12012/shogun - http://bit.ly/f14eiU | 18:49 |
serialhex | sonney2k: did you get my e-mail? | 18:52 |
CIA-110 | shogun: Soeren Sonnenburg master * r65d3d14 / src/libshogun/classifier/Perceptron.cpp : whitespace fix - http://bit.ly/gxb5xj | 18:53 |
@sonney2k | serialhex, I don't think so / when did you send it? | 18:54 |
@sonney2k | serialhex, ok I got it | 18:54 |
@sonney2k | didn't read it | 18:54 |
@sonney2k | ye | 18:54 |
@sonney2k | t | 18:54 |
@sonney2k | serialhex, now the variable name is variance right? | 18:55 |
@sonney2k | so just rename it into std_dev | 18:56 |
serialhex | ok... i thought so | 18:56 |
* serialhex makes changes to docs | 18:56 | |
@sonney2k | and to variable name I hope :) | 18:56 |
serialhex | yes, that too :D | 18:57 |
-!- dvevre_ [b49531e3@gateway/web/freenode/ip.180.149.49.227] has joined #shogun | 19:00 | |
serialhex | sonney2k: sent pull request | 19:03 |
serialhex | i figured that i'd also add the docs to the beginning of the float64_t version also, as that's how it is in the rest of the code | 19:09 |
-!- yayo3 [~jakub@ip-78-45-113-245.net.upcbroadband.cz] has joined #shogun | 19:24 | |
-!- blackburn [~qdrgsm@188.168.3.124] has joined #shogun | 19:34 | |
-!- alesis-novik [~alesis@188.74.87.84] has joined #shogun | 19:54 | |
alesis-novik | sonney2k, I saw the comments you did, I'll make the changes now | 19:54 |
yayo3 | hey, can someone test ExponentialKernel please? | 19:56 |
alesis-novik | Also, for the empty constructor the idea was to make a 1-dimensional 0 mean 1 variance Gaussian distribution | 19:56 |
blackburn | yayo3: what do you mean saying 'test'? | 19:56 |
yayo3 | I tested it from python modular interface and it seems to crash. I'm rebuilding it now to be sure it's not on my side | 19:56 |
blackburn | qdrgsm@blackburn-R519:~/Documents/GSoC-SHOGUN/shogun_myfork/shogun/examples/undocumented/python_modular$ python kernel_exponential_modular.py | 19:57 |
blackburn | Exponential | 19:57 |
blackburn | doesn't crash on my | 19:57 |
yayo3 | right. shogun rebuild then | 19:58 |
blackburn | anyway working | 20:00 |
alesis-novik | Do I need to add all fields for serialization? Or just the "important" ones, because some of the others are derived and stored for faster execution? | 20:05 |
-!- akhil_ [75d35896@gateway/web/freenode/ip.117.211.88.150] has quit [Quit: Page closed] | 20:13 | |
yayo3 | alesis-novik: I'd guess you need to either store everything, or recompute them on reserialization | 20:20 |
yayo3 | but I'm no sonney2k :) | 20:20 |
alesis-novik | I probably didn't search well enough, but if anyone can tell me how to serialize matrices of vectors (as in pointers), it would be helpful | 20:27 |
blackburn | just like vector, I guess | 20:29 |
alesis-novik | blackburn, as in? | 20:37 |
blackburn | hm.. | 20:38 |
blackburn | will look for some example | 20:38 |
blackburn | alesis-novik: there is methods of m_parameters: add_vector(..) and add_matrix(..) | 20:40 |
blackburn | just use it | 20:40 |
alesis-novik | many thanks blackburn | 20:40 |
blackburn | np :) | 20:41 |
yayo3 | hmm | 20:41 |
alesis-novik | blackburn, thoughts on adding everything or just essentials for serialization? | 20:41 |
yayo3 | it appears the testfiles link to ../data but need files in ../data/toy | 20:42 |
blackburn | yayo3: yeap, don't understand it too | 20:42 |
blackburn | alesis-novik: I don't pretty sure | 20:42 |
blackburn | alesis-novik: for example https://github.com/shogun-toolbox/shogun/blob/master/src/libshogun/preproc/RandomFourierGaussPreproc.cpp | 20:43 |
yayo3 | I think some guy (serialhex, maybe?) added them just a few days ago | 20:43 |
yayo3 | I wonder if he meant something by that | 20:43 |
blackburn | alesis-novik: 73-86 lines | 20:43 |
blackburn | alesis-novik: seems that everything is serializing there | 20:43 |
@mlsec | w | 20:45 |
@mlsec | oops wrong terminal | 20:45 |
@mlsec | ;) | 20:45 |
blackburn | wer ist das? :D | 20:48 |
@bettyboo | HA blackburn | 20:48 |
@mlsec | anyway, time for some lunch | 20:49 |
-!- yayo3 [~jakub@ip-78-45-113-245.net.upcbroadband.cz] has quit [Quit: leaving] | 21:06 | |
serialhex | mlsec: what time zone are you in? pacific?? | 21:13 |
serialhex | (though i just had breakfast & i'm on the east coast ~3:15pm here so maybe i shouldnt speek :P ) | 21:14 |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has joined #shogun | 21:14 | |
heiko | hi, anybody here? | 21:19 |
@sonney2k | logo | 21:19 |
heiko | na logo :) | 21:19 |
heiko | wanted to ask about this registering of cross-val parameters: do you already have any clues how this should be done? | 21:20 |
heiko | do we actually have to talk in englisch btw? :) | 21:20 |
alesis-novik | That or Russian (majority rules) :D | 21:22 |
heiko | hehe :) what about french? ;) | 21:22 |
alesis-novik | serialhex, I had breakfast at 6 PM today | 21:22 |
blackburn | wo kann ich tanyen gehen? | 21:22 |
blackburn | koennen Sie ein restaraunt empfehlen? | 21:23 |
blackburn | :D | 21:23 |
@bettyboo | ;> | 21:23 |
heiko | sehr gut | 21:23 |
blackburn | ich bin allergisch gegen Penizillin | 21:23 |
@sonney2k | blackburn, good to know | 21:24 |
@sonney2k | heiko, I haven;t thought this part through | 21:24 |
blackburn | sonney2k: just opened german phrase book :D | 21:24 |
blackburn | es schmeckt ausgezelchnet :D | 21:24 |
blackburn | wir sind touristen | 21:25 |
blackburn | I really don't sure that I talking hehehe | 21:25 |
blackburn | what* | 21:25 |
@sonney2k | heiko,what I would try is to use the framework we currently use for serialization and either add a flag there (I am a x-validation parameter) or have a new variable m_xval_parameter and register the variables in there the same way as m_parameter works | 21:27 |
blackburn | guys, give me a lesson of German :D | 21:27 |
@sonney2k | blackburn, Eierlegendewollmilchsau | 21:28 |
blackburn | sonney2k: nice, what does it mean? | 21:28 |
@sonney2k | blackburn, look in the shogun docs - the elwms interface :D | 21:28 |
blackburn | I saw it earlier | 21:28 |
heiko | hehe eggproducingwoolmilkpig :D | 21:29 |
blackburn | ehehe | 21:29 |
blackburn | sonney2k: человеконенавистничество | 21:29 |
@sonney2k | heiko, does that sound like a plan? | 21:29 |
heiko | sonney2k: yes, i would prefer the second variant | 21:29 |
heiko | sonney2k, to have this variable in the uppermost cross-val class | 21:30 |
heiko | the CParameterSetting does not need the cross-val parameters, right? | 21:31 |
heiko | but CSettingGenerator and eventually some classes that use this one | 21:31 |
alesis-novik | sonney2k, submitted some of the changes | 21:31 |
alesis-novik | I kind of want to keep the empty constructor constructing the 0 mean 1 variance Gaussian | 21:32 |
@sonney2k | heiko, I think this varialbe should be in CSGObject | 21:32 |
heiko | sonney2k, really? why? most of the objects wont need it | 21:32 |
@sonney2k | heiko, well but at least in CClassifier (aka CMethod ) | 21:34 |
@sonney2k | but I am a bit afraid that this may not be enough | 21:36 |
@sonney2k | it could be that a preprocessor has some parameter | 21:36 |
@sonney2k | or even a feature vector for scaling | 21:36 |
heiko | how should the cross-validation should take place? at which point are the subsets created? | 21:37 |
@sonney2k | Some ModelSelection class should get as argument what kind of data splitting, what kind of performance measure, which model | 21:40 |
blackburn | what's up with warnings at StringFeatures.h? | 21:43 |
heiko | still there | 21:44 |
blackburn | I see :) but why it appeared? | 21:44 |
CIA-110 | shogun: Justin Patera master * rc884b90 / src/libshogun/lib/Mathematics.h : Changed "variance" to "std_dev" as it actually uses the standard deviation to shift the normal distribution, not variance. - http://bit.ly/eAdNmY | 21:44 |
CIA-110 | shogun: Justin Patera master * r05a2219 / src/libshogun/lib/Mathematics.h : Small edit in comments - http://bit.ly/fmMqPL | 21:44 |
CIA-110 | shogun: Soeren Sonnenburg master * rf14d1d6 / src/libshogun/lib/Mathematics.h : Merge branch 'master' of https://github.com/serialhex/shogun - http://bit.ly/htnvpa | 21:44 |
heiko | i havent looked at it yet | 21:45 |
blackburn | IIRC for gaussian variance is standard deviation | 21:45 |
blackburn | or not? | 21:45 |
blackburn | not sure with english terminology | 21:45 |
blackburn | ah, variance is squared, ok | 21:46 |
heiko | sonney2k, i think i am missing something, why does the classifier class has to know the cross-val parameters? i thought multiple classifier instances are created, or one which is changed. But I thought the logic takes places in this ModelSelection class and the classifier stay more or less the same | 21:47 |
@sonney2k | heiko, ehh could you please flip th e int32_t subset_len and int32_t* subset_idx ? | 21:47 |
@sonney2k | or did you update that already in the patch? | 21:48 |
@sonney2k | I mean first the pointer then the length | 21:48 |
heiko | i think i have done this | 21:49 |
heiko | yes, just checked | 21:49 |
blackburn | * 16 of 60 contributed and seems there will be no more | 21:49 |
alesis-novik | blackburn, my patch is in the queue | 21:49 |
blackburn | alesis-novik: is it your first patch? | 21:50 |
blackburn | I count it including you too | 21:50 |
@sonney2k | heiko, line 287 in Features.cpp ? | 21:50 |
alesis-novik | I wen't unnoticed before the pull explanation, so I rewrote some things and added some things and changed some things | 21:50 |
blackburn | anyway, that 'contribution' is really easy way to terrify some weak candidates :D | 21:51 |
@sonney2k | heiko, or is that an additional function? | 21:51 |
heiko | sonney2k, yes, the swig function has a correct signature, line 296 | 21:51 |
alesis-novik | Next I'll patch PCA so we can choose the number of principal components or a percentage of variance explained | 21:51 |
@sonney2k | heiko, ahh so the other is for non-swig | 21:52 |
@sonney2k | ok | 21:52 |
* sonney2k really has to fix that | 21:52 | |
alesis-novik | And fix anything else sonney2k might not like in Gaussian | 21:52 |
CIA-110 | shogun: Heiko Strathmann master * r06a9d3e / (3 files in 3 dirs): Merge remote branch 'upstream/master' - http://bit.ly/fChcZh | 21:52 |
CIA-110 | shogun: Heiko Strathmann master * r2da8fc1 / src/libshogun/features/Features.cpp : remove unnecessary NULL check - http://bit.ly/h4Ko6N | 21:52 |
CIA-110 | shogun: Soeren Sonnenburg master * r698417c / (4 files in 2 dirs): Merge branch 'master' of https://github.com/karlnapf/shogun - http://bit.ly/i6gbdk | 21:52 |
CIA-110 | shogun: Heiko Strathmann master * r57b2e4f / src/modular/Features.i : corrected line for setter of subset indices of CStringFeatures - http://bit.ly/fmkz2T | 21:52 |
CIA-110 | shogun: Heiko Strathmann master * ra334ac8 / src/libshogun/features/Features.cpp : corrected signature for setter functions - http://bit.ly/h0vojC | 21:52 |
heiko | blackburn, your example for the distantsegemnts kernel needs the shogun-data package? because it produces an error when i start it without it | 21:53 |
heiko | blackburn, so what data is in your examples/undocumented/data dir? mine is empty | 21:54 |
@sonney2k | heiko, I finally understand your question. Well e.g. GaussianKernel needs to be able to register its parameters such that one can ask the gaussian kernel what parameters it has and also set them | 21:55 |
@sonney2k | blackburn, what are these warnings you are talking about? | 21:56 |
heiko | sonney2k, but arent these parameters independet from cross-validation? | 21:58 |
heiko | i mean, they are fixed for one fold arent they? | 21:58 |
-!- dvevre_ is now known as dvevre | 21:58 | |
@sonney2k | heiko, we have a misunderstanding I think | 22:01 |
heiko | yes, indeed :) | 22:01 |
heiko | about the word: parameter | 22:02 |
@sonney2k | I mean you have to specify the parameters at some point I mean you have to assign e.g. kernel_width=1.3 to the gaussian parameter | 22:02 |
heiko | and these parameters, you want to add to an extra variable x-fold-parameters? | 22:03 |
heiko | (btw these warnings blackburn talks about come from ignored return values when using fread) | 22:04 |
@sonney2k | heiko, I want to register the 'kernel_width' yes but not the value | 22:06 |
heiko | so all classes register the parameters that have to be found by cross-val. then the ModelSelection class sees what parameters are registered and sets these during the search | 22:08 |
@sonney2k | heiko, well the modelselction class generates a CParameterConfiguration or so that then is set via set_configureation(par) | 22:11 |
alesis-novik | heiko, what search is this? | 22:12 |
CIA-110 | shogun: Soeren Sonnenburg master * r32b1b14 / (4 files in 2 dirs): Merge branch 'master' of https://github.com/yayo3/shogun - http://bit.ly/ig0hPd | 22:15 |
blackburn | sorry, I'm here now | 22:16 |
blackburn | (11:53:47 PM) heiko: blackburn, your example for the distantsegemnts kernel needs the shogun-data package? because it produces an error when i start it without it --- yes, it is | 22:16 |
blackburn | (11:54:45 PM) heiko: blackburn, so what data is in your examples/undocumented/data dir? mine is empty -- it should contain shogun-data/toy files | 22:16 |
blackburn | (11:56:26 PM) sonney2k: blackburn, what are these warnings you are talking about? --- ../shogun/features/StringFeatures.h:1305: warning: ignoring return value of ‘size_t fread(void*, size_t, size_t, FILE*)’, declared with attribute warn_unused_result | 22:17 |
blackburn | heiko, and btw, examples/undocumented/data is symlink to ../data | 22:17 |
blackburn | sonney2k: don't you have these warnings? | 22:18 |
* sonney2k compiles again | 22:18 | |
heiko | sonney2k, blackburn, i just removed these warnings in StringFeatures by adding an ASSERT for this ignored return data of fread() | 22:19 |
blackburn | heiko: is it in master now? | 22:20 |
heiko | sent pull request | 22:20 |
@sonney2k | heiko, just doing ASSER(fread(xxx)) is a bit dangerous | 22:20 |
heiko | alesis-novik, search for optimal parameters in cross-validation | 22:21 |
@sonney2k | ASSERTS might define as nothing | 22:21 |
@sonney2k | so the fread would not be there | 22:21 |
alesis-novik | heiko, so your project is creating this massive CV platform? | 22:21 |
@sonney2k | alesis-novik, yes | 22:22 |
@sonney2k | probably the project with the highest difficulty ... | 22:22 |
@sonney2k | level | 22:22 |
heiko | sonney, but if the output is wrong of wrong length, the function fails anyway | 22:22 |
alesis-novik | Sounds like it | 22:22 |
alesis-novik | If nothing else, the engineering part sounds very difficult | 22:23 |
alesis-novik | what are the current thoughts on how to make the parameter search work? | 22:23 |
@sonney2k | heiko, yes but an ASSERT can be defined as nothing so ASSERT( some expression ) -> <nothing here> | 22:24 |
blackburn | sonney2k: what's up with problem on copying data vectors? | 22:25 |
@sonney2k | heiko, so the right fix would be to do if (!expr) SG_ERROR("bla") | 22:25 |
@sonney2k | blackburn, haven't done anything | 22:25 |
@sonney2k | yet | 22:25 |
heiko | ok, i will fix it | 22:25 |
blackburn | sonney2k: I'd like to do some structuring in libshogun | 22:26 |
blackburn | sonney2k: have any ideas? | 22:26 |
@sonney2k | blackburn, lenin or stalin? | 22:26 |
blackburn | sonney2k: it is good at all but neither | 22:26 |
blackburn | today I'll not joke about vodka and stalin, it is an exceptional day :) | 22:27 |
@sonney2k | blackburn, what exactly do you want to do? | 22:27 |
blackburn | sonney2k: want to do more strictly classes, etc. may be we could find weak design somewhere | 22:28 |
blackburn | oh! one idea | 22:29 |
blackburn | sonney2k: in preproc base class contains constructor with name and type | 22:29 |
@sonney2k | blackburn, the only idea I have is rename CClassifier into CMethod and change classify into apply(), same for CLinearClassifier -> CLinearMethod | 22:29 |
@sonney2k | or Machine :) | 22:30 |
blackburn | and we may do it similar way in classifiers | 22:30 |
alesis-novik | blackburn, I'd wait before the submission amount goes down a bit (which it will when the actual selected students are announced) | 22:31 |
blackburn | alesis-novik: why? | 22:31 |
blackburn | sonney2k: CPreProc(const char* name, EPreProcType type); | 22:31 |
blackburn | sonney2k: and may be Classifier(const char* name, EClassifierType type) will be a good practice too | 22:32 |
alesis-novik | Well, if you'll start changing the structure and not only names, there might be merging problems | 22:32 |
@sonney2k | I have mixed feelings about this too but it is difficult also when we do this too late | 22:32 |
blackburn | sonney2k: anyway, we should do it _same_ | 22:33 |
blackburn | because in classifiers we shadow virtual get_name() and get_type() | 22:33 |
blackburn | but in preproc just use this constructor | 22:33 |
alesis-novik | I'd stick to the idea of doing this after students are announced, unless the changes can't really break anything for other people | 22:33 |
@sonney2k | blackburn, that is true | 22:34 |
blackburn | alesis-novik: think my propose ^^ will not change anything | 22:34 |
blackburn | alesis-novik: just some internal 'design' issue | 22:34 |
alesis-novik | true | 22:35 |
alesis-novik | going away for a bit. Going to start on PCA if there are no comments for the current pull | 22:36 |
blackburn | sonney2k: the only reason why overloading is better - we don't have to store char* name in each object, just in class :) | 22:36 |
blackburn | alesis-novik: I have started working on kPCA from the blank today, may be we will exchange with some experience :) | 22:36 |
@sonney2k | blackburn, I think it is better to only have this in the virtual get_name function | 22:38 |
alesis-novik | blackburn, I was going to work on that next, but it has more to do with your project anyway. For PCA I just want to make more principal component selection options (number, percentage of variance explained) | 22:38 |
heiko | sonney2k, so i will start with adding a x-cross-validation parameter set to CSGObject and use the gaussian kernel as test for it | 22:39 |
blackburn | sonney2k: why? using constructor seems to me better.. | 22:39 |
@sonney2k | blackburn, otherwise we need to call the base constructor *everywhere* with the name | 22:40 |
@sonney2k | blackburn, when we have 3 constructors, we have to put the name there 3 times | 22:40 |
blackburn | sonney2k: sure, but virtual is duplicating code more | 22:41 |
heiko | @all going to bed, good night! | 22:41 |
blackburn | my opinion is we should try to reduce that kind of methods in concrete algos | 22:41 |
blackburn | it could be more 'readable' | 22:41 |
@sonney2k | heiko, yes | 22:42 |
@sonney2k | heiko, did you do the warning fix? | 22:42 |
heiko | yes | 22:42 |
@sonney2k | (before you leave) | 22:42 |
heiko | ;) | 22:42 |
@sonney2k | let me check | 22:42 |
heiko | yes ;) | 22:42 |
@sonney2k | heiko, yes thanks a lot - have a nice sleep :D | 22:43 |
@bettyboo | ;D | 22:43 |
heiko | np, good night! :) | 22:43 |
-!- heiko [~heiko@infole-06.uni-duisburg.de] has left #shogun [] | 22:43 | |
blackburn | you all have healthy sleep schedule :D | 22:44 |
@sonney2k | who? | 22:45 |
blackburn | hm, forgot you don't :) | 22:45 |
blackburn | I getting sleep at 2-3 with all that shoguning and javaeeing :) | 22:46 |
@sonney2k | blackburn, you are not alone | 22:46 |
@sonney2k | blackburn, look at CGaussianKernel.h / cpp | 22:47 |
@sonney2k | it has 3 constructors | 22:47 |
-!- dave [d8a57e6e@gateway/web/freenode/ip.216.165.126.110] has joined #shogun | 22:47 | |
@sonney2k | now how do you want the name to be registered there? | 22:47 |
-!- dave is now known as Guest79208 | 22:47 | |
@sonney2k | hmmhh heiko the warnings are now here too :) | 22:48 |
blackburn | CGaussianKernel::CGaussianKernel(int32_t size, float64_t w) | 22:48 |
blackburn | : CDotKernel("Gaussian",K_GAUSSIAN(??),size) | 22:48 |
blackburn | sonney2k: why not? | 22:48 |
@sonney2k | blackburn, yes but 3 times? | 22:49 |
blackburn | sonney2k: in preproc this practice is everywhere | 22:49 |
Guest79208 | Does anyone have any advice for how best to split the C values for a linear classifier with classes with unequal sizes? (e.g. 90% 1, 10% -1). Using the same value for both often results in a classifier predicting only the larger class. | 22:49 |
@sonney2k | Guest79208, optimize for auROC not accuracy and adjust the bias manually to the fpr-threshold you want | 22:50 |
@sonney2k | blackburn, yes but is this good or not? | 22:51 |
blackburn | sonney2k: seems that we should remove it in preproc, cause we overload it everywhere | 22:51 |
blackburn | sonney2k: may be it will cause some errors while development (when smb forgot to write it in constructor or mistype, etc) | 22:51 |
@sonney2k | blackburn, I wrote both things but I cannot tell what is better. | 22:51 |
@sonney2k | I am afraid of mistyping | 22:52 |
@sonney2k | it would be great if one could do the get_name() automatically | 22:52 |
blackburn | sonney2k: but it could be better for more 'compact' algorithms | 22:52 |
@sonney2k | yes | 22:52 |
Guest79208 | "optimize for auROC" - is this an option for training the svm, or do you mean manually adjusting the bias after training to optimize auROC? | 22:52 |
Guest79208 | sorry - dumb question - bias shouldn't change auROC. | 22:52 |
@sonney2k | Guest79208, but accuracy and that is what you seem to be (wrongly) optimziing for | 22:53 |
@sonney2k | accuracy is a useless measure when data is unbalanced | 22:53 |
* blackburn feels himself as moron | 22:55 | |
blackburn | :D | 22:55 |
CIA-110 | shogun: Heiko Strathmann master * rbfcc8d3 / src/libshogun/features/StringFeatures.h : removed a bunch of warnings caused by an ignored return type by introducing asserts for the result of fread() calls. - http://bit.ly/i1CkhN | 22:55 |
CIA-110 | shogun: Heiko Strathmann master * r82d8e34 / (5 files in 3 dirs): Merge remote branch 'upstream/master' - http://bit.ly/hwjqIb | 22:55 |
CIA-110 | shogun: Heiko Strathmann master * r710667c / src/libshogun/features/StringFeatures.h : replaced ASSERT by an if-check and SG_ERROR - http://bit.ly/fEbZFs | 22:55 |
CIA-110 | shogun: Soeren Sonnenburg master * rcf8098f / src/libshogun/kernel/GaussianKernel.cpp : add space after if - http://bit.ly/i4uIMT | 22:55 |
CIA-110 | shogun: Soeren Sonnenburg master * r76148d1 / src/libshogun/features/StringFeatures.h : Really fix warnings in stringfeatures. - http://bit.ly/h0SDIO | 22:55 |
@sonney2k | alesis-novik, is haipengwang doing the same thing like you intend to do? GMMs? | 22:56 |
alesis-novik | sonney2k, I'd like to do EM for GMMs, yes | 22:57 |
@sonney2k | alesis-novik, https://github.com/shogun-toolbox/shogun/pulls | 22:57 |
@sonney2k | then him too | 22:57 |
blackburn | sudden | 22:58 |
alesis-novik | Competition I guess | 22:58 |
blackburn | I bet on Lithuania! | 22:59 |
blackburn | hey-hey Litva vpered! | 22:59 |
@sonney2k | Has haipengwang attended any IRC sessions so far? | 23:00 |
blackburn | sonney2k: he has too complex nick, we don't know | 23:00 |
@sonney2k | blackburn, *lol* | 23:00 |
blackburn | ah, it's a name, damn | 23:00 |
blackburn | btw I bet you all will not spell my last name right :D | 23:01 |
alesis-novik | I'd probably have an EM class above it. While GMM is the most popular one, there's no reason to limit ourselves in the long run | 23:01 |
@sonney2k | alesis-novik, hmmhh I think it makes more sense to model a single gaussian like you do derived from CDistribution. Then have another class CombinedDistribution that is a mixture of distributions | 23:02 |
@sonney2k | this way you could potentially do em-style updates for not just gaussian mixtures | 23:02 |
@sonney2k | blackburn, so will you change preprocs then? and why did you rewrite kpcapreproc? don't like it? | 23:03 |
blackburn | sonney2k: may be I will produce simplier code, just exercising | 23:04 |
blackburn | sonney2k: if you want I will | 23:04 |
alesis-novik | ** Singular Value Decomposition of Full Covariance Matrix. */ not sure what he meant here | 23:04 |
blackburn | sonney2k: anyway I didn't it (rewrited kPCA) so far and just plan it | 23:05 |
@sonney2k | blackburn, do you have any idea how we could avoid having to write the get_name function at all? | 23:05 |
@sonney2k | some fancy macro? | 23:05 |
@sonney2k | or is this a lost cause? | 23:05 |
blackburn | static const char* name = "GAUSSIAN SHIT"; | 23:05 |
blackburn | or I don't know | 23:06 |
blackburn | use constructors | 23:06 |
@sonney2k | but then we have to write it in constructors | 23:06 |
@sonney2k | same thing to me | 23:06 |
blackburn | may be we could just #DEFINE NAME "GAUSSIAN THING" | 23:06 |
blackburn | CDotKernel(NAME) | 23:06 |
blackburn | don't make sense at all | 23:06 |
alesis-novik | sonney2k, did you try my new patch and see if you run out of memory (hopefully there are no leaks) | 23:07 |
@sonney2k | won't look any more readable | 23:07 |
blackburn | sonney2k: will look less readable :D | 23:07 |
blackburn | warnings are out! let's drink some vodka | 23:08 |
blackburn | sonney2k: so, are we changing preproc to 'main style'? | 23:09 |
@sonney2k | alesis-novik, not yet - too many patches and whenever I think I am done with one there is another one popping up :) | 23:09 |
@bettyboo | ;D | 23:10 |
@sonney2k | blackburn, yup | 23:10 |
@sonney2k | mainstream music | 23:10 |
blackburn | so wait for another pull request :D | 23:10 |
@sonney2k | dostoprimetschatelnosti | 23:10 |
blackburn | nice | 23:11 |
blackburn | but why достопримечательности? | 23:11 |
* sonney2k copies this word to his clipboard | 23:11 | |
@sonney2k | that was my favorite russian word | 23:11 |
@sonney2k | alesis-novik, reviewing now | 23:12 |
blackburn | sonney2k: davay davay rabotatsh was so funny, will not forgot it :) | 23:12 |
alesis-novik | thanks sonney2k | 23:12 |
blackburn | btw, 'rabotay' | 23:12 |
alesis-novik | nada Fedia, nada | 23:12 |
alesis-novik | or rather nado | 23:13 |
alesis-novik | how would you type "drink" without Cyrillic | 23:13 |
blackburn | пьяный? | 23:13 |
blackburn | ah | 23:14 |
blackburn | without | 23:14 |
blackburn | oh f-ck | 23:14 |
blackburn | :D | 23:14 |
blackburn | pyaniy | 23:14 |
@sonney2k | blackburn, davay davay rabotatsh!!! | 23:14 |
@sonney2k | blackburn, this is not supposed to be fun :-P | 23:14 |
alesis-novik | not drunk, drink | 23:14 |
blackburn | пить | 23:14 |
blackburn | pit | 23:14 |
blackburn | pit' | 23:14 |
alesis-novik | Nado menshe pit', pit' menshe nado | 23:15 |
blackburn | sonney2k: rabotatsh :D | 23:15 |
@sonney2k | alesis-novik, rabotatch ...I can only do this in handwritten digits | 23:15 |
blackburn | not rabotatsh, rabotay! :D | 23:16 |
@bettyboo | hihi | 23:16 |
blackburn | sonney2k: ya zhe zadolbayus' vse eto perepisyvat | 23:18 |
blackburn | it is a difficult one :) | 23:18 |
alesis-novik | sonney2k, we can just comment them out I guess | 23:19 |
alesis-novik | or just delete them? | 23:19 |
@sonney2k | alesis-novik, save them in a local branch but drop them from master for now | 23:19 |
@sonney2k | alesis-novik, there is some problem with spacing in you patch (some lines are more intended than others) | 23:20 |
alesis-novik | sonney2k, any idea which ones or in which file? | 23:21 |
blackburn | sonney2k: oh, there is some overloading in preproc.. but not for all | 23:23 |
blackburn | sonney2k: that preproc 'package' is a bit mixed, I will stalinize it | 23:24 |
@sonney2k | alesis-novik, I made a comment | 23:24 |
alesis-novik | sonney2k, I think I'll rename the init into preproc(?) or something, because it's called with parameters from a few places, including train | 23:26 |
alesis-novik | the method is more for precomputing the constant and inverse cov than anything else | 23:27 |
@sonney2k | alesis-novik, I think you can safely assume that you are dealing with CSimpleFeatures<float64_t> | 23:28 |
@sonney2k | (RealFeatures in python) | 23:28 |
@sonney2k | I mean who wants to compute a cov matrix for several thousands dim. data? | 23:29 |
alesis-novik | sonney2k, does that make a difference though? | 23:29 |
@sonney2k | alesis-novik, yes. then one can use get_feature_vector() and free_feature_vector() | 23:29 |
@sonney2k | and these can be preprocessed etc | 23:29 |
@sonney2k | get_feature_vector() in dotfeatures returns *a copy* of the features | 23:30 |
alesis-novik | I guess that would be slightly more efficient | 23:30 |
@sonney2k | alesis-novik, you could use register_params() instead of init() | 23:31 |
@sonney2k | alesis-novik, I still haven't understood why you allocate mean and cov in the default constructor? | 23:31 |
alesis-novik | so change init to register_params and leave the other init as-is? | 23:31 |
@sonney2k | the other init then has no parameters | 23:31 |
alesis-novik | sonney2k, just an idea that if you don't define anything it creates 0 mean 1 variance, so basically a 1 dim Normal distribution | 23:32 |
@sonney2k | hmmhh, is that really useful? | 23:33 |
@sonney2k | I think the default constructor will only ever be called when the object is de-serialized | 23:33 |
@sonney2k | alesis-novik, woudl be cool if you could also do the derivatives | 23:35 |
alesis-novik | Wait, where do you suggest I set the mean/cov and where do I compute the other stuff | 23:35 |
@sonney2k | alesis-novik, shall I better use the github comments instead of mixing chat and comments? | 23:37 |
alesis-novik | ok | 23:37 |
@sonney2k | why not start with mean = NULL etc and just add ASSERT(mean && cov); when needed. | 23:37 |
@sonney2k | that is what I wrote in that comment | 23:37 |
* blackburn is really don't know why preproc is so ambigous | 23:41 | |
alesis-novik | sonney2k, is there a point in getting both rows and cols for cov? | 23:42 |
@sonney2k | alesis-novik, a matrix has always a certain number of rows and cols - I understand this is a quadratic matrix but still - the swig based interfaces depend on that signature - otherwise you won't be able to set/get a cov from e.g. python | 23:43 |
alesis-novik | So I should set mean and covariance outside of preprocessing? | 23:47 |
@sonney2k | alesis-novik? | 23:49 |
alesis-novik | I'm still confused about what you suggest doing with init(mean, cov, dim) | 23:50 |
@sonney2k | call it only init() | 23:50 |
@sonney2k | no args | 23:50 |
@sonney2k | and assume that cov and mean (together with their sizes) are set as member varialbes | 23:51 |
@sonney2k | but check that cov / mean are there and that sizes match | 23:51 |
blackburn | sonney2k: refactored preproc.. again ;) | 23:51 |
alesis-novik | I don't store cov or mean sizes separately, just one param m_dim | 23:52 |
@sonney2k | then please do :) | 23:52 |
@sonney2k | you need that for serialization anyways | 23:52 |
alesis-novik | should I store separate ones for the inverse cov as well then? | 23:53 |
blackburn | why he should? | 23:53 |
@sonney2k | alesis-novik, yes | 23:54 |
@sonney2k | blackburn, I think having a matrix type with ptr to data, rows, cols would make this more explicit | 23:54 |
@sonney2k | since we have this all exploded everywhere ... | 23:54 |
@sonney2k | it is not | 23:55 |
alesis-novik | so then I might as well remove the m_dim | 23:56 |
alesis-novik | Since it's useless | 23:56 |
blackburn | okay :) | 23:56 |
--- Log closed Fri Apr 15 00:00:36 2011 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!