--- Log opened Sun Jul 31 00:00:59 2011 | ||
blackburn | sonney2k: no leaks I'd say | 00:01 |
---|---|---|
@sonney2k | ok then good... | 00:02 |
blackburn | sonney2k: tried with init and loaded features - no leak, jigsaw memory usage :D | 00:06 |
blackburn | 1.3 1.5 1.9 2.3 1.6 1.9 2.2 1.0 ... | 00:07 |
blackburn | 2M kernels | 00:07 |
blackburn | sonney2k: so the only thing to get to work - examples.. | 00:08 |
@sonney2k | yeah looks like | 00:09 |
@sonney2k | blackburn, I have a new suggestion for SGVector btw: | 00:10 |
@sonney2k | static virtual void SGVector& get_vector(SGVector &src, bool own=true) | 00:10 |
@sonney2k | { | 00:10 |
@sonney2k | if (!own) | 00:10 |
@sonney2k | return src | 00:10 |
@sonney2k | orig.do_free=false; | 00:10 |
@sonney2k | return SGVector(src.vector, src.vlen); | 00:10 |
@sonney2k | } | 00:10 |
blackburn | better | 00:10 |
@sonney2k | I think I will go with this one | 00:11 |
blackburn | sonney2k: week before I made some tester for java | 00:12 |
blackburn | but can't decide if I should continue.. | 00:12 |
@sonney2k | blackburn, don't for now | 00:12 |
@sonney2k | this needs discussion | 00:12 |
@sonney2k | the question really is - how do we want to compare if things are the same in our test suite | 00:13 |
blackburn | sonney2k: I think we should split examples to some 'unit-tests' and complex examples | 00:14 |
@sonney2k | not a good idea | 00:14 |
blackburn | sonney2k: I think we shouldn't compare if things are the same | 00:14 |
@sonney2k | no one will maintain tests | 00:15 |
blackburn | sonney2k: unit tests should be autogenerated | 00:15 |
@sonney2k | ? | 00:15 |
blackburn | well all the kernels are the same.. | 00:16 |
@sonney2k | blackburn, so? | 00:16 |
@sonney2k | we had that before | 00:16 |
@sonney2k | the test suite now is 100% useless because no one updates it | 00:16 |
@sonney2k | it is enough work to 'just' update examples | 00:16 |
blackburn | I mean we can only write templates | 00:16 |
@sonney2k | doesn't help | 00:17 |
blackburn | why? | 00:17 |
@sonney2k | too much work | 00:17 |
@sonney2k | you need to do that for everything | 00:17 |
@sonney2k | and there is not that much that you can generatlize | 00:17 |
@sonney2k | there are always exceptions | 00:17 |
blackburn | but agree, we can't maintain tests for java,python,... | 00:18 |
@sonney2k | it really is much easier to write examples for everything (which we have to have anyways) | 00:19 |
@sonney2k | and then return some reasonable number or so | 00:19 |
-!- in3xes [~in3xes@180.149.49.227] has quit [Quit: Leaving] | 00:19 | |
@sonney2k | that we use to compare results | 00:19 |
blackburn | I don't like the way of comparing results or so | 00:22 |
@sonney2k | blackburn, because? | 00:23 |
blackburn | sonney2k: well it looks strange to me.. | 00:23 |
@sonney2k | blackburn, yeah but why? | 00:23 |
blackburn | I think we should only test if no errors | 00:23 |
@sonney2k | what does no errors mean? | 00:24 |
blackburn | no compile-time errors, no runtime errors like segfaults.. | 00:24 |
@sonney2k | blackburn, but then that might mean we return just crap | 00:25 |
blackburn | return to? | 00:25 |
@sonney2k | I could replace train() with random() | 00:25 |
@sonney2k | and no one would recognize | 00:25 |
blackburn | can we recognize it now? | 00:25 |
@sonney2k | yes | 00:25 |
blackburn | how? | 00:26 |
@sonney2k | blackburn, because we know that at time point T1 everything was correct | 00:26 |
@sonney2k | now we develop sth else | 00:26 |
@sonney2k | and just compare whether result at T2 is the same as T1 | 00:26 |
blackburn | we don't use it at all.. | 00:27 |
@sonney2k | blackburn, yeah because no one runs it | 00:28 |
@sonney2k | and we have no build bot that does it automagically | 00:28 |
@sonney2k | but we also have a problem | 00:28 |
@sonney2k | because results are GaussianKernel et | 00:29 |
@sonney2k | c | 00:29 |
@sonney2k | and we pickle.dump | 00:29 |
@sonney2k | and internally we changed formats so serialization results are different and we can no longer load results | 00:29 |
blackburn | bad bad | 00:30 |
@sonney2k | blackburn, yes. | 00:30 |
@sonney2k | the issue is here how can we keep the format constant or at least compatible | 00:31 |
CIA-87 | shogun: Soeren Sonnenburg master * r42595fa / src/interfaces/java_modular/swig_typemaps.i : use simple swig enums - https://github.com/shogun-toolbox/shogun/commit/42595fa7a75037216c096aeb9879f265c37fdbfe | 00:47 |
CIA-87 | shogun: Soeren Sonnenburg master * r95f11a0 / (5 files in 3 dirs): Hopefully fix compiler errors in GMM/Gaussian. Utilize destroy_*. - https://github.com/shogun-toolbox/shogun/commit/95f11a02929a4db4404c4bfdd0670d43cebf0610 | 00:47 |
blackburn | sonney2k: what do you think, is it better to use Gaussian things in GaussianNaiveBayes? | 00:52 |
@sonney2k | blackburn, not our biggest problem now - rather think about what we do to keep the serialization format compatible | 00:52 |
blackburn | it simply fits a gaussian for each class | 00:53 |
@sonney2k | maybe we need some kind of variable x is now y | 00:53 |
blackburn | I'm just talking about it because there is a little bug in GNB :D | 00:53 |
@sonney2k | or things like this | 00:53 |
blackburn | sonney2k: how it looks now? | 00:53 |
@sonney2k | blackburn, talk to alesis-novik about this :) | 00:53 |
@sonney2k | blackburn, well we basically register all member varibles | 00:54 |
@sonney2k | the problem is that we introduced new ones now | 00:54 |
@sonney2k | and we renamed old ones or even changed types... | 00:54 |
blackburn | any automagic way ? | 00:55 |
blackburn | btw in java we would have way hehe | 00:56 |
@sonney2k | how do you do that in java? | 00:56 |
@sonney2k | if you change an object and rename variables? | 00:56 |
blackburn | well it is possible to get variable types and names | 00:57 |
blackburn | in java* | 00:57 |
@sonney2k | blackburn, how does this help our problem? | 00:57 |
blackburn | 'nohow', we aren't using java :) | 00:57 |
@sonney2k | if you rename a variable - the serialized file would be different and so you couldn' load the serialized object | 00:58 |
@sonney2k | the old one I mean | 00:58 |
blackburn | ah i see | 00:58 |
@sonney2k | that is our problem atm | 00:58 |
@sonney2k | we need to extend the serialized format to store some transitioning information | 00:58 |
@sonney2k | like version when this varialbe appeared | 00:59 |
@sonney2k | and when this thing vanished or whatever | 00:59 |
@sonney2k | and renamed | 00:59 |
blackburn | sonney2k: didn't you say that it doesn't really matter now because we changed much things already? | 00:59 |
@sonney2k | and transition form verions x to version y functions | 00:59 |
@sonney2k | blackburn, no | 01:00 |
@sonney2k | it is the only way to ensure testing... | 01:00 |
blackburn | do you want to make it still compatible? | 01:00 |
@sonney2k | the only other alternative is to check each method individually | 01:00 |
@sonney2k | by hand that is | 01:00 |
blackburn | okay time for some sleep | 01:15 |
blackburn | sonney2k: see you | 01:15 |
@sonney2k | cu | 01:15 |
-!- blackburn [~blackburn@188.122.224.26] has quit [Quit: Leaving.] | 01:15 | |
-!- f-x [~user@117.192.192.42] has joined #shogun | 07:29 | |
-!- f-x` [~user@117.192.222.125] has joined #shogun | 08:15 | |
-!- f-x [~user@117.192.192.42] has quit [Ping timeout: 260 seconds] | 08:17 | |
-!- f-x [~user@117.192.222.125] has joined #shogun | 08:49 | |
f-x | sonney2k: you here? | 10:00 |
-!- f-x` [~user@117.192.222.125] has left #shogun ["ERC Version 5.3 (IRC client for Emacs)"] | 10:00 | |
-!- f-x [~user@117.192.222.125] has quit [Quit: ERC Version 5.3 (IRC client for Emacs)] | 10:00 | |
-!- f-x [~user@117.192.222.125] has joined #shogun | 10:01 | |
-!- f-x is now known as Guest33029 | 10:01 | |
-!- Guest33029 is now known as f-x` | 10:01 | |
f-x` | sonney2k: should i use enums to identify the loss function for SVMSGD? | 10:02 |
@sonney2k | f-x`, no just use a loss member | 11:29 |
@sonney2k | then you can add some flag or so for this < comparison | 11:29 |
f-x` | sonney2k: i don't know.. what kind of flag do you suggest? | 11:35 |
-!- f-x` is now known as f-x | 11:37 | |
-!- heiko [~heiko@134.91.52.15] has joined #shogun | 11:41 | |
@sonney2k | f-x, for SGD/QN you don't really need that flag if you treat the learning algorithms differently | 12:21 |
@sonney2k | but if you don't want to do that then you could introduce some flag needs_extra_update or so that is set true for some... | 12:22 |
@sonney2k | heiko, hi... | 12:22 |
heiko | sonney2k, hi! | 12:22 |
@sonney2k | heiko, I was trying to stabilize things and came across one issue... | 12:23 |
heiko | which one? | 12:23 |
@sonney2k | that is our test suite no longer works due to all the variable addtions (like subset) and renames and SGVector stuff | 12:23 |
@sonney2k | so heiko I was wondering if you would have time to work on this a bit ... | 12:24 |
heiko | yes, I can do this | 12:24 |
heiko | Think I will finish the KMeans stuff today | 12:25 |
heiko | but what exactly is the problem? | 12:25 |
@sonney2k | heiko, you did subset for string features right, but sparse is still missing? | 12:25 |
@sonney2k | and kernel / distance machine will work too - at least soon | 12:25 |
heiko | no sparse features already have subset | 12:26 |
heiko | an example is there too | 12:26 |
heiko | in c++ | 12:27 |
heiko | yes, kernel machines work | 12:27 |
heiko | but only with simple/string features | 12:27 |
heiko | no modelselection for sparse features currently | 12:27 |
@sonney2k | heiko, ok so then todo was only some nicer python syntax typemaps and different sampling techniques | 12:27 |
heiko | yes | 12:27 |
@sonney2k | heiko, why not - how is it different? | 12:27 |
@sonney2k | ^sparse & ms | 12:27 |
heiko | cross validation needs a method of CFeatures that is not implemented for sparse | 12:28 |
heiko | copy_subset | 12:28 |
@sonney2k | heiko, btw when you use SGVector in a class, could you please use vector.destroy_vector() in destructor? | 12:28 |
@sonney2k | heiko, I just don't understand the difference to any other feature object ... | 12:29 |
heiko | sonney2k, yes, you changed this, i forgot | 12:29 |
@sonney2k | ahh ok | 12:29 |
@sonney2k | (I was just reading your pull request/patch) | 12:29 |
heiko | sonney2k, have a look at void CKernelMachine::store_model_features | 12:30 |
heiko | there, this method is called | 12:30 |
@sonney2k | heiko, yeah I understand | 12:30 |
heiko | sonney2k, pull request updated | 12:31 |
@sonney2k | heiko, but that copy* fucntion should be trivial for sparse | 12:31 |
@sonney2k | it is the same like for strings... | 12:31 |
heiko | yes, should be simple | 12:31 |
@sonney2k | (mor or less :) | 12:31 |
@sonney2k | more | 12:31 |
heiko | but has to be implemented :) | 12:31 |
@sonney2k | yes yes | 12:31 |
@sonney2k | ok then I would say finish distance, the sparse copy and then it would be very very good if you could help getting serialization more cross-version compatilbe | 12:32 |
@sonney2k | so here is the problem we have: | 12:32 |
@sonney2k | all these m_parameters->add() stuff registers the variables to be serialized | 12:33 |
@sonney2k | now that is all good and works | 12:33 |
@sonney2k | however in shogun version+1 we might add a new variable, like in this case subset :D | 12:33 |
heiko | yes | 12:34 |
@sonney2k | suddenly older objects can not really be loaded (well they could but issue a warning) | 12:34 |
@sonney2k | so the plan would be to store a version in addition | 12:34 |
@sonney2k | so add a int version to the m_parameter->add() call | 12:35 |
heiko | and then to check version upon deserialization | 12:36 |
@sonney2k | and each object then has a separate version nr defined that is then just passed | 12:36 |
@sonney2k | heiko, yes, so it means we load only things of that specific version even we are in a newer version object | 12:36 |
@sonney2k | so that would solve the problem of *additions* | 12:36 |
@sonney2k | now we have the problem of type changes and renames too | 12:37 |
heiko | yes | 12:37 |
@sonney2k | for example we have lots of changes that do double* vector, int len -> SGVector vec | 12:37 |
heiko | yes | 12:38 |
heiko | already an idea for that? | 12:38 |
@sonney2k | I am not 100% sure yet how to properly fix this but I think there is no other way than providing some transition table | 12:38 |
heiko | ok | 12:38 |
heiko | tricky | 12:38 |
@sonney2k | i.e. in that table there would be the old variable names registered and a transition function that returns the new one | 12:38 |
heiko | and this table has to be updated when someone changes a variable | 12:39 |
heiko | or should this go automatically? | 12:39 |
@sonney2k | so e.g. old_names = {vector, len} new_name= vector , transition function = transform_double_len_to_sgvector() | 12:39 |
@sonney2k | heiko, this cannot go automatically | 12:39 |
heiko | ok | 12:40 |
@sonney2k | we would have to update that for all classes such until the whole test suite runs trough again | 12:40 |
heiko | but these functions are only for serialization | 12:40 |
heiko | or deserialization of old data | 12:40 |
@sonney2k | (test suite currently is the tester.py in testsuite) | 12:40 |
@sonney2k | yes | 12:40 |
heiko | i will have alook | 12:40 |
@sonney2k | deserialization only | 12:40 |
@sonney2k | in serialization we write things out only in the newest format I would say... not sure if this is good - very much M$ Word style... | 12:41 |
heiko | puh, that is a lot of stuff | 12:41 |
@sonney2k | heiko, lets start with the low hanging fruits, that is additions | 12:41 |
heiko | ok | 12:42 |
heiko | and the version id | 12:42 |
@sonney2k | yeah I think this can be solved via the version id | 12:42 |
heiko | ok | 12:43 |
heiko | I will probably start on this tommorrow and then bother you with my problems :) | 12:43 |
@sonney2k | heiko, btw in line 146 you can use SGVector<float64_t>(k) | 12:44 |
@sonney2k | this will alloc a vector of len k | 12:44 |
@sonney2k | heiko, yeah | 12:44 |
heiko | sonney2, another question: | 12:45 |
heiko | there are many feature classes that do not support subset or model selection | 12:45 |
heiko | this is because the inheritance sturcture | 12:45 |
heiko | basically there are only three classes, and the things work for these | 12:45 |
heiko | but for all these specializations, the methods are not implemented | 12:45 |
heiko | for example all dot features | 12:46 |
heiko | because in the class DotFeatures itself, it is not possible to implement the missing methods | 12:46 |
f-x | sonney2k: sorry for persisting, but the 'if (z < 1)' between the #if-#endif should function something like 'if (z < 1 || loss->is_log_loss())' right? i'm not understanding where this needs_extra_update flag would go | 12:47 |
@sonney2k | heiko, it should be do-able in dotfeatures too - problem is that this needs another change in the features beneath, like dot() would call compute_dot() with the right subset | 12:49 |
@sonney2k | heiko, so lets postpone that for now. | 12:50 |
heiko | ok | 12:50 |
heiko | this will automatically detected if people encouter the "class XYZ" is not ready for model-selection yet :) | 12:50 |
@bettyboo | heiko, haha | 12:50 |
heiko | (SG_ERRORS) | 12:51 |
@sonney2k | f-x, it is not just for LOGLOSS but also for LOGLOSSMARGIN | 12:51 |
f-x | sonney2k: right.. so these two loss functions should have some common property we should be able to check for | 12:52 |
f-x | or we could have enum types for all loss functions and check for those enums | 12:52 |
@sonney2k | f-x, that is why I was suggesting a needs_extra_update or so flag | 12:52 |
f-x | where would this be? in the SGD class? | 12:53 |
f-x | i didn't understand properly | 12:53 |
@sonney2k | f-x, in the losses | 12:53 |
@sonney2k | f-x, in the end you either create all losses in one file or multiple like you do (up to you) | 12:53 |
f-x | sonney2k: and they will be used only in SGD/SGD-QN? | 12:53 |
@sonney2k | if in one file they would be in mathematics/* | 12:53 |
f-x | (the flag) | 12:53 |
@sonney2k | (BTW there is already one loss thingy in there which should be modified/removed) | 12:54 |
@sonney2k | f-x, yes | 12:54 |
@sonney2k | if you do it in one file then you will have to use enums for selecting the loss | 12:55 |
@sonney2k | otherwise classes - which is what you do now. | 12:55 |
f-x | sonney2k: but it wouldn't be good to modify loss functions for the sake of learning algorithms right? | 12:56 |
f-x | or will this flag be of use generally as well? | 12:56 |
@sonney2k | f-x, how else would you solve that problem? | 12:57 |
@sonney2k | the only other chance I see is to change the learning algorithm completely depending on loss | 12:57 |
f-x | define a global list of enums for all loss functions in some header file | 12:57 |
@sonney2k | and then? | 12:57 |
f-x | and each loss function returns that enum | 12:57 |
f-x | check for that enum from SGD | 12:57 |
f-x | whether the enum is LOGLOSS or LOGLOSSMARGIN or whatever | 12:57 |
@sonney2k | yes sure that is also fine | 12:57 |
@sonney2k | this is what features/ preprocessors do | 12:58 |
@sonney2k | kernels / distances too btw | 12:58 |
f-x | sonney2k: so where should i add the loss function enums? | 12:58 |
@sonney2k | they all have an enum | 12:58 |
@sonney2k | in Loss.h | 12:58 |
f-x | hmm right | 12:58 |
f-x | sonney2k: ok. sounds good, i'll do that. | 12:59 |
f-x | sonney2k: btw VW also adds a couple of methods to the loss functions it uses | 12:59 |
f-x | like get_update() and get_square_grad() | 12:59 |
f-x | (which are basically used mainly for VW) | 12:59 |
f-x | so i shouldn't put these into the loss function classes right? (coz they'll probably only be used by VW) | 13:00 |
@sonney2k | f-x, put them in the losses | 13:00 |
f-x | and also I don't know how they'd look for any general loss function - I know it only for those loss functions used in VW | 13:00 |
@sonney2k | they belong there because they do some extra stuff | 13:00 |
@sonney2k | then return SG_NOTIMPLEMENTED for the other losses there | 13:01 |
f-x | sonney2k: okay.. that's nice.. i'll add them too. | 13:01 |
@sonney2k | it is totally fine if not all losses support such functions | 13:01 |
@sonney2k | (or not implemented as in this case) | 13:01 |
f-x | great.. we could implement them later.. by solving a recurrence relation in john's paper.. but i don't think I'll do it now. | 13:02 |
f-x | SG_NOTIMPLEMENTED is the way to go | 13:02 |
f-x | sonney2k: is the compilation fixed? or was it compiling for you already? | 13:02 |
-!- in3xes [~in3xes@180.149.49.227] has joined #shogun | 13:02 | |
@sonney2k | f-x, I have an older gcc version so it always compiled here | 13:03 |
@sonney2k | but I hope I fixed it yes | 13:03 |
f-x | ok.thanks! mine's 4.6.1.. and i'll report if it doesn't work here | 13:03 |
CIA-87 | shogun: Soeren Sonnenburg master * rfd09670 / (2 files): | 13:06 |
CIA-87 | shogun: Merge pull request #253 from karlnapf/master | 13:06 |
CIA-87 | shogun: made KMeans serializable and SGVector replacement (+8 more commits...) - https://github.com/shogun-toolbox/shogun/commit/fd0967097e615f9f234f1a18c6269e89d57a2ab4 | 13:06 |
@sonney2k | alesis-novik, so can you avoid the memcpy stuff? | 13:07 |
-!- blackburn [~blackburn@188.122.224.26] has joined #shogun | 13:21 | |
heiko | sonney2k, I think, it makes sense to implement apply for any DistanceMachine, i.e. move the implementation from KMeans to DIstanceMachine | 13:28 |
heiko | but then, one has to ensure that every distance machine stores its cluster centers in the lhs of the underlying distance variable | 13:29 |
heiko | what do you think about this? | 13:29 |
heiko | then any distance machine would implement the apply method | 13:29 |
heiko | well, every distance machine that builds cluster centers in training | 13:30 |
heiko | KNN then would override apply by its own method | 13:30 |
blackburn | sonney2k: openmp? ;) | 13:54 |
@sonney2k | heiko, ok | 14:14 |
@sonney2k | blackburn, slow | 14:15 |
blackburn | sonney2k: why slow? | 14:15 |
blackburn | many things could be easily adapted for openmp because of #pragma notation.. | 14:16 |
blackburn | is it really slow? | 14:16 |
@sonney2k | blackburn, for simple things it is fast yes | 14:18 |
CIA-87 | shogun: Heiko Strathmann master * r3ac5c53 / (2 files): another SGVector replacement and usage of CMath::sqrt instead of std::sqrt - https://github.com/shogun-toolbox/shogun/commit/3ac5c53a62eec98dd1d0b68a2dd80453755f9a1d | 14:20 |
CIA-87 | shogun: Soeren Sonnenburg master * ra6586d5 / (2 files): | 14:20 |
CIA-87 | shogun: Merge pull request #254 from karlnapf/master | 14:20 |
CIA-87 | shogun: SGVector replacement - https://github.com/shogun-toolbox/shogun/commit/a6586d545c32c38ee414efd277d49f41bc8352a0 | 14:20 |
blackburn | sonney2k: and when it is slow? | 14:25 |
-!- heiko [~heiko@134.91.52.15] has quit [Ping timeout: 258 seconds] | 14:25 | |
@sonney2k | blackburn, in my attempts when I called functions in the parallelized pragma stuff | 14:34 |
@sonney2k | blackburn, so plain for loops without functions should become faster... | 14:34 |
blackburn | so what we might use for multithreading? | 14:35 |
@sonney2k | pthreads | 14:39 |
-!- mrsrikanth [~mrsrikant@59.92.22.26] has joined #shogun | 14:50 | |
-!- in3xes_ [~in3xes@210.212.58.111] has joined #shogun | 14:54 | |
-!- in3xes [~in3xes@180.149.49.227] has quit [Ping timeout: 240 seconds] | 14:57 | |
-!- mrsrikanth [~mrsrikant@59.92.22.26] has quit [Read error: Connection reset by peer] | 15:07 | |
-!- f-x [~user@117.192.222.125] has quit [Ping timeout: 260 seconds] | 15:31 | |
-!- in3xes_ is now known as in3xes | 15:34 | |
-!- srikanth [~mrsrikant@59.92.22.26] has joined #shogun | 16:21 | |
-!- srikanth [~mrsrikant@59.92.22.26] has quit [Quit: Leaving] | 17:04 | |
alesis-novik | sonney2k, around? | 18:03 |
alesis-novik | Well, I did what you asked, and valgrind doesn't seem to complain, so that's good. | 18:46 |
-!- f-x [~user@117.192.207.49] has joined #shogun | 18:58 | |
-!- in3xes_ [~in3xes@180.149.49.227] has joined #shogun | 19:04 | |
f-x | sonney2k: hey! what kind of objects as a rule do you think should inherit from CSGObject? | 19:04 |
-!- in3xes [~in3xes@210.212.58.111] has quit [Ping timeout: 240 seconds] | 19:07 | |
@sonney2k | f-x, all except those for which this would be too much overhead | 19:58 |
CIA-87 | shogun: Alesis Novik master * rf8fc62c / src/shogun/clustering/GMM.cpp : Removed copying - https://github.com/shogun-toolbox/shogun/commit/f8fc62c7b365df87859be00ef74bde3c6d2b7cdd | 19:59 |
CIA-87 | shogun: Soeren Sonnenburg master * r380af5a / (3 files in 2 dirs): | 19:59 |
CIA-87 | shogun: Merge pull request #252 from alesis/gmm | 19:59 |
CIA-87 | shogun: Memory problem fixes. - https://github.com/shogun-toolbox/shogun/commit/380af5acf4bba4d7cb226fd9dc90ab625b2ac149 | 19:59 |
@sonney2k | alesis-novik, well you are the master of your algorithm... as long as you don't destroy the vector under your feat you should be fine. | 20:01 |
alesis-novik | sonney2k, well, I think in this case we might still have a few variables floating around because the object isn't deleted. Nothing major though. | 20:05 |
-!- in3xes_ is now known as in3xes | 20:05 | |
@sonney2k | alesis-novik, ok... | 20:26 |
alesis-novik | sonney2k, found another potential memory problem, committing. | 20:30 |
blackburn | alesis-novik: do you know what is gaussian naive bayes is? | 20:35 |
blackburn | I've been thinking about GNB+Gaussian integration | 20:35 |
@sonney2k | alesis-novik, I suggest you compile with --trace-memory-allocs and check for leaks too :) | 20:37 |
alesis-novik | sonney2k, will do | 20:37 |
alesis-novik | what did you have in mind blackburn | 20:37 |
blackburn | alesis-novik: well now it uses gaussian pdf | 20:38 |
blackburn | may be it is even possible to fit gaussians for every class not only with diag cov | 20:38 |
blackburn | one issue with GNB now is that sometimes it leads to underflow or so | 20:39 |
blackburn | I mean for every class the probability becomes so small that decision is not correct | 20:39 |
alesis-novik | but how do you want to integrate it with Gaussian? | 20:48 |
blackburn | I'm not sure if CGaussian is exactly what I mean :) | 20:48 |
-!- in3xes [~in3xes@180.149.49.227] has quit [Quit: Leaving] | 21:05 | |
CIA-87 | shogun: Alesis Novik master * r2d2fbf8 / src/shogun/clustering/GMM.cpp : added SG_UNREF where needed - https://github.com/shogun-toolbox/shogun/commit/2d2fbf8433c9bfb0621a046dc7408aa49f15c2d8 | 21:09 |
CIA-87 | shogun: Soeren Sonnenburg master * ra514bca / src/shogun/clustering/GMM.cpp : | 21:09 |
CIA-87 | shogun: Merge pull request #255 from alesis/gmm | 21:09 |
CIA-87 | shogun: added SG_UNREF where needed - https://github.com/shogun-toolbox/shogun/commit/a514bca4f7200429d1696953ca9d3cadacb80a5f | 21:09 |
@sonney2k | alesis-novik, btw would it be possible to set the matrix of means etc for the gaussians in one go? | 21:09 |
@sonney2k | now it seems one has to set multiple vectors | 21:10 |
@sonney2k | alesis-novik, I mean you could just add as set_* function which takes an SGMatrix as argument and then call the respective SGVector functions multiple times... | 21:10 |
alesis-novik | sonney2k, that's because it's just calling the underlying CGaussian::set_mean(...) | 21:10 |
@sonney2k | alesis-novik, yes but you can emulate that right? | 21:11 |
@sonney2k | I mean split up the mean matrix etc | 21:11 |
alesis-novik | but what about the covariance one then? | 21:11 |
@sonney2k | alesis-novik, covariance is for every gaussian right? | 21:12 |
@sonney2k | I mean you have 1 per gaussian? | 21:12 |
alesis-novik | yes | 21:12 |
@sonney2k | so in you GMM you would have multiple cov matrices ? | 21:13 |
@sonney2k | then it doesn't make sense indeed | 21:13 |
alesis-novik | Well, every Gaussian in the mixture model has a mean and cov. While making a bulk set_means makes sense, I don't really think that a bulk set_covs using SG* would make sense | 21:14 |
@sonney2k | alesis-novik, yes I agree so we keep it like it is then | 21:28 |
-!- f-x [~user@117.192.207.49] has quit [Remote host closed the connection] | 21:54 | |
-!- serialhex [~quassel@99-101-148-183.lightspeed.wepbfl.sbcglobal.net] has quit [Ping timeout: 250 seconds] | 22:04 | |
-!- serialhex [~quassel@99.101.148.183] has joined #shogun | 22:04 | |
--- Log closed Mon Aug 01 00:00:11 2011 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!