IRC logs of #shogun for Wednesday, 2011-06-01

--- Log opened Wed Jun 01 00:00:48 2011
CIA-32shogun: Soeren Sonnenburg master * rcf2f5b2 / src/configure :00:31
CIA-32shogun: add jblas / ujmp detection (assumes that the jblas.jar / ujmp.jar are00:31
CIA-32shogun:  either in /usr/share/java/ or in CLASSPATH) - http://bit.ly/jMqzlm00:31
blackburnsleep time! see ya00:37
-!- blackburn [~qdrgsm@188.168.4.252] has left #shogun []00:38
@sonney2kn800:46
@sonney2kbut agreed00:46
* sonney2k stops doing shogunate maintenance 00:47
-!- serialhex [~quassel@99-101-148-183.lightspeed.wepbfl.sbcglobal.net] has quit [Quit: No Ping reply in 180 seconds.]01:53
-!- serialhex [~quassel@99-101-148-183.lightspeed.wepbfl.sbcglobal.net] has joined #shogun01:54
-!- alesis-novik [~alesis@188.74.87.84] has quit [Quit: I'll be Bach]02:33
-!- alesis-novik [~alesis@188.74.87.84] has joined #shogun02:43
-!- VojtechFranc [~quassel@gw-101.scnet.cz] has joined #shogun08:50
-!- VojtechFranc [~quassel@gw-101.scnet.cz] has quit [Remote host closed the connection]09:23
-!- VojtechFranc [~quassel@gw-101.scnet.cz] has joined #shogun09:24
-!- Daniel__ [3edb9704@gateway/web/freenode/ip.62.219.151.4] has joined #shogun09:29
Daniel__Hi, in what file is the class SGString<> defined?09:29
Daniel__Its not in the documentation ...09:33
-!- blackburn [~qdrgsm@188.168.4.255] has joined #shogun09:51
blackburnwow09:51
VojtechFrancalesis-novik, hi, can we talk?10:03
-!- Daniel__ [3edb9704@gateway/web/freenode/ip.62.219.151.4] has quit [Ping timeout: 252 seconds]10:13
-!- heiko [~heiko@infole-06.uni-duisburg.de] has joined #shogun10:21
@sonney2kin datatype.h10:36
@sonney2kI think we have to cancel the meeting today10:40
@sonney2kand do it on Monday next week10:40
@sonney2kblackburn, VojtechFranc, alesis-novik, mlsec, heiko would that fit better?10:41
VojtechFrancsonney2k, for me both options are OK10:42
@sonney2kI've had 50% complaining about the appointment today10:42
@sonney2knips deadline etc10:42
@sonney2kVojtechFranc, how is it going with alesis-novik? Did you already do things?10:43
heikosonney2k, ok, next monday is ok for me10:43
-!- blackburn [~qdrgsm@188.168.4.255] has quit [Ping timeout: 248 seconds]10:43
VojtechFrancsonney2k, we are just starting the real programming. Alesis finished exams the last week.10:44
-!- blackburn [~qdrgsm@188.168.4.255] has joined #shogun10:44
@sonney2k(I am asking here before sending another stupid meeting schedule email causing yet more chaos)10:44
@sonney2kVojtechFranc, so nothing yet - I guess?10:44
@sonney2kbut you have a plan I suppose :)10:44
@sonney2kblackburn, monday also good for you?10:45
heikosonney2k, when will the meeting be next monay? I do not have time between 1330 and 1530 (german time)10:45
VojtechFrancsonney2k, yes, we are following a plan, don't worry :)10:46
heikothats 1130 to 1330 utc i think10:46
@sonney2kheiko 13 UTC so 15:00 german time10:46
@sonney2kwe could schedule it to 13:30 UTC10:46
heikook, then I will be half an hour late10:46
heikook10:46
heikothis would be ok10:46
heikowhat about using a doodle?10:46
heikodo you know this? is quite handy for getting meetings organized on the internet10:47
@sonney2kheiko, yes I know but I am not really flexible10:47
VojtechFrancsonney2k, actually I already forced myself to program the algorithm in Matlab and Alesis is now traying to do it in Shogunm10:47
@sonney2kheiko, I need to have childcare in that period...10:47
@sonney2kread grandparents10:48
@sonney2kVojtechFranc, very good10:48
@sonney2kbtw congrats on you ICML paper!10:48
heikosonney2k, ok :)10:48
@sonney2kthings will be better in july though10:48
heikowell, enjoy your family life. Currently I know what to do, so its ok10:50
@sonney2kheiko in case you have things to discuss - ask. btw great job with the parameter setting generator!10:52
heikoI will ask - thanks :)10:52
@bettybooheiko, yeah10:52
@sonney2kheiko, I hope you can now rely on the CMachine infrastructure10:53
heikowas quite a lot of fiddeling with new and delete :)10:53
@sonney2kand perf measures etc10:53
heikoI will check it out10:53
@sonney2kI mean now you can do the real thing just assuming that you do modsel for a machine10:53
heikoCurrently I am building a method that sets the ParameterCombination trees to a acutal machine10:53
heikoyes10:53
heikothen I will build one simple splitting strategy and a simple grid search10:54
@sonney2kgood plan10:54
heikothen when everything works fine, I will finish the subset stuff etc10:55
heikoand then examples10:55
heikoJust to remind you: I will leave for some climbing over the long weekend till sunday this evening10:56
@sonney2kheiko, I remember - we still have 3 months. Given your progress it should be feasible if we don't stumble upon tons of stepping stones10:57
heikohope so :)10:58
heikodo you still have a minute?11:00
@sonney2kshoot11:00
heikojust encountered a thing ...11:00
heikofor the C parameter there is C1 and C211:00
heikofor libsvm11:00
heikoand also CSVM11:00
heikobut you only hand ONE C-value in the constructor, both C1, B2 are then set to this value11:01
heikohowever, when I want to set a C parameter, it is not know, because there are only C1 and C211:01
heikobut when you want to specify the C parameter for model selection you only want to set ONE C parameter11:01
@sonney2kthere are two C's one for positive and one for negative class11:02
alesis-novikMorning11:02
VojtechFrancalesis-novik, hi11:03
heikoyes, but they are not used in the libsvm case when the svm is constructed by the constructor11:03
heikoand when I want only ONE c parameter to be modseled, I cant just use C but I have to use C1 or C211:04
alesis-novikSo, I've started working on the algorithm. I've decided to start be rewriting the Gaussian class to work with decomposed matrices and in log domain11:04
heikoAlso, for example CLibSvm does only use one:11:04
heikoparam.C = get_C1(); //is the only call11:04
@sonney2kalesis-novik, ok - just don't forget to write a weekly *short* email to the mailinglist keeping everyone updated11:06
alesis-noviksonney2k, will do.11:06
VojtechFrancalesis-novik, great. How can I see the code?11:06
alesis-novikI'll commit it to my branch when I get it to the state where it compiles11:07
VojtechFrancok11:07
alesis-novikWhen I do, it will be in https://github.com/alesis/shogun/tree/gmm11:08
VojtechFrancok, them please let me know when I can start testing11:09
alesis-novikNot to break the class to much, I want to keep the get/set covariance as well (maybe something else will find the class useful)11:09
@sonney2kheiko, checking11:10
@sonney2kheiko, it uses C1 and C211:10
@sonney2k    float64_t weights[2]={1.0,get_C2()/get_C1()};11:11
heikoyes11:11
heikosomething like to set multiple parameters with one value is needed here11:11
@sonney2kI think I/we have to modify code to make this work.11:11
@sonney2ke.g. C2 will alsways be 0.0 by default11:12
@sonney2kand if it is only C1 is used11:12
alesis-novikVojtechFranc, so all in all, I'm planing to keep the structure as it is now, but rewrite the methods.11:12
heikook, I see11:12
heikoand then just change C1 in modsel if one wants the standard C11:13
@sonney2kheiko, only alternative is to be able to set combinations of parameters11:13
heikodo you think this problem will appear at another place?11:13
heikoi would prefer the first way if this only happens here because it would be quite technical to implement the second one11:14
VojtechFrancalesis-novik, I'm not familiar with the current GMM class. I need to look at it.11:14
alesis-novikEssentially I use the Gaussian class within the GMM class to represent the mixture components11:15
@sonney2kheiko, I don't like the C1/C2 thing either - I just don't know how to do it better11:17
@sonney2kand I don't hope this happens more often11:17
@sonney2ker do hope11:17
heikommh11:18
VojtechFrancalesis-novik, in which files is the definition of the current GMM ?11:18
alesis-novikclustering/GMM.h11:18
alesis-novikand I guess distributions/Gaussian.h11:19
@sonney2kheiko, I mean we have cases where the SVM has just one C or one can set a vector of C's (example based)11:19
heikommmh perhaps it is best to implement this multiple parameters-one value thing11:20
@sonney2kheiko, or not and assume that we set a vector of Cs or just one C11:21
heikoI dont understand this really, could you explain it again?11:21
@sonney2kheiko, can't you set a vector of values?11:22
heikowhere? for the modsel parameters?11:23
@sonney2kheiko, yes I mean currently a double or SGObject11:23
@sonney2kbut a vector of doubles then11:23
heikoyes it is possible11:24
heikohowever, currently the modelSelectionParams only set single values or SGObject11:24
blackburnsonney2k: yeah today is fine too :)11:29
VojtechFrancalesis-novik, do you need at this point some help from my side?11:29
alesis-novikVojtechFranc, did you look at the definitions? If the current structure seems fine, then I don't really have any more questions11:30
alesis-novikI've rewritten the definition of Gaussian to have D and U instead of cov_matrix and added a "type" field11:31
alesis-novikSo now I'm rewriting the implementation of pdf methods11:32
alesis-novikI'm still thinking of keeping the "constant" part in, computing it when new covariances are set11:33
VojtechFrancregarding the definitions of GMM class, maybe it would be better if the constructor of the class was not liked to the EM algorihtm.11:36
VojtechFrancGMM paramaters can be estimated by other methods11:36
VojtechFrancit is also reasonable to add the parmaaters of GMM manually11:36
VojtechFrancisnt it better to have a method train_em(max_iter,min_change) ?11:37
VojtechFrancI mean the GMM class should be mainly a data structure to represent a mixture of Gaussians. The EM algorithm is just a single way how to set up its parameters.11:40
alesis-novikOk, so the constructor should just have the number of components (or initial number of components) and then a train_em method of actual EM11:41
VojtechFrancyes, I think it is better11:42
VojtechFrancmaybe, there should be another method (or constructor) to setup the inital vaule of the GMM paramaters11:43
alesis-novikShould that be in form of GMM(n, means, covs) or GMM(n, gaussians)?11:45
VojtechFrancto me both options are OK11:46
VojtechFranchowever, the constructor should have also another argument : cov_type11:46
alesis-novikah, if it's GMM(n, gaussians) I guess that wouldn't be needed, because cov_type would be in the gaussians already11:47
VojtechFrancI think the typicall usage will be GMM(n_componets,cov_type) then running the EM which sets the paramaters from training set11:47
alesis-novikI'll start from that then, adding additional features later11:48
VojtechFrancyes, I think it is good to start from simpler and then improve11:48
alesis-novikAh, I remembered another question: in gmm_em.m you use knnest to initially assign datapoints. Does that make a significant improvement over just iterating EM?11:50
VojtechFrancknnest is used to get initial parameters;  do you have a better way how to do it?11:52
VojtechFrancyou can either randomly initiallize the paramaters (Mean,Cov,Priors) or to randomly initialize p(y|x) assignments of data to clussters11:53
VojtechFrancknnest returns the crisp assignment of examples to clusters.11:54
alesis-novikThe current one does a static initialization of covariances and KMeans for means11:54
VojtechFrancthe m-code I send you 1) finds n centers either randonmly or by k-mean, 2) it assigns the points to this centers and 3) estimates means and covs for each cluster11:56
alesis-novikI guess I could try using KNN for parameter initialization, but I'm not sure if the current implementation in Shogun can do that.11:57
VojtechFrancI see, so the queation is if shogun has implemnetaion of KNN? well I don't know. we should aske Soeren11:58
alesis-novikShogun has the implementation of KNN, but I don't think I can define centers manually and it has to be trained. Essentially is a proper KNN I guess.12:00
alesis-novikwouldn't the effect be similar to initializing the covariance to a spherical one? (not by type, but covariance itself)12:02
VojtechFrancit seems to be better if the initial estimate of covs fits the data, i.e. setting them to identity matrices is not generally a good idea12:03
blackburnI have some 'experience' with shogun's KNN and could adapt it if it is needed12:03
VojtechFrancI think it is not a big problem to do assigment of N points to K clusters...12:03
VojtechFranci.e. we need only the 1-NN12:04
blackburnjust fyi :)12:05
-!- blackburn [~qdrgsm@188.168.4.255] has quit [Quit: Leaving.]12:06
alesis-novikok, I'll see what can be done. I can just implement 1-NN assignment in the method itself if needed12:08
alesis-novikSince I won't really need to think about distances12:08
VojtechFranciii[12:12
VojtechFrancyes, exactly12:14
VojtechFrancso, alesis I have to leave now.12:14
alesis-novikOk, I don't think I have any more questions now.12:15
alesis-novikIf I think of any, I'll e-mail you12:15
VojtechFrancOK, great. And please, don't forget to write SHORT progress report as Soeren pointed out12:16
VojtechFrancbye12:16
alesis-novikSee you12:16
-!- VojtechFranc [~quassel@gw-101.scnet.cz] has quit [Remote host closed the connection]12:16
-!- heiko [~heiko@infole-06.uni-duisburg.de] has quit [Quit: Leaving.]15:10
CIA-32shogun: Heiko Strathmann master * re0842b9 / (2 files): added example of how to apply a set of CParameterCombination trees to a support vector machine (+9 more commits...) - http://bit.ly/kTeDKu20:00
CIA-32shogun: Soeren Sonnenburg master * r95e0dad / (src/configure src/java_modular/swig_typemaps.i): Merge branch 'master' of git://github.com/sploving/shogun - http://bit.ly/iCOdWt20:00
@sonney2kserialhex, can you do thsi minor change before I apply your patch?20:02
-!- blackburn [~qdrgsm@109.226.70.249] has joined #shogun20:15
blackburnsonney2k: hey it is some message we shouldn't see in logs :)21:19
-!- blackburn [~qdrgsm@109.226.70.249] has quit [Quit: Leaving.]21:28
-!- blackburn [~qdrgsm@109.226.70.249] has joined #shogun21:28
-!- blackburn [~qdrgsm@109.226.70.249] has quit [Read error: No route to host]21:31
-!- blackburn [~qdrgsm@109.226.70.249] has joined #shogun21:54
-!- blackburn [~qdrgsm@109.226.70.249] has quit [Quit: Leaving.]22:26
-!- alesis-novik [~alesis@188.74.87.84] has quit [Quit: I'll be Bach]22:44
serialhexsonney2k: which change?  the one in the msg on github??23:56
serialhexbtw, i msgd you back on that23:57
--- Log closed Thu Jun 02 00:00:30 2011

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!