IRC logs of #shogun for Sunday, 2011-07-03

--- Log opened Sun Jul 03 00:00:40 2011
@sonney2kf-x, does the templating we discussed about yesterday work?00:09
f-xsonney2k: haven't done it yet.. lost track of a few things in that and decided to do SGD until I asked you about that again...00:10
f-xcan you please make it a bit more explicit?00:11
@sonney2kf-x, what?00:11
f-xsonney2k: your idea of doing it00:11
f-xa templated get_vector() calling the other functions in the base class00:11
@sonney2kf-x, you mean a templated function get_vector() etc?00:11
f-xsonney2k: yeah00:12
@sonney2kf-x, the idea was to have one templated function get_vector() in StreamingFile IIRC00:12
@sonney2kthis function then has special implementations for the different types00:12
f-xsonney2k: ok00:13
@sonney2kand within this just calls get_float64_vector etc00:13
@sonney2kthat's all00:13
f-xso i'd specialize the <int> to call get_int_vector() from the base class?00:13
f-x*derived00:13
@sonney2kyou do that all in the base class00:13
f-xi mean to call the get_*_vector function which will later be implemented in the derived class00:14
@sonney2kf-x, in the derived class you just call get_vector<T>()00:14
f-xsonney2k: okay, let me check out the code for a bit00:15
f-xi'll get back to you in a minute00:16
f-xsonney2k: so let's take the case of StreamingFileFromStringFeatures and StreamingFileFromSimpleFeatures00:18
f-xand say T is float64_t00:18
f-xin each of them i call get_vector<float64_t>(), right?00:19
f-xwhere get_vector<float64_t> is implemented in the base class, where it just calls the appropriate function (which is again supposed to be implemented in the derived class)00:20
@sonney2kf-x, no you always call get_vector<T>00:21
f-xsonney2k: i mean it indirectly calls get_vector<float64_t> in this case00:21
@sonney2kbecause these derived classes are templated00:21
@sonney2kyeah00:22
f-xsonney2k: so my question is:00:22
f-xsince get_vector<T> is specialized only for type T,00:22
f-xhow will it call a different function for SimpleFeatures and StringFeatures?00:23
f-xboth have same parameters get_*_vector(T* vector, int len)00:23
f-xbut the way of getting the vector would be different for String and Simple00:23
@sonney2kf-x, huh?00:24
@sonney2kyou don't use vector for strings00:24
@sonney2kbut SGString00:24
f-xsonney2k: okay, so if that's the case then things would work00:26
f-xbut isn't SGString another encapsulation of (T*, int/index)?00:26
@sonney2kthat is the case yes, dense vectors for dense matrices, sparse matrices -> sparse vectors, strings strings00:26
@sonney2kyes indeed00:27
f-xsonney2k: so instead of using (T*, int) for dense vectors, i should switch to SGVector<T>?00:27
@sonney2kyes00:28
@sonney2kthese should return SGVector00:29
f-xhmm.. if we take care to keep the function signatures different, things will work00:30
f-xbut now I'm seeing why feature-oriented class division is better than the present stream-oriented stuff00:30
blackburnhm i did nothing today00:31
@sonney2kblackburn, a black day for shogun :D00:32
blackburnsure00:32
blackburnah no I started writing article about IRLM with MDS00:32
@sonney2kf-x, I don't see that any longer00:33
f-xsonney2k: because if we want to read sparse vectors from an ascii file, how would we do it for different file formats? if we try to have that then we need to change the function names in StreamingAsciiFile.. (like get_svmlight_sparse_vector(SGSparseVectorEntry<T>*, int), and another get_vw_sparse_vector(SGSparseVectorEntry<T>*, int)00:33
@sonney2kactually, I also don't see the problem when stringfeatures always return get_vector stuff00:34
@sonney2kf-x, well it is not an asciifile then00:34
@sonney2kit is a svmlight file00:34
f-xsonney2k: oh00:34
f-xso ascii just refers to the native shogun format?00:34
blackburnis vw stands for volkswagen? :D00:35
@sonney2kto some format ye00:35
@sonney2ks00:35
blackburnok joking I know that it is vowpal wabbit00:35
* sonney2k sent his harddrive back to seagate for replacement00:36
f-xsonney2k: and about the problem with stringfeatures: i can't think of a proper way to define get_vector<T> in the base class00:36
blackburnsonney2k: nice, have you backed up everything you wanted?00:36
@sonney2kf-x, I don't see the problem - please explain00:36
@sonney2kblackburn, at least it did copy without telling me that there are errors00:36
f-xsonney2k: i mean now i've used get_*_vector(T*, int) and get_*_string(T*, int)00:37
@sonney2kok00:37
@sonney2kbut?00:37
f-xso when i try to define get_vector<int>(), how should i define it?00:37
@sonney2kf-x, well it just calls get_*_vector of matching type00:38
@sonney2kit could even return a SGVector<T> for the beauty of it00:38
f-xsonney2k: but get_*_vector refers to the reading function for dense vectors00:39
f-xfor strings it is get_*_string00:39
@sonney2kf-x, and?00:39
@sonney2kyou also have get_*_string stuff00:39
@sonney2kso you need to define get_string<T> in the same way of course00:40
f-xsonney2k: ah - that's the thing i missed out!00:40
f-xi thought one function would do all the magic00:41
@sonney2kf-x, same for get_sparse_vector<T>00:41
@sonney2kf-x, no00:41
@sonney2kf-x, we used it as example only00:41
@sonney2kthe plan was to start with get_vector<T> and then do the others later00:41
blackburnI want to change Labels00:42
@sonney2kblackburn, ?00:42
@sonney2kto +100:42
f-xsonney2k: looking back at yesterday's discussion, it makes sense now.. took me a long time to get it properly :/00:42
@sonney2k:D00:42
bettyboo:*)00:42
blackburnI don't like it provides no functionality to sth like 3 100000 2  5453 423400:43
blackburnthese min_labels at gaussian naive bayes and knn are pathetic!00:43
@sonney2kblackburn, it is user error to some extend though00:44
@sonney2kI mean why should we have to do the work to do proper label hashing00:44
blackburnwhy not? :)00:44
@sonney2kbecause it takes time and effort00:45
blackburnwell computing min and max label in KNN is O(n)00:45
blackburnhashing should be even faster00:45
@sonney2kI don't mind if you add such functions but not enabled automagically for now00:45
@sonney2kblackburn, are we talking about the same thing? hashing is a bit more expensive even00:46
@sonney2kf-x, so ok - then please give it a try00:46
blackburnhmm00:46
blackburnwell yes, may be it could be slower..00:47
f-xsonney2k: thanks - i'll start with that now00:47
@sonney2kf-x, k thanks00:47
blackburnok i don't want to change Labels00:48
blackburn:D00:48
blackburnhmm what to do after mid-term - what is the question of the universe00:49
blackburnthat*00:49
@sonney2kblackburn, I recall from that matlab toolbox  that there are plenty of dimred methods around00:55
blackburnsonney2k: I'm pretty sure we don't need all of them00:55
@sonney2kheh :)00:55
blackburnfor sure I'll implement Hessian LLE00:56
blackburnbut there are some not-so-useful things00:56
@sonney2kblackburn, did you optimize pca/kpca yet?00:58
blackburnsonney2k: not yet, it is in todo too :)00:58
@sonney2kanyway I have to go to bed again... only 1 more day and I have to be alive early in the morning at work :)00:58
@sonney2kl8r00:58
blackburnsee ya00:59
-!- blackburn [~blackburn@188.122.238.13] has quit [Quit: Leaving.]01:05
-!- f-x [~user@117.192.218.221] has quit [Remote host closed the connection]02:28
-!- blackburn [~blackburn@188.122.238.13] has joined #shogun11:06
-!- blackburn1 [~blackburn@188.122.252.251] has joined #shogun14:08
-!- blackburn [~blackburn@188.122.238.13] has quit [Ping timeout: 244 seconds]14:10
-!- blackburn [~blackburn@85.114.187.90] has joined #shogun14:50
-!- blackburn1 [~blackburn@188.122.252.251] has quit [Ping timeout: 240 seconds]14:51
-!- f-x [~user@117.192.209.232] has joined #shogun14:53
-!- blackburn [~blackburn@85.114.187.90] has quit [Ping timeout: 255 seconds]14:56
-!- blackburn [~blackburn@188.122.238.99] has joined #shogun15:01
-!- blackburn1 [~blackburn@188.122.238.99] has joined #shogun15:25
-!- blackburn [~blackburn@188.122.238.99] has quit [Ping timeout: 255 seconds]15:27
-!- in3xes_ [~in3xes@180.149.49.227] has joined #shogun15:40
-!- in3xes [~in3xes@180.149.49.227] has quit [Ping timeout: 240 seconds]15:44
-!- in3xes_ is now known as in3xes15:46
-!- srikanth [~mrsrikant@59.92.0.164] has joined #shogun16:00
-!- blackburn1 [~blackburn@188.122.238.99] has quit [Ping timeout: 255 seconds]16:40
-!- srikanth [~mrsrikant@59.92.0.164] has quit [Quit: Leaving]17:39
-!- blackburn [~blackburn@188.122.238.99] has joined #shogun19:12
-!- blackburn [~blackburn@188.122.238.99] has quit [Read error: No route to host]20:30
-!- blackburn [~blackburn@188.122.238.99] has joined #shogun20:30
@sonney2kf-x, so did it work out?20:49
@sonney2kblackburn, did you sent your weekly report yet?20:50
blackburnsonney2k: oh sorry was fucking with arpack all day long20:50
blackburn:D20:50
blackburnwill do it now20:50
f-xsonney2k: yes, just did it now... and i think it works20:50
f-xbut again, one minor issue20:50
@sonney2kI am just wondering because no one sent an email so far20:50
@sonney2kf-x, yes?20:51
blackburnI just forgot about it20:51
f-xsonney2k: StreamingFileFromSimpleFeatures<T> needs to implement each of get_*_vector() and get_*_vector_and_label()20:51
f-xi mean even if T=int, it needs to implement get_bool_vector, get_real_vector, etc20:52
f-xand it has a CStreamingFile<T>* as member20:53
f-xsorry20:53
f-xCSimpleFeatures<T>*20:53
f-xfetching examples is done using simple_features->get_feature_vector(), and this returns an SGVector<T>20:54
f-xsonney2k: i'll point you to the code, just a sec20:55
@sonney2kI understand20:55
@sonney2kthough that problem is also there without templates if I understand correctly20:56
@sonney2k*argh*20:56
f-xsonney2k: yeah it is20:56
f-xso we need to typecast still20:56
f-xeven though it is much better than the earlier method20:56
@sonney2kthis means it will return broken things ...20:57
f-xsonney2k: not if used properly20:57
f-xi mean nobody should call get_int_vector when using a StreamingFileFromSimpleFeatures<float64_t>20:57
f-xthe typecast is done as without it things won't compile20:57
@sonney2kblackburn, btw if you don't know how to spent time - there is this nice GPL'ed version of C5.0 lingering around - it would be cool to have this in shogun20:58
@sonney2kf-x, can you throw SG_ERROR's if the T type doesn't match the get_*_vector one?20:58
@sonney2kI mean you still need the typecast but at least it cannot be misused20:58
blackburnsonney2k: well I'll either find some guy to do it or do it by myself but later20:59
blackburnbefore mid-term I'm going to really polish ready algorithms20:59
f-xsonney2k: i tried that, but i couldn't find an easy way for it20:59
@sonney2kblackburn, sure sure - I just saw you doing lots of trac cleaning issues21:00
blackburnsonney2k: most of them was kinda old21:00
blackburnah yes, need your opinion21:01
blackburnI have closed some enhancement about to put lapack and blas into CMath21:02
@sonney2kblackburn, seen that. I still don't want to require lapack / blas. It is ok to loose features but not to not being able to compile....21:03
@sonney2kf-x, I mean since the class is templated now you know or?21:04
f-xsonney2k: we'll at least need some kind of helper function to do that (as far as i can see)21:04
f-xi'm worried about complicating code.. but wait, i'll try some more21:04
@sonney2kf-x, I mean couldn't you do implementations for get_bool_vector<bool>() and then get_bool_vector<other_type>() { SG_ERROR("type mismatch") } ?21:05
blackburnsonney2k: exactly, but they would better install it (or they will not see my beautiful dimreduction ;)21:05
@sonney2kblackburn, yeah - but getting this to work with osx and cygwin is a pain21:05
f-xsonney2k: that would mean templating those functions further.. but it looks like something which can work21:06
blackburnsonney2k: what is used in osx and windoze instead of lapack?21:06
@sonney2kf-x, I mean you can do the correct implementation for the correct type - and then use the macro magic to throw errors for unsupported types21:07
@sonney2kblackburn, lapack and blas :)21:07
blackburnsonney2k: okay but what is the problem?21:07
@sonney2kblackburn, try to compile it under osx, cygwin and you will understand the problem21:08
f-xsonney2k: you're right.. but this macro business has already reduced readability a bit21:09
f-xmore so now since the class StreamingFileFromSimpleFeatures is templated and the .cpp file has to be merged with the .h file21:09
blackburnsonney2k: is it a problem on our side?21:09
@sonney2kf-x, yes I know but for just showing error messages it is ok I think21:09
@sonney2kblackburn, only if we want windows/osx users :-)21:10
blackburnsonney2k: I think we want :)21:11
f-xsonney2k: ok.. i'll fiddle around with those macros to get that SG_ERROR in.. and finally this problem can be solved21:11
blackburnweekly report!21:12
blackburnokay done21:18
blackburnsonney2k: how to go to header in vim? :)21:22
@sonney2kblackburn, thanks21:23
@sonney2kblackburn, install A.vim21:23
@sonney2kthen you just type21:23
@sonney2k:A21:23
* sonney2k is a heavy user of that feature21:23
blackburnsonney2k: what about #include<cblas.h>? how to go to cblas.h?21:24
blackburnah I see21:24
blackburnih21:24
@sonney2kahh, just type gf21:24
blackburnhm21:25
blackburnsonney2k: how to go back? :D21:25
@sonney2kctrl+o21:25
blackburnAWESOME21:25
@sonney2kctrl+i to go the other direction21:25
@sonney2kf-x, does it work?21:59
f-xsonney2k: sorry i haven't checked it out properly yet.. pretty sure it will work out, but i was exploring some alternatives in which templating is not necessary... some ideas are possible to implement, but they will have the overhead of checking types at runtime with each function call, which templates won't... templates win, i guess22:02
f-xdamn.. i really don't want readability to suffer more.. but if there's no other way, templating seems to be the reasonable option22:04
@sonney2kf-x, hmmhh I thought that readability is not suffering with this approach - I guess you should write it down once and then we will see if it is that bad22:07
f-xsonney2k: okay.. i'll write it now and then see..22:08
CIA-32shogun: Sergey Lisitsyn master * rabf5a9d / (src/libshogun/lib/arpack.cpp src/libshogun/lib/arpack.h): Fixes and improvements for arpack wrapper - http://bit.ly/mAzY2k23:08
CIA-32shogun: Sergey Lisitsyn master * rbf05670 / src/libshogun/preprocessor/ClassicMDS.cpp : ARPACK-related fixes for ClassicMDS - http://bit.ly/jKyXIO23:09
-!- blackburn [~blackburn@188.122.238.99] has quit [Ping timeout: 255 seconds]23:34
--- Log closed Mon Jul 04 00:00:43 2011

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!