--- Log opened Sun Sep 25 00:00:25 2011 | ||
-!- blackburn [~blackburn@31.28.44.65] has quit [Quit: Leaving.] | 00:59 | |
-!- sonne|work [~sonnenbu@194.78.35.195] has quit [Ping timeout: 260 seconds] | 06:00 | |
-!- sonne|work [~sonnenbu@194.78.35.195] has joined #shogun | 06:14 | |
-!- mrsrikanth [~mrsrikant@59.92.79.76] has joined #shogun | 07:11 | |
-!- mrsrikanth [~mrsrikant@59.92.79.76] has quit [Quit: Leaving] | 08:35 | |
-!- blackburn [~blackburn@31.28.44.65] has joined #shogun | 10:53 | |
blackburn | sonney2k: why we have no -msse enabled? | 11:36 |
---|---|---|
blackburn | or msse2 | 11:37 |
blackburn | or msse3 | 11:37 |
blackburn | sonney2k: I have enabled -msse -msse2 -msse3 | 11:53 |
blackburn | before: | 11:53 |
blackburn | SHOGUN Took 5.209960s | 11:53 |
blackburn | after: | 11:53 |
blackburn | SHOGUN Took 3.441315s | 11:53 |
CIA-3 | shogun: Sergey Lisitsyn master * r3f81b07 / (src/configure src/shogun/preprocessor/KernelPCA.h): Added MSSE{1,2,3} flags to configure - http://git.io/LKTUKA | 11:57 |
blackburn | oh | 11:59 |
blackburn | forgot to commit KernelPCA separately :) | 11:59 |
blackburn | sonney2k: arpack dsymv issue resolved! | 12:59 |
CIA-3 | shogun: Sergey Lisitsyn master * r134f863 / (4 files): Improved memory usage efficiency in DR preprocessor - http://git.io/mDfTTQ | 13:34 |
-!- mrsrikanth [~mrsrikant@59.92.84.60] has joined #shogun | 14:30 | |
blackburn | sonney2k: ! | 16:24 |
blackburn | sonney2k: what the fuck is in cpudetection?! it is not working anyhow.. | 16:26 |
blackburn | while sse is detected ok I have no -msse | 16:27 |
blackburn | and 3dnow too | 16:27 |
blackburn | k6? k7? :D | 16:29 |
CIA-3 | shogun: Sergey Lisitsyn master * rd6ac38d / src/configure : Fixed mistype at configure, changed flags - http://git.io/dmJMOw | 16:31 |
CIA-3 | shogun: Sergey Lisitsyn master * r2642355 / src/shogun/lib/DataType.h : Added [] for SGMatrix - http://git.io/xD20kw | 16:31 |
blackburn | what is O9?? | 16:59 |
CIA-3 | shogun: Sergey Lisitsyn master * r16e761a / (6 files in 2 dirs): Updated [] for SGMatrix and made it used by DR preprocessors - http://git.io/xvUnrw | 17:09 |
CIA-3 | shogun: Sergey Lisitsyn master * r75685b6 / src/shogun/preprocessor/LocallyLinearEmbedding.cpp : Removed junk from LLE - http://git.io/NB-MsA | 17:40 |
@sonney2k | blackburn, re | 19:51 |
blackburn | out for dinner :) | 19:51 |
@sonney2k | cu | 19:51 |
@sonney2k | blackburn, I am reverting you msse patch | 19:53 |
@sonney2k | blackburn, what are you doing??! Look at 3f81b07f4716de1b5c40f801d81a62bf071b7732 - it says Added MSSE{1,2,3} flags to configure | 19:55 |
@sonney2k | but in fact it changes stuff in KernelPCA.h | 19:55 |
blackburn | sonney2k: back | 19:56 |
blackburn | sonney2k: why to revert it? | 19:56 |
@sonney2k | we are using march=native | 19:56 |
@sonney2k | that should enable every possible speedup | 19:56 |
blackburn | it doesnt' | 19:57 |
blackburn | sonney2k: with sse enabled thing became 1.5x faster | 19:57 |
blackburn | it doesn't enable sse/sse2/sse3 | 19:57 |
blackburn | I can say it for sure | 19:58 |
@sonney2k | it does enable it | 19:58 |
@sonney2k | I am really sure | 19:58 |
@sonney2k | I am not sure if mfpmath sse is set though | 19:58 |
@sonney2k | anyhow please don't do such commits | 19:59 |
blackburn | such like what? | 19:59 |
@sonney2k | I mean don't write you did 'X' and then do 'X' and 'Y' | 19:59 |
@sonney2k | do git commit configure | 19:59 |
@sonney2k | and git commit KernelPCA.h | 19:59 |
@sonney2k | *separately* | 19:59 |
blackburn | sonney2k: if you read my message here you would know I did it by mistake | 20:00 |
blackburn | okay I will check if removing sse won't change speed | 20:00 |
@sonney2k | ok | 20:00 |
@sonney2k | blackburn, the gcc doc says mfpmath=sse is enabled on x86_64 by default | 20:01 |
blackburn | sonney2k: I have x86_32 | 20:01 |
blackburn | sonney2k: I'm not sure sse is enabled with native | 20:02 |
@sonney2k | blackburn, it is - just not for math stuff | 20:02 |
blackburn | sonney2k: eh? | 20:02 |
blackburn | sonney2k: what is O9? | 20:03 |
blackburn | and why not O3? | 20:03 |
@sonney2k | btw http://ubuntuforums.org/showthread.php?t=1477356 suggests that mfpmath=both is even faster | 20:08 |
@sonney2k | blackburn, what did you do that arpack works now on cheng's machine? | 20:09 |
blackburn | ../../shogun/preprocessor/Isomap.h:97: warning: inline function ‘int32_t shogun::CIsomap::get_k() const’ used but never defined | 20:09 |
blackburn | any suggestion why I got that? | 20:09 |
blackburn | sonney2k: I guess removing unnecessary #include <cblas.h> helped | 20:09 |
@sonney2k | blackburn, ok then it must have been some bad interaction between arpack/atlas | 20:10 |
@sonney2k | clbas | 20:10 |
blackburn | sonney2k: arpack is not involved here | 20:10 |
blackburn | the error was related to my code | 20:10 |
@sonney2k | blackburn, why not? it was arpack where cheng's stuff failed | 20:11 |
blackburn | sonney2k: my arpack *wrapper* | 20:11 |
@sonney2k | and? | 20:11 |
@sonney2k | it includes arpack header or? | 20:11 |
blackburn | header? | 20:11 |
blackburn | it is fortran 77 lib :) | 20:11 |
@sonney2k | I don't understand then why it works now | 20:12 |
blackburn | sonney2k: lapack.h includes cblas.h | 20:12 |
blackburn | arpack.h was too | 20:12 |
blackburn | the only reason I can mind | 20:12 |
blackburn | sonney2k: can you remind me how to impl inline function in .cpp? | 20:13 |
@sonney2k | http://ondioline.org/mail/cmov-a-bad-idea-on-out-of-order-cpus | 20:14 |
@sonney2k | cmov should go | 20:14 |
blackburn | okay | 20:14 |
blackburn | sonney2k: there are a lot of cpu detection useless things, would you remove it? | 20:14 |
@sonney2k | what is useless? | 20:14 |
blackburn | sonney2k: I will check if sse is really enabled on my machine | 20:15 |
@sonney2k | btw -ffast-math might help a lot | 20:15 |
@sonney2k | -msse is not needed | 20:15 |
blackburn | e.g. detection of amd k6 | 20:15 |
blackburn | :D | 20:15 |
@sonney2k | well shogun exists for some time - so it will be tuned for old cpu's too | 20:15 |
@sonney2k | but fast-math might cause trouble - if we enable that we should check if the test suite still works | 20:15 |
blackburn | This option should never be turned on by any -O option since it can result in incorrect output for programs which depend on an exact implementation of IEEE or ISO rules/specifications for math functions | 20:16 |
blackburn | lets avoid it | 20:16 |
@sonney2k | read on please | 20:16 |
@sonney2k | On Darwin systems, the math library never sets "errno". There is | 20:16 |
@sonney2k | therefore no reason for the compiler to consider the possibility | 20:16 |
@sonney2k | that it might, and -fno-math-errno is the default. | 20:16 |
blackburn | sonney2k: we can set on some things contained under -ffast-math | 20:17 |
blackburn | e.g. we can't enable finite math | 20:17 |
@sonney2k | how about mfpmath=both? | 20:19 |
blackburn | sonney2k: I will check in a min | 20:20 |
blackburn | just need to finish .h->.cpp moving | 20:20 |
@sonney2k | my take is we enable everything when the test suite still passes | 20:20 |
@sonney2k | ok | 20:20 |
@sonney2k | regarding your err ./../shogun/preprocessor/Isomap.h:9 - you probably have to remove inline ? | 20:21 |
blackburn | sonney2k: already did | 20:21 |
blackburn | sonney2k: so you didn't answered, why not O3? | 20:21 |
@sonney2k | I just used the highest possible level | 20:22 |
@sonney2k | feel free to change it to O3 | 20:22 |
blackburn | ehm? | 20:22 |
blackburn | sonney2k: some options enabled now is O3 level | 20:22 |
blackburn | so may be just enable all of them would make things faster | 20:22 |
blackburn | I feel myself speed maniac today hah | 20:23 |
@sonney2k | I don't understand what you are saying | 20:23 |
blackburn | nevermind | 20:23 |
blackburn | sonney2k: have you made progress on python interface? | 20:24 |
@sonney2k | ? | 20:24 |
@sonney2k | you mean the array_interafce? | 20:25 |
blackburn | yes | 20:25 |
@sonney2k | I learned that one should implement the protocal buffer interface http://docs.python.org/dev/c-api/buffer.html#PyObject_GetBuffer | 20:26 |
@sonney2k | and now I dont' know how to get a handle to the PyObject that swig creates | 20:26 |
blackburn | bad | 20:27 |
@sonney2k | to modify the PyTypeObject | 20:27 |
blackburn | array__ looked better :) | 20:27 |
blackburn | sonney2k: there is some mentions of sse is not always being enabled with native | 20:29 |
@sonney2k | blackburn, well no | 20:29 |
@sonney2k | the buffer protocol is much better | 20:30 |
@sonney2k | it is default in python2.6 / 3.x | 20:30 |
@sonney2k | and supported by more than numpy | 20:30 |
blackburn | sonney2k: I removed sse, will check if it became slower | 20:31 |
-!- mrsrikanth [~mrsrikant@59.92.84.60] has quit [Quit: Leaving] | 20:32 | |
blackburn | sonney2k: okay believe you, sse is enabled | 20:34 |
blackburn | sonney2k: predictive commoning speed down compilation really much.. | 20:40 |
blackburn | sonney2k: fpmath=both is slower | 21:31 |
@sonney2k | seems like a lot depends on the actual algorithm you are trying to run | 21:34 |
blackburn | sonney2k: I'm running LLE | 21:35 |
@sonney2k | what is the difference btw? | 21:35 |
blackburn | will tell you in 5 min | 21:35 |
blackburn | when it is compiled without fpmath | 21:36 |
blackburn | sonney2k: is there any way to inline functions in .cpp? | 21:36 |
@sonney2k | ? | 21:36 |
@sonney2k | it is done automatically | 21:36 |
blackburn | it doesn't work when I inline functions | 21:36 |
blackburn | ah | 21:36 |
blackburn | okay | 21:36 |
CIA-3 | shogun: Sergey Lisitsyn master * ra6426c4 / (6 files): Moved set/get code from .h to .cpp for DR preprocessors - http://git.io/PT7XcA | 21:37 |
blackburn | sonney2k: can we use another options for swig, not ones we use for libshogun? | 21:39 |
@sonney2k | currently not | 21:39 |
blackburn | sonney2k: I guess we should - because we don't need to optimize math, etc | 21:40 |
blackburn | interfaces would compile faster | 21:40 |
@sonney2k | that is true - if we don't have functions in .h files basically nothing needs to be that fast - -O2 should be sufficient | 21:41 |
blackburn | sonney2k: I have moved all the .h code to .cpp for my preprocs | 21:41 |
@sonney2k | k | 21:44 |
@sonney2k | I guess even O0 or O1 could be OK | 21:44 |
blackburn | agree | 21:44 |
blackburn | sonney2k: do you think my dimreduction 'subtoolbox' have any chances to be published on JMLR mloss? | 21:45 |
blackburn | I'm currently looking, these guys do more non-trivial things.. | 21:46 |
@sonney2k | if it is the best dimred oss toolbox available - sure | 21:46 |
blackburn | sonney2k: if not? | 21:47 |
@sonney2k | they will complain | 21:47 |
blackburn | sonney2k: what is the best toolbox? | 21:48 |
@sonney2k | I don't know | 21:48 |
blackburn | my algos are the fastest ones.. that is the only thing I know | 21:48 |
blackburn | sonney2k: http://jmlr.csail.mit.edu/papers/volume9/klanke08a/klanke08a.pdf it looks pretty 'easy' but accepted | 22:10 |
blackburn | but algo is non-trivial I guess | 22:10 |
@sonney2k | yes non-trivial | 22:10 |
@sonney2k | 1 non-trivial algorithm is enough btw | 22:11 |
@sonney2k | btw that was the first submission we got :) | 22:11 |
blackburn | I have only standard algos | 22:12 |
blackburn | but they are really fast :) the only advantage | 22:12 |
blackburn | sonney2k: sse fpmath is a little faster.. | 22:17 |
@sonney2k | ok so then add just this flag to COMP_OPTS | 22:18 |
CIA-3 | shogun: Sergey Lisitsyn master * rf634c62 / src/configure : Removed unnecessary sse flags - http://git.io/79Q8cg | 22:19 |
@sonney2k | blackburn, btw we have already COMPFLAGS_SWIG | 22:21 |
blackburn | sonney2k: what should be here? O1 only? | 22:23 |
@sonney2k | good question... I think we might need -fPIC on some archs too | 22:23 |
@sonney2k | maybe -fPIC -g -O | 22:24 |
@sonney2k | but -O should not be there when disable-optimization is enabled | 22:24 |
blackburn | sonney2k: regarding mloss, can I treat a part of shogun as not-a-shogun? :) | 22:24 |
@sonney2k | I have no idea how to do that practically | 22:25 |
blackburn | sonney2k: I see, it needs some customization | 22:25 |
blackburn | I guess I have zero chances | 22:25 |
blackburn | sad | 22:25 |
@sonney2k | I think there was some weka extension recently | 22:25 |
@sonney2k | not that recently maybe | 22:25 |
@sonney2k | but check how they did it | 22:25 |
blackburn | sonney2k: well there is no mention of NLDR in your mloss paper | 22:26 |
@sonney2k | NLDR? | 22:26 |
@sonney2k | ahh | 22:26 |
blackburn | nonlinear dim reduction | 22:26 |
blackburn | sonney2k: that is the weka paper http://jmlr.csail.mit.edu/papers/volume11/bifet10a/bifet10a.pdf | 22:26 |
blackburn | sonney2k: the one more reason to separate from preprocessors haha | 22:27 |
blackburn | sonney2k: where can I place tutorial on dimensionality reduction? | 22:37 |
blackburn | python_modular examples? | 22:37 |
blackburn | or applications? | 22:37 |
@sonney2k | hmmhh or doc/tutorial ? | 22:38 |
blackburn | sonney2k: not doc, tutorial with python | 22:38 |
@sonney2k | or new dir tutorial/python ? | 22:44 |
blackburn | no idea, okay will do first | 22:45 |
@sonney2k | http://matt.eifelle.com/2008/11/04/exposing-an-array-interface-with-swig-for-a-cc-structure/ | 22:54 |
@sonney2k | hah | 22:55 |
@sonney2k | that is how I can do ti | 22:55 |
@sonney2k | it | 22:55 |
blackburn | sonney2k: but not buffer interface? | 22:58 |
@sonney2k | it solves the problem on how to access the pyobject | 22:59 |
blackburn | I see | 23:00 |
-!- blackburn [~blackburn@31.28.44.65] has quit [Quit: Leaving.] | 23:52 | |
--- Log closed Mon Sep 26 00:00:29 2011 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!