--- Log opened Fri May 26 00:00:46 2017 | ||
-!- mikeling [uid89706@gateway/web/irccloud.com/x-ddqbkongnepvwlue] has joined #shogun | 01:31 | |
@sukey | Pull Request #3810 "Port kernelMachine to openmp" synchronized by MikeLing - https://github.com/shogun-toolbox/shogun/pull/3810 | 04:11 |
---|---|---|
mikeling | wiking: Hi, how could I specify the tests can run with mutithread? Or I should add some codes to test the mutithread situation | 07:12 |
mikeling | * specify the tests run with mutithread | 07:15 |
mikeling | sorry :/ | 07:15 |
-!- iglesiasg [~iglesiasg@217.119.234.214] has joined #shogun | 09:13 | |
-!- mode/#shogun [+o iglesiasg] by ChanServ | 09:13 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has joined #shogun | 09:30 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 09:30 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has quit [Ping timeout: 240 seconds] | 09:51 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has joined #shogun | 09:53 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 09:53 | |
@sukey | Wiki page: Heiko-Strathmann edited on shogun-toolbox/shogun by karlnapf | 10:33 |
@sukey | Wiki page: Heiko-Strathmann edited on shogun-toolbox/shogun by karlnapf | 10:38 |
mikeling | ping HeikoS | 10:57 |
mikeling | Hi, How could I validate the openmp really works? omp_get_num_threads() always return 1 even I export OMP_NUM_THREADS as 4. Could I force things run with mutithreads? | 10:59 |
-!- geektoni [~geektoni@93-34-128-38.ip49.fastwebnet.it] has joined #shogun | 11:25 | |
@wiking | mikeling, hey | 11:43 |
@wiking | omp_get_num_threads == 1 ? | 11:43 |
@wiking | even if OMP_NUM_THREADS=4? | 11:43 |
@wiking | it's strange | 11:43 |
@wiking | what machine you are testing on? | 11:44 |
mikeling | wiking: Ubuntu 17.04 | 11:57 |
mikeling | i7 with 4 cores | 11:58 |
@wiking | ok and if you run anythign with openmp | 11:58 |
@wiking | it returns 1? | 11:58 |
@wiking | what's your compiler? | 11:58 |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has quit [Ping timeout: 268 seconds] | 11:58 | |
mikeling | Gcc | 12:00 |
lisitsyn | eh? | 12:00 |
lisitsyn | :) | 12:00 |
mikeling | So? | 12:00 |
mikeling | Doesn't gcc support openmp? | 12:01 |
lisitsyn | it does 146% sure :P | 12:02 |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has joined #shogun | 12:18 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 12:18 | |
@wiking | mikeling, mmm it's weird though | 12:26 |
@wiking | it should be multithreaded | 12:26 |
@wiking | can you try to gdb or debug somehow | 12:26 |
mikeling | sure, please wait a second | 12:27 |
@wiking | what is the parallel->get_num_threads() value by default | 12:27 |
@wiking | i cannot believe that we are back to square one | 12:27 |
@wiking | with the parallel execution | 12:27 |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has quit [Remote host closed the connection] | 12:31 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has joined #shogun | 12:32 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 12:32 | |
mikeling | wiking: here is the output https://pastebin.mozilla.org/9022761 | 12:39 |
@wiking | mikeling, amazing | 12:40 |
@wiking | lemme debug | 12:40 |
mikeling | wiking: and parallel->get_num_threads also https://pastebin.mozilla.org/9022760 | 12:41 |
mikeling | and here is my compiler information gcc (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406 | 12:42 |
@wiking | crazy | 12:42 |
@wiking | agaaaaaaaaaaaaaain | 12:42 |
@wiking | craaaaaaaaaazy | 12:42 |
@wiking | this is develop branch right? | 12:43 |
* wiking we need a fucking unit test for this! :) | 12:43 | |
mikeling | no, it's on the kernel_machine_clean_up(with that pr) branch | 12:44 |
@wiking | yeah but | 12:44 |
mikeling | let me try it on develop | 12:44 |
@wiking | if you try it in a different place | 12:44 |
@wiking | where you get parallel->... | 12:44 |
@wiking | it should be the same | 12:44 |
mikeling | give me a few minutes to try it on develop, just for sure ;) | 12:45 |
@wiking | mmm | 12:45 |
@wiking | wait wait | 12:45 |
@wiking | mmmmmmmmmmmmmmm | 12:46 |
@wiking | ok wait a second :))))) | 12:46 |
@wiking | num_threads = omp_get_num_threads(); | 12:46 |
@wiking | is in a | 12:46 |
@wiking | +#ifdef HAVE_OPENMP | 12:46 |
@wiking | +#pragma omp single | 12:46 |
@wiking | section | 12:46 |
@wiking | :))) | 12:46 |
@wiking | mmm | 12:47 |
@wiking | #pragma omp single | 12:47 |
@wiking | { | 12:47 |
@wiking | num_threads=omp_get_num_threads(); | 12:47 |
@wiking | num_vec=idx_a2-idx_a1+1; | 12:47 |
@wiking | step=num_vec/num_threads; | 12:47 |
@wiking | } | 12:47 |
@wiking | is this supposed to return the available threads? | 12:47 |
@wiking | it should no? | 12:47 |
mikeling | so, it actually only have one thread been detected? | 12:48 |
mikeling | yep | 12:48 |
mikeling | I believe so | 12:48 |
@wiking | just compiling | 12:48 |
@wiking | but yeah this is actually problematic | 12:48 |
@wiking | :((((( | 12:48 |
mikeling | I guess the nature of omp_get_num_threads() is about to get all available threads | 12:48 |
@wiking | it should work | 12:48 |
@wiking | but then again | 12:48 |
@wiking | it's a shit | 12:48 |
@wiking | because this is not controlled by | 12:49 |
@wiking | SHOGUN_NUM_THREADS | 12:49 |
@wiking | :( | 12:49 |
@wiking | so that needs patching | 12:49 |
mikeling | wiking: it's on the develop branch https://pastebin.mozilla.org/9022765 | 12:50 |
mikeling | :) | 12:50 |
@wiking | huh | 12:50 |
@wiking | ok | 12:50 |
@wiking | gimme some time to debug | 12:50 |
@wiking | i'll get back to you | 12:50 |
@wiking | with an urgent ticket as well | 12:50 |
@wiking | to get this fixed all over the place | 12:50 |
@wiking | :) | 12:50 |
mikeling | ok, thank u | 12:50 |
@wiking | just have to come up with a soltuion | 12:51 |
mikeling | please tell me there are anything I can help | 12:51 |
@wiking | kk | 12:52 |
@wiking | i'll do as soon as | 12:52 |
@wiking | In a sequential section of the program omp_get_num_threads returns 1 | 12:58 |
@wiking | :( | 12:58 |
@wiking | ok | 12:58 |
@wiking | testing some shit | 12:58 |
@wiking | mikeling, (gdb) p parallel->get_num_threads() | 12:58 |
@wiking | $6 = 8 | 12:59 |
@wiking | (gdb) p omp_get_num_threads() | 12:59 |
@wiking | $8 = 1 | 12:59 |
@wiking | :) | 12:59 |
@wiking | ok | 12:59 |
@wiking | we r fucked :)))0 | 12:59 |
@wiking | ok i'll come up with a fix | 12:59 |
@wiking | and then i'll ping u in that commit | 12:59 |
CaBa | hola wiking | 13:04 |
@wiking | CaBa, servus, wie gehst? | 13:26 |
@wiking | *gehtst | 13:26 |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has quit [Quit: Leaving.] | 13:41 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has joined #shogun | 13:41 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 13:41 | |
CaBa | wiking: jó | 13:42 |
CaBa | wiking: i live in the wrong half of germany for 'servus' though ;) | 13:44 |
CaBa | wiking: how are you? | 13:44 |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has quit [Ping timeout: 245 seconds] | 13:46 | |
@wiking | good good | 13:46 |
@wiking | busy | 13:46 |
@wiking | as usual | 13:47 |
@wiking | how about you | 13:47 |
@wiking | ? | 13:47 |
CaBa | wiking: thesis crisis as usual, but otherwise good as well ;) | 13:48 |
@wiking | :) | 13:49 |
CaBa | summer has finally come to berlin \o/ | 13:49 |
-!- geektoni [~geektoni@93-34-128-38.ip49.fastwebnet.it] has left #shogun [] | 13:57 | |
lisitsyn | wiking: we don't setup the number of threads in parallel? | 14:04 |
@wiking | lisitsyn, we do | 14:04 |
lisitsyn | what's the bug then? | 14:05 |
@wiking | mmm | 14:05 |
lisitsyn | we don't setup OMP number of threads I mean | 14:05 |
lisitsyn | ? | 14:05 |
@wiking | actually nothing i've realised | 14:05 |
@wiking | but we should be using there | 14:05 |
lisitsyn | no? | 14:05 |
@wiking | parallel | 14:05 |
@wiking | mikeling, so i've did some debugging ... and in distance machine the setup works | 14:05 |
@wiking | so i'm wondering why you get =1 for threads in KernelMachine. | 14:06 |
@wiking | but we should definitely use another construct | 14:06 |
@wiking | than | 14:06 |
@wiking | num_threads = omp_get_num_threads(); | 14:06 |
lisitsyn | ah fck | 14:07 |
lisitsyn | I see | 14:07 |
lisitsyn | yes we don't ever use omp_get_num_threads I gues | 14:07 |
lisitsyn | we just don't setup it right? | 14:07 |
@wiking | mmm no | 14:09 |
@wiking | it should actually work | 14:09 |
@wiking | Parallel::set_num_threads sets | 14:09 |
@wiking | omp_set_num_threads | 14:09 |
lisitsyn | ok then I am missing everything | 14:09 |
lisitsyn | is it happening in tests? | 14:09 |
@wiking | so it should be consistent | 14:09 |
lisitsyn | fork? | 14:09 |
@wiking | i'm just not understanding | 14:09 |
@wiking | how does it =1 in case of mikeling | 14:09 |
lisitsyn | can it be that gtest forks and reset it this way? | 14:10 |
@wiking | nono | 14:10 |
@wiking | i've just tested it | 14:10 |
lisitsyn | ok no idea then | 14:10 |
lisitsyn | :D | 14:10 |
@wiking | ok it works good | 14:12 |
@wiking | with DistanceMachine | 14:12 |
@wiking | it was my fault in the beginning | 14:12 |
@wiking | i'll try patching now the develop :) | 14:12 |
@wiking | and use KernelMachine | 14:12 |
@wiking | how do you fetch the patch of a PR? | 14:13 |
@wiking | mikeling, | 14:20 |
@wiking | Breakpoint 1, shogun::CKernelMachine::apply_get_outputs () at ../src/shogun/machine/KernelMachine.cpp:348 | 14:20 |
@wiking | 348num_threads = omp_get_num_threads(); | 14:20 |
@wiking | (gdb) n | 14:20 |
@wiking | 349step = num_vectors / num_threads; | 14:20 |
@wiking | (gdb) p num_threads | 14:20 |
@wiking | $1 = 3 | 14:20 |
mikeling | wiking: pong | 14:20 |
@wiking | (gdb) r --gtest_filter=KRRNystrom.* | 14:20 |
@wiking | and i started gdb with | 14:20 |
@wiking | SHOGUN_NUM_THREADS=3 gdb /home/wiking/shogun/build/bin/shogun-unit-test | 14:20 |
@wiking | if i run without SHOGUN_NUM_THREADS=3 | 14:21 |
mikeling | wiking: ok, let me have a try | 14:21 |
mikeling | could Iuse SHOGUN_NUM_THREADS when compile it? e.g cmake SHOGUN_NUM_THREADS=3 ../ | 14:22 |
@wiking | (gdb) p num_threads | 14:22 |
@wiking | $1 = 8 | 14:22 |
mikeling | alright | 14:22 |
@wiking | in case i dont use SHOGUN_NUM_THREADS | 14:22 |
@wiking | so | 14:22 |
@wiking | i mean SHOGUN_NUM_THREADS is a runtime var | 14:22 |
@wiking | you dont need to compile anything | 14:22 |
@wiking | just make sure that cmake detects OpenMP | 14:22 |
@wiking | i.e. check in your build | 14:22 |
@wiking | src/shogun/lib/config.h | 14:23 |
@wiking | #define HAVE_OPENMP 1 | 14:23 |
@wiking | so it should work | 14:25 |
@wiking | btw make sure that you fix the typo in #pragma opm single | 14:25 |
mikeling | hmmm but the HAVE_OPENMP do equal to 1 | 14:25 |
@wiking | that should be automatically generated in your | 14:26 |
@wiking | src/shogun/lib/config.h | 14:26 |
@wiking | i.e. build/src/shogun/lib/config.h | 14:26 |
mikeling | yeah, I mean I found it in config.h and it equal to one | 14:26 |
@wiking | cool | 14:27 |
@wiking | so you hsould have openmp | 14:27 |
@wiking | so in that case num_threads should equal to your num cores | 14:27 |
-!- iglesiasg [~iglesiasg@217.119.234.214] has quit [Ping timeout: 240 seconds] | 15:00 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has joined #shogun | 15:01 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 15:02 | |
-!- iglesiasg [~iglesiasg@217.119.234.214] has joined #shogun | 15:03 | |
-!- mode/#shogun [+o iglesiasg] by ChanServ | 15:03 | |
-!- leagoetz [~leagoetz@x4db43ef6.dyn.telefonica.de] has joined #shogun | 15:03 | |
-!- iglesiasg [~iglesiasg@217.119.234.214] has quit [Ping timeout: 240 seconds] | 15:08 | |
-!- iglesiasg [~iglesiasg@217.119.234.214] has joined #shogun | 15:10 | |
-!- mode/#shogun [+o iglesiasg] by ChanServ | 15:10 | |
mikeling | wiking: ok, I it's mutithreads for KRRNystrom.*, but it = 1 in CrossValidation_multithread.* | 15:34 |
@wiking | mmm | 15:34 |
@wiking | i think in case of crossval | 15:34 |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has quit [Ping timeout: 246 seconds] | 15:34 | |
@wiking | it is forced to be 1 | 15:34 |
@wiking | imo | 15:34 |
@wiking | indeed //#pragma omp parallel for | 15:35 |
@wiking | because it's full with bugs :) | 15:35 |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has joined #shogun | 15:35 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 15:35 | |
-!- leagoetz [~leagoetz@x4db43ef6.dyn.telefonica.de] has quit [Remote host closed the connection] | 15:56 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has quit [Quit: Leaving.] | 15:57 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has joined #shogun | 16:06 | |
-!- mode/#shogun [+o HeikoS] by ChanServ | 16:06 | |
mikeling | wiking: I got this error https://pastebin.mozilla.org/9022772 | 16:51 |
mikeling | with valgrind --tool=helgrind ./bin/shogun-unit-test --gtest_filter=KRRNystrom.* | 16:51 |
mikeling | is that mean we have data race for CSignal::cancel_computations()? | 16:51 |
@iglesiasg | it looks like it | 16:53 |
@iglesiasg | it might be a false positive | 16:54 |
-!- TingMiao [uid229534@gateway/web/irccloud.com/x-kxuqzcdhsbnztlow] has joined #shogun | 17:08 | |
@iglesiasg | TingMiao: hey! I just read your e-mail. Did you want to add anything else to point 3? It finishes with "After that, I should" | 17:15 |
TingMiao | Let me have a look! | 17:18 |
TingMiao | iglesiasg: | 17:19 |
TingMiao | iglesiasg: yes I see it is not complete.. sorry about that | 17:19 |
@iglesiasg | no problem :) | 17:20 |
@iglesiasg | if you were planning to add anything else, just send a follow-up | 17:20 |
TingMiao | iglesiasg: Do you think there is any point that I missed? | 17:22 |
@iglesiasg | TingMiao: nothing missed, all good! | 17:25 |
@iglesiasg | TingMiao: regarding point 3, before you start working on a notebook for this "further ideas" part, let's sync again | 17:27 |
TingMiao | iglesiasg: Will do that! | 17:27 |
@iglesiasg | so, once you have the "I should pick up the most related ideas and think about how those ideas can be applied to our project.", write it down (a couple of paragraphs should be enough), we discuss about it, and then we go to the proof of concept notebook | 17:28 |
TingMiao | okay ;-) | 17:28 |
@iglesiasg | TingMiao: I am going through the https://github.com/tingpan/shogun-project-demos/blob/master/kmeans.ipynb | 17:28 |
@iglesiasg | I will make a few comments in github | 17:29 |
TingMiao | Sure ;-) | 17:30 |
TingMiao | iglesiasg: Thank you! | 17:32 |
@iglesiasg | TingMiao: you are welcome | 17:33 |
@iglesiasg | TingMiao: do you get notifications and can see them? | 17:33 |
TingMiao | iglesiasg: yes, I received the email notification | 17:34 |
@iglesiasg | TingMiao: great! Let me know if anything is not clear | 17:37 |
TingMiao | iglesiasg: Thank you very much! I will definitely contact you if anything is unclear! | 17:42 |
-!- Positron_ [~textual@c-73-162-174-23.hsd1.ca.comcast.net] has joined #shogun | 17:46 | |
-!- Positron_ [~textual@c-73-162-174-23.hsd1.ca.comcast.net] has quit [Client Quit] | 17:48 | |
@iglesiasg | TingMiao: just finished going through the notebook. it is interesting stuff, looking forward to continuing seeing more about the project | 17:59 |
@iglesiasg | I am off now, ttyl | 18:00 |
-!- leagoetz [~leagoetz@x4db43ef6.dyn.telefonica.de] has joined #shogun | 18:01 | |
-!- leagoetz [~leagoetz@x4db43ef6.dyn.telefonica.de] has quit [Remote host closed the connection] | 18:06 | |
-!- iglesiasg [~iglesiasg@217.119.234.214] has quit [Quit: leaving] | 18:09 | |
-!- leagoetz [~leagoetz@x4db43ef6.dyn.telefonica.de] has joined #shogun | 18:12 | |
-!- leagoetz [~leagoetz@x4db43ef6.dyn.telefonica.de] has quit [Client Quit] | 18:14 | |
-!- HeikoS [~heiko@x4db43ef6.dyn.telefonica.de] has quit [Ping timeout: 260 seconds] | 18:50 | |
--- Log closed Sat May 27 00:00:47 2017 |
Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!