IRC logs of #shogun for Saturday, 2013-06-15

--- Log opened Sat Jun 15 00:00:46 2013
-!- lisitsyn [~lisitsyn@109-226-90-135.clients.tlt.100megabit.ru] has quit [Remote host closed the connection]		01:18
-!- hushell [~hushell@8-92.ptpg.oregonstate.edu] has quit [Ping timeout: 240 seconds]		02:21
-!- hushell [~hushell@c-24-21-141-32.hsd1.or.comcast.net] has joined #shogun		02:34
-!- foulwall [~foulwall@2001:da8:215:503:d890:fd85:d362:9ad4] has joined #shogun		03:12
-!- nube [~rho@49.244.57.163] has joined #shogun		04:27
-!- foulwall [~foulwall@2001:da8:215:503:d890:fd85:d362:9ad4] has quit [Remote host closed the connection]		04:30
-!- gsomix_ [~gsomix@95.67.173.36] has quit [Read error: Connection reset by peer]		08:29
-!- gsomix_ [~gsomix@95.67.168.164] has joined #shogun		08:43
-!- gsomix_ is now known as gsomix		08:56
gsomix	good morning	08:56
-!- iglesiasg [d58f3251@gateway/web/freenode/ip.213.143.50.81] has joined #shogun		09:50
-!- mode/#shogun [+o iglesiasg] by ChanServ		09:51
-!- lisitsyn [~lisitsyn@109-226-90-135.clients.tlt.100megabit.ru] has joined #shogun		11:06
@sonney2k	lisitsyn, then please finish cmaking!	12:46
lisitsyn	sonney2k: I am a bit stucked	12:46
@sonney2k	iglesiasg, please take care of sending around the doodle results!	12:46
@wiking	yo	13:04
@wiking	lisitsyn: ?	13:04
lisitsyn	wiking: I am a bit lost but will continue later :)	13:05
@iglesiasg	sonney2k: mail with the hour decided, right?	13:05
@iglesiasg	I will send it around soon then	13:08
-!- iglesiasg [d58f3251@gateway/web/freenode/ip.213.143.50.81] has quit [Quit: Page closed]		13:08
@wiking	any ideas how to assure (and support the most type of features) that the feature vectors are the same?	13:28
-!- HeikoS [~heiko@176.248.212.166] has joined #shogun		13:43
-!- mode/#shogun [+o HeikoS] by ChanServ		13:43
-!- lambday [67157f4c@gateway/web/freenode/ip.103.21.127.76] has joined #shogun		14:12
lambday	HeikoS: there??	14:12
@HeikoS	lambday: yes	14:12
@HeikoS	so thinking about these tasks that we create	14:12
@HeikoS	they have to be small in memory	14:12
@HeikoS	otherwise we cannot store many of them	14:12
lambday	yes	14:13
@HeikoS	so the question is: can be sample the trace vectors on the fly?	14:13
@HeikoS	which means that the job has a member which is the sampler	14:13
@HeikoS	which is called on demand	14:13
@HeikoS	so that these samples are produced one-by-one	14:13
lambday	but we obntain using graph coloring	14:14
lambday	I mean, for the probing case	14:14
@HeikoS	yes	14:14
@HeikoS	I mean we would have to change the interfaces slightly	14:14
@HeikoS	sampler only returns one sample	14:14
lambday	alright	14:14
lambday	that's better	14:14
@HeikoS	but does that make sense?	14:14
@HeikoS	not sure	14:14
@HeikoS	so the probing vectors at least come in blocks	14:15
@HeikoS	also, many jobs share the same sample	14:15
@HeikoS	how to ensure that?	14:15
lambday	it's being really tricky :(	14:16
@HeikoS	we might just stores them all	14:16
@HeikoS	I dont really see problems there	14:16
lambday	how about we just don't store samples twice? samples gotta be stored anyway..	14:16
@HeikoS	although it would be nicer not ot	14:16
lambday	but we just do it in the sgmatrix	14:17
lambday	ugly, but memory efficient	14:17
@HeikoS	if the double memory is only for a short time its not a problem	14:17
@HeikoS	but the fact that we store them might be one	14:17
lambday	didn't get it :(	14:18
lambday	store them might be one?	14:18
-!- iglesiasg [d58f3208@gateway/web/freenode/ip.213.143.50.8] has joined #shogun		14:18
@HeikoS	storing all those vectors might be a problem	14:18
@HeikoS	since memory intensive	14:18
@HeikoS	and for serial computation thats not even needed)	14:18
lambday	but doing things differently for serial and parallel isn't good, right?	14:19
lambday	for parallel, we do need the vectors with them	14:19
@HeikoS	lambday: we should be able to handle this within the framework	14:19
@HeikoS	so there is actually another problem	14:19
@HeikoS	imagine, you want to compute 100 log-det estimates with each 100 trace samples	14:20
@HeikoS	in dimension 10^6	14:20
@HeikoS	then our approach needs 100 times more memory	14:20
@HeikoS	since we are storing all of those samples	14:20
@HeikoS	that should definitely not be happening	14:20
lambday	but one thing... for probing vectors case, vectors are sparse.. we don't need SGMatrix for storing that!	14:20
@HeikoS	true ...	14:21
@HeikoS	argh	14:21
@HeikoS	and SGVector and SGSparseVector have no common base	14:21
@HeikoS	sigh	14:21
lambday	nope :(	14:21
@HeikoS	so another issue	14:22
@HeikoS	we cannot make the vectors sparse	14:22
@HeikoS	but that is no problem	14:22
@HeikoS	problem is storing the samples, we have to change this I am afraid	14:22
lambday	:(	14:23
@HeikoS	so about these jobs	14:23
@HeikoS	we need to set them up in such way that they do not store any data	14:23
@HeikoS	only references to data that is shared among them all	14:24
@HeikoS	and then upon calling compute()	14:24
lambday	so, in the compute thing, each job gets a vector, gets the job done, and returns	14:24
@HeikoS	they instanciate their data/problem	14:24
@HeikoS	no rather the jobs have the information they need to create the vector	14:25
@HeikoS	and then its only created upon calling their compute method	14:25
lambday	create the vector within the job?	14:25
@HeikoS	yes	14:25
@HeikoS	give to it a reference to a sampler	14:25
lambday	but that doesn't make them independent :(	14:25
@HeikoS	how do you mean that?	14:25
lambday	each job should get one sample.. and the samples are generated in a bunch (using coloring)	14:26
lambday	not one by one	14:26
lambday	right?	14:26
@HeikoS	samplers?	14:27
@HeikoS	plural?	14:27
lambday	ummm..	14:28
@HeikoS	if each job has a reference to an instance of some sampler	14:28
@HeikoS	might be the same one actually	14:28
@HeikoS	then it might just generate a sample in its compute method	14:28
@HeikoS	wouldnt that work?	14:29
@HeikoS	say Gaussian samples	14:29
lambday	for gaussian that's okay	14:29
@HeikoS	ok	14:29
lambday	but for probing?	14:29
@HeikoS	and for probeing?	14:30
@HeikoS	proibing	14:30
lambday	lol	14:30
@HeikoS	lets think	14:30
lambday	:P	14:30
@HeikoS	probing :)	14:30
lambday	graph coloring info should be there in the sampler	14:30
lambday	whatever it is	14:30
@HeikoS	yes	14:30
lambday	and then that uses the info and generates a bunch of samples	14:30
@HeikoS	but the probing vectors are generated in groups aren't they?	14:30
lambday	yep	14:30
@HeikoS	however	14:30
lambday	once you color the graph and generate a bunch of samples based on that info	14:31
@HeikoS	they are independent (otherwise, the monte carlo estimate of the trace wouldnt work)	14:31
@HeikoS	so one can also just create one probing vector	14:31
@HeikoS	I might be wrong here	14:31
@HeikoS	gotta check my matlab code	14:31
lambday	I'm not sure :(	14:31
@HeikoS	"J. Tang and Y. Saad, A probing method for computing the diagonal of the matrix	14:33
@HeikoS	% inverse" (2010).	14:33
@HeikoS	this is the paper	14:33
lambday	it must be doable... cause the bunch of samples are generated in a loop must be...	14:33
lambday	so we can have some progress info	14:33
lambday	okay checking	14:33
@HeikoS	lambday: https://gist.github.com/karlnapf/5787975	14:34
@HeikoS	the way it works that one create one sample per colour	14:34
@HeikoS	(at least)	14:35
lambday	yes	14:35
@HeikoS	but thats easy to store within the job	14:35
lambday	so, doable, right?	14:35
@HeikoS	yes I think so	14:35
@HeikoS	we just need to think about some details	14:36
@HeikoS	the probing sampler needs to know from which colour to sample from	14:36
@HeikoS	can just be set as a member variable	14:36
lambday	yes	14:37
@HeikoS	so then a different sampler instance for every colour	14:37
@HeikoS	nono sorry	14:37
@HeikoS	just set the colour before calling sample	14:37
@HeikoS	with setter	14:37
@HeikoS	but from where is this done?	14:37
lambday	yes that's what I was thinking	14:37
lambday	how about..	14:38
lambday	each time we call this sample(), it gives a vector from the next color to sample from?	14:38
lambday	and when all the colors are traversed, it gives zero vector or some warning	14:39
@HeikoS	or just starts from the beginning	14:39
@HeikoS	this could work	14:39
@HeikoS	has one downside:	14:39
lambday	direct solvers	14:39
lambday	same sample for multiple jobs	14:40
@HeikoS	doesnt work in parallel computation	14:40
@HeikoS	since there we cannot guarantee any order or the jobs being computed	14:40
@HeikoS	so we need to store this information within the job somehow	14:40
@HeikoS	so the job needs information about the used sampler	14:40
@HeikoS	and if its the probing one	14:41
@HeikoS	it also stores the colour it uses	14:41
@HeikoS	ah ugly	14:41
lambday	order is important?	14:41
@HeikoS	if the sampler iterates over the colours, yes	14:42
lambday	if we compute the final s^T * (log(C)*s) within this compute, then may be not :-/	14:43
lambday	oh but then aggregate comes!!	14:43
@HeikoS	no thats fine	14:43
@HeikoS	its just that we should use all colours	14:43
@HeikoS	or we just do it random	14:43
@HeikoS	ah might not be independent anymore then	14:44
@HeikoS	I think I have to talk to Daniel about this	14:44
lambday	all colors, but in random order	14:44
@HeikoS	yes	14:44
lambday	that should be fine	14:44
@HeikoS	i dont know if that breaks independence somehow	14:44
@HeikoS	so the other way would be to have a unique sampler instance per job	14:45
@HeikoS	random would be best :)	14:45
lambday	how does it get to know about the coloring?	14:45
lambday	one sampler instance per color	14:45
@HeikoS	lambday: we could for example pass the index in the loop or something	14:46
@HeikoS	argh, random, I want random to work :D	14:46
@HeikoS	that would be best	14:46
@HeikoS	then we have one single sampler instance	14:46
@HeikoS	and the jobs just call it wiuthin compute	14:46
@HeikoS	and the thing just samples from a random colour	14:46
lambday	but that may make s1^T * (log(C)* s2)	14:47
lambday	s1=/=s2	14:47
@HeikoS	ah yes	14:47
lambday	:'(	14:48
@HeikoS	ok then, lets move the vector-vector to the job	14:48
@HeikoS	might be a good idea anyway	14:48
@HeikoS	then the function returns a scalar	14:48
lambday	no aggregate?	14:48
lambday	what should we do then for direct solvers?	14:49
@HeikoS	well yes	14:50
@HeikoS	agregate computes the average	14:50
lambday	how about two types of jobs	14:50
lambday	hmm	14:50
@HeikoS	I would like to avoid having different interfaces for the samplers	14:50
@HeikoS	since then different interfaces for the jobs	14:50
@HeikoS	and everythign gets more complicated	14:50
@HeikoS	dont you like s^T f(C) s ?	14:51
@HeikoS	its a standard form	14:51
lambday	:D	14:51
@HeikoS	the other one is slightly more general but the above one is the one thats relevant for practice?	14:51
@HeikoS	!	14:51
lambday	okay	14:51
@HeikoS	Ill write daniel a mail, just a sec	14:51
lambday	okay	14:52
lambday	I'm then figuring out the classes that should be fine with the new design	14:52
@HeikoS	lambday: there might be another issue :(	14:54
lambday	noooooooooo :(	14:55
@HeikoS	s^T f(C) s	14:55
lambday	what's that? :(	14:55
@HeikoS	has one sample	14:55
@HeikoS	s^T log(C) s	14:55
@HeikoS	but	14:55
@HeikoS	there might be multiple jobs	14:55
@HeikoS	if we do the family of shifted systems, thats fine - only one job	14:56
@HeikoS	but if we solve all systems of the raional approx seperately	14:56
@HeikoS	we have multiple jobs	14:56
lambday	yes that's what I was talking about when I mentioned direct solvers :(	14:56
@HeikoS	ah sorry	14:56
lambday	no problem :( but what to do for that case?	14:57
@HeikoS	one possibility, parametrise the sampler	14:58
@HeikoS	like a seed	14:59
@HeikoS	that is set before that sample is drawn	14:59
@HeikoS	this can be fixed and stored in the job	14:59
@HeikoS	but thats quite ugly	14:59
lambday	hmmm...	15:00
lambday	ugly but feasible	15:00
@HeikoS	oh or this:	15:00
@HeikoS	direct solvers are completely unfeasible on one computer	15:00
@HeikoS	so things need to be sent to a cluster anyway	15:00
lambday	classic!	15:00
lambday	:D	15:00
@HeikoS	so what we could do in this case	15:00
lambday	oh but then CG solvers are feasible	15:01
lambday	one COCG solver per system... not COCG_M	15:01
@HeikoS	yeah sure, but thats the case	15:01
@HeikoS	what?	15:01
@HeikoS	I mean COCG_M can be done one one computer	15:01
lambday	yes	15:01
@HeikoS	but solving all systems with preconditioned CG takes ages	15:01
@HeikoS	so things would need to be sent to a network anyway	15:02
@HeikoS	and this means	15:02
@HeikoS	we can just store the sample in the job in this case	15:02
@HeikoS	we create a dummy sampler, which returns always the same value	15:02
@HeikoS	this is done in the rational approximation class, if direct solvers are used	15:03
@HeikoS	and in the other case, we do this random colour trick	15:03
@HeikoS	that should do it no?	15:03
lambday	I didn't get the dummy sampler part	15:04
iglesiasg	hi HeikoS!	15:04
@HeikoS	since every job now contains a sampler rather than a sample	15:04
lambday	oh	15:04
@HeikoS	we need to put a fixed sample under that interface	15:04
@HeikoS	iglesiasg: hi!	15:04
lambday	so, a useless sampler that does nothing	15:04
@HeikoS	lambday: yes, just to have the interface	15:04
lambday	okay	15:05
@HeikoS	ah wait	15:05
iglesiasg	HeikoS: so it seems you cannot make it for the meeting on Thursday	15:05
@HeikoS	there are different job classes for direct solvers and COCG_M right?	15:05
@HeikoS	iglesiasg: no not that time	15:05
iglesiasg	HeikoS: I think we should choose that day anyway since it is the most slected one	15:05
@HeikoS	iglesiasg: it might work, but not sure	15:05
iglesiasg	HeikoS: aham ok	15:05
iglesiasg	HeikoS: what time would suit better in any case?	15:05
iglesiasg	14 and 15 UTC are with the same number of votes	15:06
@HeikoS	iglesiasg: I am in a cycle holiday from that day, so I might have a chance to get online in the evening	15:06
iglesiasg	that is 16 adn 17 German time I think	15:06
@HeikoS	but I cannot guarantee that unfortunately	15:06
iglesiasg	all right	15:06
iglesiasg	it will be probably fine anyway	15:06
@HeikoS	iglesiasg: yes, not a problem	15:07
iglesiasg	14 UTC then	15:07
@HeikoS	ok I ll try to be there	15:07
@HeikoS	but dont wait for me :)	15:07
@HeikoS	lambday: different classes for these jobs	15:07
lambday	yep	15:08
lambday	should be fine I think then	15:08
@HeikoS	could you send me the current diagram?	15:08
lambday	okay	15:08
lambday	its the same	15:08
lambday	didn't change anything	15:08
lambday	but sending anyway	15:09
lambday	shit man ! I hate this 1GB laptop :'( will buy one with the gsoc money :(	15:10
@HeikoS	lambday: haha :)	15:11
@HeikoS	dont worry then	15:11
lambday	HeikoS: checking your mail... mailed you the diagram btw	15:12
@HeikoS	lambday: thanks	15:12
@HeikoS	ah nice	15:12
@HeikoS	so different class	15:12
@HeikoS	then just store the vector within there	15:12
@HeikoS	We will do this one later anyways	15:13
@HeikoS	and the COCG_M one gets a sampler	15:13
@HeikoS	and the probing sampler then (later) will randomly select colour	15:13
@HeikoS	see mail (I hope Daniel gives OK)	15:13
@HeikoS	phew!	15:13
@HeikoS	good discussion that started only because of this m_vector.vector=NULL :D	15:14
lambday	HeikoS: hahaha! :D good that I tried with this stupid thing :D	15:14
@HeikoS	CTraceSampler now returns a vector	15:14
@HeikoS	lambday: not stupid	15:14
@HeikoS	actually that was quite sensible	15:14
lambday	works though :P	15:14
@HeikoS	and otherwise we hadnt realised this problem	15:14
@HeikoS	lambday: ah one more thing	15:15
@HeikoS	so in order to run into the same problem again upon constructing the jobs	15:15
@HeikoS	(not being able to store all)	15:15
@HeikoS	please submit the jobs to the computation class directly after you created them	15:15
@HeikoS	so rather than creating a list of jobs, just send them to the computation engine directly and then unreference locally	15:16
@HeikoS	this way, they are not stored within the class that create the jobs	15:16
-!- iglesiasg [d58f3208@gateway/web/freenode/ip.213.143.50.8] has quit [Quit: Page closed]		15:16
lambday	wait let me think	15:16
lambday	currently, calling generate jobs returns an array of jobs... its the CLogDetEstimator's task to register all of those in the engine	15:17
@HeikoS	lambday: ah yes, sorry	15:17
lambday	and this is done repeatedly for each estimate of the log-det	15:17
@HeikoS	let me think	15:18
@HeikoS	the operator function class	15:18
@HeikoS	how many jobs does it create?	15:18
lambday	depends..	15:19
@HeikoS	on?	15:19
lambday	its implementation	15:19
@HeikoS	it is one per trace sample right?	15:19
@HeikoS	ah no	15:19
@HeikoS	I mean	15:20
@HeikoS	oh yes	15:20
@HeikoS	one per trace sample	15:20
lambday	umm...	15:20
@HeikoS	so the maximum number in our context is the number of shifts?	15:20
lambday	but for preconditioned CG, one per shift too	15:20
lambday	yes	15:21
lambday	or direct	15:21
@HeikoS	ok, so the point is, not many	15:21
@HeikoS	ah gotta get my charger, just a sec	15:22
lambday	okie tyt	15:22
lambday	(my laptop doesn't run on battery :D as good as a desktop :D, it electricity goes, BAM!)	15:22
@HeikoS	so about the jobs	15:23
lambday	if*	15:23
@HeikoS	it would be nice not to store many of them locally	15:24
lambday	hmm	15:24
@HeikoS	but rather directly submit them to the engine, which then can store it somewhere	15:24
@HeikoS	depending on its implementation	15:24
@HeikoS	see what I mean?	15:25
lambday	no.. confused! :(	15:25
@HeikoS	rather than building an array of jobs and then submit all of those	15:25
lambday	you mean, register directly from the generate jobs?	15:25
@HeikoS	yes	15:26
lambday	that's cool	15:26
lambday	no return for it then	15:26
@HeikoS	actually, thats really nice	15:26
lambday	CLogDetEstimator shouldn't even bother about the jobs at all	15:26
@HeikoS	well yes	15:26
@HeikoS	getting results	15:26
@HeikoS	but thats fine	15:27
@HeikoS	so the generate_jobs method has another parameter which is the computation engine	15:27
lambday	yes	15:27
@HeikoS	whenever a job is created in the operator function, it is directly submitted	15:27
@HeikoS	and then the cool thing is	15:27
@HeikoS	the computation engine can have different implementations of what to do	15:27
@HeikoS	so serial:	15:28
@HeikoS	the computation engine does not store the jobs, but rather directly computes them and only stores the result	15:28
lambday	move the engine to the operator instead, but in the log-det.. aggregate result should return the float (vec-vec) product	15:28
@HeikoS	and parallel can store the jobs (more memory) and then on compute_all() do the computation	15:29
@HeikoS	point is the serial one does not have to store any jobs	15:29
@HeikoS	so no memory problems	15:29
@HeikoS	computation engine would have a different method then, not compute_all, but get_results() which in serial case just returns the results, and in parallel case starts computation	15:30
@HeikoS	this way, we can actually store the samples in the jobs	15:30
lambday	and you mean, register_jobs should should compute already and store the result... for serial	15:30
@HeikoS	and the operator function generates the sample	15:30
lambday	we should change the name then	15:31
@HeikoS	jep	15:31
@HeikoS	its like thread stuff	15:31
@HeikoS	join etc	15:31
@HeikoS	wait to be completed	15:31
@HeikoS	in fact, all computation classes can start computing things once the first job is registered	15:31
@HeikoS	depdning	15:32
@HeikoS	the good thing is that we can then store the sample in the jobs	15:33
@HeikoS	since jobs are not stored all at once	15:33
@HeikoS	sample is then created in the operator function class	15:34
@HeikoS	oh wow, actually this even simplifies things more	15:35
@HeikoS	COperatorFunction gets a vecor as parameter	15:36
lambday	does it? :(	15:36
lambday	okay	15:36
@HeikoS	but CLogDetEstimator samples the probing vectors	15:36
@HeikoS	and it knows the colours	15:36
@HeikoS	therefore, no random selection of colours	15:36
lambday	why you wanna give the vector to the COperatorFunction?	15:36
@HeikoS	CLotDetEstimator samples the trace vectors one by on	15:36
@HeikoS	and for each of them has to compute s^T f(C) s	15:37
@HeikoS	so its calls some method on operator function	15:37
@HeikoS	and passes the sample	15:37
@HeikoS	COperatorFunction then constructs some jobs and submits them to the engine	15:37
@HeikoS	the engine however blocks until some jobs are completed (either one or many)	15:38
@HeikoS	its like a queue that gets filled	15:38
@HeikoS	once full, it blocks until more space is available	15:38
@HeikoS	at some point, all jobs are submitted/computed	15:39
@HeikoS	CLotDetEstimator continues in its loop	15:39
@HeikoS	once completed, it asks the computation engine for results	15:40
@HeikoS	the computation engine blocks until all results are ready	15:40
@HeikoS	and then returns them	15:40
@HeikoS	CLogDet averages over them	15:40
@HeikoS	we could even give each job a callback function that is called with results once a result is done	15:41
@HeikoS	then we dont have to store the results	15:41
@HeikoS	but later	15:41
@HeikoS	lambday: sorry for the confusion, but I think this way it might be way more efficient	15:42
lambday	genering job and computing job will go parallel then	15:42
lambday	HeikoS: its okay.. it sounds better	15:42
@HeikoS	every class that generates jobs will directly submit them	15:42
@HeikoS	one by one	15:42
@HeikoS	rather than storing /returning them	15:42
lambday	yes	15:42
@HeikoS	this way we can store stuff in the jobs (as trace vectors)	15:43
@HeikoS	and the computation engine has a method submit	15:44
@HeikoS	which takes a job, adds it to an internal queue, starts computing it, and blocks if the queue is full	15:44
@HeikoS	(serial just calls compute in same thread, so dont worry about other things for now=	15:44
@HeikoS	)	15:44
lambday	how we're gonna implement that thing? blocking/	15:45
@HeikoS	just call compute method of job	15:45
@HeikoS	takes some time	15:45
@HeikoS	then one more change:	15:45
@HeikoS	rather than producing an instance for a job result	15:46
@HeikoS	we have a new class that is responsible for handling job results	15:46
@HeikoS	this class has a method which takes a job result instance	15:46
@HeikoS	and does something with it (i.e. store if feasible, or update some internal result)	15:46
@HeikoS	lambday: the idea here is again that we do not have to store things.	15:47
lambday	could you please update the class diagram with what you have in mind? :( I'm again forgetting things :(	15:48
@HeikoS	lambday: sure will do	15:49
lambday	HeikoS: when do we compute the vec-vec product in this new design?	15:50
@HeikoS	lambday: we can choose	15:50
lambday	where*	15:50
@HeikoS	either in the OperatorFunction class	15:50
@HeikoS	or inthe LogDet	15:50
@HeikoS	ah no	15:50
@HeikoS	it has to be OperatorFunction	15:51
lambday	yep	15:51
lambday	since that gets the vector, right?	15:51
@HeikoS	yes	15:51
@HeikoS	also otherwise, we could not aggregate results that easily	15:51
@HeikoS	and had to store the vectors once again	15:51
@HeikoS	this way, each job returns a float	15:52
lambday	yup	15:52
@HeikoS	of which we can do a running average	15:52
@HeikoS	or store them (floats are small :)	15:52
lambday	better :)	15:52
@HeikoS	callbacks for the results might be useful later though	15:53
@HeikoS	imagine the result is a matrix	15:53
@HeikoS	and you later want to do the lement wise average of many matrices	15:53
@HeikoS	impossible to store all them	15:53
@HeikoS	anyway	15:53
lambday	hmm.. we should keep that in the design	15:54
@HeikoS	can do that	15:54
@HeikoS	now that we dont store jobs anymore	15:55
@HeikoS	the computation class needs to call a function on a single job's result directly after having computed it	15:55
@HeikoS	then it can forget about job and result since anotoher class takes care of it	15:55
lambday	what will that function do?	15:56
lambday	for our COCG_M for instance	15:57
lambday	single job's result is a vec	15:57
@HeikoS	can the jobs compute s^T SYSTEM s ?	16:00
@HeikoS	I thnk they can	16:01
@HeikoS	since the solution of every system A^(-1) x is a vector which has to be multiplied with x aferwards right?	16:01
@HeikoS	lambday: do you agree?	16:02
lambday	ya but what about the case when we solve each for each shift/	16:02
lambday	that's also a vector but need to be summed up	16:02
lambday	before the dot product	16:03
@HeikoS	no difference	16:03
@HeikoS	a(b+c+d)a = aba + aca + ada	16:03
lambday	argh!! yes you're right	16:04
lambday	sorry	16:04
@HeikoS	lambday: no I didnt actually think of that before you mentioned it :D	16:04
@HeikoS	in both cases we can just sum up the scalars returned by the jobs	16:05
@HeikoS	for COCG_M its only one scalar	16:05
lambday	super cool!!!	16:05
@HeikoS	for direct solvers, its a sum	16:05
@HeikoS	quite a few changes	16:06
@HeikoS	Ill update the class diagram	16:06
lambday	for the whole thing its a sum over a number of scalars.. so no aggregation needed	16:06
@HeikoS	lambday: yes	16:06
@HeikoS	especially no storage of vectors	16:06
lambday	aaaah classic!!	16:06
lambday	you rock!!	16:06
@HeikoS	haha, lambday well you designed it with me, so you rock too! :)	16:07
lambday	lol I only tried stupid things and questions - you came up with all the ideas :D	16:07
lambday	I'm feeling a little sad for my code though :( but this wouldn't have come up if I wouldn't have tried it.. so its better I guess	16:08
@HeikoS	lambday: yes thats always annyoing to have code lost	16:09
@HeikoS	teaches us to even spend more time with planning	16:09
lambday	HeikoS: please also write a few lines in the mail too.. why - that's clear but its the "what" that I'm fond of forgetting :(	16:10
lambday	HeikoS: ya.. well official GSoC hasn't started yet, so not that bad.. its just 1 week I've been coding on this	16:10
lambday	:D	16:10
@HeikoS	lambday: some things will be re-used	16:10
lambday	yes	16:11
lambday	I'	16:11
lambday	I'll think about it	16:11
lambday	(as much copy-paste as I can :D )	16:11
@HeikoS	yep	16:11
@HeikoS	:D	16:11
lambday	I'm bookmarking that page btw.. will show it to my friends :D hehe	16:12
@HeikoS	lambday: wait until its on the UCL website :D	16:15
lambday	HeikoS: hehe :D	16:15
lambday	HeikoS: I'll be back after my brunchnner :-/	17:10
@HeikoS	lambday: currently writing you an email	17:10
@HeikoS	leaving afterwards	17:10
lambday	HeikoS: okay then I'm here till then...	17:10
@HeikoS	lambday: sent	17:14
@HeikoS	maybe read and tell me if you have questions	17:14
lambday	HeikoS: checking	17:14
lambday	HeikoS: oh so CLogDetEstimator will use the aggregators... which will have the results stored in them once the engine returns the "wait_for_all()"	17:31
lambday	and get that result from that aggregator (float in this case) and do the sum	17:32
@HeikoS	yes	17:32
@HeikoS	lambday: let me know if there are any problems, I have to go now, but will check mail later	17:32
-!- HeikoS [~heiko@176.248.212.166] has quit [Quit: Leaving.]		17:32
-!- lambday [67157f4c@gateway/web/freenode/ip.103.21.127.76] has quit []		17:33
-!- votjakovr [~votjakovr@host-46-241-3-209.bbcustomer.zsttk.net] has joined #shogun		19:09
-!- nube [~rho@49.244.57.163] has quit [Ping timeout: 256 seconds]		19:11
-!- nube [~rho@49.244.22.151] has joined #shogun		19:25
@sonney2k	wiking, what did you mean by that?	19:43
@sonney2k	and lisitsyn where are you stuck?	19:46
lisitsyn	sonney2k: I am stuck with the structure of CMakeLists there	19:46
@sonney2k	lisitsyn, then talk to wiking about it	19:47
lisitsyn	yes good idea ;)	19:47
lisitsyn	sonney2k: I actually need your help too	19:47
lisitsyn	sonney2k: Makefile.template looks very entangled to me	19:48
@sonney2k	lisitsyn, Makefile.template is used to generate the makefile for each interface	19:49
lisitsyn	sonney2k: that's clear	19:49
@sonney2k	then what is not?	19:49
lisitsyn	sonney2k: I don't get what is say exact CXX line for cmdline_interface	19:50
@sonney2k	?	19:50
@sonney2k	no idea what you mean	19:51
lisitsyn	sonney2k: nevermind I think I got it for cmdline	19:51
lisitsyn	not for octave/matlab though	19:51
@sonney2k	what did you not get?	19:52
lisitsyn	sonney2k: I don't get what are the steps to compile it	19:52
@sonney2k	compile libshogun	19:52
@sonney2k	then the matlab part	19:53
@sonney2k	and then link to libshogun with the matlab object files	19:53
lisitsyn	sonney2k: where can I find these last two steps?	19:55
gsomix	sonney2k, good evening. have a minute? PR is updated. https://github.com/shogun-toolbox/shogun/pull/1130	19:56
@sonney2k	lisitsyn, $(TEMPLATE_TARGET): .depend $(OBJFILES) $(SRCFILES) $(HEADERFILES) $(OTHERDEPS)	19:57
@sonney2k	@echo 'Linking object files'	19:57
@sonney2k	@$(LINK) $(PRELINKFLAGS) $(shell find $(SRCDIR) -name "*.$(EXT_OBJ_CPP)" -o \	19:57
@sonney2k	-name "*.$(EXT_OBJ_C)" 2>/dev/null) $(LINKFLAGS) -o $@ $(POSTLINKFLAGS)	19:57
lisitsyn	sonney2k: :D yeah here I am stucked	19:57
@sonney2k	the POSTLINKCMD was just necessary for running shogun on gunnars iphone	19:57
gsomix	sonney2k, btw, I think that there is needed another class for splitting strings, instead CircularBuffer::find_char(char delimiter[256]).	19:59
-!- hushell [~hushell@c-24-21-141-32.hsd1.or.comcast.net] has quit [Read error: Connection reset by peer]		19:59
@sonney2k	lisitsyn, well compiling happens in the implicit rule %.$(EXT_OBJ_CPP): %.$(EXT_SRC_CPP)	19:59
@sonney2k	so you have all .cpp -> cpp.o	19:59
-!- hushell [~hushell@c-24-21-141-32.hsd1.or.comcast.net] has joined #shogun		20:00
@sonney2k	and then the find just finds all *.cpp.o and links them together into $(TEMPLATE_TARGET)	20:00
@sonney2k	which whould be sg.oct for octave	20:00
lisitsyn	sonney2k: alright that explains something	20:04
lisitsyn	thnaks	20:04
lisitsyn	wiking: ping	20:26
lisitsyn	wiking: we've got to give user ability to disable python modular	20:33
lisitsyn	;)	20:33
-!- nube [~rho@49.244.22.151] has quit [Quit: Leaving.]		20:33
-!- HeikoS [~heiko@176.248.212.166] has joined #shogun		20:34
-!- mode/#shogun [+o HeikoS] by ChanServ		20:34
votjakovr	HeikoS: hi! I've sent a PR with some fixes. Please look at it	20:46
@HeikoS	votjakovr: hi!	20:46
@HeikoS	just started looking at it 5 mins ago :)	20:46
@HeikoS	votjakovr: I see you saw the note on the error messages, sorry for pushing you in the other direction before	20:47
@HeikoS	but this way its way easier	20:47
@HeikoS	votjakovr: there is another very nice publication about GPs for classification	20:49
votjakovr	HeikoS: ah, all right. btw now posterior paramereter are evaluated "lazy"	20:49
@HeikoS	its not yet published so please dont spread it, but Ill send it to you anyways, might be interesting for some post GSoC work	20:50
@HeikoS	votjakovr: thats totally fine	20:50
@HeikoS	votjakovr: for these combined features cases, why is only the first element checked?	20:51
lisitsyn	HeikoS: secret ops	20:52
-!- shogun-notifier- [~irker@7nn.de] has joined #shogun		20:52
shogun-notifier-	shogun: Roman Votyakov :develop * b8b2e6f / src/shogun/regression/gp/ (6 files): https://github.com/shogun-toolbox/shogun/commit/b8b2e6f9cf99a1055429945f5d0ef309dfcba9c1	20:52
shogun-notifier-	shogun: Fixed recomputing of posterior parameters	20:52
shogun-notifier-	shogun: Heiko Strathmann :develop * a640e47 / src/shogun/regression/gp/ (6 files): https://github.com/shogun-toolbox/shogun/commit/a640e473fae1e5e608457e2d0a5cc242c8ca53a7	20:52
shogun-notifier-	shogun: Merge pull request #1170 from votjakovr/develop	20:52
shogun-notifier-	shogun:	20:52
shogun-notifier-	shogun: Fixed recomputing of posterior parameters	20:52
@HeikoS	votjakovr: nice, work!	20:52
votjakovr	HeikoS: thanks :)	20:53
@HeikoS	votjakovr: say, could you make the laplace approximation in such way that we can sample from it?	20:53
@HeikoS	i.e. make the cholesky available from outside	20:53
@HeikoS	or is that already possible?	20:53
@HeikoS	I really want to integrate this pseudo-marginal stuff at some point and for that we need to be able to sample from the approximate posterior distributions (Laplace, also EP)	20:54
votjakovr	HeikoS: do you mean posterior cholesky?	20:55
@HeikoS	votjakovr: yes the cholesky of the Gaussian approximation to the posterior	20:55
@HeikoS	votjakovr: ah its already there	20:56
votjakovr	HeikoS: so get_cholesky() is the public method of Laplacian inference	20:56
@HeikoS	votjakovr: yep, I see, nice	20:57
@HeikoS	votjakovr: whats next?	20:57
votjakovr	HeikoS: are you working on mcmc stuff?	20:57
@HeikoS	votjakovr: yes a bit	20:57
@HeikoS	I actually wanted to write this paper that I sent you	20:58
@HeikoS	but out of time ;)	20:58
votjakovr	HeikoS: great :)	20:58
@HeikoS	so mark and this other guy did it	20:58
@HeikoS	votjakovr, are you interested in that stuff=?	20:58
shogun-buildbot	build #940 of cyg1 - libshogun is complete: Failure [failed configure] Build details are at http://www.shogun-toolbox.org/buildbot/builders/cyg1%20-%20libshogun/builds/940 blamelist: Heiko Strathmann <heiko.strathmann@gmail.com>	20:58
votjakovr	HeikoS: mcmc?	20:59
@HeikoS	yes	20:59
@HeikoS	votjakovr: about the cholesky	21:00
@HeikoS	I want to be able to sample from the Laplace approxiation	21:01
@HeikoS	so need a way to get its covariance factor and its mean	21:01
@HeikoS	get_cholesky does not do that	21:01
@HeikoS	votjakovr: do you have the GP book?	21:03
@HeikoS	check expression 3.20	21:03
@HeikoS	we need the cholesky of that Gaussian to sample from it	21:04
@HeikoS	so need the cholesky of matrix 3.27	21:05
@HeikoS	votjakovr: but maybe lets do that a bit later, just keep it in mind	21:06
@HeikoS	votjakovr: we should really soon start with the classification stuff as GSoC begins on Monday	21:06
@HeikoS	so high priority is to seperate regression and GP so that the GP class can be used for classification	21:07
@HeikoS	I suggest we start with logit/Laplace	21:07
@HeikoS	votjakovr: what do you think?	21:07
shogun-buildbot	build #941 of cyg1 - libshogun is complete: Failure [failed configure] Build details are at http://www.shogun-toolbox.org/buildbot/builders/cyg1%20-%20libshogun/builds/941 blamelist: Roman Votyakov <votjakovr@gmail.com>	21:07
-!- lisitsyn [~lisitsyn@109-226-90-135.clients.tlt.100megabit.ru] has quit [Quit: Leaving.]		21:08
-!- lisitsyn [~lisitsyn@109-226-90-135.clients.tlt.100megabit.ru] has joined #shogun		21:08
votjakovr	HeikoS: Ah, yep.. I've already started drawing diagrams	21:08
@HeikoS	votjakovr: nice!	21:08
@HeikoS	any particular thoughts?	21:09
votjakovr	HeikoS: not yet, i've just drew an existing part	21:12
@HeikoS	votjakovr: what kind of interfaces should the classifier have?	21:14
@HeikoS	I think it might be good to do all those in the shogun style	21:15
@HeikoS	returning regression/binary/multiclass labels depending on the GP type	21:15
votjakovr	Heiko'S: maybe it could be derived from CMachine?	21:17
votjakovr	HeikoS: Btw why do we need a general GPs class?	21:18
@HeikoS	votjakovr: so there are two ways to go:	21:19
@HeikoS	first: GP an own thing, all methods are derived from the GPRegression class	21:19
@HeikoS	However, we can also do classification problems - I am a little afraid here that users might be confused	21:20
@HeikoS	when we mix regression/classification	21:20
@HeikoS	the basics are the same (prior, likelihood etc)	21:20
@HeikoS	whing brings us to the second way:	21:21
@HeikoS	have a GP base class, and inherit from it for regression and classification	21:21
@HeikoS	where each of those has shogun-like interfaces	21:21
@HeikoS	I would rename the apply_regression method to	21:22
@HeikoS	CLabels* apply(...)	21:22
@HeikoS	and add some distinction for classification problems	21:23
@HeikoS	so we could have GPRegression which only accepts regression based inference method (and data)	21:24
@HeikoS	and a GPClassification which only accepts classification based methods	21:24
@HeikoS	this one could have a subclass for multiclass	21:24
@HeikoS	votjakovr: or dont you agree?	21:24
-!- pickle27 [~Kevin@S0106002191dec7e8.cg.shawcable.net] has joined #shogun		21:25
@HeikoS	votjakovr: another problem might be that some of the code in the current GP class is not suitable for classification	21:26
votjakovr	HeikoS: yep, i agree.	21:27
@HeikoS	btw GP class is already derived from CMachine	21:28
votjakovr	HeikoS: current GP regression is derived from CMachine	21:28
@HeikoS	yep :)	21:28
votjakovr	HeikoS: So may be it's good to use CMachine interface for classification too?	21:29
@HeikoS	votjakovr: yep I think I would have a GP base class which inherits from CMachine, and then two subclasses for regression/classification	21:29
votjakovr	HeikoS: I totally agree! :)	21:29
@HeikoS	and multiclass is an extension of binary	21:29
@HeikoS	but that later	21:30
@HeikoS	so the whole feature storage and also mean and covariance functions can be done in the base class	21:30
@HeikoS	votjakovr: chech Machine.h	21:31
@HeikoS	there are all these apply functions	21:31
@HeikoS	we just override those	21:31
@HeikoS	in the GP subclasses	21:31
votjakovr	HeikoS: yep, shure	21:31
@HeikoS	and use CMachine's apply	21:31
@HeikoS	as its done now only for regression	21:32
@HeikoS	so btw	21:35
@HeikoS	mean vectors and covariance vectors also only apply to gp regression	21:35
@HeikoS	binary classification gets a value in [0,1]	21:36
@HeikoS	and multiclass gets a vector with [0,1] values for every class	21:36
@HeikoS	so that can also go in subclasses	21:36
@HeikoS	votjakovr: do you need help defining these new classes?	21:37
@HeikoS	or a class diagram to make clear what I mean?	21:38
@HeikoS	lisitsyn: the secret ops paper btw beats all other classification methods	21:38
@HeikoS	really cool	21:38
lisitsyn	HeikoS: the paper of who?	21:38
shogun-buildbot	build #468 of ubu1 - libshogun is complete: Success [build successful] Build details are at http://www.shogun-toolbox.org/buildbot/builders/ubu1%20-%20libshogun/builds/468	21:38
@HeikoS	some stats guys from glasgow/london	21:39
@HeikoS	on small datasets	21:39
lisitsyn	HeikoS: and what is the data? ;)	21:39
@HeikoS	some toy examples compared against other std approaches	21:39
lisitsyn	I see	21:39
@HeikoS	lisitsyn: so they fully integrate out all model hyperparameters (i,e, kernel -parameters)	21:39
@HeikoS	so they take all possibilities into account	21:40
@HeikoS	given the data	21:40
@HeikoS	will be good to have this stuff in shogun :)	21:40
lisitsyn	wiking: i see our install puts CMakeFiles/ too	21:42
@HeikoS	votjakovr: I gotta go now, let me know if you need any help. We should get this done until at latest tuesday	21:42
@HeikoS	so that we can start on the classification stuff	21:42
votjakovr	HeikoS: i think it's better to discuss it after i finish drawing of diagrams. But i think, i	21:42
@HeikoS	votjakovr: ok, please share them once they are almost complete (no need to be perfect on them, its just for discussions)	21:43
votjakovr	HeikoS: i think, i understand your idea	21:43
@HeikoS	votjakovr: cool then,	21:43
@HeikoS	what are you using to draw them?	21:43
votjakovr	HeikoS: emacs	21:44
@HeikoS	emacs?	21:44
@HeikoS	for class diagrams?	21:44
votjakovr	HeikoS: yep :)	21:44
votjakovr	HeikoS: it sounds a bit crazy)	21:44
@HeikoS	votjakovr: haha indeed :D	21:44
@HeikoS	can you show how they look like?	21:45
@HeikoS	ah man	21:45
@HeikoS	is it this ditaa?	21:45
@HeikoS	votjakovr: maybe have a look at dia, things are way easier/faster to do in there -- although I appreciate the hacker style diagrams ;)	21:46
@HeikoS	votjakovr: ok gotta go now	21:47
@HeikoS	see you!	21:47
votjakovr	HeikoS: see you :)	21:48
-!- pickle27 [~Kevin@S0106002191dec7e8.cg.shawcable.net] has quit [Quit: Leaving]		21:48
-!- iglesiasg [d58f323e@gateway/web/freenode/ip.213.143.50.62] has joined #shogun		22:25
-!- mode/#shogun [+o iglesiasg] by ChanServ		22:25
-!- votjakovr [~votjakovr@host-46-241-3-209.bbcustomer.zsttk.net] has left #shogun ["ERC Version 5.3 (IRC client for Emacs)"]		22:27
-!- pickle27 [~Kevin@S0106002191dec7e8.cg.shawcable.net] has joined #shogun		22:40
lisitsyn	pickle27: oh I know your ISP!	22:44
lisitsyn	patched a few things for them :D	22:44
pickle27	lisitsyn, haha thats hilarious	22:55
lisitsyn	pickle27: I used to work at netcracker and shaw uses their software	22:56
pickle27	lisitsyn, cool!	22:58
pickle27	lisitsyn, yeah shaw is actually a pretty good isp	22:58
pickle27	I wish I could have them back home in Kingston	22:58
@iglesiasg	haha funny thing in lmnn code	22:59
@iglesiasg	fprintf('The bizarre error happened!\n');	22:59
lisitsyn	iglesiasg: ohh you look at lmnn code	23:05
lisitsyn	I'll send you flowers!	23:05
@iglesiasg	haha why?	23:06
lisitsyn	iglesiasg: I died twice reading the code!	23:06
lisitsyn	good I am a cat you know	23:06
lisitsyn	9 lives	23:06
@iglesiasg	lisitsyn: ah! hehe	23:06
@iglesiasg	were you interested in it?	23:06
@iglesiasg	this is LMNN23, a relatively new version	23:07
@iglesiasg	I think the code is quite ok	23:07
lisitsyn	iglesiasg: yes chris wanted me to port it	23:10
@iglesiasg	aham	23:10
@iglesiasg	where do you think it would make sense to have it in shogun?	23:10
lisitsyn	multiclass I guess?	23:11
@iglesiasg	lisitsyn: then as a classifier	23:11
lisitsyn	yes, what else?	23:11
@iglesiasg	lisitsyn: I think it would actually be more flexible if just the linear transformation is returned	23:11
lisitsyn	you mean LMNN = project + NN?	23:12
@iglesiasg	one may want to do other thing different from applying NN	23:12
lisitsyn	makes sense for me if possible	23:12
-!- FSCV [~FSCV@189.139.252.135] has joined #shogun		23:38
-!- hushell [~hushell@c-24-21-141-32.hsd1.or.comcast.net] has quit [Ping timeout: 248 seconds]		23:45
-!- shogun-notifier- [~irker@7nn.de] has quit [Quit: transmission timeout]		23:52
-!- hushell [~hushell@8-92.ptpg.oregonstate.edu] has joined #shogun		23:57
--- Log closed Sun Jun 16 00:00:47 2013

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!