IRC logs of #shogun for Friday, 2011-12-30

--- Log opened Fri Dec 30 00:00:19 2011
-!- blackburn [~blackburn@188.168.4.27] has quit [Quit: Leaving.]06:02
-!- in3xes [~in3xes@49.249.162.22] has joined #shogun15:46
-!- heiko [~heiko@host109-158-55-154.range109-158.btcentralplus.com] has joined #shogun17:21
-!- in3xes [~in3xes@49.249.162.22] has quit [Ping timeout: 240 seconds]17:32
-!- heiko [~heiko@host109-158-55-154.range109-158.btcentralplus.com] has quit [Ping timeout: 240 seconds]17:39
-!- heiko [~heiko@host109-158-55-154.range109-158.btcentralplus.com] has joined #shogun18:01
-!- heiko [~heiko@host109-158-55-154.range109-158.btcentralplus.com] has quit [Ping timeout: 276 seconds]18:06
-!- heiko [~heiko@host109-158-55-154.range109-158.btcentralplus.com] has joined #shogun18:45
-!- puneetgoyal [~puneetgoy@117.197.160.53] has joined #shogun19:36
-!- heiko [~heiko@host109-158-55-154.range109-158.btcentralplus.com] has quit [Ping timeout: 240 seconds]19:36
-!- blackburn1 [~blackburn@109.226.80.162] has joined #shogun19:56
-!- heiko [~heiko@host86-150-48-229.range86-150.btcentralplus.com] has joined #shogun19:57
blackburn1heiko: how do you do?19:58
-!- blackburn1 [~blackburn@109.226.80.162] has quit [Quit: Leaving.]19:59
heikohej blackburn19:59
heikoI am fine :)19:59
heikoand you?19:59
-!- Netsplit *.net <-> *.split quits: CIA-120:08
-!- Netsplit *.net <-> *.split quits: puneetgoyal20:19
-!- heiko [~heiko@host86-150-48-229.range86-150.btcentralplus.com] has quit [Ping timeout: 240 seconds]20:20
-!- Netsplit *.net <-> *.split quits: naywhayare20:23
-!- Netsplit *.net <-> *.split quits: @sonney2k, shogun-buildbot20:23
-!- Netsplit over, joins: puneetgoyal, CIA-1, naywhayare, @sonney2k, shogun-buildbot20:24
-!- heiko [~heiko@host86-167-55-27.range86-167.btcentralplus.com] has joined #shogun20:36
-!- blackburn [~blackburn@109.226.100.113] has joined #shogun20:48
puneetgoyalblackburn: hi21:32
-!- puneetgoyal [~puneetgoy@117.197.160.53] has quit [Quit: Leaving]21:43
-!- puneetgoyal [~chatzilla@117.197.160.53] has joined #shogun22:52
blackburnpuneetgoyal: hi23:05
-!- in3xes [~in3xes@180.149.49.230] has joined #shogun23:06
puneetgoyalblackburn: I have calculated the tf-idf values for every word in every payload of every email.....but I am confused on the representation of them23:07
blackburnpuneetgoyal: nice23:07
blackburnwell just store it as matrix23:07
blackburnchoose e.g. 10 of them23:07
blackburnwith highest tf-idf23:07
puneetgoyalso should I consider the tf-idf value of a word for different documents, or take some average?23:08
puneetgoyalI mean a lot of documents will have same word...and each of them will have a different value...23:09
blackburnhmm yes23:09
-!- heiko [~heiko@host86-167-55-27.range86-167.btcentralplus.com] has quit [Ping timeout: 240 seconds]23:09
blackburnI don't know what is the best way23:09
puneetgoyalhow will it work if we have different values for a same word..23:10
blackburnehm? can't see any problem23:10
puneetgoyalplease elaborate23:10
blackburnpuneetgoyal: ok consider you have calculated td-idfs23:11
blackburnthen you can choose some of them23:12
blackburne.g. 323:12
blackburnX,Y,Z23:12
puneetgoyalok23:12
blackburnthen feature vector for document23:12
blackburnis tf-idfs of X,Y,Z respectively23:12
blackburngot it?23:12
puneetgoyalyeah23:12
blackburnI am not sure it is the best way23:13
blackburnbut would work23:13
puneetgoyalhmm..ok23:13
blackburnthe only heuristics is how to choose words23:14
-!- in3xes [~in3xes@180.149.49.230] has quit [Quit: Leaving]23:16
-!- puneetgoyal [~chatzilla@117.197.160.53] has quit [Remote host closed the connection]23:17
blackburncu23:21
-!- blackburn [~blackburn@109.226.100.113] has quit [Quit: Leaving.]23:21
-!- puneetgoyal [~puneetgoy@117.197.160.53] has joined #shogun23:21
--- Log closed Sat Dec 31 00:00:19 2011

Generated by irclog2html.py 2.10.0 by Marius Gedminas - find it at mg.pov.lt!