My Photo

Your email address:


Powered by FeedBlitz

April 2018

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          
Blog powered by Typepad

Become a Fan

« Cannibalism Bites: Speaking at the Risk Assessment and Horizon Scanning Symposium in Singapore | Main | Predicate-based Link Analysis: A Post 9/11 Analysis (1+1= 13) »

April 17, 2007

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Aneel

"Constructing context from historical data involves streaming the data in. ... In short, such systems must incrementally learn from the past!"

Can you post more about why context *has* to be built up historically? Is there no other, perhaps better, way to deal with the time variable in establishing context without having to do it serially?

Thanks!

Esfandiar Bandari

Interesting. I believe this also depends if the transactions are causal or not. If you have a first order markov system/source generating the transactions, then your last transaction (or two) is adequate to tell you about current transaction so if you keep a large window, you are not gaining anything.

If you have a situation where random events/noise effect the transactions, or you have multiple overlaping inputs with different order causalities, well... then it get interesting, beacuase you have to do signal/source separation as well as system identification. (In here a source refers to those things that generate the various types of casual transactions, not just the literal sources. For instance, you can be singing, banging on the drums etc, all in an uncorrelated manner. You are the person responsible, but the noise generated are from three different sources).

Anyway, both problems of signal/source separation and system identification are non-trivial (in the literal sense). The first problem involves guessing or rather guestimating how many sources of transactions/signals there are and their form or lag if amplitidues or priorities of any kind are involved (for descrete binary systems, this pretty much comes down to guessing the number of sources of transations generating the casually related transactions). FOR EACH SOURCE then System Identification, on the other hand, is the process of identifying how the past outputs are correlated to the new one + noise .

Now things get really interesting, if we have multi-layer systems. This is when some sources may be correlated with one or more sources [but you know which ones, and you don't know the time lags either :-) ], and some may or may not be influenced by external effects. Eg Google keeps people's searches around because if ... Jessica Alba gets in the news ... well more people will search for her, and one index can serve very many :-) (this may sound odd, but one can think of these as distorted echoes).

Even more interesting yet is when we have a feedback loop. Here system identification, even for one source, specially if non-linear transformations are involved, is... well hela-of-a-cool and then some, partly because they are darn near impossible to do :-) I said near impossible to do so...

Rajat Chadha

Hi,

This is very interesting. Can you please tell me which companies are providing stream analytics solutions?

The comments to this entry are closed.