My Photo

Your email address:


Powered by FeedBlitz

April 2018

Sun Mon Tue Wed Thu Fri Sat
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30          
Blog powered by Typepad

Become a Fan

« Today’s FCW story about my anonymization work | Main | What sharks? Reflections on the 2005 Western Australia Ironman »

January 26, 2006

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Fred M-D

Jeff,

I strongly agree with your "aggregate vs sequence results" perspective. The need to have systems neutralize the challenges posed by traditional runtime "race controls" and other nondeterministic factors inherent in distributed systems is key to solving many of the representative problems you cite.

With respect to the "temporal nature of request..." I again strongly support your position on the need to monitor and broadcast "substantial" change in result(s) based on parametric data. I would add that there's considerable value in having some persistent data available to manage users / organizations that "used" information that has changed -- per the above discussion / scenario... Decision may want to consider these insights in addition to leveraging the latest & greatest (temporal) perspective on the available data set.

I recognize this is probably covered in other discussion threads; but I have to bring up the importance of dis-ambiguation. The importance of applying a plethora of text analytics technology to minimize ambiguities is key to these challenges.

Regards, from a long-time SRD (and now IBM Entity Analytics) fan, Fred M-D

Solange Azevedo

Dear Jeff James,

I am a Brazilian journalist and I would like to interview you. I can explain the aim of the interview by e-mail. Could you give me your e-mail address?

Best Regards,

Solange

Bob Aman

I actually implemented this concept in an Arabic search engine I wrote for a school project in my data mining class. But it had a lot more to do with the fact that it was more efficient to implement it so that the data was run past the queries rather than the queries run on top of the data. But I used the same argument you present here while convincing my professor that I should get a decent grade for it.

Man that was a tough class.

Chris Anderson

Thanks for giving me a word for it. I've been striving for "sequence neutrality" in my music aggregator, Grabb.it, and now that I can name it it's a lot easier to whiteboard.

Sreenath Chary

Jeff....isn't sequence neutrality as described in your blog the same as recording a "declaring an interest in and want to be informed when it happens" type of setup?

Jakub Kotowski

It also looks similar to the backward chaining vs. forward chaining problem (or trade-off) in reasoning.

Mike French

There are 3 things being conflated and confused here:

- Publish-subscribe; registering or "declaring an interest in and wanting to be informed when it happens" (@Sreenath), instead of periodically polling for changes and pulling the increments; "data was run past the queries rather than the queries run on top of the data" (@Bob). So we have streaming data, standing query and asynch push to client.

- Order independence of operators; depending on what you want, this is the associative, commutative, or distributive property of the operator algebra; idempotency is important too, because it affects whether you need to de-dup; generalize this to Category Theory, and you have catamorphism, anamorphism, hylomorphism (aka Bananas, Lenses and Barbed Wire), for example, map-reduce relies on catamorphisms (as has been known since ca. 1969, through the Bird-Meertens Formalism).

- Non-monotonic logical inference; previous inferences can be undone by new data, something that was assumed true is no longer true; this is more like the idea of backtracking, than backward v. forward inference direction; inferences should be firing in all directions at all times, the only question is how are conflicts resolved; one can imagine asynch agents colliding on a fact, then negotiating with each other about their confidence, explaining (exchanging) their evidence, and coming to an agreed conclusion, or an agreement to differ and continue anyway - you just get a multivalued result, so leave it to someone else to arbitrate and decide later. Monotonic logics are often equivalent to a 'closed world assumption', i.e. all the data/facts are already known, bounded and accessible, whereas we all know that the real world is open, liable to accretion and nonmonotonicity.

So 'sequence neutrality' means being careful about incremental information flow, operator algebras, and logical commitment to partial inferences.

Mike

The comments to this entry are closed.