The Fast Last Puzzle Piece
The notion that the more
data, the slower the system – ain’t always true.
My favorite way to
explain this very important phenomenon involves the familiar process of
assembling a jigsaw puzzle.
The first piece you take
out of the box and place on the work surface requires very little computational
effort. The second and third pieces require almost equally insignificant
mental effort. Then as the number of pieces on the table grows the effort
to determine where the next piece goes increases as well. But there is a
tipping point where the effort to determine where to place the next piece gets
easier and easier … despite the fact the number of puzzle pieces on the table
continues to grow.
Well isn’t it
interesting, although obvious, that those last few puzzle pieces take nearly as
little effort as the first few!
I have witnessed
this.
This has a slew of
ramifications.
This does not apply to
all domains. This behavior requires: (a) observations from the same
universe; (b) observations with enough features to enable contextualization;
(c) observations in which these features can be extracted, enhanced and
classified; (d) sufficient saturation of the observational space; and (e)
enough smarts to stitch these puzzle pieces together.
Context accumulating
systems, fed appropriate observations, can be expected to have this behavior.
RELATED POSTS:
More
Data is Better, Proceed With Caution
Context:
A Must-Have and Thoughts on Getting Some …
To
Know Semantic Reconciliation is to Love Semantic Reconciliation
Big
Breakthrough in Performance: Tuning Tips for Incremental Learning Systems
I like using the notion of perimeter. It starts from scratch, increases to a tipping point after which each new piece is decreasing the perimeter rather than increasing it and rapidly decreases. Software tends to be similar IMHO, not just with searching information but also programmer effort in using APIs. Designing software so that it has maximum utility with minimum perimeter is an ideal I aspire to.
Posted by: Jason Watkins | September 29, 2008 at 08:45 PM
This happens because when the memory in the system is full (when it contains the most data in the middle) it will take the longest to process the data. I really like your comparison to a jigsaw puzzle, because that makes tons of sense.
Posted by: Jigsaw Free | January 09, 2009 at 05:25 PM