[Israel.pm] sliding window on streaming lines

Gaal Yahas gaal at forum2.org
Sat Jun 5 14:26:56 PDT 2004


On Sat, Jun 05, 2004 at 11:21:30PM +0300, Shlomo Yona wrote:
> > I'll take that as an invitation to ignore the product output part of the
> > problem :)
> 
> Actually, you've ignored the most interesting part -- the
> part which handles the possible combinations of every
> window's content...

But in your original post you said,

> I wonder if you mongers can suggest an elegant way of
> processing such input data, given that the input comes from
> a file (of unknown number of lines -- so slurping it into
> some array may not be practical).   

Emphasizing the input suggested that's where you were looking for help.
I didn't say so in my reply so far, but I don't think you can do much
better than holding $N clusters in memory at any given moment. (My
implementation actually holds at most $N+1, but at some expense of
clarity that is fixable.) (Well, you could play seek games on the file
if you were desperate, but that sucks.[1])

Anyway, here's a straightforward idea (not code, not nifty) for the
combinations. It requires only one level of nesting.

Hold an array of size @window, with each element initialized to 1 and
representing the 1-based index of the current datum inside each cluster.

Start at one end of the array. Increment the index modulo that cluster's
size + 1; if overflow, increment this index again and treat the next
element the same way. Emit the current state every time we're done fixing
up overflows. Stop when the last element triggered an overflow.


[1] It sucks less if you use Tie::File.

-- 
Gaal Yahas <gaal at forum2.org>
http://gaal.livejournal.com/



More information about the Perl mailing list