[Israel.pm] binary vectors representation as sparse vectors

Shlomo Yona shlomo at cs.haifa.ac.il
Tue Jun 15 08:33:09 PDT 2004


I've made some trials on several large data sets, and it
seems that the compression rate of the sparse vector
representation is very impressive:

The ordinary binary vectors (recall that there are about
30,000-50,000 items in every vector, the exact number is
determined up front and stays fixed) take up 6GB while the
sparse vector representation (recall that only about 50
items at most get a nonzero value) takes up less than 10MB.

This is very very impressive and of course, justifies using
the sparse representation. Now the challange is to use it
while not making the implementation of the algorithms ugly
due to structural code (see the discussion I've started
about how to tie LoLs).

Shlomo Yona
shlomo at cs.haifa.ac.il

More information about the Perl mailing list