[Israel.pm] my new module on CPAN

Pinkhas Nisanov pinkhas at nisanov.com
Tue Mar 6 01:36:57 PST 2012

Thanks for comments, they are really helpful. I know that there are
many misunderstandings in my documentation (I made 3 releases
fixing docs :-) ), such comments will help me to improve it.
now I'll give answers inline, under your questions.

On Tue, Mar 6, 2012 at 8:16 AM, Gabor Szabo <gabor at szabgab.com> wrote:

> Not that I understand what it should do, but I installed and tried it.
> I guess people who know what Markov Cluster Algorithm is should know
> what is this.

example from my friend that works in coca-cola factory. factory has
a lot of orders from whole country that changes every day. Factory
also has many trucks that should supply bottles to order points.
Problem: what is best way to group orders that truck's path will
be shorter?
You solve it by building graph in which every vertex is order point
and edge weight between them is path distance.
Grouping order points into clusters will help you solve problem.
There are many companies that should build such clusters for
huge amount of data and MCL can be useful because of it's
scalabiliy and smplicity.

> Some minor comments:
> What is unclear from the SYNOPSIS is what is "MyClass" in there
> (and if you are using that already, I'd recommend MyClass->new
> and not the indirect notation of new MyClass.
> Why do you need to use scalar references there?
>  My feeling is that the example should have the original data in an array
> of
> pairs that would be passed to the addEdge method.
I want that module usage will as simple as possible. IMHO simplest
way to load graph into module is by adding all graph's edges.
Every vertex could be element in many edges and it will be loaded
into module many times. Problem is to find was some vertex
already loaded in another edge. I solve this problem by decision
that only references will be passed module. I just wanted to show
that input for load is references and output is list of references.
May be I should make longer examples.

> In the docs I'd link to PDL  with   L<PDL>
> and it seems it need a bit more documentation.
> You can tell in the Makefile.PL where is your public version control
> system for this module.
> Having one helps getting patches.
> I looked at the tests too:
> ok(1/2 == $matrix1->at(1, 1), "stochastic 1");
> could be better written as
> is($matrix1->at(1, 1), 1/2, "stochastic 1");
> ok(includeVertex($cluster1, $val4) > 0, "vertex is not in cluster - 1");
> could be better written as
> cmp_ok(includeVertex($cluster1, $val4), '>', 0, "vertex is not in cluster
> - 1");
agree for those comments,

> You could randomly generate a big data set and check if it does not crash,
> does not leak memory and if it works in a reasonable time. To some value of
> reasonable. Without actually checking correctness.

I thought about randomly created big graph. My main problem for first
release is correctness of implementation and randomly created
graph does not help here. I hope I'll find real data somewhere.

> Finally, I think I'd ask on the PDL mailing list. They probably have a
> lot more insight in this.
> I hope some of these will help!

definitely help, Thanks!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.perl.org.il/pipermail/perl/attachments/20120306/8610620b/attachment.htm 

More information about the Perl mailing list