[Israel.pm] my first program

Gaal Yahas gaal at forum2.org
Sun Aug 5 00:32:08 PDT 2012

I wanted to mention one other thing -- how do you know this thing works?

When you're developing it, you probably ran it several times. This is fine
to do occasionally, but it has a few problems:
- it puts load on Gabor's servers
- it's slow! you have to fetch a bunch or resources from the network every
time and uncompress them.
- when you have a bug, you have to figure out what's wrong from looking at
the output of the entire program. This can be hard.
- it's possible you have a bug in the statistics code that doesn't crash
the program, but which causes the wrong output! You'd never find it by
looking at the report, unless you had some prior knowledge about the actual

Testing is a big subject, and I don't know how well you're already familiar
with it, so I'll just again give a few general remarks:

- The main point I'm making here is that your script is a single flow of
code, and if you break it down to functions (sub get_page, sub analyze_page
etc.) you can cause only specific parts of your code to run, with input
that's under your control, and verify that the output is what you expect.

- You might be tempted to comment out pieces of code / add flags like $TEST
or whatever -- it's okay here and there during development, but it doesn't
scale well down the line. You end up with messy code or things you have to
clean up before release, and have to repeat some work every time you want
to test something.

- Whatever works. Perl has spectacular support for testing, but it means
it's easy to get overwhelmed by the options. Start with Test::More or
something else that's well-known.

On Sun, Aug 5, 2012 at 9:51 AM, Gaal Yahas <gaal at forum2.org> wrote:

> Since you asked for comments, here are a few general ones.
> - You use autodie, but aren't telling it to capture errors in system. The
> most likely part of your program to fail is the network part (what you
> currently delegate to wget), so it's probably the one you want the most
> control over. Currently you'd fail silently. If you say "use autodie
> qw(:all)" you'd die on the first wget failure. What would you like to do?
> I'm guessing that ideally, you should retry (or have wget retry). But at
> the very least, you want good error messages.    To summarize this point:
> don't give up contorl of error reporting in the heart of what you're trying
> to accomplish.
> - You're using use strict, which is very good, but notice that the way
> your file handles are named, they are unchecked globals. Avoid
> all-uppercase, sigilless names (FILE and so one) if you want protection;
> use lexicals instead:
>   open my $fh, "<", "myfile" or die ...
> (BTW, you have or die here and there, despite autodie. I'd recommend for
> now just getting rid of autodie entirely and doing all the errors yourself.)
> When you use lexical filehandles, the files are closed automatically when
> the variable goes out of scoped. This is great for reads. For writes, you
> should always close the file yourself to check for errors. There are fancy
> ways of doing that automatically but never mind them for now.
> - It's not unreasonable to use external tools in your script. Whatever
> gets the job done. Of course there are tradeoffs: you comment that you
> don't know if these tools are available on other platforms. Well, modules
> for gzip and http may not be available on all platforms either -- take a
> look at corelist if you want to make educated guesses about what to expect.
> - Syntactically, this:
>   if (CONDITION 1) {
>     CODE 1
>   } if (CONDITION 2) {
>     CODE 2
>   } if (CONDITION 3) {
>     CODE 3
>   } ...
>   checks all the conditions and may run all the branches. I think you
> meant "elsif" there.
> - Semantically, what you're doing is extracting a bunch of values from a
> hash into a number of scalars. Why? This has a few costs:
> 1. You have to pick names for these. If you ever scratch your head about
> the name of a variable, and end up with something mundane like "first",
> something's wrong. (I mean, first is perfectly all right if its scope is
> very small, and within that block of code it's clear what it's intended to
> refer to. But here the scope of these variables is pretty much the whole
> program, and the reader is confounded with arbitrary distinctions like
> "first" and "firs".
> 2. It's inefficient in code. You have a 20 lines of code that aren't
> furthering your goal as a programmer here. Couldn't the user of this data
> may as well just say $names{$some_name}?
> 3. It's inefficient in performance. Your for loop runs as many times as
> there are elements in %names, and performs ten conditions. So you have 100
> branches in your code where you didn't need any. In this case, it doesn't
> matter; but you have to keep these things in mind for when you have larger
> data structures.
> - Again on naming, "percent" vs. "per" -- the reader can't tell what these
> mangled names are for.
> - I see you're using "say", which is a new language feature. If you're
> doing that, you may as well know about and use given/when instead of
> if/else, though like I said, it's not actually required here.
> Keep having fun,
> And I hope this is useful,
> Gaal
> On Sun, Aug 5, 2012 at 3:39 AM, moshe nahmias <moshegrey at ubuntu.com>wrote:
>> Hi all,
>> I just uploaded my first little program to github. I want your feedback
>> on the script.
>> the script is intended to make the statistics for mailing groups from
>> their archives, for now it only works on linux (or at least if wget and
>> gzip are installed).
>> So, what do you think?
>> I know its not perfect so feel free to teach me, after all, that's one of
>> the reasons to publish this :)
>> the link: https://github.com/moshe742/mail-stats.git
>> Moshe
>> _______________________________________________
>> Perl mailing list
>> Perl at perl.org.il
>> http://mail.perl.org.il/mailman/listinfo/perl
> --
> Gaal Yahas <gaal at forum2.org>
> http://gaal.livejournal.com/

Gaal Yahas <gaal at forum2.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.perl.org.il/pipermail/perl/attachments/20120805/88334566/attachment-0001.htm 

More information about the Perl mailing list