[Israel.pm] my first program

moshe nahmias moshegrey at ubuntu.com
Sun Aug 5 12:18:49 PDT 2012


First of all, thanks for your comments.

On Sun, Aug 5, 2012 at 10:32 AM, Gaal Yahas <gaal at forum2.org> wrote:

> I wanted to mention one other thing -- how do you know this thing works?
>
> When you're developing it, you probably ran it several times. This is fine
> to do occasionally, but it has a few problems:
> - it puts load on Gabor's servers
> - it's slow! you have to fetch a bunch or resources from the network every
> time and uncompress them.
> - when you have a bug, you have to figure out what's wrong from looking at
> the output of the entire program. This can be hard.
> - it's possible you have a bug in the statistics code that doesn't crash
> the program, but which causes the wrong output! You'd never find it by
> looking at the report, unless you had some prior knowledge about the actual
> stats.
>
> Testing is a big subject, and I don't know how well you're already
> familiar with it, so I'll just again give a few general remarks:
>
> - The main point I'm making here is that your script is a single flow of
> code, and if you break it down to functions (sub get_page, sub analyze_page
> etc.) you can cause only specific parts of your code to run, with input
> that's under your control, and verify that the output is what you expect.
>
> - You might be tempted to comment out pieces of code / add flags like
> $TEST or whatever -- it's okay here and there during development, but it
> doesn't scale well down the line. You end up with messy code or things you
> have to clean up before release, and have to repeat some work every time
> you want to test something.
>
> - Whatever works. Perl has spectacular support for testing, but it means
> it's easy to get overwhelmed by the options. Start with Test::More or
> something else that's well-known.
>
>
Right, I really need to start making tests for this and do it in parts, I
think this will be for my next version of the script (tests and building it
in smaller chunks).


> On Sun, Aug 5, 2012 at 9:51 AM, Gaal Yahas <gaal at forum2.org> wrote:
>
>> Since you asked for comments, here are a few general ones.
>>
>> - You use autodie, but aren't telling it to capture errors in system. The
>> most likely part of your program to fail is the network part (what you
>> currently delegate to wget), so it's probably the one you want the most
>> control over. Currently you'd fail silently. If you say "use autodie
>> qw(:all)" you'd die on the first wget failure. What would you like to do?
>> I'm guessing that ideally, you should retry (or have wget retry). But at
>> the very least, you want good error messages.    To summarize this point:
>> don't give up contorl of error reporting in the heart of what you're trying
>> to accomplish.
>>
>>
> OK, so I removed the autodie, but can i use it somehow and still have my
own warnings? (want to learn, so a link will be good enough).
About the wget, I meant the program on linux distro's, not something on
perl itself.
How do I get the errors of a system error? I use system() for the wget and
gzip so if there is an error i can see it in the terminal, but if I will be
able to get the errors it can help if and when there will be problems and
save them for a time I can look at them.

> - You're using use strict, which is very good, but notice that the way
>> your file handles are named, they are unchecked globals. Avoid
>> all-uppercase, sigilless names (FILE and so one) if you want protection;
>> use lexicals instead:
>>
>>   open my $fh, "<", "myfile" or die ...
>>
>> (BTW, you have or die here and there, despite autodie. I'd recommend for
>> now just getting rid of autodie entirely and doing all the errors yourself.)
>>
>> When you use lexical filehandles, the files are closed automatically when
>> the variable goes out of scoped. This is great for reads. For writes, you
>> should always close the file yourself to check for errors. There are fancy
>> ways of doing that automatically but never mind them for now.
>>
>> I understand from your comment that I can check for errors when closing a
file, how? (again, if you prefer giving a link that's OK too).


> - It's not unreasonable to use external tools in your script. Whatever
>> gets the job done. Of course there are tradeoffs: you comment that you
>> don't know if these tools are available on other platforms. Well, modules
>> for gzip and http may not be available on all platforms either -- take a
>> look at corelist if you want to make educated guesses about what to expect.
>>
>> Didn't understand what you mean, I used the linux programs wget and gzip
in this script, thus i don't know if those programs are available for
windows for example, thus I don't know if it will work on windows.


> - Syntactically, this:
>>
>>   if (CONDITION 1) {
>>     CODE 1
>>   } if (CONDITION 2) {
>>     CODE 2
>>   } if (CONDITION 3) {
>>     CODE 3
>>   } ...
>>
>>   checks all the conditions and may run all the branches. I think you
>> meant "elsif" there.
>>
>> I use it this way because it doesn't get the right numbers and make
mistakes if I use elsif (at least on some of the years, don't remember why).


> - Semantically, what you're doing is extracting a bunch of values from a
>> hash into a number of scalars. Why? This has a few costs:
>>
>> 1. You have to pick names for these. If you ever scratch your head about
>> the name of a variable, and end up with something mundane like "first",
>> something's wrong. (I mean, first is perfectly all right if its scope is
>> very small, and within that block of code it's clear what it's intended to
>> refer to. But here the scope of these variables is pretty much the whole
>> program, and the reader is confounded with arbitrary distinctions like
>> "first" and "firs".
>>
>> I have refactored the names, it's better, not great.
The reason for the names is that $first is the one with the most mails on
this year, so may be the name isn't great but I don't know if there is a
much better choice. I am open to suggestions.

2. It's inefficient in code. You have a 20 lines of code that aren't
>> furthering your goal as a programmer here. Couldn't the user of this data
>> may as well just say $names{$some_name}?
>>
>> The point of these 20 lines is to get the numbers of the most active
participants, I guess there are other ways (and may be better ways) but I
don't know them yet.


> 3. It's inefficient in performance. Your for loop runs as many times as
>> there are elements in %names, and performs ten conditions. So you have 100
>> branches in your code where you didn't need any. In this case, it doesn't
>> matter; but you have to keep these things in mind for when you have larger
>> data structures.
>>
>> So, what you say is that i should find a way to make it run only on the
"right" parts of my variable? the big question is how can I do that?


> - Again on naming, "percent" vs. "per" -- the reader can't tell what these
>> mangled names are for.
>>
>> - I see you're using "say", which is a new language feature. If you're
>> doing that, you may as well know about and use given/when instead of
>> if/else, though like I said, it's not actually required here.
>>
>>
>> Keep having fun,
>> And I hope this is useful,
>> Gaal
>>
>> Thanks again for your comments, it's very helpful and I think I learnt
some things already :)


> On Sun, Aug 5, 2012 at 3:39 AM, moshe nahmias <moshegrey at ubuntu.com>wrote:
>>
>>> Hi all,
>>> I just uploaded my first little program to github. I want your feedback
>>> on the script.
>>> the script is intended to make the statistics for mailing groups from
>>> their archives, for now it only works on linux (or at least if wget and
>>> gzip are installed).
>>>
>>> So, what do you think?
>>> I know its not perfect so feel free to teach me, after all, that's one
>>> of the reasons to publish this :)
>>> the link: https://github.com/moshe742/mail-stats.git
>>>
>>> Moshe
>>>
>>> _______________________________________________
>>> Perl mailing list
>>> Perl at perl.org.il
>>> http://mail.perl.org.il/mailman/listinfo/perl
>>>
>>
>>
>>
>> --
>> Gaal Yahas <gaal at forum2.org>
>> http://gaal.livejournal.com/
>>
>
>
>
> --
> Gaal Yahas <gaal at forum2.org>
> http://gaal.livejournal.com/
>
> _______________________________________________
> Perl mailing list
> Perl at perl.org.il
> http://mail.perl.org.il/mailman/listinfo/perl
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.perl.org.il/pipermail/perl/attachments/20120805/ec58802a/attachment.htm 


More information about the Perl mailing list