[Israel.pm] What advantage would Perl give to a University student?

Avishalom Shalit avishalom at gmail.com
Mon Mar 23 08:52:13 PDT 2009


a reprise

2009/3/23 Avishalom Shalit <avishalom at gmail.com>:
> well, Matlab has a limited embedded perl interpreter , so
> theoretically you could do this from within.
> but i find it easier to work outside of matlab with perl
>
> let me give you a use case or two.
>
> imagine a standard server log file that contains a url, a referrer and
> an ip address,
> now imagine it is 1GB in size,
> and you want to ask some questions.
>
> a- map the traffic inside the website citing both unique visitations
> and total visitations.
> example question, find most frequent cycle of pages larger than 5 links.
> b- plot the histograms for referrers , page hits and ip activity
>
> so. once you have the data in the right format it is easy with matlab.
> BUT, for matlab is a bit heavy and slow on text processing,
> (especially if the delimiting character isn't a space)
>
> so in this case i would use perl to create 2 or three dictionaries.
> e.g.
>
> perl -F, -anle '$urlnum{$F[0]}=$pagecounter and
> $urldict[$pagecounter]=$F[0] and $pagecounter++ unless
> $pagecounter{$F[0]}++; .... print $urldict[$F[0]] .....}'
> and sometimes even -i
>
> (or i would have used a file with "strict" and "my" of course :-) )
>
> now i have a lookup file (because i printed it in the END to a file)
> 0 www.google.co
> 1 www.bbc.co.uk
> etc .
> and another
>
> 0 123.123.123.123
> 1 321.321.321.321
>
> etc.
>
>
> and the main log file looks like this
> 0 1 0
> 0 1 0
> 0 1 1
> 2 1 0
> 2 1 2
> 3 0 2
>
> ......
> this , matlab slurps in a second, and recognizes as numerical data
> (even if bitwise it strings, i.e. a text file. )
> if your urls contain some non english charaters (?query=שדג)
> you have no choice even if you were willing to let matlab sweat some strings
>
> the dictionary files are much smaller now and pose no problem to read
> and use as labels.
>
> {then, using either accumarray, or sparse, i get this into an
> adjacency matrix inside matlab.  etc }
>
>
> ------
> another use case would be a matlab script that does a certain
> computation (that may take an a few hours) over different parameters
> this could be done internally , but has some advantages to do it from
> an outside script, e.g. multi cores , unstable machines that drop your
> computation (because someone didn't plug the fan in and it got hot)
>
> 2009/3/23 Gabor Szabo <szabgab at gmail.com>:
>> On Thu, Mar 12, 2009 at 2:29 PM, Avishalom Shalit <avishalom at gmail.com> wrote:
>>> to cover a different benefit.
>>
>>>
>>> I have often found myself preformatting data files (for example to be
>>> used in matlab) with perl.
>>> i may have been able to do this with awk, but i am not fluent in awk.
>>
>> I never used Matlab but I often encounter people in my classes who are talking
>> about using Perl and  Matlab together. So far I have not managed to understand
>> what this means. I'd really appreciate if you wrote a couple of examples on
>> how you used the two together (and why :-).
>>
>>
>> reagards
>>  Gabor
>>
>> --
>> Gabor Szabo                     http://szabgab.com/blog.html
>> Perl Training in Israel         http://www.pti.co.il/
>> Test Automation Tips        http://szabgab.com/test_automation_tips.html
>> _______________________________________________
>> Perl mailing list
>> Perl at perl.org.il
>> http://mail.perl.org.il/mailman/listinfo/perl
>>
>
>
>
> --
> -- vish
>



-- 
-- vish


More information about the Perl mailing list