[Israel.pm] Perl Vs. Python on Various Points

Shlomi Fish shlomif at iglu.org.il
Mon Jul 13 01:10:27 PDT 2009


Hi all.

I had a discussion about Perl vs. Python with a certain Python programmer, who
had used Perl 5 in the past (he's BCCed to this message). While most of the
discussion is of little interest to this forum, we did raise a few interesting
points.

1. Syntax as an Indicative of What the Language is Doing:
---------------------------------------------------------

He said he didn't like Perl syntax like "push @$array_ref, $val;" because
of the sigils. I said I happen to like it because the extra characters
convey meaning. By looking at this code I know that $array_ref is an array
reference, that it is being treated as an array and that one appends a
single ("scalar") value to it. On the other if I see code like this:

<<<<<
s.add(h)
>>>>>

It's harder for me to know what's going on without running the program. For
all I know s and h could be almost anything. The sigils convey meaning.

In one of his talks, Larry Wall said that in Lisp all code comes in 
parentheses and so all code looks the same. Perl takes the directly opposite 
approach of adding many symbols to make sure to convey as much meaning in
the code as possible. 

( There are other advantages to prefixing variables with sigils like making
sure that new keywords do not break old programs. )

OTOH, one may argue that the extra symbols make the code messier, but I think
people can agree they have their advantages.

2. Comparison Operators:
------------------------

Later on the discussion diverted to comparison operators. Now python only
has "==" and friends for comparison (at least as far as I know) while Perl 5
has both ==/!=/>/etc. and eq/ne/gt/etc. The first ones are intended for
numeric comparison and the latter ones for string comparison. 

I argued that by looking at code with such comparisons, I can tell what kind
of comparison the programmmer intended the comparison to be. Part of the
reason for the fact that Perl 5 has both types of comparison is that it
does not have separate data types for strings and for numbers, but this is
not the only reason.

So in Python, I have:

<<<<<<<<<<<<<<<
shlomi:~$ python
Python 2.6.2 (r262:71600, Jul 11 2009, 07:37:11) 
[GCC 4.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 1 == 1.0
True
>>> "1" == "1.0"
False
>>> 
>>>>>>>>>>>>>>>

Whereas in Perl, I have:

<<<<<<<<<<<<<<<
shlomi:~$ re.pl 
$ 1 == 1.0
1
$ "1" == "1.0"
1
$ "1" eq "1.0"

$ 1 eq 1.0
1
>>>>>>>>>>>>>>>

That's not all there is to it, however. In Python:

<<<<<<<<<<<<<<<<<<<<<
shlomi:~$ python
Python 2.6.2 (r262:71600, Jul 11 2009, 07:37:11) 
[GCC 4.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = [0,1,2]
>>> y = [2,1,0]
>>> y.reverse
<built-in method reverse of list object at 0xb7cc444c>
>>> y.reverse()
>>> x
[0, 1, 2]
>>> y
[0, 1, 2]
>>> x == y
True
>>> 
>>>>>>>>>>>>>>>>>>>>

So Python's == does a deep comparison of complex data structures and returned
that x and y where equivalent despite the fact that they aren't the same
physical reference.

In Perl, however:

<<<<<<<<<<
shlomi:~$ re.pl
$ [0,1,2] eq [0,1,2]

$ ([0,1,2] eq [0,1,2]) ? "True" : "False"
False
$ ([0,1,2] == [0,1,2]) ? "True" : "False"
False
$ 
>>>>>>>>>>

(You shouldn't use == for comparing references in Perl 5 - it's just for the
sake of the demonstration.)

Perl did a shallow comparison of the references and returned a false because
they weren't the same reference.

I should note that in Perl comparison is not necessarily O(1) because if I 
have
two very long strings, then comparing them may be O(N) where N is the length
of the strings.

For deep comparison we have CPAN modules like
http://search.cpan.org/dist/Test-Differences/ , or can use the more limited
is_deeply() functionationality of Test::More.

I personally feel that it's impossible to have "one-comparison-fits-all"
because for two pieces of data, there may be several ways that we would like
to compare them.

3. Circular References:
-----------------------

After the discussion on comparison, the conversation diverted to discussing
circular references. My partner for the conversation was surprised to learn
that Python has them:

<<<<<<<<<<<<<<<<<<<<<<
shlomi:~$ python
Python 2.6.2 (r262:71600, Jul 11 2009, 07:37:11) 
[GCC 4.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> a = [0,1,2]
>>> a[0] = a
>>> a
[[...], 1, 2]
>>> a[0][1]
1
>>> a[0][0][1]
1
>>> a[0][0][0][1]
1
>>> a[0]
[[...], 1, 2]
>>> 
>>>>>>>>>>>>>>>>>>>>>>

Now what happens if we try to compare two equivalent circular data strctures:

<<<<<<<<<<<<<<<<<<<<<<
shlomi:~$ python
Python 2.6.2 (r262:71600, Jul 11 2009, 07:37:11) 
[GCC 4.4.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> x = [0,1,2]
>>> x[0] = x
>>> y = [0,1,2]
>>> y[0] = y
>>> x
[[...], 1, 2]
>>> y
[[...], 1, 2]
>>> x == y
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp
>>> 
>>>>>>>>>>>>>>>>>>>>>>

So CPython is not very smart about it, and throws an ugly expection.

Obviously x[0] = x data structures are neither too common or useful, but
circular references are useful for such data structures and OO patterns 
as trees with parent pointers, doubly-linked lists or graphs.

Now, I also remembered that circular references with a reference count-based
garbage collector (GC) were a common cause of memory leaks, so I decided to
see if it still existed in Python. I wrote the following program:

<<<<<<<<<<<<<<<<<<<<<<
import random

def gen_rand_string():
    ret = ""
    for x in range(0,100):
        ret += str(random.randint(0,1000000))
    return ret

def leak():
    a = [0,gen_rand_string(), 24]
    a[0] = a
    return a

random.seed(24)

while(1):
    print leak()[2]
>>>>>>>>>>>>>>>>>>>>>>

Running it made it remain at 0.1% of memory for a long time. I asked
the people on Freenode #python about it and they told me that "python has had 
a cycle collector since 2.0" and I was told that this cycle collector does
not make object destruction unpredictable.

perl 5 still doesn't have something like that and when I consulted #perl
about it they said that http://xrl.us/be233e is a start towards a cycle
collection for objects.

4. Hiding Code By Using .pyc's
------------------------------

The python backend compiles the text-based Python code to bytecode, and
caches the result in .pyc files. My partner to the conversation argued that
he often uses these .pyc files to "hide" the source code from people he's
distributing it to them, so they will be unable to reverse engineer it.

I told him that with a symbolic language such as Python, such .pyc files
do not provide adequate "protection" as they contain the names of identifiers
and other detailed information on the code. At one point, he even thought
that they are compiled C code, but I told him CPython can work pretty well
on machines without any kind of C compiler.

On #python, people seemed to have agree with me (I am rindolf):

<<<<<<<<<<<<<<<<<<<<<<
Jul 12 20:33:13 <rindolf>	Another question: can I depend on compiled python
bytecode (.pyc) for "hiding" code? Doesn't it still contain all the 
identifiers
verbatim?
Jul 12 20:33:28 <lvh>	rindolf: No. You cannot hide code, stop trying.
Jul 12 20:33:34 <kniht>	rindolf: why are you trying to hide code?
Jul 12 20:35:53 <DeadPanda>	rindolf, check the new IEEE Security and Privacy
(if you can), you can't hide Python code
Jul 12 20:38:21 <DeadPanda>	rindolf, unfortunately, there's nothing you can 
do.
Otoh, you can probably do the bare minimum and make your boss happy.
Jul 12 20:38:55 <kniht>	rindolf: because these final users are liars and
cheats? if that's the business plan, not sure what I can say
>>>>>>>>>>>>>>>>>>>>>>

Python knows the identifiers of the variables at run-time. For example:

<<<<<<<<<<<<<<<<<<<<<<
shlomi:~$ cat exec-test.py 
#!/usr/bin/env python

import sys

a = "I am a"
b = "I'm b"

exec(sys.stdin.readline())
shlomi:~$ python exec-test.py 
print a
I am a
shlomi:~$ python exec-test.py 
print b
I'm b
shlomi:~$ 
>>>>>>>>>>>>>>>>>>>>>>

Even if you're not using exec(), eval() or friends, python still has to 
accomodate for them being potentially used and as a result keeps this
information in the bytecode. My partner said he doesn't use eval and friends
because they are "a bad programming practice" and as a result thought he
was safe. However, that's not the case.

He told his clients that Python bytecode was only marginally worse than
Java and .NET bytecode which "are used to protect the code of highly sensitive
IDF and US millitary applications - bytecode is sufficient protection."
However, according to:

http://developers.slashdot.org/article.pl?sid=05/06/28/2319213&tid=108

"java is [a] cake to reverse engineer".

So it's not adequate protection, and Python is even substantially less than 
that.

I next suggested he may opt to use obfuscators to obfuscate his code, and he
said that:

<<<<<<<<<<<<<<<<<<<<<<
They can't sue me, they'll have to sue whoever reverse engineered the code - I
believe it's illegal to reverse engineer commercial compiled bytecode - it
isn't however to reverse engineer obfuscated code far as I know -- it's not a
technical issue, it's a legal/business one.
>>>>>>>>>>>>>>>>>>>>>>

I don't understand the distinction between obfuscated code and bytecode
in this case. And like I told him "Any sufficiently advanced obfuscation is 
indistinguishable from bytecode.". 

Talking with a different Python programmer, he told me that .pyc were 
considered adequate "protection" for them, despite the fact there are
several .pyc disassemblers and decompilers present.

In short, depending on .pyc's for protecting your source code, only provides 
fig-leaf protection.

------------------------------

In short, I enjoyed this discussion and learned some new things about Python.
Another thing I was happy to find out was that often my intuition and
understanding were more correct than the knowledge of someone who's been
programming Python intensively for 3 years.

Regards,

    Shlomi Fish


-- 
-----------------------------------------------------------------
Shlomi Fish       http://www.shlomifish.org/
Original Riddles - http://www.shlomifish.org/puzzles/

God gave us two eyes and ten fingers so we will type five times as much as we
read.


More information about the Perl mailing list