[Israel.pm] RegEx in HTML Character

Yona Shlomo yona at cs.technion.ac.il
Mon Jan 28 21:00:40 PST 2008

On Mon, 28 Jan 2008, Georges EL OJAIMI wrote:

> Hello,
> Yona Shlomo wrote:
>> How does the following help prevent HTML characters and SQL
>> injection into the database?

Can you answer this question? How does this transformation
of yours help prevent SQL injections?

>>> [b]bold[/b]
>>> [i]italic[/i]
>>> [u]underline[/u]
>>> [url=http://www.url.com]url[/url]
>>> I want to replace each tag on the fly by its real HTML tag while
>>> displaying it to the end user.
>>> Is there a way to replace all these tags by there equivalents? I am
>>> having problem detecting the brackets []
> I will remove all escape characters except these ones. example:
> /<[//]{0,1}(B|b)[^><]*>/g by dynamically passing all the needed tags.
>> Can you guarantee that square brackets are only used as your
>> markup?
>> Your is the [url=....] the equevalent to the HTML <a href=...> ?
> Yes, it is

You can try the following hack, but it is risky:

s,<url(="[^"]+")>([^<]+)</url>,<a href=\1>\2</a>,g

See, the above regular expressions do not try to balance
your markup's open and close tags, nor are aware of
whitespace issues, quotations and escaping.

Shlomo Yona
yona at cs.technion.ac.il

More information about the Perl mailing list