webhelpers2.html.tools

HTML helpers that are more than just simple tags.

I contain helpers to generate complex tags, convert text to HTML, strip tags, and modify URLs.

There are no helpers to prettify HTML or canonicalize whitespace because BeautifulSoup and HTMLTidy handle this well.

Turn all urls and email addresses into clickable links.

link

Used to determine what to link. Options are “all”, “email_addresses”, or “urls”

href_attrs

Additional attributes for generated <a> tags.

Example:

>>> auto_link("Go to http://www.planetpython.com and say hello to guido@python.org")
literal(u'Go to <a href="http://www.planetpython.com">http://www.planetpython.com</a> and say hello to <a href="mailto:guido@python.org">guido@python.org</a>')
webhelpers2.html.tools.button_to(name, url='', **html_attrs)

Generate a form containing a sole button that submits to url.

Use this method instead of link_to for actions that do not have the safe HTTP GET semantics implied by using a hypertext link.

The parameters are the same as for link_to. Any html_attrs that you pass will be applied to the inner input element. In particular, pass

disabled = True/False

as part of html_attrs to control whether the button is disabled. The generated form element is given the class ‘button-to’, to which you can attach CSS styles for display purposes.

The submit button itself will be displayed as an image if you provide both type and src as followed:

type=’image’, src=’icon_delete.gif’

The src path should be the exact URL desired. A previous version of this helper added magical prefixes but this is no longer the case.

Example 1:

# inside of controller for "feeds"
>> button_to("Edit", url(action='edit', id=3))
<form method="post" action="/feeds/edit/3" class="button-to">
<div><input value="Edit" type="submit" /></div>
</form>

Example 2:

>> button_to("Destroy", url(action='destroy', id=3), 
.. method='DELETE')
<form method="POST" action="/feeds/destroy/3" 
 class="button-to">
<div>
    <input type="hidden" name="_method" value="DELETE" />
    <input value="Destroy" type="submit" />
</div>
</form>

Example 3:

# Button as an image.
>> button_to("Edit", url(action='edit', id=3), type='image', 
.. src='icon_delete.gif')
<form method="POST" action="/feeds/edit/3" class="button-to">
<div><input alt="Edit" src="/images/icon_delete.gif"
 type="image" value="Edit" /></div>
</form>

Note

This method generates HTML code that represents a form. Forms are “block” content, which means that you should not try to insert them into your HTML where only inline content is expected. For example, you can legally insert a form inside of a div or td element or in between p elements, but not in the middle of a run of text, nor can you place a form within another form. (Bottom line: Always validate your HTML before going public.)

Changed in WebHelpers 1.2: Preserve case of “method” arg for XHTML compatibility. E.g., “POST” or “PUT” causes method=”POST”; “post” or “put” causes method=”post”.

webhelpers2.html.tools.highlight(text, phrase, case_sensitive=False, class_='highlight', **attrs)

Highlight all occurrences of phrase in text.

This inserts “<strong class=”highlight”>…</strong>” around every occurrence.

Arguments:

text:

The full text.

phrase:

A phrase to find in the text. This may be a string, a list of strings, or a compiled regular expression. If a string, it’s regex-escaped and compiled. If a list, all of the strings will be highlighted. This is done by regex-escaping all elements and then joining them using the regex “|” token.

case_sensitive:

If false (default), the phrases are searched in a case-insensitive manner. No effect if phrase is a regex object.

class_:

CSS class for the <strong> tag.

**attrs:

Additional HTML attributes for the <strong> tag.

webhelpers2.html.tools.html_to_text(html, width=70)

Render HTML as formatted text, like lynx’s “print” function.

Paragraphs are collected and wrapped at the specified width, default 70. Blockquotes are indented. Paragraphs cannot be nested at this time. HTML entities are substituted.

The formatting is simplistic, and works best with HTML that was written for this renderer (e.g., ZPTKit.emailtemplate).

The output usually ends with two newlines. If the input ends with a close tag plus any whitespace, the output ends with four newlines. This is probably a bug.

webhelpers2.html.tools.js_obfuscate(content)

Obfuscate data in a Javascript tag.

Example:

>>> js_obfuscate("<input type='hidden' name='check' value='valid' />")
literal(u'<script type="text/javascript">\n//<![CDATA[\neval(unescape(\'%64%6f%63%75%6d%65%6e%74%2e%77%72%69%74%65%28%27%3c%69%6e%70%75%74%20%74%79%70%65%3d%27%68%69%64%64%65%6e%27%20%6e%61%6d%65%3d%27%63%68%65%63%6b%27%20%76%61%6c%75%65%3d%27%76%61%6c%69%64%27%20%2f%3e%27%29%3b\'))\n//]]>\n</script>')
webhelpers2.html.tools.mail_to(email_address, name=None, cc=None, bcc=None, subject=None, body=None, replace_at=None, replace_dot=None, encode=None, **html_attrs)

Create a link tag for starting an email to the specified email_address.

This email_address is also used as the name of the link unless name is specified. Additional HTML options, such as class or id, can be passed in the html_attrs hash.

You can also make it difficult for spiders to harvest email address by obfuscating them.

Examples:

>>> mail_to("me@domain.com", "My email", encode = "javascript")
literal(u'<script type="text/javascript">\n//<![CDATA[\neval(unescape(\'%64%6f%63%75%6d%65%6e%74%2e%77%72%69%74%65%28%27%3c%61%20%68%72%65%66%3d%22%6d%61%69%6c%74%6f%3a%6d%65%40%64%6f%6d%61%69%6e%2e%63%6f%6d%22%3e%4d%79%20%65%6d%61%69%6c%3c%2f%61%3e%27%29%3b\'))\n//]]>\n</script>')

>>> mail_to("me@domain.com", "My email", encode = "hex")
literal(u'<a href="&#109;&#97;&#105;&#108;&#116;&#111;&#58;%6d%65@%64%6f%6d%61%69%6e.%63%6f%6d">My email</a>')

You can also specify the cc address, bcc address, subject, and body parts of the message header to create a complex e-mail using the corresponding cc, bcc, subject, and body keyword arguments. Each of these options are URI escaped and then appended to the email_address before being output. Be aware that javascript keywords will not be escaped and may break this feature when encoding with javascript.

Examples:

>>> mail_to("me@domain.com", "My email", cc="ccaddress@domain.com", bcc="bccaddress@domain.com", subject="This is an example email", body= "This is the body of the message.")
literal(u'<a href="mailto:me@domain.com?cc=ccaddress%40domain.com&amp;bcc=bccaddress%40domain.com&amp;subject=This%20is%20an%20example%20email&amp;body=This%20is%20the%20body%20of%20the%20message.">My email</a>')
webhelpers2.html.tools.nl2br(text)

Insert a <br /> before each newline.

webhelpers2.html.tools.sanitize(html)

Strip all HTML tags but leave their content.

Use this to strip any potentially malicious tags from user input.

HTML entities are left as-is.

Usage:

>>> sanitize(u'I <i>really</i> like steak!')
u'I really like steak!'
>>> sanitize(u'I <i>really</i> like <script language="javascript">NEFARIOUS CODE</script> steak!')
u'I really like NEFARIOUS CODE steak!'

Strip link tags from text leaving just the link label.

Example:

>>> strip_links('<a href="something">else</a>')
'else'
webhelpers2.html.tools.strip_tags(text)

Delete any HTML tags in the text, leaving their contents intact. Convert newlines to spaces, and <br /> to newlines.

Example::
>>> strip_tags('Text <em>emphasis</em> <script>Javascript</script>.')
'Text emphasis Javascript.'
>>> strip_tags('Ordinary <!-- COMMENT! --> text.')
'Ordinary  COMMENT!  text.'
>>> strip_tags('Line\n1<br />Line 2')
'Line 1\nLine 2'

Implementation copied from WebOb.

sanitize() does almost the same thing, but has a different implementation.

webhelpers2.html.tools.text_to_html(text, preserve_lines=False)

Convert text to HTML paragraphs.

text:

the text to convert. Split into paragraphs at blank lines (i.e., wherever two or more consecutive newlines appear), and wrap each paragraph in a <p>.

preserve_lines:

If true, add <br /> before each single line break

webhelpers2.html.tools.update_params(_url, _debug=False, **params)

Update the query parameters in a URL.

_url is any URL, with or without a query string.

**params are query parameters to add or replace. Each value may be a string, a list of strings, or None. Passing a list generates multiple values for the same parameter. Passing None deletes the corresponding parameter if present.

Return the new URL.

Debug mode: if _debug=True, return a tuple: [0] is the URL without query string or fragment, [1] is the final query parameters as a dict, and [2] is the fragment part of the original URL or the empty string.

Usage:

>>> update_params("foo", new1="NEW1")
'foo?new1=NEW1'
>>> update_params("foo?p=1", p="2")
'foo?p=2'
>>> update_params("foo?p=1", p=None)
'foo'
>>> update_params("http://example.com/foo?new1=OLD1#myfrag", new1="NEW1")
'http://example.com/foo?new1=NEW1#myfrag'
>>> update_params("http://example.com/foo?new1=OLD1#myfrag", new1="NEW1", _debug=True)
('http://example.com/foo', {'new1': 'NEW1'}, 'myfrag')
>>> update_params("http://www.mau.de?foo=2", brrr=3)
'http://www.mau.de?foo=2&brrr=3'
>>> update_params("http://www.mau.de?foo=A&foo=B", foo=["C", "D"])
'http://www.mau.de?foo=C&foo=D'