webhelpers2.text

Functions that output text (not HTML).

Helpers for filtering, formatting, and transforming strings.

webhelpers2.text.chop_at(s, sub, inclusive=False)

Truncate string s at the first occurrence of sub.

If inclusive is true, truncate just after sub rather than at it.

webhelpers2.text.collapse(string, character=' ')

Removes specified character from the beginning and/or end of the string and then condenses runs of the character within the string.

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers2.text.convert_accented_entities(string)

Converts HTML entities into the respective non-accented letters.

Examples:

>>> convert_accented_entities("á")
'a'
>>> convert_accented_entities("ç")
'c'
>>> convert_accented_entities("è")
'e'
>>> convert_accented_entities("î")
'i'
>>> convert_accented_entities("ø")
'o'
>>> convert_accented_entities("ü")
'u'

Note: This does not do any conversion of Unicode/ASCII accented-characters. For that functionality please use unidecode.

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers2.text.convert_misc_entities(string)

Converts HTML entities (taken from common Textile formattings) into plain text formats

Note: This isn’t an attempt at complete conversion of HTML entities, just those most likely to be generated by Textile.

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers2.text.excerpt(text, phrase, radius=100, excerpt_string='...')

Extract an excerpt from the text, or ‘’ if the phrase isn’t found.

phrase

Phrase to excerpt from text

radius

How many surrounding characters to include

excerpt_string

Characters surrounding entire excerpt

Example:

>>> excerpt("hello my world", "my", 3)
'...lo my wo...'
webhelpers2.text.lchop(s, sub)

Chop sub off the front of s if present.

>>> lchop("##This is a comment.##", "##")
'This is a comment.##'

The difference between lchop and s.lstrip is that lchop strips only the exact prefix, while s.lstrip treats the argument as a set of leading characters to delete regardless of order.

webhelpers2.text.plural(n, singular, plural, with_number=True)

Return the singular or plural form of a word, according to the number.

If with_number is true (default), the return value will be the number followed by the word. Otherwise the word alone will be returned.

webhelpers2.text.rchop(s, sub)

Chop sub off the end of s if present.

>>> rchop("##This is a comment.##", "##")
'##This is a comment.'

The difference between rchop and s.rstrip is that rchop strips only the exact suffix, while s.rstrip treats the argument as a set of trailing characters to delete regardless of order.

webhelpers2.text.remove_formatting(string)

Simplify HTML text by removing tags and several kinds of formatting.

If the unidecode package is installed, it will also transliterate non-ASCII Unicode characters to their nearest pronunciation equivalent in ASCII.

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers2.text.replace_whitespace(string, replace=' ')

Replace runs of whitespace in string

Defaults to a single space but any replacement string may be specified as an argument. Examples:

>>> replace_whitespace("Foo       bar")
'Foo bar'
>>> replace_whitespace("Foo       bar", "-")
'Foo-bar'

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

webhelpers2.text.series(*items, **kw)

Join strings using commas and a conjunction such as “and” or “or”.

The conjunction defaults to “and”. Pass ‘conj’ as a keyword arg to change it. Pass ‘strict=False’ to omit the comma before the conjunction.

Examples:

>>> series("A", "B")
'A and B'
>>> series("A", "B", conj="or")
'A or B'
>>> series("A", "B", "C")
'A, B, and C'
>>> series("A", "B", "C", strict=False)
'A, B and C'
webhelpers2.text.strip_leading_whitespace(s)

Strip the leading whitespace in all lines in s.

This deletes all leading whitespace. textwrap.dedent deletes only the whitespace common to all lines.

webhelpers2.text.truncate(text, length=30, indicator='...', whole_word=False)

Truncate text with replacement characters.

length

The maximum length of text before replacement

indicator

If text exceeds the length, this string will replace the end of the string

whole_word

If true, shorten the string further to avoid breaking a word in the middle. A word is defined as any string not containing whitespace. If the entire text before the break is a single word, it will have to be broken.

Example:

>>> truncate('Once upon a time in a world far far away', 14)
'Once upon a...'
webhelpers2.text.urlify(string)

Create a URI-friendly representation of the string

Can be called manually in order to generate an URI-friendly version of any string.

If the unidecode package is installed, it will also transliterate non-ASCII Unicode characters to their nearest pronounciation equivalent in ASCII.

Examples::
>>> urlify("Mighty Mighty Bosstones")
'mighty-mighty-bosstones'

Based on Ruby’s stringex package (http://github.com/rsl/stringex/tree/master)

Changed in WebHelpers 1.2: urlecode the result in case it contains special characters like “?”.

webhelpers2.text.wrap_long_lines(text, width=72)

Wrap all long lines in a text string to the specified width.

width may be an int or a textwrap.TextWrapper instance. The latter allows you to set other options besides the width, and is more efficient when wrapping many texts.

Unlike wrap_paragraphs(), this splits individual lines and does not look at the paragraph context. Thus it never joins lines. This is safer if the text might contain preformatted lines (tables, poetry, headers) in the middle of paragraphs. However, it could lead to splitting a line just before the last word or two, putting the orphan words on a separate line, in the middle of a paragraph.

webhelpers2.text.wrap_paragraphs(text, width=72)

Wrap all paragraphs in a text string to the specified width.

width may be an int or a textwrap.TextWrapper instance. The latter allows you to set other options besides the width, and is more efficient when wrapping many texts.

This is intended only to split lines that are too long. It keeps short lines intact, including at the beginning of paragraphs. If a paragraph starts with short lines and then a long line, it will keep the initial short lines as is, and wrap from the long line until the end of the paragraph (a blank line, a line containing only whitespace, or the end of the document). This is intended to preserve preformatted text (tables, poetry, headers), but occasionally it may preserve short lines you wanted to join.