# use the power of regular expressions # bite the bullet and review the regular expression syntax import re # lets say you have created the next search engine # your search engine extracts the contents of # the <title></title> tags theString = """ <lots of garbage and # what not and this title is going to be cool> <myTitle> will be awesome. And once you get <title>the title is here</title> and then there is the end """ # you compile a regular expression to search # for the contents of the title tag # (this is where the regular expression syntax http://docs.python.org/library/re.html#regular-expression-syntax # comes in handy) # the one thing to certainly notice is that there are # parenthesis surrounding the contents of the title tag. # These are called backreferences. Once we've run the search # we'll be able to reference these. p = re.compile('<title>(.+)<\/title>') # now search theString m = re.search(p, theString) # you can test whether or not your # regular expression was successfull if m: print "regular expression search successfull!" # referencing group #1 references the first backreference print "the title contents are:", m.group(1) # group # 0 is the entire regular expression result print "the entire regular expression returned:", m.group(0) else: print "regular expression search returns no results" #output: # regular expression search successfull! # the title contents are: the title is here # the entire regular expression returned: <title>the title is here</title>
A python example based blog that shows how to accomplish python goals and how to correct python errors.
Wednesday, September 30, 2009
Python - regular expression backreference example
Labels:
backreference,
compile,
group,
python,
re,
regular expressions,
search
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment