# use the power of regular expressions # bite the bullet and review the regular expression syntax import re # lets say you have created the next search engine # your search engine extracts the contents of # the <title></title> tags theString = """ <lots of garbage and # what not and this title is going to be cool> <myTitle> will be awesome. And once you get <title>the title is here</title> and then there is the end """ # you compile a regular expression to search # for the contents of the title tag # (this is where the regular expression syntax http://docs.python.org/library/re.html#regular-expression-syntax # comes in handy) # the one thing to certainly notice is that there are # parenthesis surrounding the contents of the title tag. # These are called backreferences. Once we've run the search # we'll be able to reference these. p = re.compile('<title>(.+)<\/title>') # now search theString m = re.search(p, theString) # you can test whether or not your # regular expression was successfull if m: print "regular expression search successfull!" # referencing group #1 references the first backreference print "the title contents are:", m.group(1) # group # 0 is the entire regular expression result print "the entire regular expression returned:", m.group(0) else: print "regular expression search returns no results" #output: # regular expression search successfull! # the title contents are: the title is here # the entire regular expression returned: <title>the title is here</title>
A python example based blog that shows how to accomplish python goals and how to correct python errors.
Showing posts with label backreference. Show all posts
Showing posts with label backreference. Show all posts
Wednesday, September 30, 2009
Python - regular expression backreference example
Labels:
backreference,
compile,
group,
python,
re,
regular expressions,
search
Subscribe to:
Posts (Atom)