BeautifulSoup
The module Beautiful Soup is a Python HTML/XML parser designed for quick turnaround projects like screen-scraping. An example to find all the links in a page
1 import urllib
2 import BeautifulSoup
3
4 html = urllib.urlopen('http://vmspython.dyndns.org').read()
5
6 s = BeautifulSoup.BeautifulSoup()
7 s.feed(html)
8
9 for i in s('a'):
10 try:
11 print i['href']
12 except:
13 pass
PyAMF
The module PyAMF provides Action Message Format (AMF) support for Python that is compatible with the Flash Player.
A demo can be seen at http://vmspython.dyndns.org/anonymous/demo/demo2.htm
PycURL
PycURL is a Python module interface to the cURL library. PycURL can be used to fetch objects identified by an URL within a Python program.
1 import sys
2 import pycurl
3
4 class Test:
5 def __init__(self):
6 self.contents = ''
7
8 def body_callback(self, buf):
9 self.contents = self.contents + buf
10
11 print >>sys.stderr, 'Testing', pycurl.version
12
13 t = Test()
14 c = pycurl.Curl()
15 c.setopt(c.URL, 'http://vmspython.dyndns.org/')
16 c.setopt(c.WRITEFUNCTION, t.body_callback)
17 c.perform()
18 c.close()
19
20 print t.contents
dateutil
The dateutil module provides powerful extensions to the standard datetime module, available in Python 2.3+.
1 from dateutil.relativedelta import *
2 from dateutil.easter import *
3 from dateutil.rrule import *
4 from dateutil.parser import *
5 from datetime import *
6 import commands
7 import os
8 now = parse(commands.getoutput("date"))
9 today = now.date()
10 year = rrule(YEARLY,bymonth=8,bymonthday=13,byweekday=FR)[0].year
11 rdelta = relativedelta(easter(year), today)
12 print "Today is:", today
13 print "Year with next Aug 13th on a Friday is:", year
14 print "How far is the Easter of that year:", rdelta
15 print "And the Easter of that year is:", today+rdelta
ElementTree
The ElementTree wrapper type adds code to load XML files as trees of Element objects, and save them back again.
1 from elementtree.ElementTree import Element
2
3 elem = Element("tag", first="1", second="2")
4
5 # print 'first' attribute
6 print elem.attrib.get("first")
7
8 # same, using shortcut
9 print elem.get("first")
10
11 # print list of keys (using shortcuts)
12 print elem.keys()
13 print elem.items()
14
15 # the 'third' attribute doesn't exist
16 print elem.get("third")
17 print elem.get("third", "default")
18
19 # add the attribute and try again
20 elem.set("third", "3")
21 print elem.get("third", "default")
PIL
The Python Imaging Library handles images.
The following code creates JPEG Thumbnails
1 import os, sys
2 import Image
3
4 size = 128, 128
5
6 for infile in sys.argv[1:]:
7 outfile = os.path.splitext(infile)[0] + ".thumbnail"
8 if infile != outfile:
9 try:
10 im = Image.open(infile)
11 im.thumbnail(size)
12 im.save(outfile, "JPEG")
13 except IOError:
14 print "cannot create thumbnail for", infile
SwishE
The module Swish-e Simple Web Indexing System for Humans - Enhanced
Example from http://jibe.freeshell.org/bits/SwishE/
# load the module
import SwishE
# get a SWISH-E handle on 't/swish.idx'
handle = SwishE.new('t/swish.idx')
# get a search object
search = handle.search('')
# search for 'madrid'
results = search.execute('madrid')
# tell the world how many results we have
print results.hits()
# iterate on the results
for r in results:
... print r.getproperty('swishtitle')
...
Argentina Centro de Medios Independientes
Indymedia Barcelona: home
San Francisco Bay Area Independent Media Center
Independent Media Center -
# now looking for 'lluita', we want to sort by title
search.setSort('swishtitle')
again = search.execute('lluita')
for r in again:
... print r.getproperty('swishdocpath')
...
1.html
MySQLdb
The module MySQLdb offers a Python interface for Mysql
import MySQLdb, pprint
connectionObject = MySQLdb.connect(host='172.17.2.1', user='toto', passwd='123', db='mycollection')
c = connectionObject.cursor()
nom_auteur = "'Shakespeare'"
c.execute(""" SELECT title, description FROM books WHERE author = %s """ % (nom_auteur,))
pprint.pprint(c.fetchall())
c.query("update books set title='toto' where author='titi'")
connectionObject.commit()