blob: 2f1ef3912c2a2a104b652b1e3bf4c6378e48a4e0 (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
This is a Python library of web-related functions, such as:
- remove comments, or tags from HTML snippets
- extract base url from HTML snippets
- translate entites on HTML strings
- encoding mulitpart/form-data
- convert raw HTTP headers to dicts and vice-versa
- construct HTTP auth header
- converting HTML pages to unicode
- RFC-compliant url joining
- sanitize urls (like browsers do)
- extract arguments from urls
WWW: http://github.com/scrapy/w3lib/
|