PATH:
usr
/
share
/
doc
/
python-docs-2.7.5
/
html
/
_sources
/
library
:mod:`robotparser` --- Parser for robots.txt ============================================= .. module:: robotparser :synopsis: Loads a robots.txt file and answers questions about fetchability of other URLs. .. sectionauthor:: Skip Montanaro <skip@pobox.com> .. index:: single: WWW single: World Wide Web single: URL single: robots.txt .. note:: The :mod:`robotparser` module has been renamed :mod:`urllib.robotparser` in Python 3. The :term:`2to3` tool will automatically adapt imports when converting your sources to Python 3. This module provides a single class, :class:`RobotFileParser`, which answers questions about whether or not a particular user agent can fetch a URL on the Web site that published the :file:`robots.txt` file. For more details on the structure of :file:`robots.txt` files, see http://www.robotstxt.org/orig.html. .. class:: RobotFileParser(url='') This class provides methods to read, parse and answer questions about the :file:`robots.txt` file at *url*. .. method:: set_url(url) Sets the URL referring to a :file:`robots.txt` file. .. method:: read() Reads the :file:`robots.txt` URL and feeds it to the parser. .. method:: parse(lines) Parses the lines argument. .. method:: can_fetch(useragent, url) Returns ``True`` if the *useragent* is allowed to fetch the *url* according to the rules contained in the parsed :file:`robots.txt` file. .. method:: mtime() Returns the time the ``robots.txt`` file was last fetched. This is useful for long-running web spiders that need to check for new ``robots.txt`` files periodically. .. method:: modified() Sets the time the ``robots.txt`` file was last fetched to the current time. The following example demonstrates basic use of the RobotFileParser class. :: >>> import robotparser >>> rp = robotparser.RobotFileParser() >>> rp.set_url("http://www.musi-cal.com/robots.txt") >>> rp.read() >>> rp.can_fetch("*", "http://www.musi-cal.com/cgi-bin/search?city=San+Francisco") False >>> rp.can_fetch("*", "http://www.musi-cal.com/") True
[+]
..
[-] fpformat.txt
[edit]
[-] string.txt
[edit]
[-] aetypes.txt
[edit]
[-] textwrap.txt
[edit]
[-] codecs.txt
[edit]
[-] array.txt
[edit]
[-] urllib2.txt
[edit]
[-] pickle.txt
[edit]
[-] 2to3.txt
[edit]
[-] ossaudiodev.txt
[edit]
[-] markup.txt
[edit]
[-] sha.txt
[edit]
[-] stat.txt
[edit]
[-] itertools.txt
[edit]
[-] numeric.txt
[edit]
[-] gzip.txt
[edit]
[-] pkgutil.txt
[edit]
[-] subprocess.txt
[edit]
[-] token.txt
[edit]
[-] debug.txt
[edit]
[-] dummy_thread.txt
[edit]
[-] mimify.txt
[edit]
[-] bastion.txt
[edit]
[-] popen2.txt
[edit]
[-] autogil.txt
[edit]
[-] importlib.txt
[edit]
[-] io.txt
[edit]
[-] msilib.txt
[edit]
[-] development.txt
[edit]
[-] hotshot.txt
[edit]
[-] constants.txt
[edit]
[-] curses.ascii.txt
[edit]
[-] os.txt
[edit]
[-] easydialogs.txt
[edit]
[-] mutex.txt
[edit]
[-] md5.txt
[edit]
[-] stdtypes.txt
[edit]
[-] bz2.txt
[edit]
[-] imgfile.txt
[edit]
[-] aetools.txt
[edit]
[-] internet.txt
[edit]
[-] xml.dom.pulldom.txt
[edit]
[-] cookie.txt
[edit]
[-] sunau.txt
[edit]
[-] archiving.txt
[edit]
[-] modules.txt
[edit]
[-] base64.txt
[edit]
[-] ttk.txt
[edit]
[-] tty.txt
[edit]
[-] cookielib.txt
[edit]
[-] pydoc.txt
[edit]
[-] misc.txt
[edit]
[-] fm.txt
[edit]
[-] whichdb.txt
[edit]
[-] imghdr.txt
[edit]
[-] datatypes.txt
[edit]
[-] dis.txt
[edit]
[-] datetime.txt
[edit]
[-] userdict.txt
[edit]
[-] py_compile.txt
[edit]
[-] imputil.txt
[edit]
[-] xml.dom.minidom.txt
[edit]
[-] multiprocessing.txt
[edit]
[-] functools.txt
[edit]
[-] audioop.txt
[edit]
[-] language.txt
[edit]
[-] miniaeframe.txt
[edit]
[-] ssl.txt
[edit]
[-] copy_reg.txt
[edit]
[-] heapq.txt
[edit]
[-] syslog.txt
[edit]
[-] fileformats.txt
[edit]
[-] simplexmlrpcserver.txt
[edit]
[-] numbers.txt
[edit]
[-] pipes.txt
[edit]
[-] shutil.txt
[edit]
[-] zlib.txt
[edit]
[-] optparse.txt
[edit]
[-] allos.txt
[edit]
[-] custominterp.txt
[edit]
[-] windows.txt
[edit]
[-] dbm.txt
[edit]
[-] select.txt
[edit]
[-] time.txt
[edit]
[-] index.txt
[edit]
[-] undoc.txt
[edit]
[-] plistlib.txt
[edit]
[-] htmllib.txt
[edit]
[-] warnings.txt
[edit]
[-] functions.txt
[edit]
[-] argparse.txt
[edit]
[-] distutils.txt
[edit]
[-] macpath.txt
[edit]
[-] fnmatch.txt
[edit]
[-] tabnanny.txt
[edit]
[-] calendar.txt
[edit]
[-] jpeg.txt
[edit]
[-] zipimport.txt
[edit]
[-] unix.txt
[edit]
[-] chunk.txt
[edit]
[-] smtplib.txt
[edit]
[-] tokenize.txt
[edit]
[-] asynchat.txt
[edit]
[-] decimal.txt
[edit]
[-] cd.txt
[edit]
[-] pickletools.txt
[edit]
[-] statvfs.txt
[edit]
[-] gettext.txt
[edit]
[-] sqlite3.txt
[edit]
[-] imageop.txt
[edit]
[-] dl.txt
[edit]
[-] logging.handlers.txt
[edit]
[-] zipfile.txt
[edit]
[-] copy.txt
[edit]
[-] sched.txt
[edit]
[-] nis.txt
[edit]
[-] timeit.txt
[edit]
[-] weakref.txt
[edit]
[-] mailbox.txt
[edit]
[-] othergui.txt
[edit]
[-] sgmllib.txt
[edit]
[-] mimewriter.txt
[edit]
[-] email.charset.txt
[edit]
[-] mmap.txt
[edit]
[-] xmlrpclib.txt
[edit]
[-] robotparser.txt
[edit]
[-] xml.sax.reader.txt
[edit]
[-] hmac.txt
[edit]
[-] collections.txt
[edit]
[-] mimetools.txt
[edit]
[-] aifc.txt
[edit]
[-] modulefinder.txt
[edit]
[-] socketserver.txt
[edit]
[-] grp.txt
[edit]
[-] getopt.txt
[edit]
[-] __future__.txt
[edit]
[-] binascii.txt
[edit]
[-] symtable.txt
[edit]
[-] crypt.txt
[edit]
[-] email.encoders.txt
[edit]
[-] imaplib.txt
[edit]
[-] spwd.txt
[edit]
[-] wsgiref.txt
[edit]
[-] errno.txt
[edit]
[-] email-examples.txt
[edit]
[-] sun.txt
[edit]
[-] doctest.txt
[edit]
[-] resource.txt
[edit]
[-] asyncore.txt
[edit]
[-] curses.panel.txt
[edit]
[-] imp.txt
[edit]
[-] htmlparser.txt
[edit]
[-] thread.txt
[edit]
[-] posix.txt
[edit]
[-] repr.txt
[edit]
[-] macostools.txt
[edit]
[-] colorsys.txt
[edit]
[-] bdb.txt
[edit]
[-] stringio.txt
[edit]
[-] mm.txt
[edit]
[-] cgi.txt
[edit]
[-] math.txt
[edit]
[-] httplib.txt
[edit]
[-] urlparse.txt
[edit]
[-] aepack.txt
[edit]
[-] colorpicker.txt
[edit]
[-] binhex.txt
[edit]
[-] tk.txt
[edit]
[-] xml.txt
[edit]
[-] webbrowser.txt
[edit]
[-] contextlib.txt
[edit]
[-] types.txt
[edit]
[-] shelve.txt
[edit]
[-] platform.txt
[edit]
[-] readline.txt
[edit]
[-] locale.txt
[edit]
[-] sets.txt
[edit]
[-] macosa.txt
[edit]
[-] bisect.txt
[edit]
[-] email.parser.txt
[edit]
[-] xml.sax.txt
[edit]
[-] xml.etree.elementtree.txt
[edit]
[-] netrc.txt
[edit]
[-] new.txt
[edit]
[-] bsddb.txt
[edit]
[-] fl.txt
[edit]
[-] curses.txt
[edit]
[-] hashlib.txt
[edit]
[-] tarfile.txt
[edit]
[-] ftplib.txt
[edit]
[-] telnetlib.txt
[edit]
[-] macos.txt
[edit]
[-] multifile.txt
[edit]
[-] dbhash.txt
[edit]
[-] netdata.txt
[edit]
[-] fileinput.txt
[edit]
[-] python.txt
[edit]
[-] mac.txt
[edit]
[-] filesys.txt
[edit]
[-] scrolledtext.txt
[edit]
[-] compiler.txt
[edit]
[-] tkinter.txt
[edit]
[-] random.txt
[edit]
[-] mailcap.txt
[edit]
[-] fpectl.txt
[edit]
[-] fcntl.txt
[edit]
[-] ipc.txt
[edit]
[-] formatter.txt
[edit]
[-] profile.txt
[edit]
[-] xml.dom.txt
[edit]
[-] email.iterators.txt
[edit]
[-] logging.config.txt
[edit]
[-] rlcompleter.txt
[edit]
[-] idle.txt
[edit]
[-] gdbm.txt
[edit]
[-] ast.txt
[edit]
[-] someos.txt
[edit]
[-] cmd.txt
[edit]
[-] re.txt
[edit]
[-] docxmlrpcserver.txt
[edit]
[-] pdb.txt
[edit]
[-] rexec.txt
[edit]
[-] commands.txt
[edit]
[-] restricted.txt
[edit]
[-] crypto.txt
[edit]
[-] dumbdbm.txt
[edit]
[-] email.txt
[edit]
[-] linecache.txt
[edit]
[-] os.path.txt
[edit]
[-] cgitb.txt
[edit]
[-] _winreg.txt
[edit]
[-] carbon.txt
[edit]
[-] turtle.txt
[edit]
[-] xml.sax.handler.txt
[edit]
[-] json.txt
[edit]
[-] email.generator.txt
[edit]
[-] struct.txt
[edit]
[-] mhlib.txt
[edit]
[-] uuid.txt
[edit]
[-] threading.txt
[edit]
[-] mimetypes.txt
[edit]
[-] traceback.txt
[edit]
[-] filecmp.txt
[edit]
[-] sndhdr.txt
[edit]
[-] logging.txt
[edit]
[-] getpass.txt
[edit]
[-] dummy_threading.txt
[edit]
[-] abc.txt
[edit]
[-] shlex.txt
[edit]
[-] email.mime.txt
[edit]
[-] fractions.txt
[edit]
[-] email.header.txt
[edit]
[-] marshal.txt
[edit]
[-] symbol.txt
[edit]
[-] wave.txt
[edit]
[-] urllib.txt
[edit]
[-] queue.txt
[edit]
[-] operator.txt
[edit]
[-] compileall.txt
[edit]
[-] sysconfig.txt
[edit]
[-] quopri.txt
[edit]
[-] inspect.txt
[edit]
[-] xdrlib.txt
[edit]
[-] __builtin__.txt
[edit]
[-] ic.txt
[edit]
[-] signal.txt
[edit]
[-] dircache.txt
[edit]
[-] rfc822.txt
[edit]
[-] glob.txt
[edit]
[-] stringprep.txt
[edit]
[-] tix.txt
[edit]
[-] i18n.txt
[edit]
[-] email.util.txt
[edit]
[-] anydbm.txt
[edit]
[-] code.txt
[edit]
[-] __main__.txt
[edit]
[-] difflib.txt
[edit]
[-] unicodedata.txt
[edit]
[-] sunaudio.txt
[edit]
[-] msvcrt.txt
[edit]
[-] site.txt
[edit]
[-] codeop.txt
[edit]
[-] uu.txt
[edit]
[-] winsound.txt
[edit]
[-] smtpd.txt
[edit]
[-] al.txt
[edit]
[-] user.txt
[edit]
[-] poplib.txt
[edit]
[-] nntplib.txt
[edit]
[-] keyword.txt
[edit]
[-] ctypes.txt
[edit]
[-] socket.txt
[edit]
[-] simplehttpserver.txt
[edit]
[-] sys.txt
[edit]
[-] pyexpat.txt
[edit]
[-] configparser.txt
[edit]
[-] persistence.txt
[edit]
[-] trace.txt
[edit]
[-] runpy.txt
[edit]
[-] csv.txt
[edit]
[-] test.txt
[edit]
[-] framework.txt
[edit]
[-] gc.txt
[edit]
[-] cmath.txt
[edit]
[-] exceptions.txt
[edit]
[-] pyclbr.txt
[edit]
[-] sgi.txt
[edit]
[-] parser.txt
[edit]
[-] future_builtins.txt
[edit]
[-] pwd.txt
[edit]
[-] strings.txt
[edit]
[-] frameworks.txt
[edit]
[-] email.message.txt
[edit]
[-] cgihttpserver.txt
[edit]
[-] intro.txt
[edit]
[-] email.errors.txt
[edit]
[-] pprint.txt
[edit]
[-] pty.txt
[edit]
[-] xml.sax.utils.txt
[edit]
[-] posixfile.txt
[edit]
[-] tempfile.txt
[edit]
[-] termios.txt
[edit]
[-] gl.txt
[edit]
[-] gensuitemodule.txt
[edit]
[-] atexit.txt
[edit]
[-] basehttpserver.txt
[edit]
[-] unittest.txt
[edit]