wsgiref --- WSGI 工具和引用的实现


Web 服务器网关接口(WSGI)是 Web 服务器软件和用 Python 编写的 Web 应用程序之间的标准接口。 具有标准接口能够让支持 WSGI 的应用程序与多种不同的 Web 服务器配合使用。

只有 Web 服务器和编程框架的开发者才需要了解 WSGI 设计的每个细节和边界情况。 你不需要了解 WSGI 的每个细节而只需要安装一个 WSGI 应用程序或编写使用现有框架的 Web 应用程序。

wsgiref 是一个 WSGI 规范的参考实现,可被用于将 WSGI 支持添加到 Web 服务器或框架中。 它提供了操作 WSGI 环境变量和响应标头的工具,实现 WSGI 服务器的基类,一个可部署 WSGI 应用程序的演示性 HTTP 服务器,以及一个用于检查 WSGI 服务器和应用程序是否符合 WSGI 规范的验证工具 (PEP 3333)。

参看 wsgi.readthedocs.io 获取有关 WSGI 的更多信息,以及教程和其他资源的链接。

wsgiref.util -- WSGI 环境工具

本模块提供了多种工具配合 WSGI 环境使用。 WSGI 环境就是一个包含 PEP 3333 所描述的 HTTP 请求变量的目录。 所有接受 environ 形参的函数都会预期被得到一个符合 WSGI 规范的目录;请参阅 PEP 3333 来了解相关规范的细节。

wsgiref.util.guess_scheme(environ)

返回对于 wsgi.url_scheme 应为 "http" 还是 "https" 的猜测,具体方式是在 environ 中检查 HTTPS 环境变量。 返回值是一个字符串。

此函数适用于创建一个包装了 CGI 或 CGI 类协议例如 FastCGI 的网关。 通常,提供这种协议的服务器将包括一个 HTTPS 变量并会在通过 SSL 接收请求时将其值设为 "1", "yes" 或 "on"。 这样,此函数会在找到上述值时返回 "https",否则返回 "http"。

wsgiref.util.request_uri(environ, include_query=True)

使用 PEP 3333 的 "URL Reconstruction" 一节中的算法返回完整的请求 URL,可能包括查询字符串。 如果 include_query 为假值,则结果 URI 中将不包括查询字符串。

wsgiref.util.application_uri(environ)

类似于 request_uri(),区别在于 PATH_INFOQUERY_STRING 变量会被忽略。 结果为请求所指定的应用程序对象的基准 URI。

wsgiref.util.shift_path_info(environ)

将单个名称从 PATH_INFO 变换为 SCRIPT_NAME 并返回该名称。 environ 目录将被原地 修改;如果你需要保留原始 PATH_INFOSCRIPT_NAME 不变请使用一个副本。

如果 PATH_INFO 中没有剩余的路径节,则返回 None

通常,此例程被用来处理请求 URI 路径的每个部分,比如说将路径当作是一系列字典键。 此例程会修改传入的环境以使其适合发起调用位于目标 URI 上的其他 WSGI 应用程序,如果有一个 WSGI 应用程序位于 /foo,而请求 URI 路径为 /foo/bar/baz,且位于 /foo 的 WSGI 应用程序调用了 shift_path_info(),它将获得字符串 "bar",而环境将被更新以适合传递给位于 /foo/bar 的 WSGI 应用程序。 也就是说,SCRIPT_NAME 将由 /foo 变为 /foo/bar,而 PATH_INFO 将由 /bar/baz 变为 /baz

PATH_INFO 只是 "/" 时,此例程会返回一个空字符串并在 SCRIPT_NAME 末尾添加一个斜杠,虽然空的路径节通常会被忽略,并且 SCRIPT_NAME 通常也不以斜杠作为结束。 此行为是有意为之的,用来确保应用程序在使用此例程执行对象遍历时能区分以 /x 结束的和以 /x/ 结束的 URI。

wsgiref.util.setup_testing_defaults(environ)

以简短的默认值更新 environ 用于测试目的。

此全程会添加 WSGI 所需要的各种参数,包括 HTTP_HOST, SERVER_NAME, SERVER_PORT, REQUEST_METHOD, SCRIPT_NAME, PATH_INFO 以及 PEP 3333 中定义的所有 wsgi.* 变量。 它只提供默认值,而不会替换这些变量的现有设置。

此例程的目的是让 WSGI 服务器的单元测试以及应用程序设置测试环境更为容易。 它不应该被实际的 WSGI 服务器或应用程序所使用,因为它用的是假数据!

用法示例:

from wsgiref.util import setup_testing_defaults
from wsgiref.simple_server import make_server

# A relatively simple WSGI application. It's going to print out the
# environment dictionary after being updated by setup_testing_defaults
def simple_app(environ, start_response):
    setup_testing_defaults(environ)

    status = '200 OK'
    headers = [('Content-type', 'text/plain; charset=utf-8')]

    start_response(status, headers)

    ret = [("%s: %s\n" % (key, value)).encode("utf-8")
           for key, value in environ.items()]
    return ret

with make_server('', 8000, simple_app) as httpd:
    print("Serving on port 8000...")
    httpd.serve_forever()

除了上述的环境函数,wsgiref.util 模块还提供了以下辅助工具:

wsgiref.util.is_hop_by_hop(header_name)

如果 'header_name' 是 RFC 2616 所定义的 HTTP/1.1 "Hop-by-Hop" 标头则返回 True

class wsgiref.util.FileWrapper(filelike, blksize=8192)

一个将文件类对象转换为 iterator 的包装器。 结果对象同时支持 __getitem__()__iter__() 迭代形式,以便兼容 Python 2.1 和 Jython。 当对象被迭代时,可选的 blksize 形参将被反复地传给 文件类对象read() 方法以获取字节串并输出。 当 read() 返回空字节串时,迭代将结束并不可再恢复。

如果 filelike 具有 close() 方法,返回的对象也将具有 close() 方法,并且它将在被调用时发起调用 filelike 对象的 close() 方法。

用法示例:

from io import StringIO
from wsgiref.util import FileWrapper

# We're using a StringIO-buffer for as the file-like object
filelike = StringIO("This is an example file-like object"*10)
wrapper = FileWrapper(filelike, blksize=5)

for chunk in wrapper:
    print(chunk)

3.8 版后已移除: 序列协议 的支持已被弃用。

wsgiref.headers -- WSGI 响应标头工具

此模块提供了一个单独的类 Headers,可以方便地使用一个映射类接口操作 WSGI 响应标头。

class wsgiref.headers.Headers([headers])

创建一个包装了 headers 的映射类对象,它必须为如 PEP 3333 所描述的由标头名称/值元组构成的列表。 headers 的默认值为一个空列表。

Headers 对象支持典型的映射操作包括 __getitem__(), get(), __setitem__(), setdefault(), __delitem__()__contains__()。 对于以上各个方法,映射的键是标头名称(大小写不敏感),而值是关联到该标头名称的第一个值。 设置一个标头会移除该标头的任何现有值,再将一个新值添加到所包装的标头列表末尾。 标头的现有顺序通常会保持不变,只在所包装的列表末尾添加新的标头。

与字典不同,当你试图获取或删除所包装的标头列表中不存在的键时 Headers 对象不会引发错误。 获取一个不存在的标头只是返回 None,而删除一个不存在的标头则没有任何影响。

Headers 对象还支持 keys(), values() 以及 items() 方法。 如果存在具有多个值的键则 keys()items() 所返回的列表可能包括多个相同的键。 对 Headers 对象执行 len() 的结果与 items() 的长度相同,也与所包装的标头列表的长度相同。 实际上,items() 方法只是返回所包装的标头列表的一个副本。

Headers 对象上调用 bytes() 将返回适用于作为 HTTP 响应标头来传输的已格式化字节串。 每个标头附带以逗号加空格分隔的值放置在一行中。 每一行都以回车符加换行符结束,且该字节串会以一个空行作为结束。

除了映射接口和格式化特性,Headers 对象还具有下列方法用来查询和添加多值标头,以及添加具有 MIME 参数的标头:

get_all(name)

返回包含指定标头的所有值的列表。

返回的列表项将按它们在原始标头列表中的出现或被添加到实例中的顺序排序,并可能包含重复项。 任何被删除并重新插入的字段总是会被添加到标头列表末尾。 如果给定名称的字段不存在,则返回一个空列表。

add_header(name, value, **_params)

Add a (possibly multi-valued) header, with optional MIME parameters specified via keyword arguments.

name is the header field to add. Keyword arguments can be used to set MIME parameters for the header field. Each parameter must be a string or None. Underscores in parameter names are converted to dashes, since dashes are illegal in Python identifiers, but many MIME parameter names include dashes. If the parameter value is a string, it is added to the header value parameters in the form name="value". If it is None, only the parameter name is added. (This is used for MIME parameters without a value.) Example usage:

h.add_header('content-disposition', 'attachment', filename='bud.gif')

The above will add a header that looks like this:

Content-Disposition: attachment; filename="bud.gif"

在 3.5 版更改: headers parameter is optional.

wsgiref.simple_server -- a simple WSGI HTTP server

This module implements a simple HTTP server (based on http.server) that serves WSGI applications. Each server instance serves a single WSGI application on a given host and port. If you want to serve multiple applications on a single host and port, you should create a WSGI application that parses PATH_INFO to select which application to invoke for each request. (E.g., using the shift_path_info() function from wsgiref.util.)

wsgiref.simple_server.make_server(host, port, app, server_class=WSGIServer, handler_class=WSGIRequestHandler)

Create a new WSGI server listening on host and port, accepting connections for app. The return value is an instance of the supplied server_class, and will process requests using the specified handler_class. app must be a WSGI application object, as defined by PEP 3333.

用法示例:

from wsgiref.simple_server import make_server, demo_app

with make_server('', 8000, demo_app) as httpd:
    print("Serving HTTP on port 8000...")

    # Respond to requests until process is killed
    httpd.serve_forever()

    # Alternative: serve one request, then exit
    httpd.handle_request()
wsgiref.simple_server.demo_app(environ, start_response)

This function is a small but complete WSGI application that returns a text page containing the message "Hello world!" and a list of the key/value pairs provided in the environ parameter. It's useful for verifying that a WSGI server (such as wsgiref.simple_server) is able to run a simple WSGI application correctly.

class wsgiref.simple_server.WSGIServer(server_address, RequestHandlerClass)

Create a WSGIServer instance. server_address should be a (host,port) tuple, and RequestHandlerClass should be the subclass of http.server.BaseHTTPRequestHandler that will be used to process requests.

You do not normally need to call this constructor, as the make_server() function can handle all the details for you.

WSGIServer is a subclass of http.server.HTTPServer, so all of its methods (such as serve_forever() and handle_request()) are available. WSGIServer also provides these WSGI-specific methods:

set_app(application)

Sets the callable application as the WSGI application that will receive requests.

get_app()

Returns the currently-set application callable.

Normally, however, you do not need to use these additional methods, as set_app() is normally called by make_server(), and the get_app() exists mainly for the benefit of request handler instances.

class wsgiref.simple_server.WSGIRequestHandler(request, client_address, server)

Create an HTTP handler for the given request (i.e. a socket), client_address (a (host,port) tuple), and server (WSGIServer instance).

You do not need to create instances of this class directly; they are automatically created as needed by WSGIServer objects. You can, however, subclass this class and supply it as a handler_class to the make_server() function. Some possibly relevant methods for overriding in subclasses:

get_environ()

Returns a dictionary containing the WSGI environment for a request. The default implementation copies the contents of the WSGIServer object's base_environ dictionary attribute and then adds various headers derived from the HTTP request. Each call to this method should return a new dictionary containing all of the relevant CGI environment variables as specified in PEP 3333.

get_stderr()

Return the object that should be used as the wsgi.errors stream. The default implementation just returns sys.stderr.

handle()

Process the HTTP request. The default implementation creates a handler instance using a wsgiref.handlers class to implement the actual WSGI application interface.

wsgiref.validate --- WSGI conformance checker

When creating new WSGI application objects, frameworks, servers, or middleware, it can be useful to validate the new code's conformance using wsgiref.validate. This module provides a function that creates WSGI application objects that validate communications between a WSGI server or gateway and a WSGI application object, to check both sides for protocol conformance.

Note that this utility does not guarantee complete PEP 3333 compliance; an absence of errors from this module does not necessarily mean that errors do not exist. However, if this module does produce an error, then it is virtually certain that either the server or application is not 100% compliant.

This module is based on the paste.lint module from Ian Bicking's "Python Paste" library.

wsgiref.validate.validator(application)

Wrap application and return a new WSGI application object. The returned application will forward all requests to the original application, and will check that both the application and the server invoking it are conforming to the WSGI specification and to RFC 2616.

Any detected nonconformance results in an AssertionError being raised; note, however, that how these errors are handled is server-dependent. For example, wsgiref.simple_server and other servers based on wsgiref.handlers (that don't override the error handling methods to do something else) will simply output a message that an error has occurred, and dump the traceback to sys.stderr or some other error stream.

This wrapper may also generate output using the warnings module to indicate behaviors that are questionable but which may not actually be prohibited by PEP 3333. Unless they are suppressed using Python command-line options or the warnings API, any such warnings will be written to sys.stderr (not wsgi.errors, unless they happen to be the same object).

用法示例:

from wsgiref.validate import validator
from wsgiref.simple_server import make_server

# Our callable object which is intentionally not compliant to the
# standard, so the validator is going to break
def simple_app(environ, start_response):
    status = '200 OK'  # HTTP Status
    headers = [('Content-type', 'text/plain')]  # HTTP Headers
    start_response(status, headers)

    # This is going to break because we need to return a list, and
    # the validator is going to inform us
    return b"Hello World"

# This is the application wrapped in a validator
validator_app = validator(simple_app)

with make_server('', 8000, validator_app) as httpd:
    print("Listening on port 8000....")
    httpd.serve_forever()

wsgiref.handlers -- server/gateway base classes

This module provides base handler classes for implementing WSGI servers and gateways. These base classes handle most of the work of communicating with a WSGI application, as long as they are given a CGI-like environment, along with input, output, and error streams.

class wsgiref.handlers.CGIHandler

CGI-based invocation via sys.stdin, sys.stdout, sys.stderr and os.environ. This is useful when you have a WSGI application and want to run it as a CGI script. Simply invoke CGIHandler().run(app), where app is the WSGI application object you wish to invoke.

This class is a subclass of BaseCGIHandler that sets wsgi.run_once to true, wsgi.multithread to false, and wsgi.multiprocess to true, and always uses sys and os to obtain the necessary CGI streams and environment.

class wsgiref.handlers.IISCGIHandler

A specialized alternative to CGIHandler, for use when deploying on Microsoft's IIS web server, without having set the config allowPathInfo option (IIS>=7) or metabase allowPathInfoForScriptMappings (IIS<7).

By default, IIS gives a PATH_INFO that duplicates the SCRIPT_NAME at the front, causing problems for WSGI applications that wish to implement routing. This handler strips any such duplicated path.

IIS can be configured to pass the correct PATH_INFO, but this causes another bug where PATH_TRANSLATED is wrong. Luckily this variable is rarely used and is not guaranteed by WSGI. On IIS<7, though, the setting can only be made on a vhost level, affecting all other script mappings, many of which break when exposed to the PATH_TRANSLATED bug. For this reason IIS<7 is almost never deployed with the fix (Even IIS7 rarely uses it because there is still no UI for it.).

There is no way for CGI code to tell whether the option was set, so a separate handler class is provided. It is used in the same way as CGIHandler, i.e., by calling IISCGIHandler().run(app), where app is the WSGI application object you wish to invoke.

3.2 新版功能.

class wsgiref.handlers.BaseCGIHandler(stdin, stdout, stderr, environ, multithread=True, multiprocess=False)

Similar to CGIHandler, but instead of using the sys and os modules, the CGI environment and I/O streams are specified explicitly. The multithread and multiprocess values are used to set the wsgi.multithread and wsgi.multiprocess flags for any applications run by the handler instance.

This class is a subclass of SimpleHandler intended for use with software other than HTTP "origin servers". If you are writing a gateway protocol implementation (such as CGI, FastCGI, SCGI, etc.) that uses a Status: header to send an HTTP status, you probably want to subclass this instead of SimpleHandler.

class wsgiref.handlers.SimpleHandler(stdin, stdout, stderr, environ, multithread=True, multiprocess=False)

Similar to BaseCGIHandler, but designed for use with HTTP origin servers. If you are writing an HTTP server implementation, you will probably want to subclass this instead of BaseCGIHandler.

This class is a subclass of BaseHandler. It overrides the __init__(), get_stdin(), get_stderr(), add_cgi_vars(), _write(), and _flush() methods to support explicitly setting the environment and streams via the constructor. The supplied environment and streams are stored in the stdin, stdout, stderr, and environ attributes.

The write() method of stdout should write each chunk in full, like io.BufferedIOBase.

class wsgiref.handlers.BaseHandler

This is an abstract base class for running WSGI applications. Each instance will handle a single HTTP request, although in principle you could create a subclass that was reusable for multiple requests.

BaseHandler instances have only one method intended for external use:

run(app)

Run the specified WSGI application, app.

All of the other BaseHandler methods are invoked by this method in the process of running the application, and thus exist primarily to allow customizing the process.

The following methods MUST be overridden in a subclass:

_write(data)

Buffer the bytes data for transmission to the client. It's okay if this method actually transmits the data; BaseHandler just separates write and flush operations for greater efficiency when the underlying system actually has such a distinction.

_flush()

Force buffered data to be transmitted to the client. It's okay if this method is a no-op (i.e., if _write() actually sends the data).

get_stdin()

Return an input stream object suitable for use as the wsgi.input of the request currently being processed.

get_stderr()

Return an output stream object suitable for use as the wsgi.errors of the request currently being processed.

add_cgi_vars()

Insert CGI variables for the current request into the environ attribute.

Here are some other methods and attributes you may wish to override. This list is only a summary, however, and does not include every method that can be overridden. You should consult the docstrings and source code for additional information before attempting to create a customized BaseHandler subclass.

Attributes and methods for customizing the WSGI environment:

wsgi_multithread

The value to be used for the wsgi.multithread environment variable. It defaults to true in BaseHandler, but may have a different default (or be set by the constructor) in the other subclasses.

wsgi_multiprocess

The value to be used for the wsgi.multiprocess environment variable. It defaults to true in BaseHandler, but may have a different default (or be set by the constructor) in the other subclasses.

wsgi_run_once

The value to be used for the wsgi.run_once environment variable. It defaults to false in BaseHandler, but CGIHandler sets it to true by default.

os_environ

The default environment variables to be included in every request's WSGI environment. By default, this is a copy of os.environ at the time that wsgiref.handlers was imported, but subclasses can either create their own at the class or instance level. Note that the dictionary should be considered read-only, since the default value is shared between multiple classes and instances.

server_software

If the origin_server attribute is set, this attribute's value is used to set the default SERVER_SOFTWARE WSGI environment variable, and also to set a default Server: header in HTTP responses. It is ignored for handlers (such as BaseCGIHandler and CGIHandler) that are not HTTP origin servers.

在 3.3 版更改: The term "Python" is replaced with implementation specific term like "CPython", "Jython" etc.

get_scheme()

Return the URL scheme being used for the current request. The default implementation uses the guess_scheme() function from wsgiref.util to guess whether the scheme should be "http" or "https", based on the current request's environ variables.

setup_environ()

Set the environ attribute to a fully-populated WSGI environment. The default implementation uses all of the above methods and attributes, plus the get_stdin(), get_stderr(), and add_cgi_vars() methods and the wsgi_file_wrapper attribute. It also inserts a SERVER_SOFTWARE key if not present, as long as the origin_server attribute is a true value and the server_software attribute is set.

Methods and attributes for customizing exception handling:

log_exception(exc_info)

Log the exc_info tuple in the server log. exc_info is a (type, value, traceback) tuple. The default implementation simply writes the traceback to the request's wsgi.errors stream and flushes it. Subclasses can override this method to change the format or retarget the output, mail the traceback to an administrator, or whatever other action may be deemed suitable.

traceback_limit

The maximum number of frames to include in tracebacks output by the default log_exception() method. If None, all frames are included.

error_output(environ, start_response)

This method is a WSGI application to generate an error page for the user. It is only invoked if an error occurs before headers are sent to the client.

This method can access the current error information using sys.exc_info(), and should pass that information to start_response when calling it (as described in the "Error Handling" section of PEP 3333).

The default implementation just uses the error_status, error_headers, and error_body attributes to generate an output page. Subclasses can override this to produce more dynamic error output.

Note, however, that it's not recommended from a security perspective to spit out diagnostics to any old user; ideally, you should have to do something special to enable diagnostic output, which is why the default implementation doesn't include any.

error_status

The HTTP status used for error responses. This should be a status string as defined in PEP 3333; it defaults to a 500 code and message.

error_headers

The HTTP headers used for error responses. This should be a list of WSGI response headers ((name, value) tuples), as described in PEP 3333. The default list just sets the content type to text/plain.

error_body

The error response body. This should be an HTTP response body bytestring. It defaults to the plain text, "A server error occurred. Please contact the administrator."

Methods and attributes for PEP 3333's "Optional Platform-Specific File Handling" feature:

wsgi_file_wrapper

A wsgi.file_wrapper factory, or None. The default value of this attribute is the wsgiref.util.FileWrapper class.

sendfile()

Override to implement platform-specific file transmission. This method is called only if the application's return value is an instance of the class specified by the wsgi_file_wrapper attribute. It should return a true value if it was able to successfully transmit the file, so that the default transmission code will not be executed. The default implementation of this method just returns a false value.

Miscellaneous methods and attributes:

origin_server

This attribute should be set to a true value if the handler's _write() and _flush() are being used to communicate directly to the client, rather than via a CGI-like gateway protocol that wants the HTTP status in a special Status: header.

This attribute's default value is true in BaseHandler, but false in BaseCGIHandler and CGIHandler.

http_version

If origin_server is true, this string attribute is used to set the HTTP version of the response set to the client. It defaults to "1.0".

wsgiref.handlers.read_environ()

Transcode CGI variables from os.environ to PEP 3333 "bytes in unicode" strings, returning a new dictionary. This function is used by CGIHandler and IISCGIHandler in place of directly using os.environ, which is not necessarily WSGI-compliant on all platforms and web servers using Python 3 -- specifically, ones where the OS's actual environment is Unicode (i.e. Windows), or ones where the environment is bytes, but the system encoding used by Python to decode it is anything other than ISO-8859-1 (e.g. Unix systems using UTF-8).

If you are implementing a CGI-based handler of your own, you probably want to use this routine instead of just copying values out of os.environ directly.

3.2 新版功能.

示例

This is a working "Hello World" WSGI application:

from wsgiref.simple_server import make_server

# Every WSGI application must have an application object - a callable
# object that accepts two arguments. For that purpose, we're going to
# use a function (note that you're not limited to a function, you can
# use a class for example). The first argument passed to the function
# is a dictionary containing CGI-style environment variables and the
# second variable is the callable object.
def hello_world_app(environ, start_response):
    status = '200 OK'  # HTTP Status
    headers = [('Content-type', 'text/plain; charset=utf-8')]  # HTTP Headers
    start_response(status, headers)

    # The returned object is going to be printed
    return [b"Hello World"]

with make_server('', 8000, hello_world_app) as httpd:
    print("Serving on port 8000...")

    # Serve until process is killed
    httpd.serve_forever()

Example of a WSGI application serving the current directory, accept optional directory and port number (default: 8000) on the command line:

#!/usr/bin/env python3
'''
Small wsgiref based web server. Takes a path to serve from and an
optional port number (defaults to 8000), then tries to serve files.
Mime types are guessed from the file names, 404 errors are raised
if the file is not found. Used for the make serve target in Doc.
'''
import sys
import os
import mimetypes
from wsgiref import simple_server, util

def app(environ, respond):

    fn = os.path.join(path, environ['PATH_INFO'][1:])
    if '.' not in fn.split(os.path.sep)[-1]:
        fn = os.path.join(fn, 'index.html')
    type = mimetypes.guess_type(fn)[0]

    if os.path.exists(fn):
        respond('200 OK', [('Content-Type', type)])
        return util.FileWrapper(open(fn, "rb"))
    else:
        respond('404 Not Found', [('Content-Type', 'text/plain')])
        return [b'not found']

if __name__ == '__main__':
    path = sys.argv[1] if len(sys.argv) > 1 else os.getcwd()
    port = int(sys.argv[2]) if len(sys.argv) > 2 else 8000
    httpd = simple_server.make_server('', port, app)
    print("Serving {} on port {}, control-C to stop".format(path, port))
    try:
        httpd.serve_forever()
    except KeyboardInterrupt:
        print("Shutting down.")
        httpd.server_close()