"zipapp" —— 管理可执行的 Python zip 打包文件
********************************************

3.5 新版功能.

**源代码：** Lib/zipapp.py

======================================================================

本模块提供了一套管理工具，用于创建包含 Python 代码的压缩文件，这些文件
可以 直接由 Python 解释器执行。 本模块提供 命令行接口 和 Python API。


简单示例
========

下述例子展示了用 命令行接口 根据含有 Python 代码的目录创建一个可执行的
打包文件。 运行后该打包文件时，将会执行 "myapp" 模块中的 "main" 函数。

   $ python -m zipapp myapp -m "myapp:main"
   $ python myapp.pyz
   <output from myapp>


命令行接口
==========

若要从命令行调用，则采用以下形式：

   $ python -m zipapp source [options]

如果 *source* 是个目录，将根据 *source* 的内容创建一个打包文件。如果
*source* 是个文件，则应为一个打包文件，将会复制到目标打包文件中（如果
指定了 -info 选项，将会显示 shebang 行的内容）。

可以接受以下参数：

-o <output>, --output=<output>

   将程序的输出写入名为 *output* 的文件中。若未指定此参数，输出的文件
   名将与输入的 *source* 相同，并添加扩展名 ".pyz"。如果显式给出了文件
   名，将会原样使用（因此必要时应包含扩展名 ".pyz"）。

   如果 *source* 是个打包文件，必须指定一个输出文件名（这时 *output*
   必须与 *source* 不同）。

-p <interpreter>, --python=<interpreter>

   给打包文件加入 "#!" 行，以便指定 *解释器* 作为运行的命令行。另外，
   还让打包文件在 POSIX 平台上可执行。默认不会写入  "#!" 行，也不让文
   件可执行。

-m <mainfn>, --main=<mainfn>

   在打包文件中写入一个 "__main__.py" 文件，用于执行 *mainfn*。
   *mainfn* 参数的形式应为 “pkg.mod:fn”，其中 “pkg.mod”是打包文件中的
   某个包/模块，“fn”是该模块中的一个可调用对象。"__main__.py" 文件将会
   执行该可调用对象。

   在复制打包文件时，不能设置 "--main"  参数。

-c, --compress

   利用 deflate 方法压缩文件，减少输出文件的大小。默认情况下，打包文件
   中的文件是不压缩的。

   在复制打包文件时，"--compress" 无效。

   3.7 新版功能.

--info

   显示嵌入在打包文件中的解释器程序，以便诊断问题。这时会忽略其他所有
   参数，SOURCE 必须是个打包文件，而不是目录。

-h, --help

   打印简短的用法信息并退出。


Python API
==========

该模块定义了两个快捷函数：

zipapp.create_archive(source, target=None, interpreter=None, main=None, filter=None, compressed=False)

   由 *source* 创建一个应用程序打包文件。source 可以是以下形式之一：

   * 一个目录名，或指向目录的 *path-like object* ，这时将根据目录内容
     新建一个应用程序打包文件。

   * 一个已存在的应用程序打包文件名，或指向这类文件的 *path-like
     object*，这时会将该文件复制为目标文件（会稍作修改以反映出
     *interpreter* 参数的值）。必要时文件名中应包括 ".pyz" 扩展名。

   * 一个以字节串模式打开的文件对象。该文件的内容应为应用程序打包文件
     ，且假定文件对象定位于打包文件的初始位置。

   *target* 参数定义了打包文件的写入位置：

   * 若是个文件名，或是 *path-like object*，打包文件将写入该文件中。

   * 若是个打开的文件对象，打包文件将写入该对象，该文件对象必须在字节
     串写入模式下打开。

   * 如果省略了 target （或为 "None"），则 source 必须为一个目录，
     target  将是与 source 同名的文件，并加上 ".pyz" 扩展名。

   The *interpreter* argument specifies the name of the Python
   interpreter with which the archive will be executed.  It is written
   as a "shebang" line at the start of the archive.  On POSIX, this
   will be interpreted by the OS, and on Windows it will be handled by
   the Python launcher.  Omitting the *interpreter* results in no
   shebang line being written.  If an interpreter is specified, and
   the target is a filename, the executable bit of the target file
   will be set.

   The *main* argument specifies the name of a callable which will be
   used as the main program for the archive.  It can only be specified
   if the source is a directory, and the source does not already
   contain a "__main__.py" file.  The *main* argument should take the
   form "pkg.module:callable" and the archive will be run by importing
   "pkg.module" and executing the given callable with no arguments.
   It is an error to omit *main* if the source is a directory and does
   not contain a "__main__.py" file, as otherwise the resulting
   archive would not be executable.

   The optional *filter* argument specifies a callback function that
   is passed a Path object representing the path to the file being
   added (relative to the source directory).  It should return "True"
   if the file is to be added.

   The optional *compressed* argument determines whether files are
   compressed.  If set to "True", files in the archive are compressed
   with the deflate method; otherwise, files are stored uncompressed.
   This argument has no effect when copying an existing archive.

   If a file object is specified for *source* or *target*, it is the
   caller's responsibility to close it after calling create_archive.

   When copying an existing archive, file objects supplied only need
   "read" and "readline", or "write" methods.  When creating an
   archive from a directory, if the target is a file object it will be
   passed to the "zipfile.ZipFile" class, and must supply the methods
   needed by that class.

   3.7 新版功能: Added the *filter* and *compressed* arguments.

zipapp.get_interpreter(archive)

   Return the interpreter specified in the "#!" line at the start of
   the archive.  If there is no "#!" line, return "None". The
   *archive* argument can be a filename or a file-like object open for
   reading in bytes mode.  It is assumed to be at the start of the
   archive.


示例
====

Pack up a directory into an archive, and run it.

   $ python -m zipapp myapp
   $ python myapp.pyz
   <output from myapp>

The same can be done using the "create_archive()" function:

   >>> import zipapp
   >>> zipapp.create_archive('myapp', 'myapp.pyz')

To make the application directly executable on POSIX, specify an
interpreter to use.

   $ python -m zipapp myapp -p "/usr/bin/env python"
   $ ./myapp.pyz
   <output from myapp>

To replace the shebang line on an existing archive, create a modified
archive using the "create_archive()" function:

   >>> import zipapp
   >>> zipapp.create_archive('old_archive.pyz', 'new_archive.pyz', '/usr/bin/python3')

To update the file in place, do the replacement in memory using a
"BytesIO" object, and then overwrite the source afterwards.  Note that
there is a risk when overwriting a file in place that an error will
result in the loss of the original file.  This code does not protect
against such errors, but production code should do so.  Also, this
method will only work if the archive fits in memory:

   >>> import zipapp
   >>> import io
   >>> temp = io.BytesIO()
   >>> zipapp.create_archive('myapp.pyz', temp, '/usr/bin/python2')
   >>> with open('myapp.pyz', 'wb') as f:
   >>>     f.write(temp.getvalue())


Specifying the Interpreter
==========================

Note that if you specify an interpreter and then distribute your
application archive, you need to ensure that the interpreter used is
portable.  The Python launcher for Windows supports most common forms
of POSIX "#!" line, but there are other issues to consider:

* If you use "/usr/bin/env python" (or other forms of the "python"
  command, such as "/usr/bin/python"), you need to consider that your
  users may have either Python 2 or Python 3 as their default, and
  write your code to work under both versions.

* If you use an explicit version, for example "/usr/bin/env python3"
  your application will not work for users who do not have that
  version.  (This may be what you want if you have not made your code
  Python 2 compatible).

* There is no way to say "python X.Y or later", so be careful of using
  an exact version like "/usr/bin/env python3.4" as you will need to
  change your shebang line for users of Python 3.5, for example.

Typically, you should use an "/usr/bin/env python2" or "/usr/bin/env
python3", depending on whether your code is written for Python 2 or 3.


Creating Standalone Applications with zipapp
============================================

Using the "zipapp" module, it is possible to create self-contained
Python programs, which can be distributed to end users who only need
to have a suitable version of Python installed on their system.  The
key to doing this is to bundle all of the application's dependencies
into the archive, along with the application code.

The steps to create a standalone archive are as follows:

1. Create your application in a directory as normal, so you have a
   "myapp" directory containing a "__main__.py" file, and any
   supporting application code.

2. Install all of your application's dependencies into the "myapp"
   directory, using pip:

      $ python -m pip install -r requirements.txt --target myapp

   (this assumes you have your project requirements in a
   "requirements.txt" file - if not, you can just list the
   dependencies manually on the pip command line).

3. Optionally, delete the ".dist-info" directories created by pip in
   the "myapp" directory. These hold metadata for pip to manage the
   packages, and as you won't be making any further use of pip they
   aren't required - although it won't do any harm if you leave them.

4. Package the application using:

      $ python -m zipapp -p "interpreter" myapp

This will produce a standalone executable, which can be run on any
machine with the appropriate interpreter available. See Specifying the
Interpreter for details. It can be shipped to users as a single file.

On Unix, the "myapp.pyz" file is executable as it stands.  You can
rename the file to remove the ".pyz" extension if you prefer a "plain"
command name.  On Windows, the "myapp.pyz[w]" file is executable by
virtue of the fact that the Python interpreter registers the ".pyz"
and ".pyzw" file extensions when installed.


Making a Windows executable
---------------------------

On Windows, registration of the ".pyz" extension is optional, and
furthermore, there are certain places that don't recognise registered
extensions "transparently" (the simplest example is that
"subprocess.run(['myapp'])" won't find your application - you need to
explicitly specify the extension).

On Windows, therefore, it is often preferable to create an executable
from the zipapp.  This is relatively easy, although it does require a
C compiler.  The basic approach relies on the fact that zipfiles can
have arbitrary data prepended, and Windows exe files can have
arbitrary data appended.  So by creating a suitable launcher and
tacking the ".pyz" file onto the end of it, you end up with a single-
file executable that runs your application.

A suitable launcher can be as simple as the following:

   #define Py_LIMITED_API 1
   #include "Python.h"

   #define WIN32_LEAN_AND_MEAN
   #include <windows.h>

   #ifdef WINDOWS
   int WINAPI wWinMain(
       HINSTANCE hInstance,      /* handle to current instance */
       HINSTANCE hPrevInstance,  /* handle to previous instance */
       LPWSTR lpCmdLine,         /* pointer to command line */
       int nCmdShow              /* show state of window */
   )
   #else
   int wmain()
   #endif
   {
       wchar_t **myargv = _alloca((__argc + 1) * sizeof(wchar_t*));
       myargv[0] = __wargv[0];
       memcpy(myargv + 1, __wargv, __argc * sizeof(wchar_t *));
       return Py_Main(__argc+1, myargv);
   }

If you define the "WINDOWS" preprocessor symbol, this will generate a
GUI executable, and without it, a console executable.

To compile the executable, you can either just use the standard MSVC
command line tools, or you can take advantage of the fact that
distutils knows how to compile Python source:

   >>> from distutils.ccompiler import new_compiler
   >>> import distutils.sysconfig
   >>> import sys
   >>> import os
   >>> from pathlib import Path

   >>> def compile(src):
   >>>     src = Path(src)
   >>>     cc = new_compiler()
   >>>     exe = src.stem
   >>>     cc.add_include_dir(distutils.sysconfig.get_python_inc())
   >>>     cc.add_library_dir(os.path.join(sys.base_exec_prefix, 'libs'))
   >>>     # First the CLI executable
   >>>     objs = cc.compile([str(src)])
   >>>     cc.link_executable(objs, exe)
   >>>     # Now the GUI executable
   >>>     cc.define_macro('WINDOWS')
   >>>     objs = cc.compile([str(src)])
   >>>     cc.link_executable(objs, exe + 'w')

   >>> if __name__ == "__main__":
   >>>     compile("zastub.c")

The resulting launcher uses the "Limited ABI", so it will run
unchanged with any version of Python 3.x.  All it needs is for Python
("python3.dll") to be on the user's "PATH".

For a fully standalone distribution, you can distribute the launcher
with your application appended, bundled with the Python "embedded"
distribution.  This will run on any PC with the appropriate
architecture (32 bit or 64 bit).


Caveats
-------

There are some limitations to the process of bundling your application
into a single file.  In most, if not all, cases they can be addressed
without needing major changes to your application.

1. If your application depends on a package that includes a C
   extension, that package cannot be run from a zip file (this is an
   OS limitation, as executable code must be present in the filesystem
   for the OS loader to load it). In this case, you can exclude that
   dependency from the zipfile, and either require your users to have
   it installed, or ship it alongside your zipfile and add code to
   your "__main__.py" to include the directory containing the unzipped
   module in "sys.path". In this case, you will need to make sure to
   ship appropriate binaries for your target architecture(s) (and
   potentially pick the correct version to add to "sys.path" at
   runtime, based on the user's machine).

2. If you are shipping a Windows executable as described above, you
   either need to ensure that your users have "python3.dll" on their
   PATH (which is not the default behaviour of the installer) or you
   should bundle your application with the embedded distribution.

3. The suggested launcher above uses the Python embedding API.  This
   means that in your application, "sys.executable" will be your
   application, and *not* a conventional Python interpreter.  Your
   code and its dependencies need to be prepared for this possibility.
   For example, if your application uses the "multiprocessing" module,
   it will need to call "multiprocessing.set_executable()" to let the
   module know where to find the standard Python interpreter.


The Python Zip Application Archive Format
=========================================

Python has been able to execute zip files which contain a
"__main__.py" file since version 2.6.  In order to be executed by
Python, an application archive simply has to be a standard zip file
containing a "__main__.py" file which will be run as the entry point
for the application.  As usual for any Python script, the parent of
the script (in this case the zip file) will be placed on "sys.path"
and thus further modules can be imported from the zip file.

The zip file format allows arbitrary data to be prepended to a zip
file.  The zip application format uses this ability to prepend a
standard POSIX "shebang" line to the file ("#!/path/to/interpreter").

Formally, the Python zip application format is therefore:

1. An optional shebang line, containing the characters "b'#!'"
   followed by an interpreter name, and then a newline ("b'\n'")
   character.  The interpreter name can be anything acceptable to the
   OS "shebang" processing, or the Python launcher on Windows.  The
   interpreter should be encoded in UTF-8 on Windows, and in
   "sys.getfilesystemencoding()" on POSIX.

2. Standard zipfile data, as generated by the "zipfile" module.  The
   zipfile content *must* include a file called "__main__.py" (which
   must be in the "root" of the zipfile - i.e., it cannot be in a
   subdirectory).  The zipfile data can be compressed or uncompressed.

If an application archive has a shebang line, it may have the
executable bit set on POSIX systems, to allow it to be executed
directly.

There is no requirement that the tools in this module are used to
create application archives - the module is a convenience, but
archives in the above format created by any means are acceptable to
Python.
