3. 一個非正式的 Python 簡介
***************************

在下面的例子中，輸入與輸出的區別在於有無提示符（prompt，*>>>* 和 *…*）
：如果要重做範例，你必須在提示符出現的時候，輸入提示符後方的所有內容；
那些非提示符開始的文字行是直譯器的輸出。注意到在範例中，若出現單行只有
次提示符時，代表該行你必須直接換行；這被使用在多行指令結束輸入時。

在本手冊中的許多範例中，即便他們為互動式地輸入，仍然包含註解。Python
中的註解 (comments) 由 hash 字元 "#" 開始一直到該行結束。註解可以從該
行之首、空白後、或程式碼之後開始，但不會出現在字串之中。hash 字元在字
串之中時仍視為一 hash 字元。因為註解只是用來說明程式而不會被 Python 解
讀，在練習範例時不一定要輸入。

一些範例如下：

   # this is the first comment
   spam = 1  # and this is the second comment
             # ... and now a third!
   text = "# This is not a comment because it's inside quotes."


3.1. 把 Python 當作計算機使用
=============================

讓我們來試試一些簡單的 Python 指令。啟動直譯器並等待第一個主提示符
">>>" 出現。（應該不會等太久）


3.1.1. 數字 (Number)
--------------------

直譯器如同一台簡單的計算機：你可以輸入一個 expression（運算式），它會
寫出該式的值。Expression 的語法很使用：運算子 "+"、"-"、"*" 和 "/" 的
行為如同大多數的程式語言（例如：Pascal 或 C）；括號 "()" 可以用來分群
。例如：

   >>> 2 + 2
   4
   >>> 50 - 5*6
   20
   >>> (50 - 5.0*6) / 4
   5.0
   >>> 8 / 5.0
   1.6

整數數字（即 "2"、"4"、"20"）為 "int" 型態，數字有小數點部份的（即
"5.0"、"1.6"）為 "float" 型態。我們將在之後的教學中看到更多數字相關的
型態。

The return type of a division ("/") operation depends on its operands.
If both operands are of type "int", *floor division* is performed and
an "int" is returned.  If either operand is a "float", classic
division is performed and a "float" is returned.  The "//" operator is
also provided for doing floor division no matter what the operands
are.  The remainder can be calculated with the "%" operator:

   >>> 17 / 3  # int / int -> int
   5
   >>> 17 / 3.0  # int / float -> float
   5.666666666666667
   >>> 17 // 3.0  # explicit floor division discards the fractional part
   5.0
   >>> 17 % 3  # the % operator returns the remainder of the division
   2
   >>> 5 * 3 + 2  # result * divisor + remainder
   17

在 Python 中，計算冪次 (powers) 可以使用 "**" 運算子 [1]：

   >>> 5 ** 2  # 5 squared
   25
   >>> 2 ** 7  # 2 to the power of 7
   128

等於符號 ("=") 可以用於為變數賦值。賦值完之後，在下個指示符前並不會顯
示任何結果：

   >>> width = 20
   >>> height = 5 * 9
   >>> width * height
   900

如果一個變數未被「定義 (defined)」（即變數未被賦值），試著使用它時會出
現一個錯誤：

   >>> n  # try to access an undefined variable
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   NameError: name 'n' is not defined

浮點數的運算有完善的支援，運算子 (operator) 遇上混合的運算元 (operand)
時會把整數的運算元轉換為浮點數：

   >>> 3 * 3.75 / 1.5
   7.5
   >>> 7.0 / 2
   3.5

在互動式模式中，最後一個印出的運算式的結果會被指派至變數 "_" 中。這表
示當你把 Python 當作桌上計算機使用者，要接續計算變得容易許多：

   >>> tax = 12.5 / 100
   >>> price = 100.50
   >>> price * tax
   12.5625
   >>> price + _
   113.0625
   >>> round(_, 2)
   113.06

這個變數應該被使用者視為只能讀取。不應該明確地為它賦值 — 你可以創一個
獨立但名稱相同的本地變數來覆蓋掉預設變數和它的神奇行為。

除了 "int" 和 "float"，Python 還支援了其他的數字型態，包含 "Decimal"
和 "Fraction"。Python 亦內建支援複數 (complex numbers)，並使用 "j" 和
"J" 後綴來指定虛數的部份（即 "3+5j"）。


3.1.2. 字串 (String)
--------------------

除了數字之外，Python 也可以操作字串，而表達字串有數種方式。它們可以用
包含在單引號 ("'...'") 或雙引號 (""..."") 之中，兩者會得到相同的結果
[2]。使用 "\" 跳脫出現於字串中的引號：

   >>> 'spam eggs'  # single quotes
   'spam eggs'
   >>> 'doesn\'t'  # use \' to escape the single quote...
   "doesn't"
   >>> "doesn't"  # ...or use double quotes instead
   "doesn't"
   >>> '"Yes," they said.'
   '"Yes," they said.'
   >>> "\"Yes,\" they said."
   '"Yes," they said.'
   >>> '"Isn\'t," they said.'
   '"Isn\'t," they said.'

In the interactive interpreter, the output string is enclosed in
quotes and special characters are escaped with backslashes.  While
this might sometimes look different from the input (the enclosing
quotes could change), the two strings are equivalent.  The string is
enclosed in double quotes if the string contains a single quote and no
double quotes, otherwise it is enclosed in single quotes.  The "print"
statement produces a more readable output, by omitting the enclosing
quotes and by printing escaped and special characters:

   >>> '"Isn\'t," they said.'
   '"Isn\'t," they said.'
   >>> print '"Isn\'t," they said.'
   "Isn't," they said.
   >>> s = 'First line.\nSecond line.'  # \n means newline
   >>> s  # without print, \n is included in the output
   'First line.\nSecond line.'
   >>> print s  # with print, \n produces a new line
   First line.
   Second line.

如果你不希望字元前出現 "\" 就被當成特殊字元時，可以改使用 *raw string*
，在第一個包圍引號前加上 "r" ：

   >>> print 'C:\some\name'  # here \n means newline!
   C:\some
   ame
   >>> print r'C:\some\name'  # note the r before the quote
   C:\some\name

字串值可以跨越數行。其中一方式是使用三個重覆引號：""""..."""" 或
"'''...'''"。此時換行會被自動加入字串值中，但也可以在換行前加入 "\" 來
取消這個行為。在以下的例子中：

   print """\
   Usage: thingy [OPTIONS]
        -h                        Display this usage message
        -H hostname               Hostname to connect to
   """

會產生以下的輸出（注意第一個換行並沒有被包含進字串值中）：

   Usage: thingy [OPTIONS]
        -h                        Display this usage message
        -H hostname               Hostname to connect to

字串可以使用 "+" 運算子連接 (concatenate)，並用 "*" 重覆該字串的內容：

   >>> # 3 times 'un', followed by 'ium'
   >>> 3 * 'un' + 'ium'
   'unununium'

兩個以上相鄰的字串值（*string literal*，即被引號包圍的字串）會被自動連
接起來：

   >>> 'Py' 'thon'
   'Python'

當你想要分段一個非常長的字串時，兩相鄰字串值自動連接的特性十分有用：

   >>> text = ('Put several strings within parentheses '
   ...         'to have them joined together.')
   >>> text
   'Put several strings within parentheses to have them joined together.'

但這特性只限於兩相鄰的字串值間，而非兩相鄰變數或表達式：

   >>> prefix = 'Py'
   >>> prefix 'thon'  # can't concatenate a variable and a string literal
     ...
   SyntaxError: invalid syntax
   >>> ('un' * 3) 'ium'
     ...
   SyntaxError: invalid syntax

如果要連接變數們或一個變數與一個字串值，使用 "+"：

   >>> prefix + 'thon'
   'Python'

字串可以被「索引 *indexed*」(下標，即 subscripted)，第一個字元的索引值
為 0。沒有獨立表示字元的型別；一個字元就是一個大小為 1 的字串：

   >>> word = 'Python'
   >>> word[0]  # character in position 0
   'P'
   >>> word[5]  # character in position 5
   'n'

索引值可以是負的，此時改成從右開始計數：

   >>> word[-1]  # last character
   'n'
   >>> word[-2]  # second-last character
   'o'
   >>> word[-6]
   'P'

注意到因為 -0 等同於 0，負的索引值由 -1 開始。

In addition to indexing, *slicing* is also supported.  While indexing
is used to obtain individual characters, *slicing* allows you to
obtain a substring:

   >>> word[0:2]  # characters from position 0 (included) to 2 (excluded)
   'Py'
   >>> word[2:5]  # characters from position 2 (included) to 5 (excluded)
   'tho'

注意到起點永遠被包含，而結尾永遠不被包含。這確保了 "s[:i] + s[i:]" 永
遠等於 "s"：

   >>> word[:2] + word[2:]
   'Python'
   >>> word[:4] + word[4:]
   'Python'

切片索引 (slice indices) 有很常用的預設值，省略起點索引值時預設為 0，
而省略第二個索引值時預設整個字串被包含在 slice 中：

   >>> word[:2]   # character from the beginning to position 2 (excluded)
   'Py'
   >>> word[4:]   # characters from position 4 (included) to the end
   'on'
   >>> word[-2:]  # characters from the second-last (included) to the end
   'on'

這裡有個簡單記住 slice 是如何運作的方式。想像 slice 的索引值指著字元們
之間，其中第一個字元的左側邊緣由 0 計數。則 *n* 個字元的字串中最後一個
字元的右側邊緣會有索引值 *n*，例如：

    +---+---+---+---+---+---+
    | P | y | t | h | o | n |
    +---+---+---+---+---+---+
    0   1   2   3   4   5   6
   -6  -5  -4  -3  -2  -1

第一行數字給定字串索引值為 0…6 的位置；第二行則標示了負索引值的位置。
由 *i* 至 *j* 的 slice 包含了標示 *i* 和 *j* 邊緣間的所有字元。

對非負數的索引值而言，一個 slice 的長度等於其索引值之差，如果索引值落
在字串邊界內。例如，"word[1:3]" 的長度是 2。

嘗試使用一個過大的索引值會造成錯誤：

   >>> word[42]  # the word only has 6 characters
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
   IndexError: string index out of range

然而，超出範圍的索引值在 slice 中會被妥善的處理：

   >>> word[4:42]
   'on'
   >>> word[42:]
   ''

Python 字串無法被改變 — 它們是 *immutable*。因此，嘗試對字串中某個索引
位置賦值會產生錯誤：

   >>> word[0] = 'J'
     ...
   TypeError: 'str' object does not support item assignment
   >>> word[2:] = 'py'
     ...
   TypeError: 'str' object does not support item assignment

如果你需要一個不一樣的字串，你必須建立一個新的：

   >>> 'J' + word[1:]
   'Jython'
   >>> word[:2] + 'py'
   'Pypy'

內建的函式 "len()" 回傳一個字串的長度：

   >>> s = 'supercalifragilisticexpialidocious'
   >>> len(s)
   34

也參考:

  Sequence Types — str, unicode, list, tuple, bytearray, buffer,
  xrange
     Strings, and the Unicode strings described in the next section,
     are examples of *sequence types*, and support the common
     operations supported by such types.

  String Methods
     Both strings and Unicode strings support a large number of
     methods for basic transformations and searching.

  Format String Syntax
     關於透過 "str.format()" 字串格式化 (string formatting) 的資訊。

  String Formatting Operations
     The old formatting operations invoked when strings and Unicode
     strings are the left operand of the "%" operator are described in
     more detail here.


3.1.3. Unicode Strings
----------------------

Starting with Python 2.0 a new data type for storing text data is
available to the programmer: the Unicode object. It can be used to
store and manipulate Unicode data (see http://www.unicode.org/) and
integrates well with the existing string objects, providing auto-
conversions where necessary.

Unicode has the advantage of providing one ordinal for every character
in every script used in modern and ancient texts. Previously, there
were only 256 possible ordinals for script characters. Texts were
typically bound to a code page which mapped the ordinals to script
characters. This lead to very much confusion especially with respect
to internationalization (usually written as "i18n" — "'i'" + 18
characters + "'n'") of software.  Unicode solves these problems by
defining one code page for all scripts.

Creating Unicode strings in Python is just as simple as creating
normal strings:

   >>> u'Hello World !'
   u'Hello World !'

The small "'u'" in front of the quote indicates that a Unicode string
is supposed to be created. If you want to include special characters
in the string, you can do so by using the Python *Unicode-Escape*
encoding. The following example shows how:

   >>> u'Hello\u0020World !'
   u'Hello World !'

The escape sequence "\u0020" indicates to insert the Unicode character
with the ordinal value 0x0020 (the space character) at the given
position.

Other characters are interpreted by using their respective ordinal
values directly as Unicode ordinals.  If you have literal strings in
the standard Latin-1 encoding that is used in many Western countries,
you will find it convenient that the lower 256 characters of Unicode
are the same as the 256 characters of Latin-1.

For experts, there is also a raw mode just like the one for normal
strings. You have to prefix the opening quote with 『ur』 to have
Python use the *Raw-Unicode-Escape* encoding. It will only apply the
above "\uXXXX" conversion if there is an uneven number of backslashes
in front of the small 『u』.

   >>> ur'Hello\u0020World !'
   u'Hello World !'
   >>> ur'Hello\\u0020World !'
   u'Hello\\\\u0020World !'

The raw mode is most useful when you have to enter lots of
backslashes, as can be necessary in regular expressions.

Apart from these standard encodings, Python provides a whole set of
other ways of creating Unicode strings on the basis of a known
encoding.

The built-in function "unicode()" provides access to all registered
Unicode codecs (COders and DECoders). Some of the more well known
encodings which these codecs can convert are *Latin-1*, *ASCII*,
*UTF-8*, and *UTF-16*. The latter two are variable-length encodings
that store each Unicode character in one or more bytes. The default
encoding is normally set to ASCII, which passes through characters in
the range 0 to 127 and rejects any other characters with an error.
When a Unicode string is printed, written to a file, or converted with
"str()", conversion takes place using this default encoding.

   >>> u"abc"
   u'abc'
   >>> str(u"abc")
   'abc'
   >>> u"äöü"
   u'\xe4\xf6\xfc'
   >>> str(u"äöü")
   Traceback (most recent call last):
     File "<stdin>", line 1, in ?
   UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)

To convert a Unicode string into an 8-bit string using a specific
encoding, Unicode objects provide an "encode()" method that takes one
argument, the name of the encoding.  Lowercase names for encodings are
preferred.

   >>> u"äöü".encode('utf-8')
   '\xc3\xa4\xc3\xb6\xc3\xbc'

If you have data in a specific encoding and want to produce a
corresponding Unicode string from it, you can use the "unicode()"
function with the encoding name as the second argument.

   >>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')
   u'\xe4\xf6\xfc'


3.1.4. List（串列）
-------------------

Python 理解數種複合型資料型別，用來組合不同的數值。當中最多樣變化的型
別為 *list*，可以寫成一系列以逗號分隔的數值（稱之元素，即 item），包含
在方括號之中。List 可以包合不同型別的元素，但通常這些元素會有相同的型
別：

   >>> squares = [1, 4, 9, 16, 25]
   >>> squares
   [1, 4, 9, 16, 25]

如同字串（以及其他內建的 *sequence* 型別），list 可以被索引和切片
(slice)：

   >>> squares[0]  # indexing returns the item
   1
   >>> squares[-1]
   25
   >>> squares[-3:]  # slicing returns a new list
   [9, 16, 25]

所有 slice 操作都會回傳一個新的 list 包含要求的元素。這意謂著以下這個
slice 複製了原本 list（淺複製，即 shallow copy）：

   >>> squares[:]
   [1, 4, 9, 16, 25]

Lists also supports operations like concatenation:

   >>> squares + [36, 49, 64, 81, 100]
   [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

不同於字串是 *immutable*，list 是 *mutable* 型別，即改變 list 的內容是
可能的：

   >>> cubes = [1, 8, 27, 65, 125]  # something's wrong here
   >>> 4 ** 3  # the cube of 4 is 64, not 65!
   64
   >>> cubes[3] = 64  # replace the wrong value
   >>> cubes
   [1, 8, 27, 64, 125]

你也可以在 list 的最後加入新元素，透過使用 "append()" *方法* (method)
（我們稍後會看到更多方法的說明）：

   >>> cubes.append(216)  # add the cube of 6
   >>> cubes.append(7 ** 3)  # and the cube of 7
   >>> cubes
   [1, 8, 27, 64, 125, 216, 343]

也可以對 slice 賦值，這能改變 list 的大小，甚至是清空一個 list：

   >>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g']
   >>> letters
   ['a', 'b', 'c', 'd', 'e', 'f', 'g']
   >>> # replace some values
   >>> letters[2:5] = ['C', 'D', 'E']
   >>> letters
   ['a', 'b', 'C', 'D', 'E', 'f', 'g']
   >>> # now remove them
   >>> letters[2:5] = []
   >>> letters
   ['a', 'b', 'f', 'g']
   >>> # clear the list by replacing all the elements with an empty list
   >>> letters[:] = []
   >>> letters
   []

內建的函式 "len()" 亦可以作用在 list 上：

   >>> letters = ['a', 'b', 'c', 'd']
   >>> len(letters)
   4

也可以嵌套多層 list （建立 list 包含其他 list），例如：

   >>> a = ['a', 'b', 'c']
   >>> n = [1, 2, 3]
   >>> x = [a, n]
   >>> x
   [['a', 'b', 'c'], [1, 2, 3]]
   >>> x[0]
   ['a', 'b', 'c']
   >>> x[0][1]
   'b'


3.2. 初探程式設計的前幾步
=========================

Of course, we can use Python for more complicated tasks than adding
two and two together.  For instance, we can write an initial sub-
sequence of the *Fibonacci* series as follows:

   >>> # Fibonacci series:
   ... # the sum of two elements defines the next
   ... a, b = 0, 1
   >>> while b < 10:
   ...     print b
   ...     a, b = b, a+b
   ...
   1
   1
   2
   3
   5
   8

這例子引入了許多新的特性。

* 第一行出現了多重賦值：變數 "a" 與 "b" 同時得到了新的值 0 與 1。在
  最 後一行同樣的賦值再被使用了一次，示範了等號的右項運算 (expression)
  會 先被計算 (evaluate)，賦值再發生。右項的運算式由左至右依序被計算。

* The "while" loop executes as long as the condition (here: "b <
  10") remains true.  In Python, like in C, any non-zero integer value
  is true; zero is false.  The condition may also be a string or list
  value, in fact any sequence; anything with a non-zero length is
  true, empty sequences are false.  The test used in the example is a
  simple comparison.  The standard comparison operators are written
  the same as in C: "<" (less than), ">" (greater than), "==" (equal
  to), "<=" (less than or equal to), ">=" (greater than or equal to)
  and "!=" (not equal to).

* 迴圈的主體會*縮排*：縮排在 Python 中用來關連一群陳述式。在互動式提
  示 符中，你必須在迴圈內的每一行一開始鍵入 tab 或者（數個）空白來維持
  縮 排。實務上，你會先在文字編輯器中準備好比較複雜的輸入；多數編輯器
  都有 自動縮排的功能。當一個複合陳述式以互動地方式輸入，必須在結束時
  多加一 行空行來代表結束（因為語法解析器無法判斷你何時輸入複合陳述的
  最後一行 ）。注意在一個縮排段落內的縮排方式與數量必須維持一致。

* The "print" statement writes the value of the expression(s) it is
  given.  It differs from just writing the expression you want to
  write (as we did earlier in the calculator examples) in the way it
  handles multiple expressions and strings.  Strings are printed
  without quotes, and a space is inserted between items, so you can
  format things nicely, like this:

     >>> i = 256*256
     >>> print 'The value of i is', i
     The value of i is 65536

  A trailing comma avoids the newline after the output:

     >>> a, b = 0, 1
     >>> while b < 1000:
     ...     print b,
     ...     a, b = b, a+b
     ...
     1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987

  Note that the interpreter inserts a newline before it prints the
  next prompt if the last line was not completed.

-[ 註解 ]-

[1] 因為 "**" 擁有較 "-" 高的優先次序，"-3**2" 會被解釋為
    "-(3**2)" 並 得到 "-9"。如果要避免這樣的優先順序以得到 "9"，你可以
    使用 "(-3)**2"。

[2] 不像其他語言，特殊符號如 "\n" 在單 ("'...'") 和雙 (""..."") 括
    號中 有相同的意思。兩種刮號的唯一差別，在於使用單刮號時，不需要跳
    脫 (escape) """（但需要跳脫 "\'"），反之亦同。
