# A02 python (basic data type, container, function, class), numpy (array, array index, data type, mathematics in array, broadcast)

## Python version

### Basic data type

Like most languages, Python has many basic types, including integers, floats, Booleans, and strings. These data types behave in a similar way to other programming languages.
Numbers (number type): it represents integer and floating-point number. Its principle is the same as other languages:

```# -*- coding: UTF-8 -*-

x = 3
print(type(x)) # Prints "<class 'int'>"
print(x)       # Prints "3"
print(x + 1)   # Addition; prints "4"
print(x - 1)   # Subtraction; prints "2"
print(x * 2)   # Multiplication; prints "6"
print(x ** 2)  # Exponentiation; prints "9"
x += 1
print(x)  # Prints "4"
x *= 2
print(x)  # Prints "8"
y = 2.5
print(type(y)) # Prints "<class 'float'>"
print(y, y + 1, y * 2, y ** 2) # Prints "2.5 3.5 5.0 6.25"
```

Note that unlike many languages, Python does not have a unary increment (x +) or decrement (x -) operator.
Python also has built-in types for plurals; you can find all the details in this document.
Boolean: Python implements all common Boolean logical operators, but it uses English words instead of symbols (& &, |, etc.):

```# -*- coding: UTF-8 -*-

t = True
f = False
print(type(t)) # Prints "<class 'bool'>"
print(t and f) # Logical AND; prints "False"
print(t or f)  # Logical OR; prints "True"
print(not t)   # Logical NOT; prints "False"
print(t != f)  # Logical XOR; prints "True"
```

Strings: Python has good support for Strings:

```hello = 'hello'    # String literals can use single quotes
world = "world"    # or double quotes; it does not matter.
print(hello)       # Prints "hello"
print(len(hello))  # String length; prints "5"
hw = hello + ' ' + world  # String concatenation
print(hw)  # prints "hello world"
hw12 = '%s %s %d' % (hello, world, 12)  # sprintf style string formatting
print(hw12)  # prints "hello world 12"
```

String objects have many useful methods; for example:

```s = "hello"
print(s.capitalize())  # Capitalize a string; prints "Hello"
print(s.upper())       # Convert a string to uppercase; prints "HELLO"
print(s.rjust(7))      # Right-justify a string, padding with spaces; prints "  hello"
print(s.center(7))     # Center a string, padding with spaces; prints " hello "
print(s.replace('l', '(ell)'))  # Replace all instances of one substring with another;
# prints "he(ell)(ell)o"
print('  world '.strip())  # Strip leading and trailing whitespace; prints "world"
```

Containers
Python includes several built-in container types: list, dictionary, set, and tuple.

Lists (lists)
A list is actually an array in Python, but it can dynamically resize and contain different types of elements:

```xs = [3, 1, 2]    # Create a list
print(xs, xs)  # Prints "[3, 1, 2] 2"
print(xs[-1])     # Negative indices count from the end of the list; prints "2"
xs = 'foo'     # Lists can contain elements of different types
print(xs)         # Prints "[3, 1, 'foo']"
xs.append('bar')  # Add a new element to the end of the list
print(xs)         # Prints "[3, 1, 'foo', 'bar']"
x = xs.pop()      # Remove and return the last element of the list
print(x, xs)      # Prints "bar [3, 1, 'foo']"
```

We'll see the slice again in the context of the numpy array
Loop Loops: you can loop through the elements of the list as follows:

```animals = ['cat', 'dog', 'monkey']
for animal in animals:
print(animal)
# Prints "cat", "dog", "monkey", each on its own line.
```

If you want to access the index of each element in the loop, use the built-in enumerate function:

```animals = ['cat','dog','monkey']
for idx, animal in enumerate(animals):
print('#%d: %s' % (idx + 1, animal))
```

List comprehensions: when programming, we often want to convert one data to another. For a simple example, consider the following code to calculate the square:

```nums = [0, 1, 2, 3, 4]
squares = []
for x in nums:
squares.append(x ** 2)
print(squares)   # Prints [0, 1, 4, 9, 16]
```

You can use list derivation to make this code simpler:

```nums = [0, 1, 2, 3, 4]
squares = [x ** 2 for x in nums]
print(squares)   # Prints [0, 1, 4, 9, 16]
```

List derivation can also include conditions:

```nums = [0,1,2,3,4]
even_squares = [x ** 2 for x in nums if x % 2 == 0]
print(even_squares)    #Prints "[0,4,16]"
```

Dictionaries
Dictionaries store (key, value) pairs, similar to objects in a Map or Javascript in Java. You can use it like this:

```d = {'cat': 'cute', 'dog': 'furry'}  # Create a new dictionary with some data
print(d['cat'])       # Get an entry from a dictionary; prints "cute"
print('cat' in d)     # Check if a dictionary has a given key; prints "True"
d['fish'] = 'wet'     # Set an entry in a dictionary
print(d['fish'])      # Prints "wet"
# print(d['monkey'])  # KeyError: 'monkey' not a key of d
print(d.get('monkey', 'N/A'))  # Get an element with a default; prints "N/A"
print(d.get('fish', 'N/A'))    # Get an element with a default; prints "wet"
del d['fish']         # Remove an element from a dictionary
print(d.get('fish', 'N/A')) # "fish" is no longer a key; prints "N/A"
```

(loop) Loops: it's easy to iterate over the keys in the dictionary:

```d = {"person":2,'cat':4,'spider':8}
for animal in d:
legs = d[animal]
print('A %s has %d legs' % (animal,legs))
```

Operation result:

```A person has 2 legs
A cat has 4 legs
A spider has 8 legs
```

If you want to access the key and its corresponding value, use the items method:

```nums = [0,1,2,3,4]
even_num_to_square = {x : x ** 2 for x in nums if x % 2 == 0}
print(even_num_to_square)
```

Sets
A collection is an unordered collection of different elements. For a simple example, consider the following code:

```animals = {'cat', 'dog'}
print('cat' in animals)   # Check if an element is in a set; prints "True"
print('fish' in animals)  # prints "False"
print('fish' in animals)  # Prints "True"
print(len(animals))       # Number of elements in a set; prints "3"
print(len(animals))       # Prints "3"
animals.remove('cat')     # Remove an element from a set
print(len(animals))       # Prints "2"
```

Loops: the syntax of traversing a set is the same as that of traversing a list; however, because the set is unordered, the order in which the set elements are accessed cannot be assumed:

```animals  = {'cat','dog','fish'}
for idx,animal in enumerate(animals):
print('#%d:%s' % (idx + 1,animal))
```

Operation result:

```#1:dog
#2:fish
#3:cat
```

Set comprehensions: just like lists and dictionaries, we can easily use sets to construct sets:

```from math import sqrt
nums = {int(sqrt(x)) for x in range(30)}
print(nums)  # Prints "{0, 1, 2, 3, 4, 5}"
```

Tuples
Tuples are (immutable) lists of ordered values. Tuples are similar to lists in many ways; one important difference is that tuples can be used as keys in dictionaries and as elements of collections, while lists cannot. This is a simple example:

```d = {(x, x + 1): x for x in range(10)}  # Create a dictionary with tuple keys
t = (5, 6)        # Create a tuple
print(type(t))    # Prints "<class 'tuple'>"
print(d[t])       # Prints "5"
print(d[(1, 2)])  # Prints "1"
```

### Functions

Python functions are defined using the def keyword. For example:

```def sign(x):
if x > 0:
return 'positive'
elif x < 0:
return 'negative'
else:
return 'zero'

for x in [-1,0,1]:
print(sign(x))
```

Operation result:

```negative
zero
positive
```

We often define functions to obtain optional key parameters, as follows:

```def hello(name,loud=False):
if loud:
print('HELLO,%s!' % name.upper())
else:
print('Hello,%s' % name)

hello('Bob') #Prints "Hello,Bob"
hello('Fred',loud=True) #Prints "Hello,FRED!"
```

Operation result:

```Hello,Bob
HELLO,FRED!
```

### Classes

The syntax for defining classes in Python is simple:

```class Greeter(object):

def __init__(self,name):
self.name = name

#Instance method
def greet(self,loud=False):
if loud:
print('HELLO,%s!' % self.name.upper())
else:
print('Hello,%s' % self.name)

g = Greeter('Fred')
g.greet()
g.greet(loud=True)
```

Operation result:

```Hello,Fred
HELLO,FRED!
```

### NumPy

Numpy is the core library of scientific computing in Python. It provides a high-performance multi-dimensional array object and tools for handling these arrays. If you are familiar with MATLAB, you may find this tutorial helpful for you to switch from MATLAB to numpy.

#### Arrays

A numpy array is a value grid, all of which are of the same type and indexed by a nonnegative integer tuple. Dimension is the rank of the array; the shape of the array is an integer tuple, giving the array size of each dimension.

We can initialize the numpy array from the nested Python list and use square brackets to access the elements:

```import numpy as np

a = np.array([1, 2, 3])   # Create a rank 1 array
print(type(a))            # Prints "<class 'numpy.ndarray'>"
print(a.shape)            # Prints "(3,)"
print(a, a, a)   # Prints "1 2 3"
a = 5                  # Change an element of the array
print(a)                  # Prints "[5, 2, 3]"

b = np.array([[1,2,3],[4,5,6]])    # Create a rank 2 array
print(b.shape)                     # Prints "(2, 3)"
print(b[0, 0], b[0, 1], b[1, 0])   # Prints "1 2 4"
```

Numpy also provides many functions for creating arrays:

```import numpy as np

a = np.zeros((2,2))   # Create an array of all zeros
print(a)              # Prints "[[ 0.  0.]
#          [ 0.  0.]]"

b = np.ones((1,2))    # Create an array of all ones
print(b)              # Prints "[[ 1.  1.]]"

c = np.full((2,2), 7)  # Create a constant array
print(c)               # Prints "[[ 7.  7.]
#          [ 7.  7.]]"

d = np.eye(2)         # Create a 2x2 identity matrix
print(d)              # Prints "[[ 1.  0.]
#          [ 0.  1.]]"

e = np.random.random((2,2))  # Create an array filled with random values
print(e)                     # Might print "[[ 0.91940167  0.08143941]
#               [ 0.68744134  0.87236687]]"
```

#### Array index

Numpy provides several index array methods.
Slicing: similar to Python lists, you can slice numpy arrays. Because arrays can be multidimensional, you must specify a slice for each dimension of the array:

```import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]

# A slice of an array is a view into the same data, so modifying it
# will modify the original array.
print(a[0, 1])   # Prints "2"
b[0, 0] = 77     # b[0, 0] is the same piece of data as a[0, 1]
print(a[0, 1])   # Prints "77"
```

You can also mix integer indexes with slice indexes. However, doing so results in an array lower than the original. Note that this is quite different from how MATLAB handles array slicing:

```import numpy as np

# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])

# Two ways of accessing the data in the middle row of the array.
# Mixing integer indexing with slices yields an array of lower rank,
# while using only slices yields an array of the same rank as the
# original array:
row_r1 = a[1, :]    # Rank 1 view of the second row of a
row_r2 = a[1:2, :]  # Rank 2 view of the second row of a
print(row_r1, row_r1.shape)  # Prints "[5 6 7 8] (4,)"
print(row_r2, row_r2.shape)  # Prints "[[5 6 7 8]] (1, 4)"

# We can make the same distinction when accessing columns of an array:
col_r1 = a[:, 1]
col_r2 = a[:, 1:2]
print(col_r1, col_r1.shape)  # Prints "[ 2  6 10] (3,)"
print(col_r2, col_r2.shape)  # Prints "[[ 2]
#          [ 6]
#          ] (3, 1)"
```

Integer array index: when using slice index to Numpy array, the generated array view will always be a sub array of the original array. Instead, an integer array index allows you to construct any array using data from another array. Here is an example:

```import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

# An example of integer array indexing.
# The returned array will have shape (3,) and
print(a[[0, 1, 2], [0, 1, 0]])  # Prints "[1 4 5]"

# The above example of integer array indexing is equivalent to this:
print(np.array([a[0, 0], a[1, 1], a[2, 0]]))  # Prints "[1 4 5]"

# When using integer array indexing, you can reuse the same
# element from the source array:
print(a[[0, 0], [1, 1]])  # Prints "[2 2]"

# Equivalent to the previous integer array indexing example
print(np.array([a[0, 1], a[0, 1]]))  # Prints "[2 2]"
```

A useful technique for integer array indexing is to select or change an element from each row of the matrix:

```import numpy as np

# Create a new array from which we will select elements
a = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])

print(a)  # prints "array([[ 1,  2,  3],
#                [ 4,  5,  6],
#                [ 7,  8,  9],
#                [10, 11, 12]])"

# Create an array of indices
b = np.array([0, 2, 0, 1])

# Select one element from each row of a using the indices in b
print(a[np.arange(4), b])  # Prints "[ 1  6  7 11]"

# Mutate one element from each row of a using the indices in b
a[np.arange(4), b] += 10

print(a)  # prints "array([[11,  2,  3],
#                [ 4,  5, 16],
#                [17,  8,  9],
#                [10, 21, 12]])
```

Boolean array index: a Boolean array index allows you to select any element of an array. Typically, this type of index is used to select array elements that meet certain criteria. Here is an example:

```import numpy as np

a = np.array([[1,2], [3, 4], [5, 6]])

bool_idx = (a > 2)   # Find the elements of a that are bigger than 2;
# this returns a numpy array of Booleans of the same
# shape as a, where each slot of bool_idx tells
# whether that element of a is > 2.

print(bool_idx)      # Prints "[[False False]
#          [ True  True]
#          [ True  True]]"

# We use boolean array indexing to construct a rank 1 array
# consisting of the elements of a corresponding to the True values
# of bool_idx
print(a[bool_idx])  # Prints "[3 4 5 6]"

# We can do all of the above in a single concise statement:
print(a[a > 2])     # Prints "[3 4 5 6]"
```

#### data type

Each numpy array is a grid of elements of the same type. Numpy provides a large set of numeric data types that can be used to construct arrays. Numpy tries to guess the data type when creating an array, but the function that constructs the array usually also contains an optional parameter to explicitly specify the data type. Here is an example:

```import numpy as np

x = np.array([1, 2])   # Let numpy choose the datatype
print(x.dtype)         # Prints "int32"

x = np.array([1.0, 2.0])   # Let numpy choose the datatype
print(x.dtype)             # Prints "float64"

x = np.array([1, 2], dtype=np.int64)   # Force a particular datatype
print(x.dtype)                         # Prints "int64"
```

#### Mathematics in arrays

Basic mathematical functions run as elements on the array. They can be overloaded as operators or as functions in the numpy module:

```import numpy as np

x = np.array([[1,2],[3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print(x + y)

# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(x - y)
print(np.subtract(x, y))

# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))

# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))
```

Note that unlike MATLAB, * is an element multiplication, not a matrix multiplication. We use dot function to calculate inner product of vector, multiply vector by matrix, and multiply by matrix. Dot can be used either as a function in the numpy module or as an instance method of an array object:

```import numpy as np

x = np.array([[1,2],[3,4]])
y = np.array([[5,6],[7,8]])

v = np.array([9,10])
w = np.array([11, 12])

# Inner product of vectors; both produce 219
print(v.dot(w))
print(np.dot(v, w))

# Matrix / vector product; both produce the rank 1 array [29 67]
print(x.dot(v))
print(np.dot(x, v))

# Matrix / matrix product; both produce the rank 2 array
# [[19 22]
#  [43 50]]
print(x.dot(y))
print(np.dot(x, y))
```

Numpy provides many useful functions for performing calculations on arrays; one of the most useful functions is SUM:

```import numpy as np

x = np.array([[1,2],[3,4]])

print(np.sum(x))  # Compute sum of all elements; prints "10"
print(np.sum(x, axis=0))  # Compute sum of each column; prints "[4 6]"
print(np.sum(x, axis=1))  # Compute sum of each row; prints "[3 7]"
```

In addition to using arrays to calculate mathematical functions, we often need to shape or operate on the data in arrays. The simplest example of this operation is to transpose a matrix; to transpose a matrix, you only need to use the T attribute of an array object:

```import numpy as np

x = np.array([[1,2], [3,4]])
print(x)    # Prints "[[1 2]
#          [3 4]]"
print(x.T)  # Prints "[[1 3]
#          [2 4]]"

# Note that taking the transpose of a rank 1 array does nothing:
v = np.array([1,2,3])
print(v)    # Prints "[1 2 3]"
print(v.T)  # Prints "[1 2 3]"
```

Broadcast is a powerful mechanism that allows numpy to use arrays of different shapes when performing arithmetic operations. Generally, we have a smaller array and a larger array, and we want to use the smaller array multiple times to perform some operations on the larger array.

For example, suppose we want to add a constant vector to each row of the matrix. We can do this:

```import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = np.empty_like(x)   # Create an empty matrix with the same shape as x

# Add the vector v to each row of the matrix x with an explicit loop
for i in range(4):
y[i, :] = x[i, :] + v

# Now y is the following
# [[ 2  2  4]
#  [ 5  5  7]
#  [ 8  8 10]
#  [11 11 13]]
print(y)
```

This works; however, when the matrix x is very large, calculating explicit loops in Python can be slow. Note that adding vector v to each row of matrix x is equivalent to forming matrix vv by vertically stacking multiple copies of v, and then performing summation x and vv of elements. We can do this as follows:

```import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
vv = np.tile(v, (4, 1))   # Stack 4 copies of v on top of each other
print(vv)                 # Prints "[[1 0 1]
#          [1 0 1]
#          [1 0 1]
#          [1 0 1]]"
y = x + vv  # Add x and vv elementwise
print(y)  # Prints "[[ 2  2  4
#          [ 5  5  7]
#          [ 8  8 10]
#          [11 11 13]]"
```

Numpy broadcast allows us to perform this calculation without actually creating multiple copies of v. Considering this requirement, broadcast is used as follows:

```import numpy as np

# We will add the vector v to each row of the matrix x,
# storing the result in the matrix y
x = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
v = np.array([1, 0, 1])
y = x + v  # Add v to each row of x using broadcasting
print(y)  # Prints "[[ 2  2  4]
#          [ 5  5  7]
#          [ 8  8 10]
#          [11 11 13]]"
```

y=x+v row even if x has shape (4, 3) and v has shape (3), but due to broadcasting, the row works as if v actually has shape (4, 3), where each row is a copy of v, and summation is performed by element.

Broadcasting two arrays together follows the following rules:

If the array does not have the same rank, add 1 to the shape of the lower level array until the two shapes have the same length.
If two arrays have the same size in the dimension, or if one of them has a size of 1 in the dimension, the two arrays are said to be compatible in the dimension.
Arrays can be broadcast together if they are compatible across all dimensions.
After broadcasting, each array behaves as if its shape is equal to the maximum element value of the shape of two input arrays.
In any dimension where the size of one array is 1 and the size of another array is greater than 1, the first array behaves as if it were copied along that dimension
If you still don't understand the above explanation, please try to read the instructions in this document or this explanation.

The functions supporting broadcasting are called general functions. You can find a list of all the common features in this document.

Here are some applications of Broadcasting:

```import numpy as np

# Compute outer product of vectors
v = np.array([1,2,3])  # v has shape (3,)
w = np.array([4,5])    # w has shape (2,)
# To compute an outer product, we first reshape v to be a column
# vector of shape (3, 1); we can then broadcast it against w to yield
# an output of shape (3, 2), which is the outer product of v and w:
# [[ 4  5]
#  [ 8 10]
#  [12 15]]
print(np.reshape(v, (3, 1)) * w)

# Add a vector to each row of a matrix
x = np.array([[1,2,3], [4,5,6]])
# x has shape (2, 3) and v has shape (3,) so they broadcast to (2, 3),
# giving the following matrix:
# [[2 4 6]
#  [5 7 9]]
print(x + v)

# Add a vector to each column of a matrix
# x has shape (2, 3) and w has shape (2,).
# If we transpose x then it has shape (3, 2) and can be broadcast
# against w to yield a result of shape (3, 2); transposing this result
# yields the final result of shape (2, 3) which is the matrix x with
# the vector w added to each column. Gives the following matrix:
# [[ 5  6  7]
#  [ 9 10 11]]
print((x.T + w).T)
# Another solution is to reshape w to be a column vector of shape (2, 1);
# we can then broadcast it directly against x to produce the same
# output.
print(x + np.reshape(w, (2, 1)))

# Multiply a matrix by a constant:
# x has shape (2, 3). Numpy treats scalars as arrays of shape ();
# these can be broadcast together to shape (2, 3), producing the
# following array:
# [[ 2  4  6]
#  [ 8 10 12]]
print(x * 2)
```  1050 original articles published, 338 praised, 3.9 million visitors+

Posted on Fri, 06 Mar 2020 03:25:41 -0800 by sanlove