Chapter 2 - Data

Data Manipulation

Some Definitions

A tensor represents a (possibly multidimensional) array of numerical values.

by invoking arrange (n)→create a row vector of n from 0 to n-1
- x = torch.arange(12, dtype=torch.float32)
access a tensor’s shape (the length along each axis)
- x.shape
- output: torch.Size([12])
inspect the total number of elements in a tensor
- x.numel()
- output: 12
change the shape without changing the elements and values
- X = x.reshape(3, 4) which makes it becomes a matrix
- by placing -1 → automatically infer one component of the shape.
- In our case, instead of calling x.reshape(3, 4), we could have equivalently called x.reshape(-1, 4) or x.reshape(3, -1).
create a tensor with all 1s / 0s torch.zeros((2, 3, 4)) or torch.ones((2, 3, 4))
creates a tensor with elements drawn from a standard Gaussian (normal) distribution with mean 0 and standard deviation 1. torch.randn(3, 4)
create by listing all values torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
sum all the elements X.sum()

Indexing and Slicing

nearly same in Python [-1] selects the last row and [1:3] selects the second and third rows. X[a:b] outputs the a-th row to the (b-1)th row (0-based) X[a,b] = 17 can modify the elements by specifying indices. X[:2, :] = 12 befor , 为修改元素的行索引范围；after , 为修改元素的列索引范围，不填则代表全部

Openrations

element-wise operations: These apply a standard scalar operation to each element of a tensor. The common standard arithmetic operators for addition (+), subtraction (-), multiplication (*), division (/), and exponentiation (**) have all been lifted to elementwise operations for identically-shaped tensors of arbitrary shape.

x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y

Output:

(tensor([ 3.,  4.,  6., 10.]),
 tensor([-1.,  0.,  2.,  6.]),
 tensor([ 2.,  4.,  8., 16.]),
 tensor([0.5000, 1.0000, 2.0000, 4.0000]),
 tensor([ 1.,  4., 16., 64.]))

linear algebraic operations
- Hadamard product: A * B 两个同维矩阵（或向量）对应元素相乘
concatenate operations: stacking them end-to-end to form a larger one. We need to provide a list of tensors and tell the system along which axis to concatenate. e.g. torch.cat((X, Y), dim=0), torch.cat((X, Y), dim=1) dim=0 means concatenated along axis 0
logical operations e.g X==Y will output a binary tensor via logical statements.

Broadcasting

Work mechanism:

expand one or both arrays by copying elements along axes with length 1 so that after this transformation, the two tensors have the same shape;
perform an elementwise operation on the resulting arrays.

Data Preprocessing

Read the Dataset

CSV files(Comma-separated values), which storing of tabular (spreadsheet-like) data and can be loaded by pandas.

Data Preparation

separate out columns corresponding to input versus target values.
missing values: which were replaced with a special NaN (not a number) value. handled by imputation or deletion: Imputation replaces missing values with estimates of their values(选取一个替代值/对同一类型赋上相同的值) while deletion simply discards either those rows or those columns that contain missing values.

Conversion to the Tensor Format

X = torch.tensor(inputs.to_numpy(dtype=float))
y = torch.tensor(targets.to_numpy(dtype=float))
X, y

关于自动求导

pytorch会存储正向传播的数值，深度学习的系统会构造计算图，自动微分使系统能够随后反向传播梯度，反向传播（backpropagate）意味着跟踪整个计算图，填充关于每个参数的偏导数。 e.g. 函数为 $y = 2 x^{T} x$ 反向传播函数：y.backward() 会自动计算y关于x每个分量的梯度 x.grad 表示打印x的每个分量 x.grad.zero_() 会清除已经累积的梯度

x.requires_grad_(True)  # 等价于x=torch.arange(4.0,requires_grad=True)
# requires_grad_=True 标识该张量是否需要计算梯度
x.grad  # 默认值是None

e.g. 函数为 $y = x * x$ y.sum().backward() 的本质：执行 y.sum() 时，实际上创建了一个新的标量 $L$ ： $L = y_{1} + y_{2} + ... + y_{n}$ 此时求导，根据链式法则，对于每一个 $x_{i}$ ： $\frac{\partial L}{\partial x _{i}} = \frac{\partial L}{\partial y _{i}} \cdot \frac{\partial y _{i}}{\partial x _{i}} = 1 \cdot \frac{\partial y _{i}}{\partial x _{i}}$ 因为 $\frac{\partial L}{\partial y _{i}}$ 始终等于 1，所以对 sum() 求导等同于给每个 $y_{i}$ 分配了 1.0 的外部梯度。

# 对非标量调用backward需要传入一个gradient参数，该参数指定微分函数关于self的梯度。
# 本例只想求偏导数的和，所以传递一个1的梯度是合适的
x.grad.zero_()
y = x * x
# 等价于y.backward(torch.ones(len(x)))
y.sum().backward()
x.grad

Starry's Blog

Explorer