Data Manipulation
Some Definitions
A tensor represents a (possibly multidimensional) array of numerical values.
- by invoking arrange (n)→create a row vector of n from 0 to n-1
x = torch.arange(12, dtype=torch.float32)
- access a tensor’s shape (the length along each axis)
x.shape- output:
torch.Size([12])
- inspect the total number of elements in a tensor
x.numel()- output: 12
- change the shape without changing the elements and values
X = x.reshape(3, 4)which makes it becomes a matrix- by placing -1 → automatically infer one component of the shape.
- In our case, instead of calling
x.reshape(3, 4), we could have equivalently calledx.reshape(-1, 4)orx.reshape(3, -1).
- create a tensor with all 1s / 0s
torch.zeros((2, 3, 4))ortorch.ones((2, 3, 4)) - creates a tensor with elements drawn from a standard Gaussian (normal) distribution with mean 0 and standard deviation 1.
torch.randn(3, 4) - create by listing all values
torch.tensor([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]]) - sum all the elements
X.sum()
Indexing and Slicing
nearly same in Python
[-1] selects the last row and [1:3] selects the second and third rows.
X[a:b] outputs the a-th row to the (b-1)th row (0-based)
X[a,b] = 17 can modify the elements by specifying indices.
X[:2, :] = 12 befor , 为修改元素的行索引范围 ;after , 为修改元素的列索引范围,不填则代表全部
Openrations
- element-wise operations: These apply a standard scalar operation to each element of a tensor.
The common standard arithmetic operators for addition (
+), subtraction (-), multiplication (*), division (/), and exponentiation (**) have all been lifted to elementwise operations for identically-shaped tensors of arbitrary shape.
x = torch.tensor([1.0, 2, 4, 8])
y = torch.tensor([2, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** yOutput:
(tensor([ 3., 4., 6., 10.]),
tensor([-1., 0., 2., 6.]),
tensor([ 2., 4., 8., 16.]),
tensor([0.5000, 1.0000, 2.0000, 4.0000]),
tensor([ 1., 4., 16., 64.]))- linear algebraic operations
- Hadamard product: A * B 两个同维矩阵(或向量)对应元素相乘
- concatenate operations: stacking them end-to-end to form a larger one.
We need to provide a list of tensors and tell the system along which axis to concatenate.
e.g.
torch.cat((X, Y), dim=0), torch.cat((X, Y), dim=1)dim=0 means concatenated along axis 0 - logical operations
e.g
X==Ywill output a binary tensor via logical statements.
Broadcasting
Work mechanism:
- expand one or both arrays by copying elements along axes with length 1 so that after this transformation, the two tensors have the same shape;
- perform an elementwise operation on the resulting arrays.
Data Preprocessing
Read the Dataset
- CSV files(Comma-separated values), which storing of tabular (spreadsheet-like) data and can be loaded by pandas.
Data Preparation
- separate out columns corresponding to input versus target values.
- missing values: which were replaced with a special
NaN(not a number) value. handled by imputation or deletion: Imputation replaces missing values with estimates of their values(选取一个替代值/对同一类型赋上相同的值) while deletion simply discards either those rows or those columns that contain missing values.
Conversion to the Tensor Format
X = torch.tensor(inputs.to_numpy(dtype=float))
y = torch.tensor(targets.to_numpy(dtype=float))
X, y关于自动求导
pytorch会存储正向传播的数值,深度学习的系统会构造计算图,自动微分使系统能够随后反向传播梯度,反向传播(backpropagate)意味着跟踪整个计算图,填充关于每个参数的偏导数。
e.g. 函数为
反向传播函数:y.backward() 会自动计算y关于x每个分量的梯度
x.grad 表示打印x的每个分量
x.grad.zero_() 会清除已经累积的梯度
x.requires_grad_(True) # 等价于x=torch.arange(4.0,requires_grad=True)
# requires_grad_=True 标识该张量是否需要计算梯度
x.grad # 默认值是Nonee.g. 函数为
y.sum().backward() 的本质:
执行 y.sum() 时,实际上创建了一个新的标量 :
此时求导,根据链式法则,对于每一个 :
因为 始终等于 1,所以对 sum() 求导等同于给每个 分配了 1.0 的外部梯度。
# 对非标量调用backward需要传入一个gradient参数,该参数指定微分函数关于self的梯度。
# 本例只想求偏导数的和,所以传递一个1的梯度是合适的
x.grad.zero_()
y = x * x
# 等价于y.backward(torch.ones(len(x)))
y.sum().backward()
x.grad