「PyTorch」：2-Tensors Explained And Operations

2020-10-212021-02-28 PyTorch 34 minutes read (About 5067 words) 0 visits

PyTorch框架学习。

本篇文章主要介绍PyTorch中的Tensor及其基本操作，主要分为四个方面：Reshape, Element-wise, Reduction和Access。

Tensor的具体操作介绍，建议配合Colab笔记使用：

PyTorch Tensors Explained

Tensor Operations: Reshape

Tensor Operations: Element-wise

Tensor Operation: Reduction and Access

英文的表达解释都是比较清晰且精确的，所以以英语的形式作为主要记录，文中会夹带一些中文总结语句，方便阅读。

Introducing Tensors

Tensor Explained - Data Structures of Deep Learning

What Is A Tensor?

A tensor is the primary data structure used by neural networks.

【Tensor是NN中最主要的数据结构】

Indexes Required To Access An Element

The relationship within each of these pairs is that both elements require the same number of indexes to refer to a specific element within the data structure.

【以下pairs都是需要同等数量的indexes才能确定特定的元素。】

【而tensor是generalizations，是一种统一而普遍的定义。】

Indexes required	Computer science	Mathematics
0	number	scalar
1	array	vector
2	2d-array	matrix

Tensors Are Generalizations

When more than two indexes are required to access a specific element, we stop giving specific names to the structures, and we begin using more general language.

Mathematics

In mathematics, we stop using words like scalar, vector, and matrix, and we start using the word tensor or nd-tensor. The n tells us the number of indexes required to access a specific element within the structure.

【数学中，当我们需要用大于两个的indexes才能确定特点元素时，我们使用tensor或者nd-tensor来表示该数据结构，说明需要n个index才能确定该数据结构中的特定元素。】

Computer Science

In computer science, we stop using words like, number, array, 2d-array, and start using the word multidimensional array or nd-array. The n tells us the number of indexes required to access a specific element within the structure.

【计算机科学中，我们使用nd-array来表示，因此，nd-array和tensor实则是一个东西。】

Indexes required	Computer science	Mathematics
n	nd-array	nd-tensor

Tensors and nd-arrays are the same thing!

One thing to note about the dimension of a tensor is that it differs from what we mean when we refer to the dimension of a vector in a vector space. The dimension of a tensor does not tell us how many components exist within the tensor.

【需要注意的地方是，tensor中的维度和vector向量空间中的维度不是同一个东西，vector向量空间中的维度表示该vector有多少个元素组成的，而tensor中的维度是下文中rank的含义。】

Rank, Axes, And Shape Explained

【下文会详细解释深度学习tensor的几个重要性质：Rank, Axes, Shape.】

The concepts of rank, axes, and shape are the tensor attributes that will concern us most in deep learning.

Rank
Axes
Shape

Rank And Indexes

We are introducing the word rank here because it is commonly used in deep learning when referring to the number of dimensions present within a given tensor.

The rank of a tensor tells us how many indexes are required to access (refer to) a specific data element contained within the tensor data structure.

A tensor’s rank tells us how many indexes are needed to refer to a specific element within the tensor.

【这里的rank实则就是tensor的维度。】

【tensor的rank值告诉我们需要多少个indexes才能确定该tensor中的特定元素。】

Axes Of A Tensor

If we have a tensor, and we want to refer to a specific dimension, we use the word axis in deep learning.

An axis of a tensor is a specific dimension of a tensor.

Elements are said to exist or run along an axis. This running is constrained by the length of each axis. Let’s look at the length of an axis now.

Length Of An Axis

The length of each axis tells us how many indexes are available along each axis.

【当我们关注tensor的某一具体维度时，在深度学习中我们使用axis来表达。】

【元素被认为是在某一axie上存在或延伸的，元素延伸的长度取决于axis的长度。】

【Axis的长度表示在每一维度（axis）上有多少个索引】

Shape Of A Tensor

The shape of a tensor is determined by the length of each axis, so if we know the shape of a given tensor, then we know the length of each axis, and this tells us how many indexes are available along each axis.

The shape of a tensor gives us the length of each axis of the tensor.

【tensor的shape由每一axis的长度决定，即每一axis的索引数目】

Additionally, one of the types of operations we must perform frequently when we are programming our neural networks is called reshaping.

Reshaping changes the shape but not the underlying data elements.

【tensor的常见操作reshape只改变tensor的shape，而不改变底层的数据。】

CNN Tensors Shape Explained

CNN的相关介绍，可见这篇文章

What I want to do now is put the concepts of rank, axes, and shape to use with a practical example. To do this, we’ll consider an image input as a tensor to a CNN.

Remember that the shape of a tensor encodes all the relevant information about a tensor’s axes, rank, and indexes, so we’ll consider the shape in our example, and this will enable us to work out the other values.

【tensor的shape能体现tensor的axes、rank、index所有信息】

【以CNN为例来说明rank, axes, shape.】

Shape Of A CNN Input

The shape of a CNN input typically has a length of four. This means that we have a rank-4 tensor with four axes. Each index in the tensor’s shape represents a specific axis, and the value at each index gives us the length of the corresponding axis.

【CNN的input 是一个rank4-tensor.】

Each axis of a tensor usually represents some type of real world or logical feature of the input data. If we understand each of these features and their axis location within the tensor, then we can have a pretty good understanding of the tensor data structure overall.

【tensor的每个axis往往代表着某一个逻辑feature，所以理解features和tensor中axis的位置的关系能帮助我们更好的理解tensor。】

Image Height And Width

To represent two dimensions, we need two axes.

The image height and width are represented on the last two axes.

【表示图像的height和width，需要2个axes，使用最后两个axes表示。】

Image Color Channels

The next axis represents the color channels. Typical values here are 3 for RGB images or 1 if we are working with grayscale images. This color channel interpretation only applies to the input tensor.

【下一个axis(从右至左)表示图像的color channels（颜色通道，如灰度图像就有1个颜色通道，RGB图像有三个）。】

【注意：color channel的说法只适用于input tensor。】

Image Batches

This brings us to the first axis of the four which represents the batch size. In neural networks, we usually work with batches of samples opposed to single samples, so the length of this axis tells us how many samples are in our batch.

Suppose we have the following shape [3, 1, 28, 28] for a given tensor. Using the shape, we can determine that we have a batch of three images.

【第一个axis表示batch属性，表明该batch的size。在深度学习中，我们通常使用一批样本，而不是一个单独的样本，所以这一维度表明了我们的batch中有多少样本。】

tensor：[Batch, Channels, Height, Width]

Each image has a single color channel, and the image height and width are 28 x 28 respectively.

Batch size
Color channels
Height
Width

NCHW vs NHWC vs CHWN

It’s common when reading API documentation and academic papers to see the B replaced by an N. The N standing for number of samples in a batch.

【在API文档或学术论文中，N经常会代替代替B，表示the number of samples in a batch。】

Furthermore, another difference we often encounter in the wild is a reordering of the dimensions. Common orderings are as follows:

NCHW
NHWC
CHWN

【除此之外，也会经常遇到这些axes的其他顺序。】

As we have seen, PyTorch uses NCHW, and it is the case that TensorFlow and Keras use NHWC by default (it can be configured). Ultimately, the choice of which one to use depends mainly on performance. Some libraries and algorithms are more suited to one or the other of these orderings.

【PyTorch 默认使用NCHW，而TensorFlow和Keras使用NHWC】

Output Channels And Feature Maps

Let’s look at how the interpretation of the color channel axis changes after the tensor is transformed by a convolutional layer.

Suppose we have three convolutional filters, and lets just see what happens to the channel axis.

Since we have three convolutional filters, we will have three channel outputs from the convolutional layer. These channels are outputs from the convolutional layer, hence the name output channels opposed to color channels.

【tensor送入convolutional layer（卷积层）后，color channel 这一axis的长度发生变化。

【在Post not found: % CNN CNN的介绍文章中解释到，有几个convolutional filters，卷积层输出的tensor就有几个channel（channel代替color channel的表达）。】

Feature Maps

With the output channels, we no longer have color channels, but modified channels that we call feature maps. These so-called feature maps are the outputs of the convolutions that take place using the input color channels and the convolutional filters.

Feature maps are the output channels created from the convolutions.

【卷积层输出tensor的channel维度代替color channels的叫法。】

【卷积层的输出也叫叫feature maps】

PyTorch Tensors

When programming neural networks, data preprocessing is often one of the first steps in the overall process, and one goal of data preprocessing is to transform the raw input data into tensor form.

【数据预处理往往是编写NN的第一步，将原始数据转换为tensor form。】

Tensor的基本操作见Colab运行笔记链接：PyTorch Tensors Explained

(不会用的也可以直接看github 上的)

PyTorch Tensors Attributes

torch.dtype：tensor包含数据类型。

常见数据类型：

Data type	dtype	CPU tensor	GPU tensor
32-bit floating point	torch.float32	torch.FloatTensor	torch.cuda.FloatTensor
64-bit floating point	torch.float64	torch.DoubleTensor	torch.cuda.DoubleTensor
16-bit floating point	torch.float16	torch.HalfTensor	torch.cuda.HalfTensor
8-bit integer (unsigned)	torch.uint8	torch.ByteTensor	torch.cuda.ByteTensor
8-bit integer (signed)	torch.int8	torch.CharTensor	torch.cuda.CharTensor
16-bit integer (signed)	torch.int16	torch.ShortTensor	torch.cuda.ShortTensor
32-bit integer (signed)	torch.int32	torch.IntTensor	torch.cuda.IntTensor
64-bit integer (signed)	torch.int64	torch.LongTensor	torch.cuda.LongTensor

torch.device: tensor数据所分配的设备，如CPU，cuda:0
torch.layout: tensor在内存中的存储方式。

As neural network programmers, we need to be aware of the following:

Tensors contain data of a uniform type (dtype).
Tensor computations between tensors depend on the dtype and the device.

【Tensors包含相同类型的数据】

【Tensors之间的计算取决于他的类型和他所分配的设备】

Creating Tensors

These are the primary ways of creating tensor objects (instances of the torch.Tensor class), with data (array-like) in PyTorch:

Creating Tensors with data.

【四种用数据创建tensor的方式】

torch.Tensor(data)
torch.tensor(data)
torch.as_tensor(data)
torch.from_numpy(data)

`torch.Tensor()` Vs `torch.tensor()`

The first option with the uppercase T is the constructor of the torch.Tensor class, and the second option is what we call a factory function that constructs torch.Tensor objects and returns them to the caller.

However, the factory function torch.tensor() has better documentation and more configuration options, so it gets the winning spot at the moment.

【torch.Tensor(data) 是 torch.Tensor class的Constructor，而torch.tensor(data) 是生成/返回 torch.Tensor class的函数（factory functions)】

【因为torch.tensor() 有更多的选项设置，比如可以设置数据类型，所以一般用torch.tensor() 来生成。】

Default `dtype` Vs Inferred `dtype`

The difference here arises in the fact that the torch.Tensor() constructor uses the default dtype when building the tensor. The other calls choose a dtype based on the incoming data. This is called type inference. The dtype is inferred based on the incoming data.

【torch.Tensor() 在生成tensor时，使用的是默认dtype=torch.float32 ，而其他三种是使用的引用dtype ，即生成tensor的数据类型和输入的数据类型一致。】

torch.Tensor() and torch.tensor() copy their input data while torch.as_tensor() and torch.from_numpy() share their input data in memory with the original input object.

This sharing just means that the actual data in memory exists in a single place. As a result, any changes that occur in the underlying data will be reflected in both objects, the torch.Tensor and the numpy.ndarray.

Sharing data is more efficient and uses less memory than copying data because the data is not written to two locations in memory.

【torch.Tensor() 和 torch.tensor() 在根据data创建tensor时，在内存中额外复制数据】

【torch.as_tensor() 和 torch.from_numpy() 在根据data创建tensor时，是和原输入数据共享的内存，即原numpy.ndarry的数据改变，相应的tensor也会改变。】

Share Data	Copy Data
torch.as_tensor()	torch.tensor()
torch.from_numpy()	torch.Tensor()

Some things to keep in mind about memory sharing (it works where it can):

Since numpy.ndarray objects are allocated on the CPU, the as_tensor() function must copy the data from the CPU to the GPU when a GPU is being used.

【在使用GPU时， as_tensor() 也会将ndarray数据从CPU复制到GPU上。】
The memory sharing of as_tensor() doesn’t work with built-in Python data structures like lists.

【as_tensor() 在Python内置数据结构时不会共享内存】
The as_tensor() performance improvement will be greater if there are a lot of back and forth operations between numpy.ndarray objects and tensor objects.

【as_tensor() 在ndarry和tensor之间大量连续操作时能有效提高性能】

`torch.as_tensor()` Vs `torch.from_numpy()`

This establishes that torch.as_tensor() and torch.from_numpy() both share memory with their input data. However, which one should we use, and how are they different?

The torch.from_numpy() function only accepts numpy.ndarrays, while the torch.as_tensor() function accepts a wide variety of array-like objects, including other PyTorch tensors.

【这两个都是和输入数据共享内存，但 torch.from_numpy() 只能接受numpy.ndarrays 类型的数据，而torch.as_tensor() 能接受array-like(像list, tuple)等类型，所以一般torch.as_tensor() 更常用。】

If we have a torch.Tensor and we want to convert it to a numpy.ndarray

【用torch.numpy() 把tensor转换为ndarray】

Creating Tensors without data.

【还有几种创建常见tensor的方式】

torch.eyes(n) : 创建2-D tensor，即n*n的单位向量。
torch.zeros(shape) : 创建shape=shape的全0tensor。
torch.ones(shape) : 创建全1tensor。
torch.rand(shape) : 创建随机值tensor。

Tensor Operation

关于Tensor 操作的Colab运行笔记。对照使用最佳。如果打不开也可以看github

Tensor Operations: Reshape

Tensor Operations: Element-wise

Tensor Operation: Reduction and Access

We have the following high-level categories of operations:

Reshaping operations
Element-wise operations
Reduction operations
Access operations

【对tensor的操作主要分为4种：reshape, element-wise, reduction, access】

Reshape

As neural network programmers, we have to do the same with our tensors, and usually shaping and reshaping our tensors is a frequent task.

【reshape在NN编程中是很常见的操作】

（具体操作见colab运行笔记本:Tensor Operations: Reshape ）

import torch
t = torch.tensor([
    [1,1,1,1],
    [2,2,2,2],
    [3,3,3,3]
], dtype=torch.float32)
t.reshape([2,6])
t.reshape(2,2,3)

Reshaping changes the tensor’s shape but not the underlying data. Our tensor has 12 elements, so any reshaping must account for exactly 12 elements.

【reshape操作不改变底层的数据，只是改变tensor的shape】

In PyTorch, the -1 tells the reshape() function to figure out what the value should be based on the number of elements contained within the tensor.

【reshape中传入的-1参数，PyTorch可以自动计算该值，因为PyTorch要保证tensor的元素个数不变】

Squeezing And Unsqueezing

Squeezing a tensor removes the dimensions or axes that have a length of one.

【Squeezing操作：移除tensor中axis长度为1的维度】
Unsqueezing a tensor adds a dimension with a length of one.

【Unsqueezing操作：增加一个axis长度为1的维度】

（具体操作见colab运行笔记本:Tensor Operations: Reshape ）

1 2	t.squeeze() t.squeeze().unsqueeze(dim=0)

Concatenation Tensors

We combine tensors using the cat() function, and the resulting tensor will have a shape that depends on the shape of the two input tensors.

（具体操作见colab运行笔记本:Tensor Operations: Reshape ）

1 2	torch.cat((t1,t2,t3), dim=0) torch.cat((t1,t2,t3), dim=1)

Flatten

这里从CNN的例子看Flatten，CNN的相关细节见：这篇文章

A tensor flatten operation is a common operation inside convolutional neural networks. This is because convolutional layer outputs that are passed to fully connected layers must be flatted out before the fully connected layer will accept the input.

【flatten在卷积层网络很常见，因为输入必须flatten后才能连接到一个全连接网络层】

对于MNIST数据集中18*18的手写数字，在前文说到CNN的输入是[Batch Size, Channels, Height, Width] ，怎么才能flatten tensor的部分axis，而不是全部维度。

CNN的输入，需要flatten的axes：(C,H,W)

从dim1维度开始flatten（具体操作见colab运行笔记本:Tensor Operations: Reshape ）

1	t.flatten(start_dim=1, end_dim=-1)

Broadcasting and Element-Wise

An element-wise operation operates on corresponding elements between tensors.

【element-wise操作两个tensor之间对应的元素。】

Broadcasting

Broadcasting describes how tensors with different shapes are treated during element-wise operations.

Broadcasting is the concept whose implementation allows us to add scalars to higher dimensional tensors.

【broadcast描述了不同shape之间的tensor如何进行element-wise操作】

【broadcast允许我们增加scalars到高维度】

Let’s think about the t1 + 2 operation. Here, the scaler valued tensor is being broadcasted to the shape of t1, and then, the element-wise operation is carried out.

【在t1+2时，scalar 2实际是先被broadcast到和t1相同的shape, 再执行element-wise操作】

We have two tensors with different shapes. The goal of broadcasting is to make the tensors have the same shape so we can perform element-wise operations on them.

（具体操作见colab运行笔记本:Tensor Operations: Element-wise ）

Broadcasting Details

（具体操作见colab运行笔记本:Tensor Operations: Element-wise ）

Same Shapes: 直接操作
Same Rank, Different Shape:
1. Determine if tensors are compatible（兼容）.
  
  【两个tensor兼容，才可以对tensor broadcast，再执行element-wise操作】
  
  We compare the shapes of the two tensors, starting at their last dimensions and working backwards. Our goal is to determine whether each dimension between the two tensors’ shapes is compatible.
  
  【从最后一个维度向前判断，每个维度是否兼容】
  
  【判断该维度兼容的条件是满足下面两个条件其一：维度长度相同；或者其中一个为1】
  
  The dimensions are compatible when either:
  - They’re equal to each other.
  - One of them is 1.
2. Determine the shape of the resulting tensor.
  
  【操作的结果是一个新的tensor，结果tensor的每个维度长度是原tensors在该维度的最大值】
Different Ranks:
1. Determine if tensors are compatible.(同上)
  
  When we’re in a situation where the ranks of the two tensors aren’t the same, like what we have here, then we simply substitute a one in for the missing dimensions of the lower-ranked tensor.
  
  【对低维度的tensor的缺失维度，用1来代替，比如shape为(1,3) 和 ()，低维度的shape变为(1,1)】
2. Determine the shape of the resulting tensor.

ArgMax and Reduction

A reduction operation on a tensor is an operation that reduces the number of elements contained within the tensor.

【reduction 操作是能减少tensor元素数量的操作。】

Reshaping operations gave us the ability to position our elements along particular axes. Element-wise operations allow us to perform operations on elements between two tensors, and reduction operations allow us to perform operations on elements within a single tensor.

【Reshape操作让我们能沿着某一axis操纵tensor 中的元素位置；Element-wise操作让我们能对tensors之间对应元素进行操作；Reduction操作能让我们对单个tensor间的元素操作。】

(具体操作见colab笔记本：Tensor Operation: Reduction and Access )

t.sum()
t.prod()
t.mean()
t.std()

Reducing Tensors By Axes

只需要对这些方法传一个维度对参数。

(具体操作见colab笔记本：Tensor Operation: Reduction and Access )

1 2	t.sum(dim=0) t.sum(dim=1)

Argmax

Argmax returns the index location of the maximum value inside a tensor.

【Argmax返回最大value的index】

(具体操作见colab笔记本：Tensor Operation: Reduction and Access )

1	t.argmax(dim=0)

Aceessing Elements Inside Tensors

The last type of common operation that we need for tensors is the ability to access data from within the tensor.

【Access操作能获得tensor中的数据，即将tensor中的数据拿出来放在Python内置的数据结构中】

(具体操作见colab笔记本：Tensor Operation: Reduction and Access )

1
2
3

t.mean().item()
t.mean(dim=0).tolist()
t.mean(dim=0).numpy()

Advanced Indexing And Slicing

PyTorch Tensor支持大多数NumPy的index和slicing操作。

坑：https://numpy.org/doc/stable/reference/arrays.indexing.html

Reference

挖坑：advanced indexing and slicing: https://numpy.org/doc/stable/reference/arrays.indexing.html

「PyTorch」：2-Tensors Explained And Operations

https://f7ed.com/2020/10/21/pytorch-tensors/

Author

f7ed

Posted on

2020-10-21

Updated on

2021-02-28

Licensed under

CC BY-NC-SA 4.0

「PyTorch」：2-Tensors Explained And Operations

Introducing Tensors

Tensor Explained - Data Structures of Deep Learning

What Is A Tensor?

Indexes Required To Access An Element

Tensors Are Generalizations

Mathematics

Computer Science

Rank, Axes, And Shape Explained

Rank And Indexes

Axes Of A Tensor

Length Of An Axis

Shape Of A Tensor

CNN Tensors Shape Explained

Shape Of A CNN Input

Image Height And Width

Image Color Channels

Image Batches

NCHW vs NHWC vs CHWN

Output Channels And Feature Maps

Feature Maps

PyTorch Tensors

PyTorch Tensors Attributes

Creating Tensors

torch.Tensor() Vs torch.tensor()

Default dtype Vs Inferred dtype

Sharing Memory For Performance: Copy Vs Share

torch.as_tensor() Vs torch.from_numpy()

Tensor Operation

Reshape

Squeezing And Unsqueezing

Concatenation Tensors

Flatten

Broadcasting and Element-Wise

Broadcasting

Broadcasting Details

ArgMax and Reduction

Reducing Tensors By Axes

Argmax

Aceessing Elements Inside Tensors

Advanced Indexing And Slicing

Reference

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Comments

Catalogue

`torch.Tensor()` Vs `torch.tensor()`

Default `dtype` Vs Inferred `dtype`

`torch.as_tensor()` Vs `torch.from_numpy()`