torch 中 Tensor 的乘法有几种方法，如 *、torch.mul、torch.multiply、torch.dot、torch.mv、torch.mm、torch.matmul、torch.einsum 等，它们之间有什么区别，都是做什么操作。本篇对其进行介绍。

逐元素乘法

逐元素（element-wise）乘法就是对于两个Tensor对象（如矩阵和矩阵、矩阵和实数等，向量也是矩阵的一种），分别按对应元素进行实数普通乘法。

*、torch.mul、torch.multiply

*、torch.mul、torch.multiply 三者操作含义是相同的。torch.multiply 是 torch.mul 的别名，* 是 torch.mul 的简写。

当相乘的两个张量，一方维度低、一方维度高时，将采用广播（broadcast）的形式。实数是 0 维张量、向量是 1 维张量、二维矩阵是 2 维张量等等。

广播，简而言之，如果两个对象维度不一样，那么它的张量参数可以自动扩展为相等大小（无需复制数据）。

1	import torch

矩阵和实数乘

1 2	x = torch.ones(2, 3) x

tensor([[1., 1., 1.],
        [1., 1., 1.]])

1	x * 2, torch.mul(x, 2), torch.multiply(x, 2)

(tensor([[2., 2., 2.],
         [2., 2., 2.]]),
 tensor([[2., 2., 2.],
         [2., 2., 2.]]),
 tensor([[2., 2., 2.],
         [2., 2., 2.]]))

矩阵和向量（一维矩阵）乘

1 2	x = torch.ones(2, 3) x

tensor([[1., 1., 1.],
        [1., 1., 1.]])

1 2	y = torch.tensor([1, 2, 3], dtype=torch.float32) y

tensor([1., 2., 3.])

1	x * y, torch.mul(x, y), torch.multiply(x, y)

(tensor([[1., 2., 3.],
         [1., 2., 3.]]),
 tensor([[1., 2., 3.],
         [1., 2., 3.]]),
 tensor([[1., 2., 3.],
         [1., 2., 3.]]))

矩阵和矩阵乘

1 2	x = torch.ones(2, 3) x

tensor([[1., 1., 1.],
        [1., 1., 1.]])

1 2	y = torch.randn(2, 3) y

tensor([[-0.7308, -1.9904,  0.0606],
        [ 0.8398, -0.9562,  0.3506]])

1	x * y, torch.mul(x, y), torch.multiply(x, y)

(tensor([[-0.7308, -1.9904,  0.0606],
         [ 0.8398, -0.9562,  0.3506]]),
 tensor([[-0.7308, -1.9904,  0.0606],
         [ 0.8398, -0.9562,  0.3506]]),
 tensor([[-0.7308, -1.9904,  0.0606],
         [ 0.8398, -0.9562,  0.3506]]))

向量和向量乘

1 2	x = torch.randn(4, 1) x

tensor([[-0.3991],
        [ 2.1165],
        [-0.8744],
        [-1.3571]])

1 2	y = torch.randn(1, 4) y

tensor([[ 2.0050, -0.5762,  0.3147, -0.3944]])

1	x * y, torch.mul(x, y), torch.multiply(x, y)

(tensor([[-0.8002,  0.2299, -0.1256,  0.1574],
         [ 4.2436, -1.2194,  0.6661, -0.8347],
         [-1.7531,  0.5038, -0.2752,  0.3448],
         [-2.7211,  0.7819, -0.4271,  0.5352]]),
 tensor([[-0.8002,  0.2299, -0.1256,  0.1574],
         [ 4.2436, -1.2194,  0.6661, -0.8347],
         [-1.7531,  0.5038, -0.2752,  0.3448],
         [-2.7211,  0.7819, -0.4271,  0.5352]]),
 tensor([[-0.8002,  0.2299, -0.1256,  0.1574],
         [ 4.2436, -1.2194,  0.6661, -0.8347],
         [-1.7531,  0.5038, -0.2752,  0.3448],
         [-2.7211,  0.7819, -0.4271,  0.5352]]))

二维矩阵和三维矩阵

1 2	x = torch.tensor([[1, 2], [2, 3]]) x

tensor([[1, 2],
        [2, 3]])

1 2	y = torch.tensor([[[1, 2], [2, 3]], [[-1, -2], [-2, -3]]]) y

tensor([[[ 1,  2],
         [ 2,  3]],

        [[-1, -2],
         [-2, -3]]])

x * y

tensor([[[ 1,  4],
         [ 4,  9]],

        [[-1, -4],
         [-4, -9]]])

1	torch.mul(x, y)

tensor([[[ 1,  4],
         [ 4,  9]],

        [[-1, -4],
         [-4, -9]]])

1	torch.multiply(x, y)

tensor([[[ 1,  4],
         [ 4,  9]],

        [[-1, -4],
         [-4, -9]]])

torch.dot

计算两个一维张量的点积。即数学中，两个一维向量的内积，对应元素相乘然后再相加，得到一个实数。要求两个张量元素个数相同。

1 2	x = torch.tensor([2, 3]) x

tensor([2, 3])

1 2	y = torch.tensor([2, 1]) y

tensor([2, 1])

1	torch.dot(x, y)

tensor(7)

torch.mv

矩阵和向量乘积。对应高等代数（或线性代数）中一个矩阵和一个一维向量的点积（矩阵乘法）。但注意，这里和数学中唯一区别是，矩阵的列数和向量的列数保存一致，而不是矩阵的列数和向量的行数保存一致。这里默认把向量看作列向量。mv 表示矩阵（matrix）和向量（vector）的首字母。

1 2	mat = torch.ones(2, 3) mat

tensor([[1., 1., 1.],
        [1., 1., 1.]])

1 2	vec = torch.ones(3) vec

tensor([1., 1., 1.])

1	torch.mv(mat, vec)

tensor([3., 3.])

torch.mm

矩阵和矩阵乘积。对应高等代数（或线性代数）中一个矩阵和另一个矩阵的点积（矩阵乘法）。要求第一个矩阵的列数和第二个矩阵的行数保存一致。mm 表示两个矩阵（matrix）的首字母。

1 2	mat1 = torch.randn(2, 3) mat1

tensor([[ 0.1216, -1.0658,  0.4596],
        [ 0.0281, -0.6817,  0.9793]])

1 2	mat2 = torch.randn(3, 3) mat2

tensor([[-0.9294, -1.4270,  2.8876],
        [-0.5021,  0.4515,  0.3465],
        [ 0.5015, -0.9415, -0.3981]])

1	torch.mm(mat1, mat2)

tensor([[ 0.6526, -1.0874, -0.2011],
        [ 0.8073, -1.2699, -0.5448]])

torch.matmul

矩阵点乘。对应数学中的矩阵（包括向量）的点乘运算。它包含了上面的两个向量的点积torch.dot、矩阵和向量的乘积 torch.mv、矩阵和矩阵的乘积 torch.mm，能够用一个函数表示上面所有的运算。

向量点积

1 2	x = torch.ones(3) x

tensor([1., 1., 1.])

1 2	y = torch.ones(3) y

tensor([1., 1., 1.])

1	torch.dot(x, y)

tensor(3.)

1	torch.matmul(x, y)

tensor(3.)

矩阵与向量相乘

1 2	x = torch.ones(3, 4) x

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

1 2	y = torch.ones(4) y

tensor([1., 1., 1., 1.])

1	torch.mv(x, y)

tensor([4., 4., 4.])

1	torch.matmul(x, y)

tensor([4., 4., 4.])

矩阵点乘

1 2	x = torch.ones(3, 4) x

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

1 2	y = torch.ones(4, 2) y

tensor([[1., 1.],
        [1., 1.],
        [1., 1.],
        [1., 1.]])

1	torch.mm(x, y)

tensor([[4., 4.],
        [4., 4.],
        [4., 4.]])

1	torch.matmul(x, y)

tensor([[4., 4.],
        [4., 4.],
        [4., 4.]])

torch.bmm

bmm 是批量矩阵乘法。能够同时进行多个矩阵的点乘。bmm 分别表示 batch（批量）、matrix（矩阵）、matrix（矩阵）的首字母。

1 2	input = torch.ones(2, 3, 2) input

tensor([[[1., 1.],
         [1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.],
         [1., 1.]]])

1 2	mat2 = torch.randn(2, 2, 3) mat2

tensor([[[ 9.2534e-01, -1.7897e+00, -3.0873e-01],
         [-1.5857e+00,  8.6526e-01,  5.2600e-02]],

        [[ 3.9796e-01,  6.3333e-01, -1.0749e+00],
         [ 2.4290e-05,  8.3245e-03, -1.4999e-01]]])

1 2	output = torch.bmm(input, mat2) output

tensor([[[-0.6604, -0.9244, -0.2561],
         [-0.6604, -0.9244, -0.2561],
         [-0.6604, -0.9244, -0.2561]],

        [[ 0.3980,  0.6417, -1.2249],
         [ 0.3980,  0.6417, -1.2249],
         [ 0.3980,  0.6417, -1.2249]]])

torch.einsum()

爱因斯坦求和约定函数。能够包含上面所有乘法运算。

1 2	x = torch.arange(6, dtype=torch.float32).reshape(2, 3) x

tensor([[0., 1., 2.],
        [3., 4., 5.]])

矩阵转置

1	torch.einsum("ij->ji", x)

tensor([[0., 3.],
        [1., 4.],
        [2., 5.]])

矩阵元素和

1	torch.einsum("ij->", x)

tensor(15.)

列求和

1	torch.einsum("ij->j", x)

tensor([3., 5., 7.])

行求和

1	torch.einsum("ij->i", x)

tensor([ 3., 12.])

矩阵和向量相乘

tensor([[0., 1., 2.],
        [3., 4., 5.]])

1 2	y = torch.ones(3) y

tensor([1., 1., 1.])

1	torch.einsum("ij,j->i", x, y)

tensor([ 3., 12.])

1	torch.mv(x, y)

tensor([ 3., 12.])

矩阵乘法

tensor([[0., 1., 2.],
        [3., 4., 5.]])

1 2	y = torch.ones(3, 2) y

tensor([[1., 1.],
        [1., 1.],
        [1., 1.]])

1	torch.einsum("ij,jk->ik", x, y)

tensor([[ 3.,  3.],
        [12., 12.]])

1	torch.mm(x, y)

tensor([[ 3.,  3.],
        [12., 12.]])

逐元素乘积

tensor([[0., 1., 2.],
        [3., 4., 5.]])

1 2	y = torch.ones(x.shape) * 3 y

tensor([[3., 3., 3.],
        [3., 3., 3.]])

1	torch.einsum("ij,ij->ij", x, y)

tensor([[ 0.,  3.,  6.],
        [ 9., 12., 15.]])

1	torch.mul(x, y)

tensor([[ 0.,  3.,  6.],
        [ 9., 12., 15.]])

x * y

tensor([[ 0.,  3.,  6.],
        [ 9., 12., 15.]])

1	torch.multiply(x, y)

tensor([[ 0.,  3.,  6.],
        [ 9., 12., 15.]])

1	# 如果想要计算逐元素乘积后再累加求和，可以使用如下方法

1	torch.einsum("ij,ij->", x, y)

tensor(45.)

向量点积

1 2	x = torch.ones(3) x

tensor([1., 1., 1.])

1 2	y = torch.ones(3) * 2 y

tensor([2., 2., 2.])

1	torch.einsum("i,j->", x, y)

tensor(18.)

向量外积

tensor([1., 1., 1.])

tensor([2., 2., 2.])

1	torch.einsum("i,j->ij", x, y)

tensor([[2., 2., 2.],
        [2., 2., 2.],
        [2., 2., 2.]])

batch 多矩阵相乘

1 2	input = torch.ones(2, 3, 2) input

tensor([[[1., 1.],
         [1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.],
         [1., 1.]]])

1 2	mat2 = torch.randn(2, 2, 3) mat2

tensor([[[-1.4658, -1.3135, -0.0189],
         [ 0.0271, -0.7247, -0.3334]],

        [[ 0.4132,  0.8727,  2.3141],
         [ 0.8487, -0.6309, -1.4993]]])

1	torch.einsum("ijk, ikl->ijl", input, mat2)

tensor([[[-1.4387, -2.0382, -0.3522],
         [-1.4387, -2.0382, -0.3522],
         [-1.4387, -2.0382, -0.3522]],

        [[ 1.2619,  0.2417,  0.8148],
         [ 1.2619,  0.2417,  0.8148],
         [ 1.2619,  0.2417,  0.8148]]])

1 2	output = torch.bmm(input, mat2) output

tensor([[[-1.4387, -2.0382, -0.3522],
         [-1.4387, -2.0382, -0.3522],
         [-1.4387, -2.0382, -0.3522]],

        [[ 1.2619,  0.2417,  0.8148],
         [ 1.2619,  0.2417,  0.8148],
         [ 1.2619,  0.2417,  0.8148]]])

求矩阵的迹

1 2	# trace torch.einsum("ii", torch.ones(4, 4))

tensor(4.)

求矩阵的对角向量

1 2	# diagonal torch.einsum("ii->i", torch.ones(4, 4))

tensor([1., 1., 1., 1.])

求两个向量的外积

# outer product
x = torch.ones(5)
y = torch.ones(4)
torch.einsum("i,j->ij", x, y)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])

PyTorch 中矩阵乘法介绍

逐元素乘法

*、torch.mul、torch.multiply

矩阵和实数乘

矩阵和向量（一维矩阵）乘

矩阵和矩阵乘

向量和向量乘

二维矩阵和三维矩阵

torch.dot

torch.mv

torch.mm

torch.matmul

向量点积

矩阵与向量相乘

矩阵点乘

torch.bmm

torch.einsum()

矩阵转置

矩阵元素和

列求和

行求和

矩阵和向量相乘

矩阵乘法

逐元素乘积

向量点积

向量外积

batch 多矩阵相乘

求矩阵的迹

求矩阵的对角向量

求两个向量的外积

参考文献