pytorch中文文档

Tensor

torch.Tensor是一种包含单一数据类型元素的多维矩阵

Torch定义了10种CPU tensor类型和GPU tensor类型:

Data type dtype CPU tensor GPU tensor
32-bit floating point torch.float32 or torch.float torch.FloatTensor torch.cuda.FloatTensor
64-bit floating point torch.float64 or torch.double torch.DoubleTensor torch.cuda.DoubleTensor
16-bit floating point [1] torch.float16 or torch.half torch.HalfTensor torch.cuda.HalfTensor
16-bit floating point [2] torch.bfloat16 torch.BFloat16Tensor torch.cuda.BFloat16Tensor
32-bit complex torch.complex32 or torch.chalf
64-bit complex torch.complex64 or torch.cfloat
128-bit complex torch.complex128 or torch.cdouble
8-bit integer (unsigned) torch.uint8 torch.ByteTensor torch.cuda.ByteTensor
8-bit integer (signed) torch.int8 torch.CharTensor torch.cuda.CharTensor
16-bit integer (signed) torch.int16 or torch.short torch.ShortTensor torch.cuda.ShortTensor
32-bit integer (signed) torch.int32 or torch.int torch.IntTensor torch.cuda.IntTensor
64-bit integer (signed) torch.int64 or torch.long torch.LongTensor torch.cuda.LongTensor
Boolean torch.bool torch.BoolTensor torch.cuda.BoolTensor
quantized 8-bit integer (unsigned) torch.quint8 torch.ByteTensor /
quantized 8-bit integer (signed) torch.qint8 torch.CharTensor /
quantized 32-bit integer (signed) torch.qint32 torch.IntTensor /
quantized 4-bit integer (unsigned) [3] torch.quint4x2 torch.ByteTensor /

创建

一个张量tensor可以从Python的list或序列构建

1
2
3
4
5
torch.FloatTensor([[1, 2, 3], [4, 5, 6]])

Out[0]:
tensor([[1., 2., 3.],
[4., 5., 6.]])

根据可选择的大小和数据新建一个tensor。 如果没有提供参数,将会返回一个空的零维张量。如果提供了numpy.ndarray,torch.Tensortorch.Storage,将会返回一个有同样参数的tensor.如果提供了python序列,将会从序列的副本创建一个tensor

1
2
3
4
5
6
7
8
9
10
11
# 接口 一个空张量tensor可以通过规定其大小来构建
class torch.Tensor
class torch.Tensor(*sizes)
class torch.Tensor(size)
class torch.Tensor(sequence)
class torch.Tensor(ndarray)
class torch.Tensor(tensor)
class torch.Tensor(storage)

# 实例化
torch.IntTensor(2, 4).zero_()

可以用python的索引和切片来获取和修改一个张量tensor中的内容

1
2
3
4
5
6
7
8
9
x = torch.FloatTensor([[1, 2, 3], [4, 5, 6]])
x[1][2]
Out[0]: tensor(6.)

x[0][1] = 8
x
Out[1]:
tensor([[1., 8., 3.],
[4., 5., 6.]])

每一个张量tensor都有一个相应的torch.Storage用来保存其数据。类tensor提供了一个存储的多维的、横向视图,并且定义了在数值运算

会改变tensor的函数操作会用一个下划线后缀来标示。比如,torch.FloatTensor.abs_()会在原地计算绝对值,并返回改变后的tensor,而tensor.FloatTensor.abs()将会在一个新的tensor中计算结果

关键属性和方法

Tensor.new_tensor Returns a new Tensor with data as the tensor data.
Tensor.new_full Returns a Tensor of size size filled with fill_value.
Tensor.new_empty Returns a Tensor of size size filled with uninitialized data.
Tensor.new_ones Returns a Tensor of size size filled with 1.
Tensor.new_zeros Returns a Tensor of size size filled with 0.
Tensor.is_cuda Is True if the Tensor is stored on the GPU, False otherwise.
Tensor.is_quantized Is True if the Tensor is quantized, False otherwise.
Tensor.is_meta Is True if the Tensor is a meta tensor, False otherwise.
Tensor.device Is the torch.device where this Tensor is.
Tensor.grad This attribute is None by default and becomes a Tensor the first time a call to backward() computes gradients for self.
Tensor.ndim Alias for dim()
Tensor.real Returns a new tensor containing real values of the self tensor for a complex-valued input tensor.
Tensor.imag Returns a new tensor containing imaginary values of the self tensor.
Tensor.abs See torch.abs()
Tensor.abs_ In-place version of abs()
Tensor.absolute Alias for abs()
Tensor.absolute_ In-place version of absolute() Alias for abs_()
Tensor.acos See torch.acos()
Tensor.acos_ In-place version of acos()
Tensor.arccos See torch.arccos()
Tensor.arccos_ In-place version of arccos()
Tensor.add Add a scalar or tensor to self tensor.
Tensor.add_ In-place version of add()
Tensor.addbmm See torch.addbmm()
Tensor.addbmm_ In-place version of addbmm()
Tensor.addcdiv See torch.addcdiv()
Tensor.addcdiv_ In-place version of addcdiv()
Tensor.addcmul See torch.addcmul()
Tensor.addcmul_ In-place version of addcmul()
Tensor.addmm See torch.addmm()
Tensor.addmm_ In-place version of addmm()
Tensor.sspaddmm See torch.sspaddmm()
Tensor.addmv See torch.addmv()
Tensor.addmv_ In-place version of addmv()
Tensor.addr See torch.addr()
Tensor.addr_ In-place version of addr()
Tensor.adjoint Alias for adjoint()
Tensor.allclose See torch.allclose()
Tensor.amax See torch.amax()
Tensor.amin See torch.amin()
Tensor.aminmax See torch.aminmax()
Tensor.angle See torch.angle()
Tensor.apply_ Applies the function callable to each element in the tensor, replacing each element with the value returned by callable.
Tensor.argmax See torch.argmax()
Tensor.argmin See torch.argmin()
Tensor.argsort See torch.argsort()
Tensor.argwhere See torch.argwhere()
Tensor.asin See torch.asin()
Tensor.asin_ In-place version of asin()
Tensor.arcsin See torch.arcsin()
Tensor.arcsin_ In-place version of arcsin()
Tensor.as_strided See torch.as_strided()
Tensor.atan See torch.atan()
Tensor.atan_ In-place version of atan()
Tensor.arctan See torch.arctan()
Tensor.arctan_ In-place version of arctan()
Tensor.atan2 See torch.atan2()
Tensor.atan2_ In-place version of atan2()
Tensor.arctan2 See torch.arctan2()
Tensor.arctan2_ atan2_(other) -> Tensor
Tensor.all See torch.all()
Tensor.any See torch.any()
Tensor.backward Computes the gradient of current tensor w.r.t.
Tensor.baddbmm See torch.baddbmm()
Tensor.baddbmm_ In-place version of baddbmm()
Tensor.bernoulli Returns a result tensor where each \texttt{result[i]}result[i] is independently sampled from \text{Bernoulli}(\texttt{self[i]})Bernoulli(self[i]).
Tensor.bernoulli_ Fills each location of self with an independent sample from \text{Bernoulli}(\texttt{p})Bernoulli(p).
Tensor.bfloat16 self.bfloat16() is equivalent to self.to(torch.bfloat16).
Tensor.bincount See torch.bincount()
Tensor.bitwise_not See torch.bitwise_not()
Tensor.bitwise_not_ In-place version of bitwise_not()
Tensor.bitwise_and See torch.bitwise_and()
Tensor.bitwise_and_ In-place version of bitwise_and()
Tensor.bitwise_or See torch.bitwise_or()
Tensor.bitwise_or_ In-place version of bitwise_or()
Tensor.bitwise_xor See torch.bitwise_xor()
Tensor.bitwise_xor_ In-place version of bitwise_xor()
Tensor.bitwise_left_shift See torch.bitwise_left_shift()
Tensor.bitwise_left_shift_ In-place version of bitwise_left_shift()
Tensor.bitwise_right_shift See torch.bitwise_right_shift()
Tensor.bitwise_right_shift_ In-place version of bitwise_right_shift()
Tensor.bmm See torch.bmm()
Tensor.bool self.bool() is equivalent to self.to(torch.bool).
Tensor.byte self.byte() is equivalent to self.to(torch.uint8).
Tensor.broadcast_to See torch.broadcast_to().
Tensor.cauchy_ Fills the tensor with numbers drawn from the Cauchy distribution:
Tensor.ceil See torch.ceil()
Tensor.ceil_ In-place version of ceil()
Tensor.char self.char() is equivalent to self.to(torch.int8).
Tensor.cholesky See torch.cholesky()
Tensor.cholesky_inverse See torch.cholesky_inverse()
Tensor.cholesky_solve See torch.cholesky_solve()
Tensor.chunk See torch.chunk()
Tensor.clamp See torch.clamp()
Tensor.clamp_ In-place version of clamp()
Tensor.clip Alias for clamp().
Tensor.clip_ Alias for clamp_().
Tensor.clone See torch.clone()
Tensor.contiguous Returns a contiguous in memory tensor containing the same data as self tensor.
Tensor.copy_ Copies the elements from src into self tensor and returns self.
Tensor.conj See torch.conj()
Tensor.conj_physical See torch.conj_physical()
Tensor.conj_physical_ In-place version of conj_physical()
Tensor.resolve_conj See torch.resolve_conj()
Tensor.resolve_neg See torch.resolve_neg()
Tensor.copysign See torch.copysign()
Tensor.copysign_ In-place version of copysign()
Tensor.cos See torch.cos()
Tensor.cos_ In-place version of cos()
Tensor.cosh See torch.cosh()
Tensor.cosh_ In-place version of cosh()
Tensor.corrcoef See torch.corrcoef()
Tensor.count_nonzero See torch.count_nonzero()
Tensor.cov See torch.cov()
Tensor.acosh See torch.acosh()
Tensor.acosh_ In-place version of acosh()
Tensor.arccosh acosh() -> Tensor
Tensor.arccosh_ acosh_() -> Tensor
Tensor.cpu Returns a copy of this object in CPU memory.
Tensor.cross See torch.cross()
Tensor.cuda Returns a copy of this object in CUDA memory.
Tensor.logcumsumexp See torch.logcumsumexp()
Tensor.cummax See torch.cummax()
Tensor.cummin See torch.cummin()
Tensor.cumprod See torch.cumprod()
Tensor.cumprod_ In-place version of cumprod()
Tensor.cumsum See torch.cumsum()
Tensor.cumsum_ In-place version of cumsum()
Tensor.chalf self.chalf() is equivalent to self.to(torch.complex32).
Tensor.cfloat self.cfloat() is equivalent to self.to(torch.complex64).
Tensor.cdouble self.cdouble() is equivalent to self.to(torch.complex128).
Tensor.data_ptr Returns the address of the first element of self tensor.
Tensor.deg2rad See torch.deg2rad()
Tensor.dequantize Given a quantized Tensor, dequantize it and return the dequantized float Tensor.
Tensor.det See torch.det()
Tensor.dense_dim Return the number of dense dimensions in a sparse tensor self.
Tensor.detach Returns a new Tensor, detached from the current graph.
Tensor.detach_ Detaches the Tensor from the graph that created it, making it a leaf.
Tensor.diag See torch.diag()
Tensor.diag_embed See torch.diag_embed()
Tensor.diagflat See torch.diagflat()
Tensor.diagonal See torch.diagonal()
Tensor.diagonal_scatter See torch.diagonal_scatter()
Tensor.fill_diagonal_ Fill the main diagonal of a tensor that has at least 2-dimensions.
Tensor.fmax See torch.fmax()
Tensor.fmin See torch.fmin()
Tensor.diff See torch.diff()
Tensor.digamma See torch.digamma()
Tensor.digamma_ In-place version of digamma()
Tensor.dim Returns the number of dimensions of self tensor.
Tensor.dist See torch.dist()
Tensor.div See torch.div()
Tensor.div_ In-place version of div()
Tensor.divide See torch.divide()
Tensor.divide_ In-place version of divide()
Tensor.dot See torch.dot()
Tensor.double self.double() is equivalent to self.to(torch.float64).
Tensor.dsplit See torch.dsplit()
Tensor.element_size Returns the size in bytes of an individual element.
Tensor.eq See torch.eq()
Tensor.eq_ In-place version of eq()
Tensor.equal See torch.equal()
Tensor.erf See torch.erf()
Tensor.erf_ In-place version of erf()
Tensor.erfc See torch.erfc()
Tensor.erfc_ In-place version of erfc()
Tensor.erfinv See torch.erfinv()
Tensor.erfinv_ In-place version of erfinv()
Tensor.exp See torch.exp()
Tensor.exp_ In-place version of exp()
Tensor.expm1 See torch.expm1()
Tensor.expm1_ In-place version of expm1()
Tensor.expand Returns a new view of the self tensor with singleton dimensions expanded to a larger size.
Tensor.expand_as Expand this tensor to the same size as other.
Tensor.exponential_ Fills self tensor with elements drawn from the exponential distribution:
Tensor.fix See torch.fix().
Tensor.fix_ In-place version of fix()
Tensor.fill_ Fills self tensor with the specified value.
Tensor.flatten See torch.flatten()
Tensor.flip See torch.flip()
Tensor.fliplr See torch.fliplr()
Tensor.flipud See torch.flipud()
Tensor.float self.float() is equivalent to self.to(torch.float32).
Tensor.float_power See torch.float_power()
Tensor.float_power_ In-place version of float_power()
Tensor.floor See torch.floor()
Tensor.floor_ In-place version of floor()
Tensor.floor_divide See torch.floor_divide()
Tensor.floor_divide_ In-place version of floor_divide()
Tensor.fmod See torch.fmod()
Tensor.fmod_ In-place version of fmod()
Tensor.frac See torch.frac()
Tensor.frac_ In-place version of frac()
Tensor.frexp See torch.frexp()
Tensor.gather See torch.gather()
Tensor.gcd See torch.gcd()
Tensor.gcd_ In-place version of gcd()
Tensor.ge See torch.ge().
Tensor.ge_ In-place version of ge().
Tensor.greater_equal See torch.greater_equal().
Tensor.greater_equal_ In-place version of greater_equal().
Tensor.geometric_ Fills self tensor with elements drawn from the geometric distribution:
Tensor.geqrf See torch.geqrf()
Tensor.ger See torch.ger()
Tensor.get_device For CUDA tensors, this function returns the device ordinal of the GPU on which the tensor resides.
Tensor.gt See torch.gt().
Tensor.gt_ In-place version of gt().
Tensor.greater See torch.greater().
Tensor.greater_ In-place version of greater().
Tensor.half self.half() is equivalent to self.to(torch.float16).
Tensor.hardshrink See torch.nn.functional.hardshrink()
Tensor.heaviside See torch.heaviside()
Tensor.histc See torch.histc()
Tensor.histogram See torch.histogram()
Tensor.hsplit See torch.hsplit()
Tensor.hypot See torch.hypot()
Tensor.hypot_ In-place version of hypot()
Tensor.i0 See torch.i0()
Tensor.i0_ In-place version of i0()
Tensor.igamma See torch.igamma()
Tensor.igamma_ In-place version of igamma()
Tensor.igammac See torch.igammac()
Tensor.igammac_ In-place version of igammac()
Tensor.index_add_ Accumulate the elements of alpha times source into the self tensor by adding to the indices in the order given in index.
Tensor.index_add Out-of-place version of torch.Tensor.index_add_().
Tensor.index_copy_ Copies the elements of tensor into the self tensor by selecting the indices in the order given in index.
Tensor.index_copy Out-of-place version of torch.Tensor.index_copy_().
Tensor.index_fill_ Fills the elements of the self tensor with value value by selecting the indices in the order given in index.
Tensor.index_fill Out-of-place version of torch.Tensor.index_fill_().
Tensor.index_put_ Puts values from the tensor values into the tensor self using the indices specified in indices (which is a tuple of Tensors).
Tensor.index_put Out-place version of index_put_().
Tensor.index_reduce_ Accumulate the elements of source into the self tensor by accumulating to the indices in the order given in index using the reduction given by the reduce argument.
Tensor.index_reduce
Tensor.index_select See torch.index_select()
Tensor.indices Return the indices tensor of a sparse COO tensor.
Tensor.inner See torch.inner().
Tensor.int self.int() is equivalent to self.to(torch.int32).
Tensor.int_repr Given a quantized Tensor, self.int_repr() returns a CPU Tensor with uint8_t as data type that stores the underlying uint8_t values of the given Tensor.
Tensor.inverse See torch.inverse()
Tensor.isclose See torch.isclose()
Tensor.isfinite See torch.isfinite()
Tensor.isinf See torch.isinf()
Tensor.isposinf See torch.isposinf()
Tensor.isneginf See torch.isneginf()
Tensor.isnan See torch.isnan()
Tensor.is_contiguous Returns True if self tensor is contiguous in memory in the order specified by memory format.
Tensor.is_complex Returns True if the data type of self is a complex data type.
Tensor.is_conj Returns True if the conjugate bit of self is set to true.
Tensor.is_floating_point Returns True if the data type of self is a floating point data type.
Tensor.is_inference See torch.is_inference()
Tensor.is_leaf All Tensors that have requires_grad which is False will be leaf Tensors by convention.
Tensor.is_pinned Returns true if this tensor resides in pinned memory.
Tensor.is_set_to Returns True if both tensors are pointing to the exact same memory (same storage, offset, size and stride).
Tensor.is_shared Checks if tensor is in shared memory.
Tensor.is_signed Returns True if the data type of self is a signed data type.
Tensor.is_sparse Is True if the Tensor uses sparse storage layout, False otherwise.
Tensor.istft See torch.istft()
Tensor.isreal See torch.isreal()
Tensor.item Returns the value of this tensor as a standard Python number.
Tensor.kthvalue See torch.kthvalue()
Tensor.lcm See torch.lcm()
Tensor.lcm_ In-place version of lcm()
Tensor.ldexp See torch.ldexp()
Tensor.ldexp_ In-place version of ldexp()
Tensor.le See torch.le().
Tensor.le_ In-place version of le().
Tensor.less_equal See torch.less_equal().
Tensor.less_equal_ In-place version of less_equal().
Tensor.lerp See torch.lerp()
Tensor.lerp_ In-place version of lerp()
Tensor.lgamma See torch.lgamma()
Tensor.lgamma_ In-place version of lgamma()
Tensor.log See torch.log()
Tensor.log_ In-place version of log()
Tensor.logdet See torch.logdet()
Tensor.log10 See torch.log10()
Tensor.log10_ In-place version of log10()
Tensor.log1p See torch.log1p()
Tensor.log1p_ In-place version of log1p()
Tensor.log2 See torch.log2()
Tensor.log2_ In-place version of log2()
Tensor.log_normal_ Fills self tensor with numbers samples from the log-normal distribution parameterized by the given mean \muμ and standard deviation \sigmaσ.
Tensor.logaddexp See torch.logaddexp()
Tensor.logaddexp2 See torch.logaddexp2()
Tensor.logsumexp See torch.logsumexp()
Tensor.logical_and See torch.logical_and()
Tensor.logical_and_ In-place version of logical_and()
Tensor.logical_not See torch.logical_not()
Tensor.logical_not_ In-place version of logical_not()
Tensor.logical_or See torch.logical_or()
Tensor.logical_or_ In-place version of logical_or()
Tensor.logical_xor See torch.logical_xor()
Tensor.logical_xor_ In-place version of logical_xor()
Tensor.logit See torch.logit()
Tensor.logit_ In-place version of logit()
Tensor.long self.long() is equivalent to self.to(torch.int64).
Tensor.lt See torch.lt().
Tensor.lt_ In-place version of lt().
Tensor.less lt(other) -> Tensor
Tensor.less_ In-place version of less().
Tensor.lu See torch.lu()
Tensor.lu_solve See torch.lu_solve()
Tensor.as_subclass Makes a cls instance with the same data pointer as self.
Tensor.map_ Applies callable for each element in self tensor and the given tensor and stores the results in self tensor.
Tensor.masked_scatter_ Copies elements from source into self tensor at positions where the mask is True.
Tensor.masked_scatter Out-of-place version of torch.Tensor.masked_scatter_()
Tensor.masked_fill_ Fills elements of self tensor with value where mask is True.
Tensor.masked_fill Out-of-place version of torch.Tensor.masked_fill_()
Tensor.masked_select See torch.masked_select()
Tensor.matmul See torch.matmul()
Tensor.matrix_power NOTEmatrix_power() is deprecated, use torch.linalg.matrix_power() instead.
Tensor.matrix_exp See torch.matrix_exp()
Tensor.max See torch.max()
Tensor.maximum See torch.maximum()
Tensor.mean See torch.mean()
Tensor.nanmean See torch.nanmean()
Tensor.median See torch.median()
Tensor.nanmedian See torch.nanmedian()
Tensor.min See torch.min()
Tensor.minimum See torch.minimum()
Tensor.mm See torch.mm()
Tensor.smm See torch.smm()
Tensor.mode See torch.mode()
Tensor.movedim See torch.movedim()
Tensor.moveaxis See torch.moveaxis()
Tensor.msort See torch.msort()
Tensor.mul See torch.mul().
Tensor.mul_ In-place version of mul().
Tensor.multiply See torch.multiply().
Tensor.multiply_ In-place version of multiply().
Tensor.multinomial See torch.multinomial()
Tensor.mv See torch.mv()
Tensor.mvlgamma See torch.mvlgamma()
Tensor.mvlgamma_ In-place version of mvlgamma()
Tensor.nansum See torch.nansum()
Tensor.narrow See torch.narrow()
Tensor.narrow_copy See torch.narrow_copy().
Tensor.ndimension Alias for dim()
Tensor.nan_to_num See torch.nan_to_num().
Tensor.nan_to_num_ In-place version of nan_to_num().
Tensor.ne See torch.ne().
Tensor.ne_ In-place version of ne().
Tensor.not_equal See torch.not_equal().
Tensor.not_equal_ In-place version of not_equal().
Tensor.neg See torch.neg()
Tensor.neg_ In-place version of neg()
Tensor.negative See torch.negative()
Tensor.negative_ In-place version of negative()
Tensor.nelement Alias for numel()
Tensor.nextafter See torch.nextafter()
Tensor.nextafter_ In-place version of nextafter()
Tensor.nonzero See torch.nonzero()
Tensor.norm See torch.norm()
Tensor.normal_ Fills self tensor with elements samples from the normal distribution parameterized by mean and std.
Tensor.numel See torch.numel()
Tensor.numpy Returns the tensor as a NumPy ndarray.
Tensor.orgqr See torch.orgqr()
Tensor.ormqr See torch.ormqr()
Tensor.outer See torch.outer().
Tensor.permute See torch.permute()
Tensor.pin_memory Copies the tensor to pinned memory, if it’s not already pinned.
Tensor.pinverse See torch.pinverse()
Tensor.polygamma See torch.polygamma()
Tensor.polygamma_ In-place version of polygamma()
Tensor.positive See torch.positive()
Tensor.pow See torch.pow()
Tensor.pow_ In-place version of pow()
Tensor.prod See torch.prod()
Tensor.put_ Copies the elements from source into the positions specified by index.
Tensor.qr See torch.qr()
Tensor.qscheme Returns the quantization scheme of a given QTensor.
Tensor.quantile See torch.quantile()
Tensor.nanquantile See torch.nanquantile()
Tensor.q_scale Given a Tensor quantized by linear(affine) quantization, returns the scale of the underlying quantizer().
Tensor.q_zero_point Given a Tensor quantized by linear(affine) quantization, returns the zero_point of the underlying quantizer().
Tensor.q_per_channel_scales Given a Tensor quantized by linear (affine) per-channel quantization, returns a Tensor of scales of the underlying quantizer.
Tensor.q_per_channel_zero_points Given a Tensor quantized by linear (affine) per-channel quantization, returns a tensor of zero_points of the underlying quantizer.
Tensor.q_per_channel_axis Given a Tensor quantized by linear (affine) per-channel quantization, returns the index of dimension on which per-channel quantization is applied.
Tensor.rad2deg See torch.rad2deg()
Tensor.random_ Fills self tensor with numbers sampled from the discrete uniform distribution over [from, to - 1].
Tensor.ravel see torch.ravel()
Tensor.reciprocal See torch.reciprocal()
Tensor.reciprocal_ In-place version of reciprocal()
Tensor.record_stream Ensures that the tensor memory is not reused for another tensor until all current work queued on stream are complete.
Tensor.register_hook Registers a backward hook.
Tensor.remainder See torch.remainder()
Tensor.remainder_ In-place version of remainder()
Tensor.renorm See torch.renorm()
Tensor.renorm_ In-place version of renorm()
Tensor.repeat Repeats this tensor along the specified dimensions.
Tensor.repeat_interleave See torch.repeat_interleave().
Tensor.requires_grad Is True if gradients need to be computed for this Tensor, False otherwise.
Tensor.requires_grad_ Change if autograd should record operations on this tensor: sets this tensor’s requires_grad attribute in-place.
Tensor.reshape Returns a tensor with the same data and number of elements as self but with the specified shape.
Tensor.reshape_as Returns this tensor as the same shape as other.
Tensor.resize_ Resizes self tensor to the specified size.
Tensor.resize_as_ Resizes the self tensor to be the same size as the specified tensor.
Tensor.retain_grad Enables this Tensor to have their grad populated during backward().
Tensor.retains_grad Is True if this Tensor is non-leaf and its grad is enabled to be populated during backward(), False otherwise.
Tensor.roll See torch.roll()
Tensor.rot90 See torch.rot90()
Tensor.round See torch.round()
Tensor.round_ In-place version of round()
Tensor.rsqrt See torch.rsqrt()
Tensor.rsqrt_ In-place version of rsqrt()
Tensor.scatter Out-of-place version of torch.Tensor.scatter_()
Tensor.scatter_ Writes all values from the tensor src into self at the indices specified in the index tensor.
Tensor.scatter_add_ Adds all values from the tensor src into self at the indices specified in the index tensor in a similar fashion as scatter_().
Tensor.scatter_add Out-of-place version of torch.Tensor.scatter_add_()
Tensor.scatter_reduce_ Reduces all values from the src tensor to the indices specified in the index tensor in the self tensor using the applied reduction defined via the reduce argument ("sum", "prod", "mean", "amax", "amin").
Tensor.scatter_reduce Out-of-place version of torch.Tensor.scatter_reduce_()
Tensor.select See torch.select()
Tensor.select_scatter See torch.select_scatter()
Tensor.set_ Sets the underlying storage, size, and strides.
Tensor.share_memory_ Moves the underlying storage to shared memory.
Tensor.short self.short() is equivalent to self.to(torch.int16).
Tensor.sigmoid See torch.sigmoid()
Tensor.sigmoid_ In-place version of sigmoid()
Tensor.sign See torch.sign()
Tensor.sign_ In-place version of sign()
Tensor.signbit See torch.signbit()
Tensor.sgn See torch.sgn()
Tensor.sgn_ In-place version of sgn()
Tensor.sin See torch.sin()
Tensor.sin_ In-place version of sin()
Tensor.sinc See torch.sinc()
Tensor.sinc_ In-place version of sinc()
Tensor.sinh See torch.sinh()
Tensor.sinh_ In-place version of sinh()
Tensor.asinh See torch.asinh()
Tensor.asinh_ In-place version of asinh()
Tensor.arcsinh See torch.arcsinh()
Tensor.arcsinh_ In-place version of arcsinh()
Tensor.size Returns the size of the self tensor.
Tensor.slogdet See torch.slogdet()
Tensor.slice_scatter See torch.slice_scatter()
Tensor.sort See torch.sort()
Tensor.split See torch.split()
Tensor.sparse_mask Returns a new sparse tensor with values from a strided tensor self filtered by the indices of the sparse tensor mask.
Tensor.sparse_dim Return the number of sparse dimensions in a sparse tensor self.
Tensor.sqrt See torch.sqrt()
Tensor.sqrt_ In-place version of sqrt()
Tensor.square See torch.square()
Tensor.square_ In-place version of square()
Tensor.squeeze See torch.squeeze()
Tensor.squeeze_ In-place version of squeeze()
Tensor.std See torch.std()
Tensor.stft See torch.stft()
Tensor.storage Returns the underlying storage.
Tensor.storage_offset Returns self tensor’s offset in the underlying storage in terms of number of storage elements (not bytes).
Tensor.storage_type Returns the type of the underlying storage.
Tensor.stride Returns the stride of self tensor.
Tensor.sub See torch.sub().
Tensor.sub_ In-place version of sub()
Tensor.subtract See torch.subtract().
Tensor.subtract_ In-place version of subtract().
Tensor.sum See torch.sum()
Tensor.sum_to_size Sum this tensor to size.
Tensor.svd See torch.svd()
Tensor.swapaxes See torch.swapaxes()
Tensor.swapdims See torch.swapdims()
Tensor.symeig See torch.symeig()
Tensor.t See torch.t()
Tensor.t_ In-place version of t()
Tensor.tensor_split See torch.tensor_split()
Tensor.tile See torch.tile()
Tensor.to Performs Tensor dtype and/or device conversion.
Tensor.to_mkldnn Returns a copy of the tensor in torch.mkldnn layout.
Tensor.take See torch.take()
Tensor.take_along_dim See torch.take_along_dim()
Tensor.tan See torch.tan()
Tensor.tan_ In-place version of tan()
Tensor.tanh See torch.tanh()
Tensor.tanh_ In-place version of tanh()
Tensor.atanh See torch.atanh()
Tensor.atanh_ In-place version of atanh()
Tensor.arctanh See torch.arctanh()
Tensor.arctanh_ In-place version of arctanh()
Tensor.tolist Returns the tensor as a (nested) list.
Tensor.topk See torch.topk()
Tensor.to_dense Creates a strided copy of self if self is not a strided tensor, otherwise returns self.
Tensor.to_sparse Returns a sparse copy of the tensor.
Tensor.to_sparse_csr Convert a tensor to compressed row storage format (CSR).
Tensor.to_sparse_csc Convert a tensor to compressed column storage (CSC) format.
Tensor.to_sparse_bsr Convert a CSR tensor to a block sparse row (BSR) storage format of given blocksize.
Tensor.to_sparse_bsc Convert a CSR tensor to a block sparse column (BSC) storage format of given blocksize.
Tensor.trace See torch.trace()
Tensor.transpose See torch.transpose()
Tensor.transpose_ In-place version of transpose()
Tensor.triangular_solve See torch.triangular_solve()
Tensor.tril See torch.tril()
Tensor.tril_ In-place version of tril()
Tensor.triu See torch.triu()
Tensor.triu_ In-place version of triu()
Tensor.true_divide See torch.true_divide()
Tensor.true_divide_ In-place version of true_divide_()
Tensor.trunc See torch.trunc()
Tensor.trunc_ In-place version of trunc()
Tensor.type Returns the type if dtype is not provided, else casts this object to the specified type.
Tensor.type_as Returns this tensor cast to the type of the given tensor.
Tensor.unbind See torch.unbind()
Tensor.unflatten See torch.unflatten().
Tensor.unfold Returns a view of the original tensor which contains all slices of size size from self tensor in the dimension dimension.
Tensor.uniform_ Fills self tensor with numbers sampled from the continuous uniform distribution:
Tensor.unique Returns the unique elements of the input tensor.
Tensor.unique_consecutive Eliminates all but the first element from every consecutive group of equivalent elements.
Tensor.unsqueeze See torch.unsqueeze()
Tensor.unsqueeze_ In-place version of unsqueeze()
Tensor.values Return the values tensor of a sparse COO tensor.
Tensor.var See torch.var()
Tensor.vdot See torch.vdot()
Tensor.view Returns a new tensor with the same data as the self tensor but of a different shape.
Tensor.view_as View this tensor as the same size as other.
Tensor.vsplit See torch.vsplit()
Tensor.where self.where(condition, y) is equivalent to torch.where(condition, self, y).
Tensor.xlogy See torch.xlogy()
Tensor.xlogy_ In-place version of xlogy()
Tensor.zero_ Fills self tensor with zeros.

storage

tensor的数据结构、storage()、stride()、storage_offset()

pytorch中一个tensor对象分为头信息区(Tensor)存储区(Storage)两部分

tensor结构示例图
头信息区主要保存tensor的形状(size)、步长(stride)、数据类型(type)等信息;而真正的data(数据)则以连续一维数组的形式放在存储区,由torch.Storage实例管理着

注意:storage永远是一维数组,任何维度的tensor的实际数据都存储在一维的storage中

获取tensor的storage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
a = torch.tensor([[1.0, 4.0],[2.0, 1.0],[3.0, 5.0]])
a.storage()
Out[0]:
1.0
4.0
2.0
1.0
3.0
5.0
[torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6]

a.storage()[2] = 9

id(a.storage())
Out[1]: 1343354913168

实例

图片分类

小土堆+李沐课程笔记

PyTorch深度学习快速入门教程(绝对通俗易懂!)【小土堆】

Pytorch加载数据

Pytorch中加载数据需要Dataset、Dataloader。

  • Dataset提供一种方式去获取每个数据及其对应的label,告诉我们总共有多少个数据。
  • Dataloader为后面的网络提供不同的数据形式,它将一批一批数据进行一个打包。

Tensorboard

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
import torchvision
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

# 准备的测试数据集
test_data = torchvision.datasets.CIFAR10("./dataset",train=False,transform=torchvision.transforms.ToTensor())
# batch_size=4 使得 img0, target0 = dataset[0]、img1, target1 = dataset[1]、img2, target2 = dataset[2]、img3, target3 = dataset[3],然后这四个数据作为Dataloader的一个返回
test_loader = DataLoader(dataset=test_data,batch_size=64,shuffle=True,num_workers=0,drop_last=False)
# 用for循环取出DataLoader打包好的四个数据
writer = SummaryWriter("logs")
step = 0
for data in test_loader:
imgs, targets = data # 每个data都是由4张图片组成,imgs.size 为 [4,3,32,32],四张32×32图片三通道,targets由四个标签组成
writer.add_images("test_data",imgs,step)
step = step + 1

writer.close()

Transforms

① Transforms当成工具箱的话,里面的class就是不同的工具。例如像totensor、resize这些工具。

② Transforms拿一些特定格式的图片,经过Transforms里面的工具,获得我们想要的结果。

1
2
3
4
5
6
7
8
9
from torchvision import transforms
from PIL import Image

img_path = "Data/FirstTypeData/val/bees/10870992_eebeeb3a12.jpg"
img = Image.open(img_path)

tensor_trans = transforms.ToTensor() # 创建 transforms.ToTensor类 的实例化对象
tensor_img = tensor_trans(img) # 调用 transforms.ToTensor类 的__call__的魔术方法
print(tensor_img)

torchvision数据集

① torchvision中有很多数据集,当我们写代码时指定相应的数据集指定一些参数,它就可以自行下载。

② CIFAR-10数据集包含60000张32×32的彩色图片,一共10个类别,其中50000张训练图片,10000张测试图片。

1
2
3
4
5
6
7
8
9
10
11
12
13
import torchvision
train_set = torchvision.datasets.CIFAR10(root="./dataset",train=True,download=True) # root为存放数据集的相对路线
test_set = torchvision.datasets.CIFAR10(root="./dataset",train=False,download=True) # train=True是训练集,train=False是测试集

print(test_set[0]) # 输出的3是target
print(test_set.classes) # 测试数据集中有多少种

img, target = test_set[0] # 分别获得图片、target
print(img)
print(target)

print(test_set.classes[target]) # 3号target对应的种类
img.show()

损失函数

① Loss损失函数一方面计算实际输出和目标之间的差距。

② Loss损失函数另一方面为我们更新输出提供一定的依据

L1loss损失函数

1
2
3
4
5
6
7
8
9
import torch
from torch.nn import L1Loss
inputs = torch.tensor([1,2,3],dtype=torch.float32)
targets = torch.tensor([1,2,5],dtype=torch.float32)
inputs = torch.reshape(inputs,(1,1,1,3))
targets = torch.reshape(targets,(1,1,1,3))
loss = L1Loss() # 默认为 maen
result = loss(inputs,targets)
print(result)

MSE损失函数

1
2
3
4
5
6
7
8
9
10
import torch
from torch.nn import L1Loss
from torch import nn
inputs = torch.tensor([1,2,3],dtype=torch.float32)
targets = torch.tensor([1,2,5],dtype=torch.float32)
inputs = torch.reshape(inputs,(1,1,1,3))
targets = torch.reshape(targets,(1,1,1,3))
loss_mse = nn.MSELoss()
result_mse = loss_mse(inputs,targets)
print(result_mse)

交叉熵损失函数

1
2
3
4
5
6
7
8
9
10
import torch
from torch.nn import L1Loss
from torch import nn

x = torch.tensor([0.1,0.2,0.3])
y = torch.tensor([1])
x = torch.reshape(x,(1,3)) # 1的 batch_size,有三类
loss_cross = nn.CrossEntropyLoss()
result_cross = loss_cross(x,y)
print(result_cross)

优化器

① 损失函数调用backward方法,就可以调用损失函数的反向传播方法,就可以求出我们需要调节的梯度,我们就可以利用我们的优化器就可以根据梯度对参数进行调整,达到整体误差降低的目的。

② 梯度要清零,如果梯度不清零会导致梯度累加

1
2
3
4
5
6
7
8
9
10
11
loss = nn.CrossEntropyLoss() # 交叉熵    
tudui = Tudui()
optim = torch.optim.SGD(tudui.parameters(),lr=0.01) # 随机梯度下降优化器
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets) # 计算实际输出与目标输出的差距
optim.zero_grad() # 梯度清零
result_loss.backward() # 反向传播,计算损失函数的梯度
optim.step() # 根据梯度,对网络的参数进行调优
print(result_loss) # 对数据只看了一遍,只看了一轮,所以loss下降不大

神经网络学习率优化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import torch
import torchvision
from torch import nn
from torch.nn import Conv2d, MaxPool2d, Flatten, Linear, Sequential
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

dataset = torchvision.datasets.CIFAR10("./dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)
dataloader = DataLoader(dataset, batch_size=64,drop_last=True)

class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = Sequential(
Conv2d(3,32,5,padding=2),
MaxPool2d(2),
Conv2d(32,32,5,padding=2),
MaxPool2d(2),
Conv2d(32,64,5,padding=2),
MaxPool2d(2),
Flatten(),
Linear(1024,64),
Linear(64,10)
)

def forward(self, x):
x = self.model1(x)
return x

loss = nn.CrossEntropyLoss() # 交叉熵
tudui = Tudui()
optim = torch.optim.SGD(tudui.parameters(),lr=0.01) # 随机梯度下降优化器
scheduler = torch.optim.lr_scheduler.StepLR(optim, step_size=5, gamma=0.1) # 每过 step_size 更新一次优化器,更新是学习率为原来的学习率的 0.1 倍
for epoch in range(20):
running_loss = 0.0
for data in dataloader:
imgs, targets = data
outputs = tudui(imgs)
result_loss = loss(outputs, targets) # 计算实际输出与目标输出的差距
optim.zero_grad() # 梯度清零
result_loss.backward() # 反向传播,计算损失函数的梯度
optim.step() # 根据梯度,对网络的参数进行调优
scheduler.step() # 学习率太小了,所以20个轮次后,相当于没走多少
running_loss = running_loss + result_loss
print(running_loss) # 对这一轮所有误差的总和

网络模型使用及修改

网络模型添加

1
2
3
4
5
6
7
8
import torchvision
from torch import nn

dataset = torchvision.datasets.CIFAR10("./dataset",train=True,transform=torchvision.transforms.ToTensor(),download=True)
vgg16_true = torchvision.models.vgg16(pretrained=True) # 下载卷积层对应的参数是多少、池化层对应的参数时多少,这些参数时ImageNet训练好了的
vgg16_true.add_module('add_linear',nn.Linear(1000,10)) # 在VGG16后面添加一个线性层,使得输出为适应CIFAR10的输出,CIFAR10需要输出10个种类

print(vgg16_true)

网络模型修改

1
2
3
4
5
6
7
import torchvision
from torch import nn

vgg16_false = torchvision.models.vgg16(pretrained=False) # 没有预训练的参数
print(vgg16_false)
vgg16_false.classifier[6] = nn.Linear(4096,10)
print(vgg16_false)

网络模型保存与读取

模型结构 + 模型参数

1
2
3
4
5
6
7
8
import torchvision
import torch
vgg16 = torchvision.models.vgg16(pretrained=False)
torch.save(vgg16,"./model/vgg16_method1.pth") # 保存方式一:模型结构 + 模型参数
print(vgg16)

model = torch.load("./model/vgg16_method1.pth") # 保存方式一对应的加载模型
print(model)

模型参数(官方推荐),不保存网络模型结构

1
2
3
4
5
6
7
8
import torchvision
import torch
vgg16 = torchvision.models.vgg16(pretrained=False)
torch.save(vgg16.state_dict(),"./model/vgg16_method2.pth") # 保存方式二:模型参数(官方推荐),不再保存网络模型结构
print(vgg16)

model = torch.load("./model/vgg16_method2.pth") # 导入模型参数
print(model)

固定模型参数

在训练过程中可能需要固定一部分模型的参数,只更新另一部分参数,有两种思路实现这个目标

  1. 一个是设置不要更新参数的网络层为false
  2. 另一个就是在定义优化器时只传入要更新的参数

当然最优的做法是,优化器中只传入requires_grad=True的参数,这样占用的内存会更小一点,效率也会更高

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import torch
import torch.nn as nn
import torch.optim as optim


# 定义一个简单的网络
class net(nn.Module):
def __init__(self, num_class=3):
super(net, self).__init__()
self.fc1 = nn.Linear(8, 4)
self.fc2 = nn.Linear(4, num_class)

def forward(self, x):
return self.fc2(self.fc1(x))


model = net()

# 冻结fc1层的参数
for name, param in model.named_parameters():
if "fc1" in name:
param.requires_grad = False

loss_fn = nn.CrossEntropyLoss()

# 只传入requires_grad = True的参数
optimizer = optim.SGD(filter(lambda p: p.requires_grad, net.parameters(), lr=1e-2)
print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

model.train()
for epoch in range(10):
x = torch.randn((3, 8))
label = torch.randint(0, 3, [3]).long()
output = model(x)

loss = loss_fn(output, label)
optimizer.zero_grad()
loss.backward()
optimizer.step()

print("model.fc1.weight", model.fc1.weight)
print("model.fc2.weight", model.fc2.weight)

训练流程

DataLoader加载数据集

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
import torchvision
from torch import nn
from torch.utils.data import DataLoader

# 准备数据集
train_data = torchvision.datasets.CIFAR10("./dataset",train=True,transform=torchvision.transforms.ToTensor(),download=True)
test_data = torchvision.datasets.CIFAR10("./dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)

# length 长度
train_data_size = len(train_data)
test_data_size = len(test_data)
# 如果train_data_size=10,则打印:训练数据集的长度为:10
print("训练数据集的长度:{}".format(train_data_size))
print("测试数据集的长度:{}".format(test_data_size))

# 利用 Dataloader 来加载数据集
train_dataloader = DataLoader(train_data_size, batch_size=64)
test_dataloader = DataLoader(test_data_size, batch_size=64)

测试网络正确

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import torch
from torch import nn

# 搭建神经网络
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = nn.Sequential(
nn.Conv2d(3,32,5,1,2), # 输入通道3,输出通道32,卷积核尺寸5×5,步长1,填充2
nn.MaxPool2d(2),
nn.Conv2d(32,32,5,1,2),
nn.MaxPool2d(2),
nn.Conv2d(32,64,5,1,2),
nn.MaxPool2d(2),
nn.Flatten(), # 展平后变成 64*4*4 了
nn.Linear(64*4*4,64),
nn.Linear(64,10)
)

def forward(self, x):
x = self.model1(x)
return x

if __name__ == '__main__':
tudui = Tudui()
input = torch.ones((64,3,32,32))
output = tudui(input)
print(output.shape) # 测试输出的尺寸是不是我们想要的

网络训练数据

① model.train()和model.eval()的区别主要在于Batch Normalization和Dropout两层。

② 如果模型中有BN层(Batch Normalization)和 Dropout,需要在训练时添加model.train()。model.train()是保证BN层能够用到每一批数据的均值和方差。对于Dropout,model.train()是随机取一部分网络连接来训练更新参数。

③ 不启用 Batch Normalization 和 Dropout。 如果模型中有BN层(Batch Normalization)和Dropout,在测试时添加model.eval()。model.eval()是保证BN层能够用全部训练数据的均值和方差,即测试过程中要保证BN层的均值和方差不变。对于Dropout,model.eval()是利用到了所有网络连接,即不进行随机舍弃神经元。

④ 训练完train样本后,生成的模型model要用来测试样本。在model(test)之前,需要加上model.eval(),否则的话,有输入数据,即使不训练,它也会改变权值。这是model中含有BN层和Dropout所带来的性质。

⑤ 在做one classification的时候,训练集和测试集的样本分布是不一样的,尤其需要注意这一点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
import torchvision
import torch
from torch import nn
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

# from model import * 相当于把 model中的所有内容写到这里,这里直接把 model 写在这里
class Tudui(nn.Module):
def __init__(self):
super(Tudui, self).__init__()
self.model1 = nn.Sequential(
nn.Conv2d(3,32,5,1,2), # 输入通道3,输出通道32,卷积核尺寸5×5,步长1,填充2
nn.MaxPool2d(2),
nn.Conv2d(32,32,5,1,2),
nn.MaxPool2d(2),
nn.Conv2d(32,64,5,1,2),
nn.MaxPool2d(2),
nn.Flatten(), # 展平后变成 64*4*4 了
nn.Linear(64*4*4,64),
nn.Linear(64,10)
)

def forward(self, x):
x = self.model1(x)
return x

# 准备数据集
train_data = torchvision.datasets.CIFAR10("./dataset",train=True,transform=torchvision.transforms.ToTensor(),download=True)
test_data = torchvision.datasets.CIFAR10("./dataset",train=False,transform=torchvision.transforms.ToTensor(),download=True)

# length 长度
train_data_size = len(train_data)
test_data_size = len(test_data)
# 如果train_data_size=10,则打印:训练数据集的长度为:10
print("训练数据集的长度:{}".format(train_data_size))
print("测试数据集的长度:{}".format(test_data_size))

# 利用 Dataloader 来加载数据集
train_dataloader = DataLoader(train_data, batch_size=64)
test_dataloader = DataLoader(test_data, batch_size=64)

# 创建网络模型
tudui = Tudui()

# 损失函数
loss_fn = nn.CrossEntropyLoss() # 交叉熵,fn 是 fuction 的缩写

# 优化器
learning = 0.01 # 1e-2 就是 0.01 的意思
optimizer = torch.optim.SGD(tudui.parameters(),learning) # 随机梯度下降优化器

# 设置网络的一些参数
# 记录训练的次数
total_train_step = 0
# 记录测试的次数
total_test_step = 0

# 训练的轮次
epoch = 10

# 添加 tensorboard
writer = SummaryWriter("logs")

for i in range(epoch):
print("-----第 {} 轮训练开始-----".format(i+1))

# 训练步骤开始
tudui.train() # 当网络中有dropout层、batchnorm层时,这些层能起作用
for data in train_dataloader:
imgs, targets = data
outputs = tudui(imgs)
loss = loss_fn(outputs, targets) # 计算实际输出与目标输出的差距

# 优化器对模型调优
optimizer.zero_grad() # 梯度清零
loss.backward() # 反向传播,计算损失函数的梯度
optimizer.step() # 根据梯度,对网络的参数进行调优

total_train_step = total_train_step + 1
if total_train_step % 100 == 0:
print("训练次数:{},Loss:{}".format(total_train_step,loss.item())) # 方式二:获得loss值
writer.add_scalar("train_loss",loss.item(),total_train_step)

# 测试步骤开始(每一轮训练后都查看在测试数据集上的loss情况)
tudui.eval() # 当网络中有dropout层、batchnorm层时,这些层不能起作用
total_test_loss = 0
total_accuracy = 0
with torch.no_grad(): # 没有梯度了
for data in test_dataloader: # 测试数据集提取数据
imgs, targets = data
outputs = tudui(imgs)
loss = loss_fn(outputs, targets) # 仅data数据在网络模型上的损失
total_test_loss = total_test_loss + loss.item() # 所有loss
accuracy = (outputs.argmax(1) == targets).sum()
total_accuracy = total_accuracy + accuracy

print("整体测试集上的Loss:{}".format(total_test_loss))
print("整体测试集上的正确率:{}".format(total_accuracy/test_data_size))
writer.add_scalar("test_loss",total_test_loss,total_test_step)
writer.add_scalar("test_accuracy",total_accuracy/test_data_size,total_test_step)
total_test_step = total_test_step + 1

torch.save(tudui, "./model/tudui_{}.pth".format(i)) # 保存每一轮训练后的结果
#torch.save(tudui.state_dict(),"tudui_{}.path".format(i)) # 保存方式二
print("模型已保存")

writer.close()

迁移学习

迁移学习 | 模型查看&参数查看 | 预训练模型加载 | 模型修改 | 参数冻结

模型|参数查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import torch


class MyModel(torch.nn.Module):
def __init__(self):
super().__init__()
self.layer1 = torch.nn.Sequential(
torch.nn.Linear(3, 4),
torch.nn.Linear(4, 3),
)
self.layer2 = torch.nn.Linear(3, 6)
self.layer3 = torch.nn.Sequential(
torch.nn.Linear(6, 7),
torch.nn.Linear(7, 5),
)

def forward(self, x):
x = self.layer1(x)
x = self.layer2(x)
x = self.layer3(x)
return x


net = MyModel()
print(net)
MyModel(
(layer1): Sequential(
(0): Linear(in_features=3, out_features=4, bias=True)
(1): Linear(in_features=4, out_features=3, bias=True)
)
(layer2): Linear(in_features=3, out_features=6, bias=True)
(layer3): Sequential(
(0): Linear(in_features=6, out_features=7, bias=True)
(1): Linear(in_features=7, out_features=5, bias=True)
)
)

查看参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
for layer in net.modules():
print(type(layer)) # 查看每一层的类型
# print(layer)
<class '__main__.MyModel'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.linear.Linear'>


for param in net.parameters():
print(param.shape) # 打印每一层的参数
torch.Size([4, 3])
torch.Size([4])
torch.Size([3, 4])
torch.Size([3])
torch.Size([6, 3])
torch.Size([6])
torch.Size([7, 6])
torch.Size([7])
torch.Size([5, 7])
torch.Size([5])

for name, param in net.named_parameters():
print(name, param.shape) # 看的更细
layer1.0.weight torch.Size([4, 3])
layer1.0.bias torch.Size([4])
layer1.1.weight torch.Size([3, 4])
layer1.1.bias torch.Size([3])
layer2.weight torch.Size([6, 3])
layer2.bias torch.Size([6])
layer3.0.weight torch.Size([7, 6])
layer3.0.bias torch.Size([7])
layer3.1.weight torch.Size([5, 7])
layer3.1.bias torch.Size([5])

for key, value in net.state_dict().items(): # 参数名以及参数
print(key, value.shape)
layer1.0.weight torch.Size([4, 3])
layer1.0.bias torch.Size([4])
layer1.1.weight torch.Size([3, 4])
layer1.1.bias torch.Size([3])
layer2.weight torch.Size([6, 3])
layer2.bias torch.Size([6])
layer3.0.weight torch.Size([7, 6])
layer3.0.bias torch.Size([7])
layer3.1.weight torch.Size([5, 7])
layer3.1.bias torch.Size([5])

模型保存|加载

1
2
3
4
5
6
# 1、加载模型+参数
net = torch.load("resnet50.pth")

# 2、已有模型,加载预训练参数
resnet50 = models.resnet50(weights=None)
resnet50.load_state_dict(torch.load("resnet58_weight.pth"))

网络的修改

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from torch import nn
from torchvision import models

alexnet = models.alexnet(weights=models.AlexNet_Weights.DEFAULT)
print(alexnet)

AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5, inplace=False)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace=True)
(3): Dropout(p=0.5, inplace=False)
(4): Linear(in_features=4096, out_features=4096, bias=True)
(5): ReLU(inplace=True)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)

修改网络结构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
# 1、-----删除网络的最后一层-----
# del alexnet.classifier
del alexnet.classifier[6]
print(alexnet)
AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5, inplace=False)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace=True)
(3): Dropout(p=0.5, inplace=False)
(4): Linear(in_features=4096, out_features=4096, bias=True)
(5): ReLU(inplace=True)
)
)

# 2、-----删除网络的最后多层-----
alexnet.classifier = alexnet.classifier[:-2]
print(alexnet)

AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5, inplace=False)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace=True)
(3): Dropout(p=0.5, inplace=False)
)
)

# 3、-----修改网络的某一层-----
alexnet.classifier[6] = nn.Linear(in_features=4096, out_features=1024)
print(alexnet)

AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5, inplace=False)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=4096, out_features=1024, bias=True)
)
)

# 4、-----网络添加层,每次添加一层-----
alexnet.classifier.add_module('7', nn.ReLU(inplace=True))
alexnet.classifier.add_module('8', nn.Linear(in_features=1024, out_features=20))
print(alexnet)

AlexNet(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
(1): ReLU(inplace=True)
(2): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(3): Conv2d(64, 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(4): ReLU(inplace=True)
(5): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(6): Conv2d(192, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(7): ReLU(inplace=True)
(8): Conv2d(384, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(9): ReLU(inplace=True)
(10): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): ReLU(inplace=True)
(12): MaxPool2d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(6, 6))
(classifier): Sequential(
(0): Dropout(p=0.5, inplace=False)
(1): Linear(in_features=9216, out_features=4096, bias=True)
(2): ReLU(inplace=True)
(3): Linear(in_features=4096, out_features=1024, bias=True)
(4): ReLU(inplace=True)
(5): Linear(in_features=1024, out_features=20, bias=True)
)
)

参数冻结

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# 任务一∶
# 1、将模型A作为backbone,修改为模型B
# 2、模型A的预训练参数加载到模型B上

resnet_modified = resnet50()
new_weights_dict = resnet_modified.state_dict()

resnet = models.resnet50(weights=models.ResNet50_Weights.DEFAULT)
weights_dict = resnet.state_dict()

for k in weights_dict.keys():
if k in new_weights_dict.keys() and not k.startswith('fc'):
new_weights_dict[k] = weights_dict[k]
resnet_modified.load_state_dict(new_weights_dict)
# resnet_modified.load_state_dict(new_weights_dict,strict=False)

# 任务二:
# 冻结与训练好的参数
params = []
train_layer = ['layer5', 'conv_end', 'bn_end']
for name, param in resnet_modified.named_parameters():
if any(name.startswith(prefix) for prefix in train_layer):
print(name)
params.append(param)
else:
param.requires_grad = False
optimizer = torch.optim.SGD(params, lr=0.001, momentum=0.9, weight_decay=5e-4)