实战:数据优化、多GPU加速与数值稳定性的最佳实践
概述等待…
实战dataset优化GPU使用率优化deepspeed多卡训练其他问题Nan识别与处理调优经验尽量复用已有的tensor 不要重复new出tensor
24年演讲材料
dc90f1df3b60080d2eb3dcafc02479661652d995cd5fc1f090a144fd679c247774a5c4ecd69ff755cf05b6ae4a0888eab659e2d40f81ce43d648f47840491387d53d7a2393c320527aff0303297756880e13d9f36a06dbae0dd698fc47aa7f7a074221c2f0d263aea5e801383dab6eb2e4e3be8aad10c69411c18a3a0c716a42094c5df822b3991b66c1e59cba9fdf04d1763f611dfc622385cd754c330392cc391ecd8153b7e70e3ae9976468cc2f6b0a6713f4e0a3894d08cccc42270b8a0260df3f4c4d8c0cb65b01566535eef7612b9c49c69d7ec139831e5757c5200e72d74323ce99278d8820298984394ecf89f994f861d33c398f7 ...
pytorch学习_进阶知识
pytorch中文文档
Tensortorch.Tensor是一种包含单一数据类型元素的多维矩阵
Torch定义了10种CPU tensor类型和GPU tensor类型:
Data type
dtype
CPU tensor
GPU tensor
32-bit floating point
torch.float32 or torch.float
torch.FloatTensor
torch.cuda.FloatTensor
64-bit floating point
torch.float64 or torch.double
torch.DoubleTensor
torch.cuda.DoubleTensor
16-bit floating point [1]
torch.float16 or torch.half
torch.HalfTensor
torch.cuda.HalfTensor
16-bit floating point [2]
torch.bfloat16
torch.BFloat16Tensor
torch.cuda.BFloat ...
LLM Tokenizer分词系列
tokenizer
hugging face Tokenizer文档
huggingface的分词器的摘要
【LLM系列之Tokenizer】如何科学地训练一个LLM分词器
概述文本分词的过程涉及将文本拆分成多个单词或子单词。接着,这些单词或子单词会被映射到特定的ID,转换过程涉及一个查找表,这是一种简单的对应关系
因此,我们的主要关注点在于解析文本为一系列的单词或子单词
更具体地说,我们将探讨🤗 Transformers库中常用的三种主要分词器类型:Byte-Pair Encoding (BPE)、WordPiece和SentencePiece,并且我们将提供实例说明哪种模型采用了哪种分词器
要了解特定预训练模型使用了哪种分词器,你可以参考每个模型主页上的文档说明,例如BertTokenizer,你会发现模型采用的是WordPiece分词器
分词例子将一段文本分词到小块是一个比它看起来更加困难的任务,并且有很多方式来实现分词,举个例子,让我们看看这个句子
1"Don't you love 🤗 Transformers? We sure do."
...
咕呱锻炼随笔
4607147b0a903f0f3382d056fda5e8e1331de1a117190adb97c315cbe95bf6d57c21957b63fc02117773afe4a51c81a4bfecb1f9e3f4f4f1da87f82a7cd8af8ded3d2ce85c1a81cd7edb675018d75fa42a12485a675c7e806b5a28c1179573c8b0b793e2006ef0b1ae180c34c8dab6d99406c2c0c41188b615a18c933268911efdebffb3928a15306fae9b8aafae42682b3c7884f06d0d9e8247eba7c4352b7a0c6f452d35d199f42932d5d45ebf6eb2744ed5178ef941c374f9836f3e5da1944d6d63e6906a12fbc22cd3fc3ec94a6ba37f5c6cd3fec3e7ee9e23b8584feafcb4b656c83c9909d17041964e0c15d3b46635e9c8f726f40a6 ...
机器学习_一元函数微积分(2)
6413a94e2cbbdfbf584bfad959c926efd3c8288bb8e221c8ccd0bf1c6e23ca904ee625b4797bb755049f8b518d6c721ea567b26049aeddf87fc5c3f23ff7dbbf0cccee00f71e0eaefc8f878f96fd82c3d96d34aab525a485e3cc64de2a83d93e05d83839d852521dc8d9181cae3f53abf28226f80252a87707cc126414738a16749ffad34beae5a148c2980a84460df0bf94211adff7d4e422906820d8cbd615cb419ae81c79deaac2de0af78910664811e4d875add999754101cb80bd4c0324674aad317437e784b324950730379b1c2337a549d015f01ad3a9909a80bdfffa901fd68606c73161fef9cbb73ec5d5170d06478471f2bce9d ...
机器学习_最优化方法(2)
6413a94e2cbbdfbf584bfad959c926ef0f8c951c4e7454a5c9686e1974d7abfca9907508558b652783ea891f0db53a3f60a0f836764a0b7cb1782f6c7f0763e0ae25cb87ff3837bd9f546198222146a515d3a2dd6f56d6d31ab8beaa7c7775689ae57b98b3fd0d5432e7f6494465693cd3f275888981c48366cd70480acc500044dc4dce37cd2d00911383115bcd7e2a4ed2469d2e5a6df64c817d64f28269ae6a62f0c292858ce7b850835d3746a431d4238d691765b8a22c2709402042f31a85d3a75dc9c3b3dc66316d637985ad14dea52109eeaf89a5cb099b841c9bd766553e7492f1788be525ec7c588de64abce4d6523e631ad43da ...
机器学习_概率论(2)
6413a94e2cbbdfbf584bfad959c926ef0f8c951c4e7454a5c9686e1974d7abfccc834bd071b7239e863879940bb028b682a1327134d0ecbef900db00faabc7cc45ee973d783f78821a25b822666c8d7d9ea28e502e012d75f9e9dc32afc8362061248179dc51a71e0574cfb8f4d813650f92554ee275877d05822dd4082e4a4a2ac81753cb9f5f8594bc3770fefb3bddc3aada97363130915c90bbf3467afcc2106b76bf6dd1020d60b41a24f6dfa314876b5376e50e289e17e973aa13e2593922656a280ddd2cbd4cb92c62f57d021f5e51a7334c23c30721d0113f135f5d652c43614ae014525604103064d9407ac8e5dae4001a8f8c179 ...
机器学习_线性代数与矩阵论(3)
6413a94e2cbbdfbf584bfad959c926ef0f8c951c4e7454a5c9686e1974d7abfc61d1b610293b293dd4cab871bea0dd68dc52763254c20581d68afe07c8e3037fc4ba472240857c844560e840a41d00e260acadc40b1f587fc6c0d938beab3bbbffa9851ac7bb25f9e120b7c6dbd32ff0d1053c3532fa8cdc7bcaf01967526a2985581a2d9556239544aaaab5bc5b4240a0bc08f2f753f6a1ce7f0c6c93ab2fa472c32036918beeecb7f92660bca3ee775dc78099d0ebc8bf7403aa165fb6ee7bbede4b71b3b373b888bd140d371c7fa6bd5737333d408effcbd27d1fa396beebf4a0b7ef7813cdfb2821a62adfe20874d766d424a17a7ffa8 ...
机器学习_线性代数与矩阵论(2)
6413a94e2cbbdfbf584bfad959c926ef0f8c951c4e7454a5c9686e1974d7abfc029441fe19ae3f5704872f1f603b685252933310f82d10e01fb85bf1a25ec11d6a2a8272cd7b53d79abc833ebaf40aee7170ce859206423cde5463b9b032a364c0dc1575799463c2f5d66dec6f6a04a0c62bab46d204d8400d6075521a6d8db6b1224c8353aa055e1dd9a5a861baf295ad12e9d5e2f9f01f4db21a7e8321379fa90d9d451c3496d2b36819eb4e588a7a082a705b3d70934fb11c4f1237965b5c2afdc07a9cc6cd7001525ec0d2ced01095dce8e8aef654000d453653aae377ffd81acf861ca98e16d2622bf1b9d6926fd28bf354c883756a3 ...