Learning: Auto regressive transformer architecture
torch.Size([1003854]) torch.Size([111540]) step: 0, train_loss: 4.748778820037842, val_loss: 4.714517593383789
step: 100, train_loss: 2.3903167247772217, val_loss: 2.385220766067505
step: 200, train_loss: 2.092250108718872, val_loss: 2.1435768604278564
step: 300, train_loss: 1.9623249769210815, val_loss: 1.916853666305542
step: 400, train_loss: 1.8289004564285278, val_loss: 1.8219395875930786
step: 500, train_loss: 1.7200345993041992, val_loss: 1.751213788986206
step: 600, train_loss: 1.7356281280517578, val_loss: 1.6741344928741455
step: 700, train_loss: 1.669792890548706, val_loss: 1.6438542604446411
step: 800, train_loss: 1.604638934135437, val_loss: 1.5747987031936646
step: 900, train_loss: 1.6224688291549683, val_loss: 1.5837516784667969
step: 1000, train_loss: 1.5419563055038452, val_loss: 1.5516916513442993
step: 1100, train_loss: 1.5330239534378052, val_loss: 1.5537033081054688
step: 1200, train_loss: 1.5029696226119995, val_loss: 1.5563535690307617
step: 1300, train_loss: 1.4886131286621094, val_loss: 1.475431203842163
step: 1400, train_loss: 1.5073238611221313, val_loss: 1.488280177116394
step: 1500, train_loss: 1.4790602922439575, val_loss: 1.4852876663208008
step: 1600, train_loss: 1.4555158615112305, val_loss: 1.4525682926177979
step: 1700, train_loss: 1.4737730026245117, val_loss: 1.4860968589782715
step: 1800, train_loss: 1.4608075618743896, val_loss: 1.4416084289550781
step: 1900, train_loss: 1.44414484500885, val_loss: 1.4767403602600098
step: 2000, train_loss: 1.3886847496032715, val_loss: 1.469650387763977
step: 2100, train_loss: 1.3949044942855835, val_loss: 1.4103174209594727
step: 2200, train_loss: 1.4361478090286255, val_loss: 1.4105340242385864
step: 2300, train_loss: 1.437085509300232, val_loss: 1.3743518590927124
step: 2400, train_loss: 1.4046990871429443, val_loss: 1.4338154792785645
step: 2500, train_loss: 1.4029814004898071, val_loss: 1.4249279499053955
step: 2600, train_loss: 1.3771721124649048, val_loss: 1.4320268630981445
step: 2700, train_loss: 1.4096148014068604, val_loss: 1.3782976865768433
step: 2800, train_loss: 1.3926002979278564, val_loss: 1.378352403640747
step: 2900, train_loss: 1.3778668642044067, val_loss: 1.3619288206100464
step: 3000, train_loss: 1.3635691404342651, val_loss: 1.388641119003296
step: 3100, train_loss: 1.3466562032699585, val_loss: 1.3953568935394287
step: 3200, train_loss: 1.3403244018554688, val_loss: 1.3811506032943726
step: 3300, train_loss: 1.3469452857971191, val_loss: 1.4045485258102417
step: 3400, train_loss: 1.3725504875183105, val_loss: 1.363333821296692
step: 3500, train_loss: 1.3027948141098022, val_loss: 1.3362812995910645
step: 3600, train_loss: 1.3611921072006226, val_loss: 1.3206337690353394
step: 3700, train_loss: 1.3972643613815308, val_loss: 1.3262919187545776
step: 3800, train_loss: 1.2789582014083862, val_loss: 1.2885440587997437
step: 3900, train_loss: 1.350998044013977, val_loss: 1.3495311737060547
step: 4000, train_loss: 1.3521102666854858, val_loss: 1.3244529962539673
step: 4100, train_loss: 1.292658805847168, val_loss: 1.3285400867462158
step: 4200, train_loss: 1.334346890449524, val_loss: 1.3347630500793457
step: 4300, train_loss: 1.3297011852264404, val_loss: 1.3277519941329956
step: 4400, train_loss: 1.3203024864196777, val_loss: 1.2987546920776367
step: 4500, train_loss: 1.3363896608352661, val_loss: 1.3728227615356445
step: 4600, train_loss: 1.3244911432266235, val_loss: 1.3296911716461182
step: 4700, train_loss: 1.2891995906829834, val_loss: 1.334154486656189
step: 4800, train_loss: 1.3093699216842651, val_loss: 1.282935380935669
step: 4900, train_loss: 1.3010748624801636, val_loss: 1.304878830909729 1.342018485069275 generating text... torch.Size([1, 501]) [0, 0, 16, 33, 23, 17, 1, 34, 21, 26, 15, 17, 26, 32, 21, 27, 10, 0, 21, 1, 41, 39, 50, 51, 1, 39, 42, 60, 43, 56, 42, 47, 41, 58, 47, 53, 52, 6, 1, 40, 39, 52, 42, 57, 1, 39, 52, 42, 1, 45, 56, 43, 39, 58, 1, 39, 1, 41, 39, 52, 53, 40, 50, 43, 0, 21, 57, 1, 57, 53, 51, 43, 1, 50, 43, 39, 56, 52, 43, 42, 1, 61, 47, 58, 46, 47, 50, 6, 1, 40, 63, 0, 58, 46, 47, 57, 1, 57, 46, 53, 59, 50, 42, 1, 46, 39, 60, 43, 1, 39, 58, 1, 64, 51, 59, 50, 58, 43, 58, 46, 8, 0, 0, 23, 21, 26, 19, 1, 20, 17, 26, 30, 37, 1, 34, 21, 10, 0, 16, 53, 1, 63, 53, 59, 1, 51, 43, 1, 58, 46, 63, 1, 56, 47, 45, 46, 58, 44, 59, 50, 1, 63, 43, 39, 56, 57, 1, 42, 47, 42, 1, 56, 43, 59, 45, 52, 11, 0, 35, 46, 53, 57, 43, 1, 54, 59, 56, 54, 53, 57, 43, 1, 58, 46, 47, 41, 46, 1, 53, 59, 58, 1, 44, 43, 43, 50, 5, 42, 1, 47, 52, 1, 56, 43, 51, 43, 51, 40, 57, 7, 7, 63, 53, 59, 56, 57, 1, 44, 56, 43, 57, 46, 1, 61, 47, 58, 46, 1, 41, 46, 47, 50, 42, 2, 1, 44, 53, 56, 1, 58, 46, 39, 58, 0, 51, 63, 1, 46, 43, 39, 56, 58, 57, 6, 1, 39, 52, 42, 1, 61, 43, 1, 61, 47, 50, 50, 1, 57, 58, 56, 47, 60, 43, 6, 1, 39, 50, 50, 1, 57, 43, 60, 43, 52, 8, 0, 26, 44, 6, 1, 61, 46, 43, 52, 1, 61, 47, 50, 50, 1, 51, 63, 57, 43, 50, 44, 1, 57, 46, 39, 50, 50, 1, 40, 43, 1, 58, 61, 43, 50, 60, 43, 10, 0, 32, 53, 59, 41, 46, 1, 45, 53, 1, 58, 53, 1, 61, 47, 52, 1, 58, 46, 43, 1, 13, 52, 58, 46, 53, 53, 42, 6, 1, 52, 53, 58, 1, 40, 63, 1, 47, 52, 6, 0, 21, 58, 1, 47, 57, 1, 52, 53, 58, 1, 39, 57, 49, 8, 0, 0, 31, 43, 41, 53, 52, 42, 1, 19, 43, 52, 58, 50, 43, 51, 39, 52, 10, 0, 13, 63, 6, 1, 58, 46, 43, 1, 54, 53, 41, 43, 43, 52, 58, 47, 52, 45, 1, 61, 47, 60, 43, 57, 1, 46, 39, 58, 46, 1, 57, 58, 56, 39, 52, 45, 43, 6, 0, 37, 53, 59, 1, 41, 46, 52, 53, 61, 6, 1, 47, 52, 5, 58, 1, 51, 63, 1, 54, 47, 56, 58, 47, 52, 45, 1, 53, 5, 43, 56, 40, 53, 56, 52, 0, 13, 52, 42, 1, 45, 47, 60, 43, 1, 39, 52, 42, 1, 41, 53, 51, 43, 1]
DUKE VINCENTIO: I calm adverdiction, bands and great a canoble Is some learned withil, by this should have at zmulteth.
KING HENRY VI: Do you me thy rightful years did reugn; Whose purpose thich out feel'd in remembs--yours fresh with child! for that my hearts, and we will strive, all seven. Nf, when will myself shall be twelve: Touch go to win the Anthood, not by in, It is not ask.
Second Gentleman: Ay, the poceenting wives hath strange, You chnow, in't my pirting o'erborn And give and come saving model...