Tengo un marco de datos de 2 años de fechas e ingresos en un día determinado,

import pandas as pd
import datetime
import itertools
import time
import plotly.graph_objects as go

startDate = datetime.date(2018,1,1)
endDate = datetime.date(2019,12,31)

date_range = pd.date_range(start=startDate, end=endDate)
performance_df = pd.DataFrame(date_range)
performance_df.columns = ['Date']
performance_df = performance_df.set_index(['Date'])
performance_df['Revenue'] = [25891.84678700861, 25851.615540667623, 25037.71189951304, 26715.764965288607, 23988.35694961679, 19029.057983049668, 16935.481705163278, 22756.072913397493, 30385.672828716073, 32970.13217533639, 31089.167074855934, 24262.972414940097, 18261.273831731618, 18304.754083985797, 26297.835664941533, 32619.66940484445, 35565.26222544722, 33229.97193979324, 25405.647135516112, 19980.890374561146, 20487.553160719217, 29709.0323217462, 38164.493647661984, 39050.80114673623, 36612.554432511824, 28169.78252364524, 22086.641617812107, 21631.662705640312, 28419.94529036299, 35644.61736420142, 35829.065860994495, 32907.079142030154, 25951.247521574016, 22888.00983435945, 22582.648252027546, 30024.92542296243, 37891.251492167445, 39065.307017542225, 35326.30407697067, 28447.88908662872, 25042.500493029664, 26403.83421252776, 32380.475740928025, 36605.55089473326, 36006.56039455697, 35189.153100968004, 29780.77465095395, 24909.218739056418, 23685.75537938559, 31839.08994457272, 39061.7208522828, 39973.50309446715, 36623.93766798115, 28038.08152342491, 23004.30712890111, 22349.30571082852, 29259.27790736973, 36562.99112657728, 34942.83314919648, 30908.429691071553, 25025.293504644822, 22417.499234687977, 23235.05923247665, 30142.36658055089, 37322.22656885001, 39533.654081050176, 38817.55694852113, 32902.066818425184, 27289.215659267025, 26836.10240333383, 32714.6554385672, 40479.10841583944, 43032.710867507936, 42172.617851188385, 34581.509020848, 28796.31836571319, 28742.324093757332, 34514.108362569794, 41762.5838531726, 43526.12116978522, 39641.50434996709, 30279.354030113776, 23901.27040606382, 24482.3224100694, 34144.561683174434, 40900.767127300045, 41325.58696351466, 32836.06314047833, 28141.78555667737, 30152.48501882366, 31302.601421675394, 30866.3243266088, 35869.875330241855, 40580.31733582241, 41993.5864357607, 39319.78415151001, 35718.297902676924, 36181.313810975684, 41230.810723606075, 46047.43448330563, 47857.354050289476, 45322.56791751129, 37822.96403899934, 31215.19286619295, 29918.90585181318, 35176.29194324105, 43574.87496458186, 46034.04274563455, 43441.151399711365, 34723.430758204755, 27862.663129803153, 28567.807404056617, 37133.90907964683, 43009.01499641166, 45626.712492547886, 43142.632484355025, 33862.51562817326, 27397.539418702294, 27826.66395150021, 33130.40536093521, 39810.27565167983, 43719.59292308625, 41154.292760403885, 35925.79865016479, 32567.778590584392, 33297.526224142544, 32947.89653815421, 42798.953783336874, 47614.60147123756, 45254.62604676405, 38343.504556057334, 33179.627837955115, 33582.52982828824, 42105.00412410267, 48043.66651093638, 48837.40726130669, 48430.69364401822, 41076.68912572308, 34892.890461267096, 34935.159059467, 43160.00734032636, 52344.94145539043, 53619.43675580939, 52103.10220317212, 46364.68259720105, 43350.64074445112, 43840.939241180975, 41388.2098953964, 50023.86997334399, 53997.492172694445, 51246.8781429738, 44896.933023168574, 39424.66568855407, 39201.81892657556, 48299.32823370456, 57482.470637739105, 59523.110347325885, 55947.08067737157, 46045.719863431106, 38883.154337118904, 39475.701405236185, 49427.033216828415, 60480.43065506853, 61135.583867704656, 56363.832578283495, 46645.212577169856, 38947.471275125194, 39329.754583734706, 50567.536867536335, 60653.82734696712, 63896.50017170786, 59898.432410040055, 48184.54638977423, 41190.50886536953, 41376.579389025275, 51676.40473294583, 60212.103940879424, 61073.542419917525, 57551.1469742174, 46747.318316404155, 40323.9814761604, 41029.31091363546, 49818.65317184012, 58397.91950877408, 61499.7188209775, 58381.773792521184, 51645.68301028936, 46686.85877971279, 46284.20382322595, 53168.20709487424, 59629.08096569072, 61559.59693441008, 58214.20592954166, 50189.35908931843, 44830.32689867108, 44553.609729770156, 50596.800094530976, 57784.320856900726, 58856.24561158704, 55747.50121815237, 45909.22320056169, 39700.62514340684, 40026.08060928283, 45670.31405277475, 53989.53076463664, 56840.67697150013, 51990.04207895543, 44479.51204240872, 40155.77712289925, 39704.41534828166, 44546.13749709498, 47853.054952435254, 47513.191562263164, 44746.64958742695, 38570.66136465314, 34913.4920574524, 34960.54662729273, 38809.56679736621, 44328.06662512622, 46786.97399754649, 43176.24626069548, 38761.63401887685, 35505.42439182791, 34491.44625748903, 38371.78994694245, 43751.84248007749, 45179.535352503226, 42830.262169078655, 37807.84955152587, 33490.071062830524, 33451.981211263366, 38489.460640344005, 44884.80556430299, 48173.59627480145, 45230.34903136869, 40408.815586107376, 37482.963570560685, 37298.73472663822, 35613.14542796679, 44369.054329647, 48985.813091293036, 47004.185926539445, 40072.386470837155, 36025.40024944878, 37087.92873340035, 43829.83096193774, 53122.12039373634, 56019.219188405055, 56327.64714973823, 49299.052443800436, 43566.64775098363, 44683.28514853064, 53151.23631794093, 62692.280707330996, 64588.59942255457, 61225.42426070072, 52811.43214958245, 45888.1735729261, 46327.71392726898, 55065.63459685981, 63657.55109568991, 66424.12156809494, 64999.66733479166, 57939.99891629782, 50574.620616169435, 50558.85074509659, 58418.01819988318, 64054.52320755815, 65362.71077771696, 63212.24817914635, 53816.39717322037, 47299.69358112465, 46438.29122104288, 51606.847773139874, 58286.00492998514, 60724.299744674456, 60074.178339144695, 53563.16147623882, 47537.06339596468, 48392.602700494775, 56176.46157312282, 64575.322111131725, 66575.84159174575, 64667.37570830546, 58256.091303140376, 53478.61964952481, 55099.86601961843, 62950.88160139487, 67181.84218779847, 66728.1189827789, 64883.456569064045, 56046.87389471389, 49400.485446729814, 47508.2631477567, 50017.97093003997, 54896.861775391895, 54433.64513960683, 50522.90141287548, 40609.51437863042, 33685.99041915158, 31782.758247954887, 36404.455159974576, 40003.119259204075, 39722.56468522227, 38114.81374102996, 32394.83359664386, 27315.366586900353, 27310.107293953042, 33405.16835200959, 42120.632096858724, 43190.48348102931, 41290.86540942159, 35119.35131893462, 28756.590790603772, 28610.11081953303, 35550.66376207889, 41120.32617529186, 42589.10273496922, 42320.707348300246, 35497.925445967456, 28680.144395217914, 28433.68805319704, 34480.02122168917, 40808.10672190518, 41632.86607595266, 39212.58899530489, 32942.6873470945, 27158.723820603347, 27161.92132942049, 33334.34617535648, 38679.06248665687, 40681.03562440046, 39477.59519930245, 34513.37459740981, 28345.14667273714, 28289.697206577042, 34737.3173677138, 40574.91034815245, 40556.06688657629, 39927.322507441866, 34634.7483828078, 27666.467364275486, 27774.36383185118, 33950.84687537262, 39518.06131165054, 39587.56870083202, 38832.66031065059, 32258.462222065184, 23343.00831465727, 23914.89577468648, 28173.094174897382, 30306.555827203953, 28284.310391780135, 24228.75442600916, 20495.999364892246, 19302.93644485608, 21391.090776536974, 21072.220129100904, 19770.681250398102, 22751.205447107975, 25744.075479601106, 27119.697588885116, 28894.626077292316, 30321.364424584666, 29665.55870322018, 28601.71879337108, 28071.180317842627, 27522.026515632668, 25081.934367325113, 21303.503766392783, 18866.89154435026, 23938.585815421528, 29814.69141061624, 33132.41332574798, 32403.437424673357, 25058.826704400228, 19710.392712492816, 20359.305357642494, 28801.270965356667, 36009.139554077155, 38616.97439807099, 36667.40186755878, 28534.365261187533, 22467.35716166321, 22792.672711448053, 29707.297931805748, 37470.29566599212, 39159.19770376187, 35885.39779973568, 28198.615393579756, 22794.986958484948, 22618.6644213648, 31692.436151849328, 38546.03983274927, 38392.05968074403, 36221.937805245456, 28193.106505443564, 24431.214224965315, 24273.041173434154, 33527.551077655895, 39799.45238198621, 42167.39446971978, 38724.44521961169, 31303.32596831539, 28852.214711208617, 29328.603401220582, 38595.29805477188, 46086.742816943195, 46212.71514312982, 40873.403063823505, 34050.57262699937, 30182.502158521005, 29877.84162199769, 36061.93273317978, 42775.17967463026, 46034.16044887075, 41613.766613228676, 33678.29697864473, 28330.525176789113, 27672.35860253945, 35245.249394927574, 44232.03285856061, 45226.19144817676, 43043.919289296115, 33089.63437761405, 27591.940183796116, 27719.98729388228, 34909.70477643947, 43035.046467701366, 45301.769969577756, 42110.40663131329, 33482.256424365776, 29072.549855117086, 28488.907914432693, 37030.57030038991, 44216.431591844026, 46331.06515629943, 45488.943714074994, 37613.19663707921, 31321.72125252333, 30932.753428207085, 37772.776810934076, 45796.64013962434, 47820.36900857583, 45315.76501111126, 35448.39425262605, 29850.35013283804, 29591.159982436213, 37654.079017344375, 45495.78841466611, 47908.959672341305, 44361.65729402939, 35983.7996238946, 32075.219160365035, 33116.287276080366, 42322.98190708696, 51963.46941425464, 52271.57848952532, 52043.23599364105, 45266.807463334626, 40278.50721681852, 41783.0851672551, 49025.27275062121, 57753.777188678556, 57148.82659973791, 51158.111967429824, 40270.45350840617, 32304.431822795817, 32070.63494283356, 37752.91137050926, 42330.03164511995, 42399.727284674096, 36018.92734513604, 30062.163733450554, 29588.1255976938, 29092.231273784368, 26893.12179696699, 34929.949841169895, 42461.99046702016, 41600.38455975693, 34379.79505825023, 29348.680842359154, 30674.019184850084, 36438.884319303455, 42592.755250922644, 42370.3178916497, 42156.89423782389, 35502.775577422006, 32961.0702792199, 33247.123325205226, 32638.287197185437, 44674.20771450234, 51377.68957910347, 50811.51554085845, 40938.066497686414, 34693.61195056415, 35794.33867394901, 43608.768716936305, 53030.999471673196, 55375.512665613925, 51350.83787518498, 43515.67353483048, 36881.67381692623, 38068.73656246778, 46093.29078747657, 57268.0460688599, 58937.23137518557, 54242.503589531414, 45218.09204062047, 41196.31936112136, 42541.41388049561, 41024.48650669875, 51622.7076655413, 56013.155747602264, 52544.45187791683, 43851.12531567635, 38604.27886985339, 38620.02900914742, 47969.72289541592, 56852.747992644305, 61996.47126627607, 58805.310177720465, 47280.70456188186, 40583.34097614553, 41128.192126370894, 50193.55312608201, 59539.43834004759, 64983.864246308556, 61279.46570907009, 48338.44203029854, 40484.3484226722, 41884.39600353677, 54289.35228313394, 61250.691081910205, 64220.31437336595, 59165.983057045596, 47642.189399246454, 41115.51194818619, 41876.72414814975, 51464.90978190481, 62803.30863417166, 65385.57373290332, 60334.06981497588, 46284.58686223134, 38833.89407337377, 39698.05357331142, 47745.00067522588, 55101.85604010965, 58237.24426215552, 57764.44135307596, 50420.8551702024, 45567.72968395969, 46504.94127248667, 54349.34998968483, 60731.4138465725, 63303.91498179767, 60298.72859863081, 51704.17508237814, 45717.347342456975, 46059.70474391559, 53049.488500429776, 59306.32398907892, 62403.524245874905, 59696.274176988816, 48684.76124917582, 41835.26632192022, 42430.54555033094, 50076.369117078095, 56940.56608534447, 58025.69842660899, 55038.43675221109, 44884.821827167245, 38939.787615900466, 38979.59802389587, 44663.55176765817, 50517.54060893932, 50345.85708898577, 47988.558150172845, 40853.4057042814, 38165.8415118995, 38648.876806740445, 41908.58157309806, 47401.57907842861, 49481.04013850385, 48095.25972992769, 41284.361139943714, 37854.4480306016, 37546.53779346516, 41603.843693113595, 47820.91079534329, 47246.476111540374, 44466.81067665845, 39450.05564794844, 37010.325669384845, 36654.85544497009, 40353.082727944166, 46185.06388406285, 47580.77496864459, 46665.410664712144, 40791.501556263196, 38355.37341411503, 37635.23471064154, 36758.08397128221, 45985.88943874199, 49995.80097644232, 47975.90718646997, 41429.81245724381, 37282.35082487397, 37137.63721104596, 42201.7547964592, 49822.27312845655, 54534.49980287231, 54533.77882552861, 47481.81951410114, 42055.08936215232, 43879.59319217621, 50025.5866728274, 57779.83356382646, 62229.99530852202, 60656.63284916041, 53322.422185201416, 46947.596440932735, 48157.11595496144, 55485.4195651247, 65417.97612369258, 69959.13505419965, 68457.41761185907, 58008.38472436904, 49036.129043870984, 48812.32078456241, 55770.81148868941, 62578.40993961545, 64911.32653171455, 63452.51661006169, 56211.79640881274, 49058.582679465755, 49695.71775694223, 54647.5259012805, 59928.6898365422, 54631.05724966697, 49250.29905728087, 40856.97377954754, 36266.89995539015, 35517.15307352361, 34184.53353052385, 33371.86128513576, 31969.554327793263, 30641.723523056826, 28778.69553058833, 27946.73958770959, 27509.81466494675, 26662.70133275635, 26476.786252324444, 24811.176812171496, 23734.407940954658, 21813.02876068876, 19691.25151218245, 17829.182468339233, 15717.002576485738, 14676.555091672217, 13659.528312206226, 12944.24059105674, 11821.33696211924, 10574.619947022518, 9870.736391613336, 8602.998647942173, 8088.018686973378, 7668.737915280734, 7409.212563574305, 7059.414255782104, 6805.311671542087, 6265.577428451043, 5336.939303601046, 5036.213630383211, 4759.937178041844, 5002.634961970787, 5377.02368302538, 5080.424215798721, 4864.875421529681, 4236.1512408381295, 4106.63390017699, 4042.2341857847173, 4197.684686131461, 4474.035273523109, 4490.267528124243, 4241.475197902689, 3589.2927029890307, 3444.298594012123, 3107.614908066812, 3426.8278986551327, 3828.278281872126, 3941.803702165574, 3732.7576911047563, 3225.113307797762, 3072.114284546333, 3150.7921285990747, 3233.947689115616, 3619.052387506853, 3551.5976360602217, 3435.0323606968946, 2973.892113333866, 2674.3168644471334, 2460.238549245212, 2800.034199553288, 3195.2623366970715, 3107.693557143192, 2961.140811696215, 2340.478044336084, 1931.373924738195, 1847.1236388756024, 2179.3253334294473, 2438.0353828364327, 2379.86512657921, 2255.5989701513035, 1870.926018202182, 1620.6083820631166, 1511.3110067191255, 1685.9676158651248, 2099.8497631541054, 2430.5076190841487, 2701.9755190700494, 2997.6300510919987, 2977.3472468469777, 2893.576703677185, 3207.4670223153535, 3539.6180497104797, 3534.997475538599, 3527.721146658791, 3489.9154298884955, 3287.543337245921]
print(performance_df)

Necesito acceder a los ingresos del año pasado e imprimirlo junto al año actual en una nueva columna de df para poder comparar el rendimiento año tras año. Mi salida deseada se vería así

                 Revenue .  LY Revenue
Date                    
2018-01-01  25891.846787 .   Nan
2018-01-02  25851.615541 .   Nan
2018-01-03  25037.711900 .   Nan
2018-01-04  26715.764965 .   Nan
2018-01-05  23988.356950 .   Nan
...                  ...
2019-12-27   3539.618050 .  25744.075480
2019-12-28   3534.997476 .  27119.697589
2019-12-29   3527.721147 .  28894.626077
2019-12-30   3489.915430 .  30321.364425
2019-12-31   3287.543337 .  29665.558703

¿Cómo vas a lograr esto? Hasta ahora, he podido obtener la fecha del año pasado del índice:

performance_df['Last year dates'] = (performance_df['Revenue'].index - pd.Timedelta(days=365))

Pero quiero los ingresos correspondientes para esa fecha, no solo la fecha en sí misma.

0
miketheking 3 oct. 2019 a las 18:09

4 respuestas

La mejor respuesta

IIUC, necesitas esto.

Esto funciona solo cuando tiene fecha y hora como índice. Lo que estamos haciendo aquí es agrupar por día y mes usando el valor de fecha y hora e incluso si las fechas están entre el año bisiesto y el año normal, esto debería funcionar.

performance_df['LY_Revenue'] = performance_df.groupby([performance_df.index.month,performance_df.index.day])['Revenue'].shift()
print(performance_df)

Salida

                Revenue     LY_Revenue
Date        
2018-01-01  25891.846787    NaN
2018-01-02  25851.615541    NaN
2018-01-03  25037.711900    NaN
2018-01-04  26715.764965    NaN
2018-01-05  23988.356950    NaN
2018-01-06  19029.057983    NaN
2018-01-07  16935.481705    NaN
2018-01-08  22756.072913    NaN
2018-01-09  30385.672829    NaN
2018-01-10  32970.132175    NaN
2018-01-11  31089.167075    NaN
2018-01-12  24262.972415    NaN
2018-01-13  18261.273832    NaN
2018-01-14  18304.754084    NaN
2018-01-15  26297.835665    NaN
2018-01-16  32619.669405    NaN
2018-01-17  35565.262225    NaN
2018-01-18  33229.971940    NaN
2018-01-19  25405.647136    NaN
2018-01-20  19980.890375    NaN
2018-01-21  20487.553161    NaN
2018-01-22  29709.032322    NaN
2018-01-23  38164.493648    NaN
2018-01-24  39050.801147    NaN
2018-01-25  36612.554433    NaN
2018-01-26  28169.782524    NaN
2018-01-27  22086.641618    NaN
2018-01-28  21631.662706    NaN
2018-01-29  28419.945290    NaN
2018-01-30  35644.617364    NaN
...     ...     ...
2019-12-02  2973.892113     28289.697207
2019-12-03  2674.316864     34737.317368
2019-12-04  2460.238549     40574.910348
2019-12-05  2800.034200     40556.066887
2019-12-06  3195.262337     39927.322507
2019-12-07  3107.693557     34634.748383
2019-12-08  2961.140812     27666.467364
2019-12-09  2340.478044     27774.363832
2019-12-10  1931.373925     33950.846875
2019-12-11  1847.123639     39518.061312
2019-12-12  2179.325333     39587.568701
2019-12-13  2438.035383     38832.660311
2019-12-14  2379.865127     32258.462222
2019-12-15  2255.598970     23343.008315
2019-12-16  1870.926018     23914.895775
2019-12-17  1620.608382     28173.094175
2019-12-18  1511.311007     30306.555827
2019-12-19  1685.967616     28284.310392
2019-12-20  2099.849763     24228.754426
2019-12-21  2430.507619     20495.999365
2019-12-22  2701.975519     19302.936445
2019-12-23  2997.630051     21391.090777
2019-12-24  2977.347247     21072.220129
2019-12-25  2893.576704     19770.681250
2019-12-26  3207.467022     22751.205447
2019-12-27  3539.618050     25744.075480
2019-12-28  3534.997476     27119.697589
2019-12-29  3527.721147     28894.626077
2019-12-30  3489.915430     30321.364425
2019-12-31  3287.543337     29665.558703
0
moys 3 oct. 2019 a las 15:23

Dado que sus datos están indexados en el tiempo, puede cambiar con freq

performance_df['LY Revenue'] = performance_df.Revenue.shift(freq='365d')

Salida:

            Revenue  LY Revenue
Date                           
2018-01-01 25891.8%        nan%
2018-01-02 25851.6%        nan%
2018-01-03 25037.7%        nan%
2018-01-04 26715.8%        nan%
2018-01-05 23988.4%        nan%
2018-01-06 19029.1%        nan%
2018-01-07 16935.5%        nan%
2018-01-08 22756.1%        nan%
2018-01-09 30385.7%        nan%
...
2019-12-21  2430.5%    20496.0%
2019-12-22  2702.0%    19302.9%
2019-12-23  2997.6%    21391.1%
2019-12-24  2977.3%    21072.2%
2019-12-25  2893.6%    19770.7%
2019-12-26  3207.5%    22751.2%
2019-12-27  3539.6%    25744.1%
2019-12-28  3535.0%    27119.7%
2019-12-29  3527.7%    28894.6%
2019-12-30  3489.9%    30321.4%
2019-12-31  3287.5%    29665.6%

Pero tenga en cuenta que 365D no es necesariamente un año en general.

0
Quang Hoang 3 oct. 2019 a las 15:17

Difícil de confirmar sin los datos, pero mis pensamientos están usando .loc

pdf = performance_df
pdf.loc['2018-xx-xx','Revenue'] - pdf.loc['2019-xx-xx','Revenue']

Puede ser capaz de usar simplemente:

pdf.loc['2018','Revenue'] - pdf.loc['2019','Revenue']

Por supuesto, también puede simplemente dividir los años en df y luego restar / dividir

pdf18 = pdf.loc['2018-01-01':'2019-01-01','Revenue'] 
pdf19 = pdf.loc['2019-01-01':'2020-01-01','Revenue']
pdf_diff = pdf18 - pdf19  
# = (pdf19-pdf18)/pdf18
#Or slice just year, then subtract pdf18['Revenue'] - pdf19['Revenue']

Esto puede tener problemas porque la forma [0] no coincidirá (no todo el año), así que asegúrese de que estén alineados.

0
Zach Oakes 3 oct. 2019 a las 15:28

Utilice Series.shift:

performance_df['LY Revenue']=performance_df['Revenue'].shift(365)
print(performance_df)


            Revenue  LY Revenue
Date                           
2018-01-01 25891.8%        nan%
2018-01-02 25851.6%        nan%
2018-01-03 25037.7%        nan%
2018-01-04 26715.8%        nan%
2018-01-05 23988.4%        nan%
...             ...         ...
2019-12-27  3539.6%    25744.1%
2019-12-28  3535.0%    27119.7%
2019-12-29  3527.7%    28894.6%
2019-12-30  3489.9%    30321.4%
2019-12-31  3287.5%    29665.6%

[730 rows x 2 columns]

Aquí puede ver el comienzo del año 2019:

print(performance_df[364:366])



            Revenue  LY Revenue
Date                           
2018-12-31 29665.6%        nan%
2019-01-01 28601.7%    25891.8%
1
ansev 3 oct. 2019 a las 15:26
58222015