Normalize time series, reshape with R -
i have dataset, similar 1 below, , still want modify things before reshaping long , using time series. reduced version of data:
rem2008 <- rnorm(30, 10, 5) rem2009 <- rnorm(30, 8, 3) rem2010 <- rnorm(30, 23, 3) id <- sample( letters[1:30], 30) currency <- sample( letters[1:2], 30, replace= true) a2008 <- rnorm(30, 357, 5) a2009 <- rnorm(30, 357, 5) a2010 <- rnorm(30, 357, 5) b2008 <- rnorm(30, 1500, 5) b2009 <- rnorm(30, 1500, 5) b2010 <- rnorm(30, 1500, 5) data <- cbind(currency,id, rem2008, rem2009, rem2010, a2008, a2009, a2010, b2008, b2009, b2010)
first of want "normalize" exchanges rates , remittances (all variables starting with: a*, b* ,rem*,), , generate "treatment" column relevant exchange rate:
a2009 = a2009/2008 a2010 = a2010/2008 treat2008[currency==a] = a2008 treat2008[currency==b] = b2008
ideally, after melting end 4 columns: id, rem, year, currency, treat.
what best way of achieving this? can in stata couple of foreach loops , reshape long, directly r, instead of importing , exporting data time. should operations before reshaping, or there command used in long format trick?
i'm not 100% sure if i've got requirement correctly, let me know if want:
transform(with(list(w=c('rem','a','b')),reshape(data,dir='l',varying=lapply(w,function(pre) grep(paste0('^',pre,'\\d+$'),names(data))),v.names=w,timevar='year',times=c(2008,2009,2010))),treat=ifelse(currency=='a',a,b),a=null,b=null,id=null); ## currency id year rem treat ## 1.2008 j 2008 1.3812811 370.4443 ## 2.2008 x 2008 12.5763558 354.9506 ## 3.2008 m 2008 10.6965128 356.8754 ## 4.2008 <na> 2008 11.3767042 351.8991 ## 5.2008 2008 2.9932609 365.7032 ## 6.2008 b f 2008 8.8211956 1497.4942 ## 7.2008 b k 2008 5.5928628 1495.0110 ## 8.2008 b t 2008 10.7754368 1495.1801 ## 9.2008 <na> 2008 3.7113704 360.5038 ## 10.2008 w 2008 11.4948061 359.6662 ## 11.2008 2008 13.6957023 360.6984 ## 12.2008 p 2008 0.6494516 350.5083 ## 13.2008 y 2008 5.9079508 351.2960 ## 14.2008 c 2008 5.5337997 362.8112 ## 15.2008 v 2008 5.9769623 359.5147 ## 16.2008 h 2008 15.8466935 356.9736 ## 17.2008 b g 2008 11.4490855 1501.6727 ## 18.2008 u 2008 19.8888661 362.6460 ## 19.2008 b e 2008 4.3677369 1508.0391 ## 20.2008 b d 2008 18.5607637 1505.0454 ## 21.2008 r 2008 9.6847677 357.0053 ## 22.2008 <na> 2008 6.8128572 356.8416 ## 23.2008 q 2008 2.7850994 352.5719 ## 24.2008 n 2008 16.2758518 362.7819 ## 25.2008 b o 2008 8.9010772 1488.3458 ## 26.2008 b 2008 7.0311623 357.1095 ## 27.2008 z 2008 13.3984208 360.6529 ## 28.2008 l 2008 7.8227800 352.9813 ## 29.2008 <na> 2008 14.4264659 350.2792 ## 30.2008 s 2008 16.6254484 355.3690 ## 1.2009 j 2009 10.3847732 357.6774 ## 2.2009 x 2009 8.6973661 359.1131 ## 3.2009 m 2009 12.0156849 357.1485 ## 4.2009 <na> 2009 1.3379483 352.8424 ## 5.2009 2009 8.3602428 354.3114 ## 6.2009 b f 2009 6.4862632 1499.4552 ## 7.2009 b k 2009 8.1214409 1504.2514 ## 8.2009 b t 2009 10.4958449 1492.5220 ## 9.2009 <na> 2009 6.5139695 364.0226 ## 10.2009 w 2009 13.1565171 361.7706 ## 11.2009 2009 7.3096138 354.9056 ## 12.2009 p 2009 7.5492308 356.2868 ## 13.2009 y 2009 7.5033149 355.2381 ## 14.2009 c 2009 5.6782097 360.6068 ## 15.2009 v 2009 3.7370571 361.0948 ## 16.2009 h 2009 8.8469938 354.0782 ## 17.2009 b g 2009 8.1174960 1494.2991 ## 18.2009 u 2009 9.2063056 350.2426 ## 19.2009 b e 2009 8.0788092 1507.2134 ## 20.2009 b d 2009 6.6348056 1498.3001 ## 21.2009 r 2009 7.9650947 353.1720 ## 22.2009 <na> 2009 3.5795757 354.2822 ## 23.2009 q 2009 8.5598213 352.8076 ## 24.2009 n 2009 6.9101325 350.0534 ## 25.2009 b o 2009 8.7567846 1507.4454 ## 26.2009 b 2009 7.5298550 360.0886 ## 27.2009 z 2009 7.5084281 361.7335 ## 28.2009 l 2009 10.0847608 367.7977 ## 29.2009 <na> 2009 6.4676085 358.1437 ## 30.2009 s 2009 10.9973672 354.5373 ## 1.2010 j 2010 22.9083834 360.8241 ## 2.2010 x 2010 21.2021885 362.2936 ## 3.2010 m 2010 23.8105198 356.7900 ## 4.2010 <na> 2010 24.1970138 367.8831 ## 5.2010 2010 20.8117066 365.3983 ## 6.2010 b f 2010 24.1926874 1497.4581 ## 7.2010 b k 2010 18.4149211 1492.5715 ## 8.2010 b t 2010 18.6693716 1497.6254 ## 9.2010 <na> 2010 27.6525893 364.0436 ## 10.2010 w 2010 23.3530057 353.5919 ## 11.2010 2010 23.4712772 359.9814 ## 12.2010 p 2010 22.2459282 353.9465 ## 13.2010 y 2010 24.0533707 349.6421 ## 14.2010 c 2010 22.3999464 357.0695 ## 15.2010 v 2010 19.3420145 360.2639 ## 16.2010 h 2010 20.4927189 359.7127 ## 17.2010 b g 2010 22.6278613 1498.9522 ## 18.2010 u 2010 24.4046376 352.0437 ## 19.2010 b e 2010 19.8161101 1498.2641 ## 20.2010 b d 2010 28.4359775 1492.3291 ## 21.2010 r 2010 23.0095938 364.6818 ## 22.2010 <na> 2010 20.6857922 363.6210 ## 23.2010 q 2010 26.9920258 364.8050 ## 24.2010 n 2010 25.0578899 356.2132 ## 25.2010 b o 2010 22.1752929 1505.5753 ## 26.2010 b 2010 19.0098126 355.1049 ## 27.2010 z 2010 24.5348854 351.4599 ## 28.2010 l 2010 29.2015909 354.3519 ## 29.2010 <na> 2010 24.4550315 358.4514 ## 30.2010 s 2010 23.1435648 365.8588
and normalization requirement, if capture above result long
, can use following, although doesn't rows id=na (you're going have fill in missing key values data make sense), , reorders frame:
transform(merge(long,na.omit(subset(long,year==2008)[,c('id','currency','rem','treat')]),c('id','currency'),all.x=t),rem=rem.x/rem.y,treat=treat.x/treat.y,rem.x=null,rem.y=null,treat.x=null,treat.y=null); ## id currency year rem treat ## 1 2010 6.9528542 0.9991664 ## 2 2008 1.0000000 1.0000000 ## 3 2009 2.7930218 0.9688498 ## 4 b 2008 1.0000000 1.0000000 ## 5 b 2009 1.0709261 1.0083424 ## 6 b 2010 2.7036515 0.9943865 ## 7 c 2010 4.0478419 0.9841746 ## 8 c 2008 1.0000000 1.0000000 ## 9 c 2009 1.0260960 0.9939242 ## 10 d b 2008 1.0000000 1.0000000 ## 11 d b 2009 0.3574640 0.9955182 ## 12 d b 2010 1.5320478 0.9915509 ## 13 e b 2008 1.0000000 1.0000000 ## 14 e b 2009 1.8496556 0.9994525 ## 15 e b 2010 4.5369285 0.9935181 ## 16 f b 2008 1.0000000 1.0000000 ## 17 f b 2009 0.7353043 1.0013095 ## 18 f b 2010 2.7425633 0.9999759 ## 19 g b 2008 1.0000000 1.0000000 ## 20 g b 2009 0.7090082 0.9950897 ## 21 g b 2010 1.9763903 0.9981883 ## 22 h 2008 1.0000000 1.0000000 ## 23 h 2009 0.5582864 0.9918890 ## 24 h 2010 1.2931858 1.0076730 ## 25 2008 1.0000000 1.0000000 ## 26 2009 0.5337159 0.9839401 ## 27 2010 1.7137695 0.9980122 ## 28 j 2008 1.0000000 1.0000000 ## 29 j 2010 16.5848818 0.9740304 ## 30 j 2009 7.5182186 0.9655361 ## 31 k b 2008 1.0000000 1.0000000 ## 32 k b 2009 1.4521080 1.0061808 ## 33 k b 2010 3.2925752 0.9983682 ## 34 l 2008 1.0000000 1.0000000 ## 35 l 2009 1.2891531 1.0419753 ## 36 l 2010 3.7328917 1.0038831 ## 37 m 2008 1.0000000 1.0000000 ## 38 m 2009 1.1233273 1.0007651 ## 39 m 2010 2.2260077 0.9997607 ## 40 n 2008 1.0000000 1.0000000 ## 41 n 2009 0.4245635 0.9649144 ## 42 n 2010 1.5395747 0.9818937 ## 43 o b 2008 1.0000000 1.0000000 ## 44 o b 2009 0.9837893 1.0128328 ## 45 o b 2010 2.4913044 1.0115763 ## 46 p 2008 1.0000000 1.0000000 ## 47 p 2009 11.6240089 1.0164859 ## 48 p 2010 34.2534058 1.0098093 ## 49 q 2008 1.0000000 1.0000000 ## 50 q 2009 3.0734347 1.0006686 ## 51 q 2010 9.6915843 1.0346968 ## 52 r 2008 1.0000000 1.0000000 ## 53 r 2009 0.8224353 0.9892627 ## 54 r 2010 2.3758540 1.0215027 ## 55 s 2008 1.0000000 1.0000000 ## 56 s 2009 0.6614779 0.9976596 ## 57 s 2010 1.3920566 1.0295179 ## 58 t b 2008 1.0000000 1.0000000 ## 59 t b 2009 0.9740528 0.9982222 ## 60 t b 2010 1.7325861 1.0016354 ## 61 u 2009 0.4628874 0.9657975 ## 62 u 2008 1.0000000 1.0000000 ## 63 u 2010 1.2270502 0.9707642 ## 64 v 2008 1.0000000 1.0000000 ## 65 v 2009 0.6252435 1.0043951 ## 66 v 2010 3.2360945 1.0020838 ## 67 w 2008 1.0000000 1.0000000 ## 68 w 2009 1.1445619 1.0058508 ## 69 w 2010 2.0316137 0.9831111 ## 70 x 2008 1.0000000 1.0000000 ## 71 x 2009 0.6915649 1.0117269 ## 72 x 2010 1.6858770 1.0206874 ## 73 y 2009 1.2700368 1.0112218 ## 74 y 2008 1.0000000 1.0000000 ## 75 y 2010 4.0713559 0.9952919 ## 76 z 2008 1.0000000 1.0000000 ## 77 z 2009 0.5603965 1.0029960 ## 78 z 2010 1.8311774 0.9745101 ## 79 <na> 2008 na na ## 80 <na> 2008 na na ## 81 <na> 2008 na na ## 82 <na> 2008 na na ## 83 <na> 2009 na na ## 84 <na> 2009 na na ## 85 <na> 2009 na na ## 86 <na> 2009 na na ## 87 <na> 2010 na na ## 88 <na> 2010 na na ## 89 <na> 2010 na na ## 90 <na> 2010 na na
Comments
Post a Comment