python - Subsetting a Pandas series -
i have pandas series. 1 specific row of pandas data frame.
name: ny.gdp.pcap.kd.zg, dtype: int64 ny.gdp.defl.zs_logdiff 0.341671 ny.gdp.disc.cn 0.078261 ny.gdp.disc.kn 0.083890 ny.gdp.frst.rt.zs 0.296574 ny.gdp.minr.rt.zs 0.264811 ny.gdp.mktp.cd_logdiff 0.522725 ny.gdp.mktp.cn_logdiff 0.884601 ny.gdp.mktp.kd_logdiff 0.990679 ny.gdp.mktp.kd.zg 0.992603 ny.gdp.mktp.kn_logdiff -0.077253 ny.gdp.mktp.pp.cd_logdiff 0.856861 ny.gdp.mktp.pp.kd_logdiff 0.990679 ny.gdp.ngas.rt.zs -0.018126 ny.gdp.pcap.cd_logdiff 0.523433 ny.gdp.pcap.kd.zg 1.000000 ny.gdp.pcap.kn_logdiff 0.999456 ny.gdp.pcap.pp.cd_logdff 0.857321 ny.gdp.pcap.pp.kd_logdiff 0.999456
the first column index find in series. want these index names in list such index should come absolute value in right column less 0.5. give context series row corresponding variable ny.gdp.pcap.kd.zg in correlation matrix , want retain variable along variables have correlation less 0.5 variable. rest variables drop dataframe
currently keeps nan
print(tourism[columns].corr().ix[14].where(np.absolute(tourism[columns].corr().ix[14]<0.5)))
where tourism data frame , columns set of columns on did correlation analysis , 14 row in correlation matrix corresponding column mentioned above
gives:
ny.gdp.defl.zs_logdiff 0.341671 ny.gdp.disc.cn 0.078261 ny.gdp.disc.kn 0.083890 ny.gdp.frst.rt.zs 0.296574 ny.gdp.minr.rt.zs 0.264811 ny.gdp.mktp.cd_logdiff nan ny.gdp.mktp.cn_logdiff nan ny.gdp.mktp.kd_logdiff nan ny.gdp.mktp.kd.zg nan ny.gdp.mktp.kn_logdiff -0.077253 ny.gdp.mktp.pp.cd_logdiff nan ny.gdp.mktp.pp.kd_logdiff nan ny.gdp.ngas.rt.zs -0.018126 ny.gdp.pcap.cd_logdiff nan ny.gdp.pcap.kd.zg nan ny.gdp.pcap.kn_logdiff nan ny.gdp.pcap.pp.cd_logdff nan ny.gdp.pcap.pp.kd_logdiff nan name: ny.gdp.pcap.kd.zg, dtype: float64
if x
series, then:
x[x.abs() < 0.5].index
Comments
Post a Comment