python - Timedelta operation yields wrong results -
i'm trying add column data frame indicates time difference of each rows index , fixed timestamp. data frame consists of datetimeindex , string columns.
i use
d["diff"] = d.index-t0
to calculate said time difference. due prior filtering, biggest possible diff value should between 10 , 20s. however, diffs under day (1-10s less), though actual difference 5s.
i read prior version of pandas had issues this, said long fixed.
my workaround copy index, cast int64, cast t0 int64, substract t0 rows , convert diff column timedeltas, seems extremely inefficient , ugly.
ps: happens on os x , debian 8 both using pandas 0.16.0.
edit: requested, 1 sample:
2013-12-12 13:50:48 # t0 timestamp 2013-12-16 13:50:52 4 days 00:00:04 name: diff, dtype: timedelta64[ns]
and noticed, date totally off, use indexer_between_time() indices , looked @ time, not date. more confusing.
indices = df.index.indexer_between_time(start_time=index,end_time=index+dateoffset(seconds=t_offset) )
so eventual cause of using between_time
find times in desired range. unfortunately, between_time
doesn't find times in range, finds times matching same hours of day, regardless of day (i have made same mistake before). find times in specific range, can do:
end_time = index + dateoffset(seconds=t_offset) df.index[index:end_time]
this works longs datetimeindex
monotonic/sorted, if not may want sort first.
Comments
Post a Comment