python - Timedelta operation yields wrong results -


i'm trying add column data frame indicates time difference of each rows index , fixed timestamp. data frame consists of datetimeindex , string columns.

i use

 d["diff"] = d.index-t0 

to calculate said time difference. due prior filtering, biggest possible diff value should between 10 , 20s. however, diffs under day (1-10s less), though actual difference 5s.

i read prior version of pandas had issues this, said long fixed.

my workaround copy index, cast int64, cast t0 int64, substract t0 rows , convert diff column timedeltas, seems extremely inefficient , ugly.

ps: happens on os x , debian 8 both using pandas 0.16.0.

edit: requested, 1 sample:

2013-12-12 13:50:48 # t0 timestamp 2013-12-16 13:50:52   4 days 00:00:04 name: diff, dtype: timedelta64[ns] 

and noticed, date totally off, use indexer_between_time() indices , looked @ time, not date. more confusing.

indices = df.index.indexer_between_time(start_time=index,end_time=index+dateoffset(seconds=t_offset) )    

so eventual cause of using between_time find times in desired range. unfortunately, between_time doesn't find times in range, finds times matching same hours of day, regardless of day (i have made same mistake before). find times in specific range, can do:

end_time = index + dateoffset(seconds=t_offset) df.index[index:end_time] 

this works longs datetimeindex monotonic/sorted, if not may want sort first.


Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -