Murmur3 hash different result between Python and Java implementation -
i have 2 different program wish hash same string using murmur3 in python , java respectively.
python version 2.7.9:
mmh3.hash128('abc')
gives 79267961763742113019008347020647561319l.
java guava 18.0:
hashcode hashcode = hashing.murmur3_128().newhasher().putstring("abc", standardcharsets.utf_8).hash();
gives string "6778ad3f3f3f96b4522dca264174a23b", converting biginterger gives 137537073056680613988840834069010096699.
how same result both?
thanks
here's how same result both:
byte[] mm3_le = hashing.murmur3_128().hashstring("abc", utf_8).asbytes(); byte[] mm3_be = bytes.toarray(lists.reverse(bytes.aslist(mm3_le))); assertequals("79267961763742113019008347020647561319", new biginteger(mm3_be).tostring());
the hash code's bytes need treated little endian biginteger
interprets bytes big endian. presumably using new biginteger(hex, 16)
create biginteger
, output of hashcode.tostring()
series of pairs of hexadecimal digits representing hash bytes in same order they're returned asbytes()
(little endian). (you can reverse pairs of hexadecimal hex number produce same result when passed new biginteger(reversedhex, 16)
).
i think documentation of tostring()
confusing because of way refers "big endian"; doesn't mean output of method hexadecimal number representing bytes interpreted big endian.
we have open issue adding asbiginteger()
hashcode
.
Comments
Post a Comment