regex - How to get rid of weird characters in python string? -


i have lines contains pesky control characters:

enter image description here

when tried read file , str.replace(), these control characters didn't replaced. i've tried it's still sticking around.

with io.open('infile', 'r', encoding='utf8') fin:     line in fin:         line = line.replace(u'\u0094', '"').replace(u'\u0093', '"').replace(u'\u0092', "'").replace(u'\u0096', '"').replace(u'\u0084', '"') 

how these strings replaces? there cannonical way replace these strings (they quotation marks / whitespaces of various kind)?

what these characters anyway? u'\u0084'?

last time had problem, happened because getting characters outside ascii range, had wrong bounds.


Comments

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

java - How to filter a backspace keyboard input -

java - Show Soft Keyboard when EditText Appears -