java - Files.readAllBytes vs Files.lines getting MalformedInputException -
i have thought following 2 approaches read file should behave equally. don't. second approach throwing malformedinputexception
.
public static void main(string[] args) { try { string content = new string(files.readallbytes(paths.get("_template.txt"))); system.out.println(content); } catch (ioexception e) { e.printstacktrace(); } try(stream<string> lines = files.lines(paths.get("_template.txt"))) { lines.foreach(system.out::println); } catch (ioexception e) { e.printstacktrace(); } }
this stack trace:
exception in thread "main" java.io.uncheckedioexception: java.nio.charset.malformedinputexception: input length = 1 @ java.io.bufferedreader$1.hasnext(bufferedreader.java:574) @ java.util.iterator.foreachremaining(iterator.java:115) @ java.util.spliterators$iteratorspliterator.foreachremaining(spliterators.java:1801) @ java.util.stream.referencepipeline$head.foreach(referencepipeline.java:580) @ test.main(test.java:19) caused by: java.nio.charset.malformedinputexception: input length = 1 @ java.nio.charset.coderresult.throwexception(coderresult.java:281) @ sun.nio.cs.streamdecoder.implread(streamdecoder.java:339) @ sun.nio.cs.streamdecoder.read(streamdecoder.java:178) @ java.io.inputstreamreader.read(inputstreamreader.java:184) @ java.io.bufferedreader.fill(bufferedreader.java:161) @ java.io.bufferedreader.readline(bufferedreader.java:324) @ java.io.bufferedreader.readline(bufferedreader.java:389) @ java.io.bufferedreader$1.hasnext(bufferedreader.java:571) ... 4 more
what difference here, , how fix it?
this has character encoding. computers deal numbers. store text, characters in text have converted , numbers, using scheme. scheme called character encoding. there many different character encodings; of well-known standard character encodings ascii, iso-8859-1 , utf-8.
in first example, read bytes (numbers) in file , convert them characters passing them constructor of class string
. use default character encoding of system (whatever on operating system) convert bytes characters.
in second example, use files.lines(...)
, utf-8 character encoding used, according the documentation. when sequence of bytes found in file not valid utf-8 sequence, you'll malformedinputexception
.
the default character encoding of system may or may not utf-8, can explain difference in behaviour.
you'll have find out character encoding used file, , explicitly use that. example:
string content = new string(files.readallbytes(paths.get("_template.txt")), standardcharsets.iso_8859_1);
second example:
stream<string> lines = files.lines(paths.get("_template.txt"), standardcharsets.iso_8859_1);
Comments
Post a Comment