Package jodd.io
Class UnicodeInputStream
- java.lang.Object
-
- java.io.InputStream
-
- jodd.io.UnicodeInputStream
-
- All Implemented Interfaces:
java.io.Closeable,java.lang.AutoCloseable
public class UnicodeInputStream extends java.io.InputStreamUnicode input stream for detecting UTF encodings and reading BOM characters. Detects following BOMs:- UTF-8
- UTF-16BE
- UTF-16LE
- UTF-32BE
- UTF-32LE
-
-
Field Summary
Fields Modifier and Type Field Description static byte[]BOM_UTF16_BEstatic byte[]BOM_UTF16_LEstatic byte[]BOM_UTF32_BEstatic byte[]BOM_UTF32_LEstatic byte[]BOM_UTF8static intMAX_BOM_SIZE
-
Constructor Summary
Constructors Constructor Description UnicodeInputStream(java.io.InputStream in, java.nio.charset.Charset targetEncoding)Creates new unicode stream.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()Closes input stream.intgetBOMSize()Returns BOM size in bytes.java.nio.charset.CharsetgetDetectedEncoding()Returns detected UTF encoding ornullif no UTF encoding has been detected (i.e.protected voidinit()Detects and decodes encoding from BOM character.intread()Reads byte from the stream.
-
-
-
Field Detail
-
MAX_BOM_SIZE
public static final int MAX_BOM_SIZE
- See Also:
- Constant Field Values
-
BOM_UTF32_BE
public static final byte[] BOM_UTF32_BE
-
BOM_UTF32_LE
public static final byte[] BOM_UTF32_LE
-
BOM_UTF8
public static final byte[] BOM_UTF8
-
BOM_UTF16_BE
public static final byte[] BOM_UTF16_BE
-
BOM_UTF16_LE
public static final byte[] BOM_UTF16_LE
-
-
Constructor Detail
-
UnicodeInputStream
public UnicodeInputStream(java.io.InputStream in, java.nio.charset.Charset targetEncoding)Creates new unicode stream. It works in two modes: detect mode and read mode.Detect mode is active when target encoding is not specified. In detect mode, it tries to detect encoding from BOM if exist. If BOM doesn't exist, encoding is not detected.
Read mode is active when target encoding is set. Then this stream reads optional BOM for given encoding. If BOM doesn't exist, nothing is skipped.
-
-
Method Detail
-
getDetectedEncoding
public java.nio.charset.Charset getDetectedEncoding()
Returns detected UTF encoding ornullif no UTF encoding has been detected (i.e. no BOM). If stream is not read yet, it will beinitalizedfirst.
-
init
protected void init() throws java.io.IOExceptionDetects and decodes encoding from BOM character. Reads ahead four bytes and check for BOM marks. Extra bytes are unread back to the stream, so only BOM bytes are skipped.- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOExceptionCloses input stream. If stream was not used, encoding will be unavailable.- Specified by:
closein interfacejava.lang.AutoCloseable- Specified by:
closein interfacejava.io.Closeable- Overrides:
closein classjava.io.InputStream- Throws:
java.io.IOException
-
read
public int read() throws java.io.IOExceptionReads byte from the stream.- Specified by:
readin classjava.io.InputStream- Throws:
java.io.IOException
-
getBOMSize
public int getBOMSize()
Returns BOM size in bytes. Returns-1if BOM not found.
-
-