public final class NonStrictUTF8Encoding extends UnicodeEncoding
| Modifier and Type | Field and Description |
|---|---|
protected static CaseFoldCodeItem[] |
EMPTY_FOLD_CODES |
static NonStrictUTF8Encoding |
INSTANCE |
| Modifier | Constructor and Description |
|---|---|
protected |
NonStrictUTF8Encoding() |
| Modifier and Type | Method and Description |
|---|---|
protected void |
asciiApplyAllCaseFold(int flag,
ApplyAllCaseFoldFunction fun,
Object arg) |
protected CaseFoldCodeItem[] |
asciiCaseFoldCodesByString(int flag,
byte[] bytes,
int p,
int end) |
protected int |
asciiMbcCaseFold(int flag,
byte[] bytes,
IntHolder pp,
int end,
byte[] lower) |
int |
codeToMbc(int code,
byte[] bytes,
int p)
Extracts code point into it's multibyte representation
|
int |
codeToMbcLength(int code)
Returns character length given a code point
Oniguruma equivalent:
code_to_mbclen |
int[] |
ctypeCodeRange(int ctype,
IntHolder sbOut)
utf8_get_ctype_code_range
|
String |
getCharsetName() |
boolean |
isCodeCType(int code,
int ctype)
Perform a check whether given code is of given character type (e.g.
|
protected boolean |
isCodeCTypeInternal(int code,
int ctype)
ONIGENC_IS_XXXXXX_CODE_CTYPE
|
boolean |
isNewLine(byte[] bytes,
int p,
int end)
onigenc_is_mbc_newline_0x0a / used also by multibyte encodings
|
boolean |
isReverseMatchAllowed(byte[] bytes,
int p,
int end)
onigenc_always_true_is_allowed_reverse_match
|
int |
leftAdjustCharHead(byte[] bytes,
int p,
int s,
int end)
utf8_left_adjust_char_head
|
int |
length(byte[] bytes,
int p,
int end)
Returns character length given stream, character position and stream end
returns
1 for singlebyte encodings or performs sanity validations for multibyte ones
and returns the character length, missing characters in the stream otherwise |
int |
mbcCaseFold(int flag,
byte[] bytes,
IntHolder pp,
int end,
byte[] fold)
onigenc_ascii_mbc_case_fold
|
int |
mbcToCode(byte[] bytes,
int p,
int end)
Returns code point for a character
Oniguruma equivalent:
mbc_to_code |
applyAllCaseFold, caseFoldCodesByString, ctypeCodeRange, propertyNameToCTypelength, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoFourGreatedThan127, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLengthasciiToLower, asciiToUpper, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isSbWord, isSingleByte, isSpace, isUnicode, isUpper, isUTF8, isWord, isWordGraphPrint, isXDigit, load, maxLength, maxLengthDistance, mbcodeStartPosition, minLength, odigitVal, prevCharHead, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setDummy, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitValpublic static final NonStrictUTF8Encoding INSTANCE
protected static final CaseFoldCodeItem[] EMPTY_FOLD_CODES
public int length(byte[] bytes,
int p,
int end)
Encoding1 for singlebyte encodings or performs sanity validations for multibyte ones
and returns the character length, missing characters in the stream otherwisepublic boolean isCodeCType(int code,
int ctype)
EncodingisCodeCType in class UnicodeEncodingcode - a code point of a characterctype - a character type to check against
Oniguruma equivalent: is_code_ctypepublic String getCharsetName()
getCharsetName in class UnicodeEncodingpublic boolean isNewLine(byte[] bytes,
int p,
int end)
public int codeToMbcLength(int code)
Encodingcode_to_mbclencodeToMbcLength in class Encodingpublic int mbcToCode(byte[] bytes,
int p,
int end)
Encodingmbc_to_codepublic int codeToMbc(int code,
byte[] bytes,
int p)
Encodingpublic int mbcCaseFold(int flag,
byte[] bytes,
IntHolder pp,
int end,
byte[] fold)
mbcCaseFold in class UnicodeEncodingflag - case fold flagpp - an IntHolder that points at character headfold - a buffer where to extract case folded character
Oniguruma equivalent: mbc_case_foldpublic int[] ctypeCodeRange(int ctype,
IntHolder sbOut)
ctypeCodeRange in class Encodingpublic int leftAdjustCharHead(byte[] bytes,
int p,
int s,
int end)
leftAdjustCharHead in class Encodingbytes - byte streamp - positions - stopend - endpublic boolean isReverseMatchAllowed(byte[] bytes,
int p,
int end)
isReverseMatchAllowed in class Encodingprotected final boolean isCodeCTypeInternal(int code,
int ctype)
protected final int asciiMbcCaseFold(int flag,
byte[] bytes,
IntHolder pp,
int end,
byte[] lower)
protected final void asciiApplyAllCaseFold(int flag,
ApplyAllCaseFoldFunction fun,
Object arg)
protected final CaseFoldCodeItem[] asciiCaseFoldCodesByString(int flag, byte[] bytes, int p, int end)
Copyright © 2016. All Rights Reserved.