Trie Parser

Last updated 6 months ago

Almost all of Pronghorn’s string manipulation utilizes a bitwise trie. It is called the TrieParser and is used in route matching, JSON parsing, HTTP header parsing, and much more.

Utilize the TrieParser for fast parsing and matching in your Pronghorn projects.

Example

// Define a tag. This number is an ID used to associate with certain matches. Make sure this is unique.
final int tagIndex = 120;
// Create a new TrieParser
TrieParser map = new TrieParser();
// Define your matches here
// Use %b for bytes and strings
map.setUTF8Value("<strong>%b</strong>", tagIndex);
// Create a new reader. Since we already know all of our content, make sure to set alwaysCompletePayloads to true
TrieParserReader reader = new TrieParserReader(true);
// Our example string. Since it has to be in bytes, use CharSequenceToUTF8Local to convert to a String.
CharSequenceToUTF8Local.get().convert("Wow, it is a beautiful day <strong>today</strong>!").parseSetup(reader);
// This will store the output
StringBuilder tag = new StringBuilder();
// Now do the actual parsing. Make sure we have content and check if we found our tagIndex
while (TrieParserReader.parseHasContent(reader)) {
long val = TrieParserReader.parseNext(reader, map); // The ID the parser found
// The parser found our index
if(val == tagIndex) {
// Capture it and output it to the StringBuilder
TrieParserReader.capturedFieldBytesAsUTF8(reader, 0, tag);
} else {
// Skip over anything else
TrieParserReader.parseSkipOne(reader);
}
}
System.out.println("This is what's inside the strong tag: " + tag);

Supported Types

Name

Description

Example

Optional Signed Integer

Optional signed int, if absent returns zero

%o

Signed Integer

Signed Integer (may be hex if starts with 0x)

%i

Unsigned Integer

Unsigned Integer (may be hex if starts with 0x)

%u

Signed Hex

Signed Integer (may skip prefix 0x, assumed to be hex)

%I

Unsigned Hex

Unsigned Integer (may skip prefix 0x, assumed to be hex)

%U

Decimal

If value is found after dot, capture it. Otherwise, a 0 is captured.

%i%.

Rational

If value is found after dot, capture it. Otherwise, a 0 is captured. Always after i, dot, or u.

%i%/

Bytes/String

Captures bytes and Strings

%b