ObiectumTokenizer

Just a small library for tokenizing strings.

Features

Line and column numbers.
Custom special characters - they're always counted as separate tokens (if not inside a string).
Custom comments - both single-line and multi-line.
Strings - a string is always a monolithic token, wrapped in quotes.
UTF-8 is the only supported encoding.

Example

obtokenizer_tokenizer_t tokenizer;
if (obtokenizer_init(&tokenizer, "abc /* comment №1 */ def,123 // №2") ||
    obtokenizer_add_spec_char(&tokenizer, ',')                   || // Count commas as separate tokens.
    obscanner_add_comment_mark(&tokenizer.scanner, false, "//")  || // Enable C-style single-line comments.
    obscanner_add_comment_mark(&tokenizer.scanner, true,  "/*")  || // Enable C-style multi-
    obscanner_add_comment_mark(&tokenizer.scanner, true,  "*/")     // line comments.
    ) {
    // error
}

obtokenizer_token_t token;
while (!obtokenizer_get(&tokenizer, &token)) {
    if (token.str[0] == '\0') {
        // No more tokens.
        obtokenizer_free_token(&token);
        break;
    }

    printf("%d:%d: %s\n", token.pos.line, token.pos.col, token.str);

    obtokenizer_free_token(&token); // Must be called before each reuse of a token structure.
}

Output:

$ ./test
1:1: abc
1:22: def
1:25: ,
1:26: 123

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
scanner		scanner
tokenizer		tokenizer
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ObiectumTokenizer

Features

Example

About

Releases

Packages

Languages

License

rzhikharevich/obiectumtokenizer

Folders and files

Latest commit

History

Repository files navigation

ObiectumTokenizer

Features

Example

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages