Command Line Interface

Graph Transliterator has a simple command line interface with six commands: dump, dump-tests, generate-tests, list-bundled, make-json, test, and transliterate.

$ graphtransliterator --help
Usage: main [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  dump            Dump transliterator as JSON.
  dump-tests      Dump BUNDLED tests.
  generate-tests  Generate tests as YAML.
  list-bundled    List BUNDLED transliterators.
  make-json       Make JSON rules of BUNDLED transliterator(s).
  test            Test BUNDLED transliterator.
  transliterate   Transliterate INPUT.

Dump

The dump command will output the specified transliterator as JSON:

$ graphtransliterator dump --help
Usage: dump [OPTIONS]

  Dump transliterator as JSON.

Options:
  -f, --from <CHOICE TEXT>...     Format (bundled/yaml_file) and source (name or
                                  filename) of transliterator  [required]
  -ca, --check-ambiguity / -nca, --no-check-ambiguity
                                  Check for ambiguity.  [default: no-check-
                                  ambiguity]
  -cl, --compression-level INTEGER
                                  Compression level, from 0 to 2  [default: 2]
  --help                          Show this message and exit.

It require a --from or -f option with two arguments. The first argument specifies the format of the transliterator (bundled or yaml_file) and the second a parameter for that format (the name of the bundled transliterator or the name of a YAML file).

To load a bundled transliterator, used bundled as the first parameter and give its (class) name, which will be in CamelCase, as the second:

$ graphtransliterator dump --from bundled Example
{"graphtransliterator_version":"1.2.4","compressed_settings":[["consonant","vowel","whitespace"],[" ","a","b"],[[2],[1],[0]],[["!B!",[0],[1],[2],[1],[0],-5],["A",0,0,[1],0,0,-1],["B",0,0,[2],0,0,-1],[" ",0,0,[0],0,0,-1]],[" ","whitespace",0],[[[1],[1],","]],{"name":"example","version":"1.0.0","description":"An Example Bundled Transliterator","url":"https://github.com/seanpue/graphtransliterator/tree/master/transliterator/sample","author":"Author McAuthorson","author_email":"author_mcauthorson@msu.edu","license":"MIT License","keywords":["example"],"project_urls":{"Documentation":"https://github.com/seanpue/graphtransliterator/tree/master/graphtransliterator/transliterators/example","Source":"https://github.com/seanpue/graphtransliterator/tree/graphtransliterator/transliterators/example","Tracker":"https://github.com/seanpue/graphtransliterator/issues"}},null]}

To load from a YAML file, give yaml_file as the first and the the name of the file as the second parameter:

$ graphtransliterator dump --from yaml_file ../graphtransliterator/transliterators/example/example.yaml
{"graphtransliterator_version":"1.2.4","compressed_settings":[["consonant","vowel","whitespace"],[" ","a","b"],[[2],[1],[0]],[["!B!",[0],[1],[2],[1],[0],-5],["A",0,0,[1],0,0,-1],["B",0,0,[2],0,0,-1],[" ",0,0,[0],0,0,-1]],[" ","whitespace",0],[[[1],[1],","]],{"name":"example","version":"1.0.0","description":"An Example Bundled Transliterator","url":"https://github.com/seanpue/graphtransliterator/tree/master/transliterator/sample","author":"Author McAuthorson","author_email":"author_mcauthorson@msu.edu","license":"MIT License","keywords":["example"],"project_urls":{"Documentation":"https://github.com/seanpue/graphtransliterator/tree/master/graphtransliterator/transliterators/example","Source":"https://github.com/seanpue/graphtransliterator/tree/graphtransliterator/transliterators/example","Tracker":"https://github.com/seanpue/graphtransliterator/issues"}},null]}

If you want to check for ambiguity in the transliterator before the dump, use the --check-ambiguity or -ca option:

$ graphtransliterator dump --from bundled Example --check-ambiguity # human readable
{"graphtransliterator_version":"1.2.4","compressed_settings":[["consonant","vowel","whitespace"],[" ","a","b"],[[2],[1],[0]],[["!B!",[0],[1],[2],[1],[0],-5],["A",0,0,[1],0,0,-1],["B",0,0,[2],0,0,-1],[" ",0,0,[0],0,0,-1]],[" ","whitespace",0],[[[1],[1],","]],{"name":"example","version":"1.0.0","description":"An Example Bundled Transliterator","url":"https://github.com/seanpue/graphtransliterator/tree/master/transliterator/sample","author":"Author McAuthorson","author_email":"author_mcauthorson@msu.edu","license":"MIT License","keywords":["example"],"project_urls":{"Documentation":"https://github.com/seanpue/graphtransliterator/tree/master/graphtransliterator/transliterators/example","Source":"https://github.com/seanpue/graphtransliterator/tree/graphtransliterator/transliterators/example","Tracker":"https://github.com/seanpue/graphtransliterator/issues"}},null]}

The compression level can of the JSON be specified using the --compression-level or -cl command. Compression level 0 is human readable; compression level 1 is not human readable and includes the generated graph; compression level 2 is not human readable and does not include the graph. Compression level 2, which is the fastest, is the default. There is no information lost during these compressions:

$ graphtransliterator dump --from bundled Example --compression-level 0 # human readable, with graph
{"tokens": {"a": ["vowel"], " ": ["whitespace"], "b": ["consonant"]}, "rules": [{"production": "!B!", "prev_classes": ["consonant"], "prev_tokens": ["a"], "tokens": ["b"], "next_classes": ["consonant"], "next_tokens": ["a"], "cost": 0.22239242133644802}, {"production": "A", "tokens": ["a"], "cost": 0.5849625007211562}, {"production": "B", "tokens": ["b"], "cost": 0.5849625007211562}, {"production": " ", "tokens": [" "], "cost": 0.5849625007211562}], "whitespace": {"default": " ", "token_class": "whitespace", "consolidate": false}, "onmatch_rules": [{"prev_classes": ["vowel"], "next_classes": ["vowel"], "production": ","}], "metadata": {"name": "example", "version": "1.0.0", "description": "An Example Bundled Transliterator", "url": "https://github.com/seanpue/graphtransliterator/tree/master/transliterator/sample", "author": "Author McAuthorson", "author_email": "author_mcauthorson@msu.edu", "license": "MIT License", "keywords": ["example"], "project_urls": {"Documentation": "https://github.com/seanpue/graphtransliterator/tree/master/graphtransliterator/transliterators/example", "Source": "https://github.com/seanpue/graphtransliterator/tree/graphtransliterator/transliterators/example", "Tracker": "https://github.com/seanpue/graphtransliterator/issues"}}, "ignore_errors": false, "onmatch_rules_lookup": {"a": {"a": [0]}}, "tokens_by_class": {"vowel": ["a"], "whitespace": [" "], "consonant": ["b"]}, "graph": {"node": [{"type": "Start", "ordered_children": {"b": [1], "a": [3], " ": [6]}}, {"token": "b", "type": "token", "ordered_children": {"__rules__": [2, 5]}}, {"type": "rule", "accepting": true, "rule_key": 0}, {"token": "a", "type": "token", "ordered_children": {"__rules__": [4]}}, {"type": "rule", "accepting": true, "rule_key": 1}, {"type": "rule", "accepting": true, "rule_key": 2}, {"token": " ", "type": "token", "ordered_children": {"__rules__": [7]}}, {"type": "rule", "accepting": true, "rule_key": 3}], "edge": {"0": {"1": {"token": "b", "cost": 0.22239242133644802}, "3": {"token": "a", "cost": 0.5849625007211562}, "6": {"token": " ", "cost": 0.5849625007211562}}, "1": {"2": {"cost": 0.22239242133644802, "constraints": {"prev_classes": ["consonant"], "prev_tokens": ["a"], "next_tokens": ["a"], "next_classes": ["consonant"]}}, "5": {"cost": 0.5849625007211562}}, "3": {"4": {"cost": 0.5849625007211562}}, "6": {"7": {"cost": 0.5849625007211562}}}, "edge_list": [[0, 1], [0, 3], [0, 6], [1, 2], [1, 5], [3, 4], [6, 7]]}, "tokenizer_pattern": "(b|a|\\ )", "graphtransliterator_version": "1.2.4"}

$ graphtransliterator dump --from bundled Example --compression-level 1 # not human readable, with graph
{"graphtransliterator_version":"1.2.4","compressed_settings":[["consonant","vowel","whitespace"],[" ","a","b"],[[2],[1],[0]],[["!B!",[0],[1],[2],[1],[0],-5],["A",0,0,[1],0,0,-1],["B",0,0,[2],0,0,-1],[" ",0,0,[0],0,0,-1]],[" ","whitespace",0],[[[1],[1],","]],{"name":"example","version":"1.0.0","description":"An Example Bundled Transliterator","url":"https://github.com/seanpue/graphtransliterator/tree/master/transliterator/sample","author":"Author McAuthorson","author_email":"author_mcauthorson@msu.edu","license":"MIT License","keywords":["example"],"project_urls":{"Documentation":"https://github.com/seanpue/graphtransliterator/tree/master/graphtransliterator/transliterators/example","Source":"https://github.com/seanpue/graphtransliterator/tree/graphtransliterator/transliterators/example","Tracker":"https://github.com/seanpue/graphtransliterator/issues"}},[["Start","rule","token"],[[0,0,{"2":[1],"1":[3],"0":[6]}],[2,0,2,{"-1":[2,5]}],[1,1,0],[2,0,1,{"-1":[4]}],[1,1,1],[1,1,2],[2,0,0,{"-1":[7]}],[1,1,3]],{"0":{"1":[0,-5,2],"3":[0,-1,1],"6":[0,-1,0]},"1":{"2":[[[0],[1],[1],[0]],-5,-1],"5":[0,-1,-1]},"3":{"4":[0,-1,-1]},"6":{"7":[0,-1,-1]}}]]}

$ graphtransliterator dump --from bundled Example --compression-level 2 # default; not human readable, no graph
{"graphtransliterator_version":"1.2.4","compressed_settings":[["consonant","vowel","whitespace"],[" ","a","b"],[[2],[1],[0]],[["!B!",[0],[1],[2],[1],[0],-5],["A",0,0,[1],0,0,-1],["B",0,0,[2],0,0,-1],[" ",0,0,[0],0,0,-1]],[" ","whitespace",0],[[[1],[1],","]],{"name":"example","version":"1.0.0","description":"An Example Bundled Transliterator","url":"https://github.com/seanpue/graphtransliterator/tree/master/transliterator/sample","author":"Author McAuthorson","author_email":"author_mcauthorson@msu.edu","license":"MIT License","keywords":["example"],"project_urls":{"Documentation":"https://github.com/seanpue/graphtransliterator/tree/master/graphtransliterator/transliterators/example","Source":"https://github.com/seanpue/graphtransliterator/tree/graphtransliterator/transliterators/example","Tracker":"https://github.com/seanpue/graphtransliterator/issues"}},null]}

Dump Tests

The dump-tests command dumps the tests of a bundled transliterator:

$ graphtransliterator dump-tests --help
Usage: dump-tests [OPTIONS] BUNDLED

  Dump BUNDLED tests.

Options:
  -t, --to [json|yaml]  Format (json/yaml) in which to dump  [default: yaml]
  --help                Show this message and exit.

By default, it outputs the original YAML tests file, preserving any comments:

$ graphtransliterator dump-tests Example
# YAML declaration of tests for bundled Graph Transliterator
# These are in the form of a dictionary.
# The key is the source text, and the value is the correct transliteration.
' ': ' '
a: A
aa: A,A
babab: BA!B!AB
b: B


To output as JSON, use the --to or -t flag:

$ graphtransliterator dump-tests --to json Example
{" ": " ", "a": "A", "aa": "A,A", "babab": "BA!B!AB", "b": "B"}

Generate Tests

The generate-tests command generates YAML tests keyed from input to desired output covering the entire internal graph. This command can be used to view the output of the transliterator in Unicode. It can also be used to generate starter tests for bundled transliterators:

$ graphtransliterator generate-tests --help
Usage: generate-tests [OPTIONS]

  Generate tests as YAML.

Options:
  -f, --from <CHOICE TEXT>...     Format (bundled/json/json_file/yaml_file) and
                                  source (name, JSON, or filename) of
                                  transliterator  [required]
  -ca, --check-ambiguity / -nca, --no-check-ambiguity
                                  Check for ambiguity.  [default: no-check-
                                  ambiguity]
  --help                          Show this message and exit.

It also require a --from or -f option with two arguments. The first argument specifies the format of the transliterator (bundled, json, json_file, yaml_file), and the second a parameter for that format (the name of the bundled transliterator, the actual JSON, or the name of a YAML file). Ambiguity checking can be turned on using --check_ambiguity or -ca:

$ graphtransliterator generate-tests --from bundled Example
' ': ' '
a: A
aa: A,A
b: B
babab: BA!B!AB


List Bundled Transliterators

The list-bundled command provides a list of bundled transliterators:

$ graphtransliterator test --help

Make JSON of Bundled Transliterator(s)

The make-json command makes new JSON files of bundled transliterators:

$ graphtransliterator make-json --help

It also allows regular-expression matching using the --reg-ex or -re flag. Matching starts at the start of the string. This command is for people creating new bundled transliterators.

Test

The test command tests a bundled transliterator:

$ graphtransliterator test --help
Usage: test [OPTIONS] BUNDLED

  Test BUNDLED transliterator.

Options:
  -ca, --check-ambiguity / -nca, --no-check-ambiguity
                                  Check for ambiguity.  [default: no-check-
                                  ambiguity]
  --help                          Show this message and exit.

It can only be used with bundled transliterators, so it only needs the name of the transliterator as its argument. This feature is useful when developing a transliterator. You can write the tests first and then begin developing the transliterator:

$ graphtransliterator test Example
True

Transliterate

The transliterate command will transliterate any following arguments:

$ graphtransliterator transliterate --help
Usage: transliterate [OPTIONS] [INPUT]...

  Transliterate INPUT.

Options:
  -f, --from <CHOICE TEXT>...     Format (bundled/json/json_file/yaml_file) and
                                  source (name, JSON, or filename) of
                                  transliterator  [required]
  -t, --to [json|python]          Format in which to output  [default: python]
  -ca, --check-ambiguity / -nca, --no-check-ambiguity
                                  Check for ambiguity.  [default: no-check-
                                  ambiguity]
  -ie, -nie, --ignore-errors / --no-ignore-errors
                                  Ignore errors.  [default: no-ignore-errors]
  --help                          Show this message and exit.

It also requires a --from or -f option with two arguments. The first argument specifies the format of the transliterator (bundled, json, json_file, yaml_file), and the second a parameter for that format (the name of the bundled transliterator, the actual JSON, or the name of a YAML file).

The transliterate command will transliterate every argument that follows. If there is only one input string, it will return a string:

$ graphtransliterator transliterate --from bundled Example a
A

$ graphtransliterator transliterate -f json_file ../graphtransliterator/transliterators/example/example.json a
A

$ graphtransliterator transliterate -f yaml_file ../graphtransliterator/transliterators/example/example.yaml a
A

Otherwise, it will return a list:

$ graphtransliterator transliterate -f bundled Example a a
['A', 'A']

The transliterate command also an optional --to or -t command that specifies the output format, a `python string (default) or a json string:

$ graphtransliterator transliterate --from bundled Example a
A

$ graphtransliterator transliterate --from bundled Example --to json a
"A"

$ graphtransliterator transliterate --from bundled Example --to python a a
['A', 'A']

$ graphtransliterator transliterate --from bundled Example --to json a a
["A", "A"]