Alphabet

namespace mata

Main namespace including structs and algorithms for all automata.

In particular, this includes:

  1. Alphabets,

  2. Formula graphs and nodes,

  3. Mintermization,

  4. Closed sets.

Typedefs

using Symbol = unsigned
using Word = std::vector<Symbol>
using WordName = std::vector<std::string>

Functions

Word encode_word_utf8(const Word &word)

Encode a word using UTF-8 encoding.

Parameters:

word[in] The word to encode.

Returns:

The UTF-8 encoded word.

Word decode_word_utf8(const Word &word)

Decode a word using UTF-8 encoding.

Parameters:

word[in] The word to decode.

Returns:

The UTF-8 decoded word.

class Alphabet
#include <alphabet.hh>

The abstract interface for NFA alphabets.

Subclassed by EnumAlphabet, IntAlphabet, OnTheFlyAlphabet

Public Functions

virtual Symbol translate_symb(const std::string &symb) = 0

translates a string into a symbol

inline virtual Word translate_word(const WordName &word_name) const

Translate sequence of symbol names to sequence of their respective values.

virtual std::string reverse_translate_symbol(Symbol symbol) const = 0

Translate internal symbol representation back to its original string name.

Throws an exception when the symbol is missing in the alphabet.

Parameters:

symbol[in] Symbol to translate.

Returns:

symbol original name.

inline Symbol operator[](const std::string &symb)

also translates strings to symbols

inline virtual utils::OrdVector<Symbol> get_alphabet_symbols() const

Get a set of all symbols in the alphabet.

The result does not have to equal the list of symbols in the automaton using this alphabet.

inline virtual utils::OrdVector<Symbol> get_complement(const utils::OrdVector<Symbol> &symbols) const

complement of a set of symbols wrt the alphabet

virtual ~Alphabet() = default
inline virtual bool is_equal(const Alphabet &other_alphabet) const

Check whether two alphabets are equal.

In general, two alphabets are equal if and only if they are of the same class instance.

Parameters:

other_alphabet – The other alphabet to compare with for equality.

Returns:

True if equal, false otherwise.

inline virtual bool is_equal(const Alphabet *const other_alphabet) const

Check whether two alphabets are equal.

In general, two alphabets are equal if and only if they are of the same class instance.

Parameters:

other_alphabet – The other alphabet to compare with for equality.

Returns:

True if equal, false otherwise.

bool operator==(const Alphabet&) const = delete
virtual bool empty() const = 0

Checks whether the alphabet has any symbols.

inline virtual void clear()

Protected Functions

inline virtual const void *address() const
class EnumAlphabet : public Alphabet
#include <alphabet.hh>

Enumerated alphabet using a set of integers as symbols maintaining a set of specified symbols.

EnumAlphabet is a version of direct (identity) alphabet (does not give names to symbols, their name is their integer value directly). However, unlike IntAlphabet, EnumAlphabet maintains an ordered set of symbols in the alphabet.

Therefore, calling member functions get_complement() and get_alphabet_symbols() makes sense in the context of EnumAlphabet and the functions give the expected results.

Example:

Alphabet alph{ EnumAlphabet{ 0, 4, 6, 8, 9 } };
CHECK(alph.translate_symb("6") == 6);
CHECK_THROWS(alph.translate_symb("5")); // Throws an exception about an unknown symbol.
CHECK(alph.get_complement({ utils::OrdVector<Symbol>{ 0, 6, 9 } }) == utils::OrdVector<Symbol>{ 4, 8 });

Public Functions

explicit EnumAlphabet() = default
EnumAlphabet(const EnumAlphabet &alphabet) = default
inline explicit EnumAlphabet(const EnumAlphabet *const alphabet)
EnumAlphabet(EnumAlphabet &&rhs) = default
inline virtual utils::OrdVector<Symbol> get_alphabet_symbols() const override

Get a set of all symbols in the alphabet.

The result does not have to equal the list of symbols in the automaton using this alphabet.

inline virtual utils::OrdVector<Symbol> get_complement(const utils::OrdVector<Symbol> &symbols) const override

complement of a set of symbols wrt the alphabet

virtual std::string reverse_translate_symbol(Symbol symbol) const override

Translate internal symbol representation back to its original string name.

Throws an exception when the symbol is missing in the alphabet.

Parameters:

symbol[in] Symbol to translate.

Returns:

symbol original name.

EnumAlphabet &operator=(const EnumAlphabet &rhs) = default
EnumAlphabet &operator=(EnumAlphabet &&rhs) = default
inline void add_symbols_from(const mata::utils::OrdVector<Symbol> &symbols)

Expand alphabet by symbols from the passed symbols.

Adding a symbol name which already exists will throw an exception.

Parameters:

symbols[in] Vector of symbols to add.

inline void add_symbols_from(const EnumAlphabet &alphabet)

Expand alphabet by symbols from the passed alphabet.

Parameters:

symbols_to_add[in] Vector of symbols to add.

inline EnumAlphabet(std::initializer_list<Symbol> symbols)
template<class InputIt>
inline EnumAlphabet(InputIt first, InputIt last)
inline EnumAlphabet(std::initializer_list<std::string> l)
virtual Symbol translate_symb(const std::string &str) override

translates a string into a symbol

virtual Word translate_word(const WordName &word_name) const override

Translate sequence of symbol names to sequence of their respective values.

void add_new_symbol(const std::string &symbol)

Add new symbol to the alphabet with the value identical to its string representation.

Parameters:

symbol[in] User-space representation of the symbol.

Returns:

Result of the insertion as InsertionResult.

void add_new_symbol(Symbol symbol)

Add new symbol to the alphabet.

Parameters:
  • key[in] User-space representation of the symbol.

  • symbol[in] Number of the symbol to be used on transitions.

Returns:

Result of the insertion as InsertionResult.

inline Symbol get_next_value() const

Get the next value for a potential new symbol.

Returns:

Next Symbol value.

inline size_t get_number_of_symbols() const

Get the number of existing symbols, epsilon symbols excluded.

Returns:

The number of symbols.

inline virtual bool empty() const override

Checks whether the alphabet has any symbols.

void update_next_symbol_value(Symbol value)

Update next symbol value when appropriate.

When the newly inserted value is larger or equal to the current next symbol value, update the next symbol value to a value one larger than the new value.

Parameters:

value – The value of the newly added symbol.

size_t erase(const Symbol symbol)

Erase a symbol from the alphabet.

Returns:

Number of symbols erased (0 or 1).

inline void erase(utils::OrdVector<Symbol>::const_iterator pos)

Remove a symbol name value pair from the position pos from the alphabet.

Returns:

Iterator following the last removed element.

inline void erase(utils::OrdVector<Symbol>::const_iterator first, utils::OrdVector<Symbol>::const_iterator last)

Remove a symbol name value pair from the positions between first and last from the alphabet.

Returns:

Iterator following the last removed element.

inline virtual void clear() override
inline Symbol operator[](const std::string &symb)

also translates strings to symbols

inline virtual bool is_equal(const Alphabet &other_alphabet) const

Check whether two alphabets are equal.

In general, two alphabets are equal if and only if they are of the same class instance.

Parameters:

other_alphabet – The other alphabet to compare with for equality.

Returns:

True if equal, false otherwise.

inline virtual bool is_equal(const Alphabet *const other_alphabet) const

Check whether two alphabets are equal.

In general, two alphabets are equal if and only if they are of the same class instance.

Parameters:

other_alphabet – The other alphabet to compare with for equality.

Returns:

True if equal, false otherwise.

bool operator==(const Alphabet&) const = delete

Protected Functions

inline virtual const void *address() const

Private Members

mata::utils::OrdVector<Symbol> symbols_ = {}

Map of string transition symbols to symbol values.

Symbol next_symbol_value_ = {0}

Next value to be used for a newly added symbol.

class IntAlphabet : public Alphabet
#include <alphabet.hh>

Direct alphabet (also identity alphabet or integer alphabet) using integers as symbols.

This alphabet presumes that all integers are valid symbols. Therefore, calling member functions get_complement() and get_alphabet_symbols() makes no sense in this context and the methods will throw exceptions warning about the inappropriate use of IntAlphabet. If one needs these functions, they should use OnTheFlyAlphabet instead of IntAlphabet.

Public Functions

inline IntAlphabet()
virtual Symbol translate_symb(const std::string &symb) override

translates a string into a symbol

inline virtual std::string reverse_translate_symbol(Symbol symbol) const override

Translate internal symbol representation back to its original string name.

Throws an exception when the symbol is missing in the alphabet.

Parameters:

symbol[in] Symbol to translate.

Returns:

symbol original name.

inline virtual utils::OrdVector<Symbol> get_alphabet_symbols() const override

Get a set of all symbols in the alphabet.

The result does not have to equal the list of symbols in the automaton using this alphabet.

inline virtual utils::OrdVector<Symbol> get_complement(const utils::OrdVector<Symbol> &symbols) const override

complement of a set of symbols wrt the alphabet

IntAlphabet(const IntAlphabet&) = default
IntAlphabet &operator=(const IntAlphabet &int_alphabet) = delete
inline virtual bool empty() const override

Checks whether the alphabet has any symbols.

inline virtual void clear() override
inline virtual Word translate_word(const WordName &word_name) const

Translate sequence of symbol names to sequence of their respective values.

inline Symbol operator[](const std::string &symb)

also translates strings to symbols

inline virtual bool is_equal(const Alphabet &other_alphabet) const

Check whether two alphabets are equal.

In general, two alphabets are equal if and only if they are of the same class instance.

Parameters:

other_alphabet – The other alphabet to compare with for equality.

Returns:

True if equal, false otherwise.

inline virtual bool is_equal(const Alphabet *const other_alphabet) const

Check whether two alphabets are equal.

In general, two alphabets are equal if and only if they are of the same class instance.

Parameters:

other_alphabet – The other alphabet to compare with for equality.

Returns:

True if equal, false otherwise.

bool operator==(const Alphabet&) const = delete

Protected Functions

inline virtual const void *address() const override

Private Members

IntAlphabetSingleton &alphabet_instance
class IntAlphabetSingleton

Singleton class implementing integer alphabet_instance for class IntAlphabet.

Users have to use IntAlphabet instead which provides interface identical to other alphabets and can be used in places where an instance of the abstract class Alphabet is required.

Public Functions

IntAlphabetSingleton(IntAlphabetSingleton&) = delete
IntAlphabetSingleton(IntAlphabetSingleton&&) = delete
IntAlphabetSingleton &operator=(const IntAlphabetSingleton&) = delete
IntAlphabetSingleton &operator=(IntAlphabetSingleton&&) = delete
~IntAlphabetSingleton() = default

Public Static Functions

static inline IntAlphabetSingleton &get()

Protected Functions

IntAlphabetSingleton() = default
class OnTheFlyAlphabet : public Alphabet
#include <alphabet.hh>

An alphabet constructed ‘on the fly’.

Should be use anytime the automata have a specific names for the symbols.

Public Types

using StringToSymbolMap = std::unordered_map<std::string, Symbol>
using InsertionResult = std::pair<StringToSymbolMap::const_iterator, bool>

Result of the insertion of a new symbol.

Public Functions

inline explicit OnTheFlyAlphabet(Symbol init_symbol = 0)
OnTheFlyAlphabet(const OnTheFlyAlphabet &alphabet) = default
OnTheFlyAlphabet(OnTheFlyAlphabet &&alphabet) = default
inline explicit OnTheFlyAlphabet(const OnTheFlyAlphabet *const alphabet)
inline explicit OnTheFlyAlphabet(StringToSymbolMap str_sym_map)
inline explicit OnTheFlyAlphabet(const std::vector<std::string> &symbol_names, Symbol init_symbol = 0)

Create alphabet from a list of symbol names.

Parameters:
  • symbol_names – Names for symbols on transitions.

  • init_symbol – Start of a sequence of values to use for new symbols.

template<class InputIt>
inline OnTheFlyAlphabet(InputIt first, InputIt last)
inline OnTheFlyAlphabet(std::initializer_list<std::pair<std::string, Symbol>> name_symbol_map)
virtual utils::OrdVector<Symbol> get_alphabet_symbols() const override

Get a set of all symbols in the alphabet.

The result does not have to equal the list of symbols in the automaton using this alphabet.

virtual utils::OrdVector<Symbol> get_complement(const utils::OrdVector<Symbol> &symbols) const override

complement of a set of symbols wrt the alphabet

virtual std::string reverse_translate_symbol(Symbol symbol) const override

Translate internal symbol representation back to its original string name.

Throws an exception when the symbol is missing in the alphabet.

Parameters:

symbol[in] Symbol to translate.

Returns:

symbol original name.

OnTheFlyAlphabet &operator=(const OnTheFlyAlphabet &rhs) = default
OnTheFlyAlphabet &operator=(OnTheFlyAlphabet &&rhs) = default
void add_symbols_from(const std::vector<std::string> &symbol_names)

Expand alphabet by symbols from the passed symbol_names.

Adding a symbol name which already exists will throw an exception.

Parameters:

symbol_names[in] Vector of symbol names.

void add_symbols_from(const StringToSymbolMap &new_symbol_map)

Expand alphabet by symbols from the passed symbol_map.

The value of the already existing symbols will NOT be overwritten.

Parameters:

new_symbol_map[in] Map of strings to symbols.

virtual Symbol translate_symb(const std::string &str) override

translates a string into a symbol

virtual Word translate_word(const WordName &word_name) const override

Translate sequence of symbol names to sequence of their respective values.

InsertionResult add_new_symbol(const std::string &key)

Add new symbol to the alphabet with the value of next_symbol_value.

Throws an exception when the adding fails.

Parameters:

key[in] User-space representation of the symbol.

Returns:

Result of the insertion as InsertionResult.

InsertionResult add_new_symbol(const std::string &key, Symbol value)

Add new symbol to the alphabet.

Throws an exception when the adding fails.

Parameters:
  • key[in] User-space representation of the symbol.

  • value[in] Number of the symbol to be used on transitions.

Returns:

Result of the insertion as InsertionResult.

inline InsertionResult try_add_new_symbol(const std::string &key, Symbol value)

Try to add symbol to the alphabet map.

Does not throw an exception when the adding fails.

Parameters:
  • key[in] User-space representation of the symbol.

  • value[in] Number of the symbol to be used on transitions.

Returns:

Result of the insertion as InsertionResult.

inline Symbol get_next_value() const

Get the next value for a potential new symbol.

Returns:

Next Symbol value.

inline size_t get_number_of_symbols() const

Get the number of existing symbols, epsilon symbols excluded.

Returns:

The number of symbols.

inline const StringToSymbolMap &get_symbol_map() const

Get the symbol map used in the alphabet.

Returns:

Map mapping strings to symbols used internally in Mata.

inline virtual bool empty() const override

Checks whether the alphabet has any symbols.

void update_next_symbol_value(Symbol value)

Update next symbol value when appropriate.

When the newly inserted value is larger or equal to the current next symbol value, update the next symbol value to a value one larger than the new value.

Parameters:

value – The value of the newly added symbol.

size_t erase(Symbol symbol)

Remove a symbol name value pair specified by its symbol from the alphabet.

Warning

Complexity: O(n), where n is the number of symbols in the alphabet.

Returns:

Number of symbols removed (0 or 1).

size_t erase(const std::string &symbol_name)

Remove a symbol name value pair specified by its symbol_name from the alphabet.

Returns:

Number of symbols removed (0 or 1).

inline void erase(StringToSymbolMap::const_iterator pos)

Remove a symbol name value pair from the position pos from the alphabet.

inline void erase(StringToSymbolMap::const_iterator first, StringToSymbolMap::const_iterator last)

Remove a symbol name value pair from the positions between first and last from the alphabet.

inline virtual void clear() override
inline Symbol operator[](const std::string &symb)

also translates strings to symbols

inline virtual bool is_equal(const Alphabet &other_alphabet) const

Check whether two alphabets are equal.

In general, two alphabets are equal if and only if they are of the same class instance.

Parameters:

other_alphabet – The other alphabet to compare with for equality.

Returns:

True if equal, false otherwise.

inline virtual bool is_equal(const Alphabet *const other_alphabet) const

Check whether two alphabets are equal.

In general, two alphabets are equal if and only if they are of the same class instance.

Parameters:

other_alphabet – The other alphabet to compare with for equality.

Returns:

True if equal, false otherwise.

bool operator==(const Alphabet&) const = delete

Protected Functions

inline virtual const void *address() const

Private Members

StringToSymbolMap symbol_map_ = {}

Map of string transition symbols to symbol values.

Symbol next_symbol_value_ = {}

Next value to be used for a newly added symbol.

namespace std

STL namespace.

Functions

std::ostream &operator<<(std::ostream &os, const mata::Alphabet &alphabet)