Class mata::nfa::Nfa

class Nfa

A class representing an NFA.

Subclassed by Nft

Public Functions

inline explicit Nfa(const size_t num_of_states, utils::SparseSet<State> initial_states = {}, utils::SparseSet<State> final_states = {}, Alphabet *alphabet = nullptr)

Construct a new explicit NFA with num_of_states states and optionally set initial and final states.

Parameters:

num_of_states[in] Number of states for which to preallocate Delta.

Nfa(const Nfa &other) = default

Construct a new explicit NFA from other NFA.

State add_state()

Add a new (fresh) state to the automaton.

Returns:

The newly created state.

State add_state(State state)

Add state state to delta if state is not in delta yet.

Returns:

The requested state.

State insert_word(State source, const Word &word, State target)

Inserts a word into the NFA from a source state source to a target state target.

Creates new states along the path of the word.

Parameters:
  • source – The source state where the word begins. It must already be a part of the automaton.

  • word – The nonempty word to be inserted into the NFA.

  • target – The target state where the word ends.

Returns:

The state target where the inserted word ends.

State insert_word(State source, const Word &word)

Inserts a word into the NFA from a source state source to a new target state.

Creates new states along the path of the word.

Parameters:
  • source – The source state where the word begins. It must already be a part of the automaton.

  • word – The nonempty word to be inserted into the NFA.

Returns:

The newly created target state where the inserted word ends.

size_t num_of_states() const

Get the current number of states in the whole automaton.

This includes the initial and final states as well as states in the transition relation.

Returns:

The number of states.

Nfa &unify_initial(bool force_new_state = false)

Unify initial states into a single new initial state.

Parameters:

force_new_state[in] Whether to force creating a new state even when initial states are already unified.

Returns:

this after unification.

Nfa &unify_final(bool force_new_state = false)

Unify final states into a single new final state.

Parameters:

force_new_state[in] Whether to force creating a new state even when final states are already unified.

Returns:

this after unification.

inline Nfa &swap_final_nonfinal()

Swap final and non-final states in-place.

void clear()

Clear the underlying NFA to a blank NFA.

The whole NFA is cleared, each member is set to its zero value.

bool is_identical(const Nfa &aut) const

Check if this is exactly identical to aut.

This is exact equality of automata, including state numbering (so even stronger than isomorphism), essentially only useful for testing purposes.

Returns:

True if automata are exactly identical, false otherwise.

StateSet get_reachable_states() const

Get set of reachable states.

Reachable states are states accessible from any initial state.

Returns:

Set of reachable states. TODO: with the new get_useful_states, it might be useless now.

StateSet get_terminating_states() const

Get set of terminating states.

Terminating states are states leading to any final state.

Returns:

Set of terminating states. TODO: with the new get_useful_states, it might be useless now.

BoolVector get_useful_states() const

Get the useful states using a modified Tarjan’s algorithm.

A state is useful if it is reachable from an initial state and can reach a final state.

Returns:

BoolVector Bool vector whose ith value is true iff the state i is useful.

void tarjan_scc_discover(const TarjanDiscoverCallback &callback) const

Tarjan’s SCC discover algorihm.

Parameters:

callback – Callbacks class to instantiate callbacks for the Tarjan’s algorithm.

Nfa &trim(StateRenaming *state_renaming = nullptr)

Remove inaccessible (unreachable) and not co-accessible (non-terminating) states in-place.

Remove states which are not accessible (unreachable; state is accessible when the state is the endpoint of a path starting from an initial state) or not co-accessible (non-terminating; state is co-accessible when the state is the starting point of a path ending in a final state).

Parameters:

state_renaming[out] Mapping of trimmed states to new states.

Returns:

this after trimming.

Nfa decode_utf8() const

Decodes automaton from UTF-8 encoding.

Method removes unreachable states from delta.

Returns:

Decoded automaton.

std::vector<State> distances_from_initial() const

Returns vector ret where ret[q] is the length of the shortest path from any initial state to q.

std::vector<State> distances_to_final() const

Returns vector ret where ret[q] is the length of the shortest path from q to any final state.

Run get_shortest_accepting_run_from_state(State q, const std::vector<State> &distances_to_final) const

Get some shortest accepting run from state q.

Assumes that q is a state of this automaton and that there is some accepting run from q

Parameters:

distances_to_final – Vector of the lengths of the shortest runs from states (can be computed using distances_to_final())

void remove_epsilon(Symbol epsilon = EPSILON)

Remove epsilon transitions from the automaton.

Nfa &concatenate(const Nfa &aut)

In-place concatenation.

Nfa &unite_nondet_with(const Nfa &aut)

In-place nondeterministic union with aut.

Does not add epsilon transitions, just unites initial and final states.

Nfa get_one_letter_aut(Symbol abstract_symbol = 'x') const

Unify transitions to create a directed graph with at most a single transition between two states.

Parameters:

abstract_symbol[in] Abstract symbol to use for transitions in digraph.

Returns:

An automaton representing a directed graph.

inline bool is_epsilon(Symbol symbol) const

Check whether symbol is epsilon symbol or not.

Parameters:

symbolSymbol to check.

Returns:

True if the passed symbol is epsilon, false otherwise.

void get_one_letter_aut(Nfa &result) const

Unify transitions to create a directed graph with at most a single transition between two states.

Parameters:

result[out] An automaton representing a directed graph.

std::string print_to_dot(bool decode_ascii_chars = false, bool use_intervals = false, int max_label_length = -1) const

Prints the automaton in DOT format.

Parameters:
  • decode_ascii_chars[in] Whether to use ASCII characters for the output.

  • use_intervals[in] Whether to use intervals (e.g. [1-3] instead of 1,2,3) for labels.

  • max_label_length[in] Maximum label length for the output (-1 means no limit, 0 means no labels). If the label is longer than max_label_length, it will be truncated, with full label displayed on hover.

Returns:

automaton in DOT format

void print_to_dot(std::ostream &output, bool decode_ascii_chars = false, bool use_intervals = false, int max_label_length = -1) const

Prints the automaton to the output stream in DOT format.

Parameters:
  • decode_ascii_chars[in] Whether to use ASCII characters for the output.

  • use_intervals[in] Whether to use intervals (e.g. [1-3] instead of 1,2,3) for labels.

  • max_label_length[in] Maximum label length for the output (-1 means no limit, 0 means no labels). If the label is longer than max_label_length, it will be truncated, with full label displayed on hover.

void print_to_dot(const std::string &filename, bool decode_ascii_chars = false, bool use_intervals = false, int max_label_length = -1) const

Prints the automaton to the file in DOT format.

Parameters:
  • filename – Name of the file to print the automaton to

  • decode_ascii_chars[in] Whether to use ASCII characters for the output.

  • use_intervals[in] Whether to use intervals (e.g. [1-3] instead of 1,2,3) for labels.

  • max_label_length[in] Maximum label length for the output (-1 means no limit, 0 means no labels). If the label is longer than max_label_length, it will be truncated, with full label displayed on hover.

std::string print_to_mata(const Alphabet *alphabet = nullptr) const

Prints the automaton in mata format.

If you need to parse the automaton again, use IntAlphabet in construct()

Parameters:

alphabet[in] If specified, translates the symbols to their symbol names in the alphabet.

Returns:

automaton in mata format

void print_to_mata(std::ostream &output, const Alphabet *alphabet = nullptr) const

Prints the automaton to the output stream in mata format.

If you need to parse the automaton again, use IntAlphabet in construct()

Parameters:

alphabet[in] If specified, translates the symbols to their symbol names in the alphabet.

void print_to_mata(const std::string &filename, const Alphabet *alphabet = nullptr) const

Prints the automaton to the file in mata format.

If you need to parse the automaton again, use IntAlphabet in construct()

Parameters:
  • alphabet[in] If specified, translates the symbols to their symbol names in the alphabet.

  • filename – Name of the file to print the automaton to

StateSet post(const StateSet &states, const Symbol symbol, EpsilonClosureOpt epsilon_closure_opt = EpsilonClosureOpt::NONE) const

Get the set of states reachable from the given set of states over the given symbol.

TODO: Relict from VATA. What to do with inclusion/ universality/ this post function? Revise all of them.

Parameters:
  • states – Set of states to compute the post set from.

  • symbolSymbol to compute the post set for.

  • epsilon_closure_opt – Epsilon closure option. Perform epsilon closure before and/or after the post operation.

Returns:

Set of states reachable from the given set of states over the given symbol.

inline StateSet post(const State state, const Symbol symbol, EpsilonClosureOpt epsilon_closure_opt) const

Get the set of states reachable from the given state over the given symbol.

Parameters:
  • state – A state to compute the post set from.

  • symbolSymbol to compute the post set for.

  • epsilon_closure_opt – Epsilon closure option. Perform epsilon closure before and/or after the post operation.

Returns:

Set of states reachable from the given state over the given symbol.

inline const StateSet &post(const State state, const Symbol symbol) const

Returns a reference to targets (states) reachable from the given state over the given symbol.

This is an optimized shortcut for post(state, symbol, EpsilonClosureOpt::NONE).

Parameters:
  • state – A state to compute the post set from.

  • symbolSymbol to compute the post set for.

Returns:

Set of states reachable from the given state over the given symbol.

bool is_lang_empty(Run *cex = nullptr) const

Check whether the language of NFA is empty.

Currently calls is_lang_empty_scc if cex is null

Parameters:

cex[out] Counter-example path for a case the language is not empty.

Returns:

True if the language is empty, false otherwise.

bool is_lang_empty_scc() const

Check if the language is empty using Tarjan’s SCC discover algorithm.

Returns:

Language empty <-> True

bool is_deterministic() const

Test whether an automaton is deterministic.

I.e., whether it has exactly one initial state and every state has at most one outgoing transition over every symbol. Checks the whole automaton, not only the reachable part

bool is_complete(Alphabet const *alphabet = nullptr) const

Test for automaton completeness with regard to an alphabet.

An automaton is complete if every reachable state has at least one outgoing transition over every symbol.

bool is_complete(const utils::OrdVector<Symbol> &symbols) const

Test for automaton completeness with regard to an alphabet.

An automaton is complete if every reachable state has at least one outgoing transition over every symbol.

bool is_acyclic() const

Is the automaton graph acyclic?

Used for checking language finiteness.

Returns:

true <-> Automaton graph is acyclic.

bool is_flat() const

Is the automaton flat?

Flat automaton is an NFA whose every SCC is a simple loop. Basically each state in an SCC has at most one successor within this SCC.

Returns:

true <-> Automaton graph is flat.

void fill_alphabet(mata::OnTheFlyAlphabet &alphabet_to_fill) const

Fill alphabet_to_fill with symbols from nfa.

Parameters:
  • nfa[in] NFA with symbols to fill alphabet_to_fill with.

  • alphabet_to_fill[out] Alphabet to be filled with symbols from nfa.

bool is_universal(const Alphabet &alphabet, Run *cex = nullptr, const ParameterMap &params = {{"algorithm", "antichains"}}) const

Check whether the language of the automaton is universal.

Parameters:
  • alphabetAlphabet to use for checking the universality.

  • cex – Counterexample path for a case the language is not universal.

  • params – Optional parameters to control the universality check algorithm:

    • ”algorithm”:

      • ”antichains”: The algorithm uses antichains to check the universality.

      • ”naive”: The algorithm uses the naive approach to check the universality.

Returns:

True if the language of the automaton is universal, false otherwise.

bool is_universal(const Alphabet &alphabet, const ParameterMap &params) const

Check whether the language of the automaton is universal.

Parameters:
  • alphabetAlphabet to use for checking the universality.

  • params – Optional parameters to control the universality check algorithm:

    • ”algorithm”:

      • ”antichains”: The algorithm uses antichains to check the universality.

      • ”naive”: The algorithm uses the naive approach to check the universality.

Returns:

True if the language of the automaton is universal, false otherwise.

bool is_in_lang(const Run &word, bool use_epsilon = false, bool match_prefix = false) const

Check whether a run over the word (or its prefix) is in the language of an automaton.

Parameters:
  • word – The run to check.

  • use_epsilon – Whether the automaton uses epsilon transitions.

  • match_prefix – Whether to also match the prefix of the word.

Returns:

True if the run (or its prefix) is in the language of the automaton, false otherwise.

inline bool is_in_lang(const Word &word, const bool use_epsilon = false, const bool match_prefix = false)

Check whether a word (or its prefix) is in the language of an automaton.

Parameters:
  • word – The word to check.

  • use_epsilon – Whether the automaton uses epsilon transitions.

  • match_prefix – Whether to also match the prefix of the word.

Returns:

True if the word (or its prefix) is in the language of the automaton, false otherwise.

StateSet read_word(const Run &word, const bool use_epsilon = false) const

Read a word and return the set of states the automaton ends up in.

Parameters:
  • word – The word to read.

  • use_epsilon – Whether the automaton uses epsilon transitions.

Returns:

Set of all reachable states after reading the word. Note: This returns all reachable states, not just final states. Use is_in_lang() if you need to check language membership, or intersect the result with final states manually if needed. Note: The returned set is empty if the word cannot be read.

inline StateSet read_word(const Word &word, const bool use_epsilon = false) const

Read a word and return the set of states the automaton ends up in.

Parameters:
  • word – The word to read.

  • use_epsilon – Whether the automaton uses epsilon transitions.

Returns:

Set of all reachable states after reading the word. Note: This returns all reachable states, not just final states. Use is_in_lang() if you need to check language membership, or intersect the result with final states manually if needed. Note: The returned set is empty if the word cannot be read.

std::optional<State> read_word_det(const Run &word) const

Read a word and return the state the deterministic automaton ends up in.

If the automaton is nondeterministic, the result is undefined.

Parameters:

word – The word to read.

Returns:

The reachable state after reading the word or std::nullopt if the word cannot be read.

inline std::optional<State> read_word_det(const Word &word) const

Read a word and return the state the deterministic automaton ends up in.

If the automaton is nondeterministic, the result is undefined.

Parameters:

word – The word to read.

Returns:

The reachable state after reading the word or std::nullopt if the word cannot be read.

inline bool is_prefix_in_lang(const Run &word, const bool use_epsilon = false) const

Check whether a prefix of a run is in the language of an automaton.

Parameters:
  • word – The run to check.

  • use_epsilon – Whether the automaton uses epsilon transitions.

Returns:

True if the prefix of the run is in the language of the automaton, false otherwise.

inline bool is_prefix_in_lang(const Word &word, const bool use_epsilon = false) const

Check whether a prefix of a word is in the language of an automaton.

Parameters:
  • word – The word to check.

  • use_epsilon – Whether the automaton uses epsilon transitions.

Returns:

True if the prefix of the word is in the language of the automaton, false otherwise.

std::set<Word> get_words(size_t max_length) const

Get the set of all words in the language of the automaton whose length is <= max_length.

If you have an automaton with finite language (can be checked using is_acyclic), you can get all words by calling get_words(aut.num_of_states())

std::optional<Word> get_word(const std::optional<Symbol> first_epsilon = EPSILON) const

Get any arbitrary accepted word in the language of the automaton.

The automaton is searched using DFS, returning a word for the first reached final state.

Parameters:

first_epsilon – If defined, all symbols >=first_epsilon are assumed to be epsilon and therefore are not in the returned word.

Returns:

std::optional<Word> Some word from the language. If the language is empty, returns std::nullopt.

std::optional<Word> get_word_from_complement(const Alphabet *alphabet = nullptr) const

Get any arbitrary accepted word in the language of the complement of the automaton.

The automaton is lazily determinized and made complete. The algorithm returns an arbitrary word from the complemented NFA constructed until the first macrostate without any final states in the original automaton is encountered.

Parameters:

alphabet[in] Alphabet to use for computing the complement. If nullptr, uses this->alphabet when defined, otherwise uses this->delta.get_used_symbols().

Pre:

The automaton does not contain any epsilon transitions. TODO: Support lazy epsilon closure?

Returns:

An arbitrary word from the complemented automaton, or std::nullopt if the automaton is universal on the chosen set of symbols for the complement.

bool make_complete(const Alphabet *alphabet = nullptr, std::optional<State> sink_state = std::nullopt)

Make NFA complete in place.

For each state 0,…,this->num_of_states()-1, add transitions with “missing” symbols from alphabet (symbols that do not occur on transitions from given state) to sink_state. If sink_state does not belong to the NFA, it is added to it, but only in the case that some transition to sink_state was added. In the case that NFA does not contain any states, this function does nothing.

Parameters:
  • alphabet[in] Alphabet to use for computing “missing” symbols. If nullptr, use this->alphabet when defined, otherwise use this->delta.get_used_symbols().

  • sink_state[in] The state into which new transitions are added. If std::nullopt, add a new sink state.

Returns:

true if a new transition was added to the NFA.

bool make_complete(const utils::OrdVector<Symbol> &symbols, std::optional<State> sink_state = std::nullopt)

Make NFA complete in place.

For each state 0,…,this->num_of_states()-1, add transitions with “missing” symbols from alphabet (symbols that do not occur on transitions from given state) to sink_state. If sink_state does not belong to the NFA, it is added to it, but only in the case that some transition to sink_state was added. In the case that NFA does not contain any states, this function does nothing.

This overloaded version is a more efficient version which does not need to compute the set of symbols to complete to from the alphabet. Prefer this version when you already have the set of symbols precomputed or plan to complete multiple automata over the same set of symbols.

Parameters:
  • symbols[in] Symbols to compute “missing” symbols from.

  • sink_state[in] The state into which new transitions are added. If std::nullopt, add a new sink state.

Returns:

true if a new transition was added to the NFA.

Nfa &complement_deterministic(const mata::utils::OrdVector<Symbol> &symbols, std::optional<State> sink_state = std::nullopt)

Complement deterministic automaton in-place by adding a sink state and swapping final and non-final states.

Parameters:
  • symbols[in] Symbols needed to make the automaton complete.

  • sink_state[in] State to be used as a sink state. Adds a new sink state when not specified.

Returns:

DFA complemented in-place.

Pre:

this is a deterministic automaton.

Public Members

Delta delta

For state q, delta[q] keeps the list of transitions ordered by symbols.

The set of states of this automaton are the numbers from 0 to the number of states minus one.

Alphabet *alphabet = nullptr

The alphabet which can be shared between multiple automata. Key value store for additional attributes for the NFA. Keys are attribute names as strings and the value types are up to the user. For example, we can set up attributes such as “state_dict” for state dictionary attribute mapping states to their respective names, or “transition_dict” for transition dictionary adding a human-readable meaning to each transition.

struct TarjanDiscoverCallback

Structure for storing callback functions (event handlers) utilizing Tarjan’s SCC discover algorithm.