Strings¶
Algorithms and classes used for solving string constraints.
-
namespace mata
Main namespace including structs and algorithms for all automata.
In particular, this includes:
Alphabets,
Formula graphs and nodes,
Mintermization,
Closed sets.
-
namespace applications¶
-
namespace strings¶
NFA algorithms usable for solving string constraints.
Typedefs
-
using Transition = nfa::Transition¶
-
using ParameterMap = nfa::ParameterMap¶
-
using SymbolPost = nfa::SymbolPost¶
Functions
-
std::set<Word> get_shortest_words(const Nfa &nfa)¶
Get shortest words (regarding their length) of the automaton using BFS.
- Returns:
Set of shortest words.
-
std::optional<std::vector<Word>> get_words_of_lengths(const Nft &nft, std::vector<unsigned> lengths)¶
Get the accepting words for each tape of
nft
with specific lengths.This function finds such an accepting word of
nft
that for each tape i, the word on this tape (on the ith index of resulting vector) has the length lengths[i].- Parameters:
nft – Transducer whose accepting words we are looking for
lengths – The lengths of the words of each tape (size of lengths == the levels of
nft
)
- Returns:
std::optional<std::vector<Word>> Either the resulting words of tapes, or std::nullopt if such words of specific lengths do not exist
-
std::set<Symbol> get_accepted_symbols(const Nfa &nfa)¶
Get all the one symbol words accepted by
nfa
.
-
std::set<std::pair<int, int>> get_word_lengths(const Nfa &aut)¶
Get the lengths of all words in the automaton
aut
.The function returns a set of pairs <u,v> where for each such a pair there is a word with length u+k*v for all ks. The disjunction of such formulae of all pairs hence precisely describe lengths of all words in the automaton.
- Parameters:
aut – Input automaton
- Returns:
Set of pairs describing lengths
-
class ShortestWordsMap¶
- #include <strings.hh>
Class mapping states to the shortest words accepted by languages of the states.
Public Functions
-
inline explicit ShortestWordsMap(const Nfa &aut)¶
Maps states in the automaton
aut
to shortest words accepted by languages of the states.- Parameters:
aut – Automaton to compute shortest words for.
Private Types
-
using WordLength = int¶
A length of a word. Pair binding the length of all words in the word set and word set with words of the given length.
-
using LengthWordsPair = std::pair<WordLength, std::set<Word>>¶
Private Functions
-
void insert_initial_lengths()¶
Inserts initial lengths into the shortest words map.
Inserts initial length of length 0 for final state in the automaton (initial states in the reversed automaton).
-
void compute()¶
Computes shortest words for all states in the automaton.
-
void compute_for_state(State state)¶
Computes shortest words for the given
state
.- Parameters:
state – [in] State to compute shortest words for.
-
inline LengthWordsPair map_default_shortest_words(const State state)¶
Creates default shortest words mapping for yet unprocessed
state
.- Parameters:
state – [in] State to map default shortest words.
- Returns:
Created default shortest words map element for the given
state
.
Private Members
-
std::unordered_map<State, LengthWordsPair> shortest_words_map = {}¶
Map mapping states to the shortest words accepted by the automaton from the mapped state.
Private Static Functions
-
static void update_current_words(LengthWordsPair &act, const LengthWordsPair &dst, Symbol symbol)¶
Update words for the current state.
- Parameters:
act – [out] Current state shortest words and length.
dst – [in] Transition target state shortest words and length.
symbol – [in] Symbol to update with.
-
inline explicit ShortestWordsMap(const Nfa &aut)¶
-
namespace replace¶
Enums
Functions
-
Nfa reluctant_nfa(Nfa nfa)¶
Modify
nfa
in-place to remove outgoing transitions from final states.If
nfa
accepts empty string, returned NFA will accept only the empty string.- Parameters:
nfa – NFA to modify.
- Returns:
The reluctant version of
nfa
.
-
Nft create_identity(mata::Alphabet *alphabet, size_t num_of_levels = 2)¶
Create identity transducer over the
alphabet
withnum_of_levels
levels.
-
Nft create_identity_with_single_symbol_replace(mata::Alphabet *alphabet, Symbol from_symbol, Symbol replacement, ReplaceMode replace_mode = ReplaceMode::All)¶
Create identity input/output transducer with 2 levels over the
alphabet
withlevel_cnt
levels with single symbolfrom_symbol
replaced with @to_symbol.
-
Nft create_identity_with_single_symbol_replace(mata::Alphabet *alphabet, Symbol from_symbol, const Word &replacement, ReplaceMode replace_mode = ReplaceMode::All)¶
Create identity input/output transducer with 2 levels over the
alphabet
withlevel_cnt
levels with single symbolfrom_symbol
replaced with wordreplacement
.
-
Nft replace_reluctant_regex(const std::string ®ex, const Word &replacement, Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All, Symbol begin_marker = BEGIN_MARKER)¶
Create NFT modelling a reluctant leftmost replace of regex
regex
toreplacement
.The most general replace operation, handling any regex as the part to be replaced.
- Parameters:
regex – A string containing regex to be replaced.
replacement – Literal to be replaced with.
alphabet – Alphabet over which to create the NFT.
replace_mode – Whether to replace all or just the single (the leftmost) occurrence of
regex
.begin_marker – Symbol to be used internally as a begin marker of replaced
regex
.
- Returns:
The reluctant leftmost replace NFT.
-
Nft replace_reluctant_regex(nfa::Nfa regex, const Word &replacement, Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All, Symbol begin_marker = BEGIN_MARKER)¶
Create NFT modelling a reluctant leftmost replace of regex
regex
toreplacement
.The most general replace operation, handling any regex as the part to be replaced.
- Parameters:
regex – NFA representing regex to be replaced.
replacement – Literal to replace with.
alphabet – Alphabet over which to create the NFT.
replace_mode – Whether to replace all or just the single (the leftmost) occurrence of
regex
.begin_marker – Symbol to be used internally as a begin marker of replaced
regex
.
- Returns:
The reluctant leftmost replace NFT.
-
Nft replace_reluctant_literal(const Word &literal, const Word &replacement, Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All, Symbol end_marker = END_MARKER)¶
Create NFT modelling a reluctant leftmost replace of literal
literal
toreplacement
.- Parameters:
literal – Literal to replace.
replacement – Literal to replace with.
alphabet – Alphabet over which to create the NFT.
replace_mode – Whether to replace all or just the single (the leftmost) occurrence of
literal
.end_marker – Symbol to be used internally as an end marker marking the end of the replaced literal.
- Returns:
The reluctant leftmost replace NFT.
-
Nft replace_reluctant_single_symbol(Symbol from_symbol, Symbol replacement, mata::Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All)¶
Create NFT modelling a reluctant leftmost replace of symbol
from_symbol
toreplacement
.
-
Nft replace_reluctant_single_symbol(Symbol from_symbol, const Word &replacement, mata::Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All)¶
Create NFT modelling a reluctant leftmost replace of symbol
from_symbol
toreplacement
.
Variables
-
class ReluctantReplace¶
- #include <strings.hh>
Implementation of all reluctant replace versions.
Public Static Functions
-
static Nft replace_regex(nfa::Nfa regex, const Word &replacement, Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All, Symbol begin_marker = BEGIN_MARKER)¶
Create NFT modelling a reluctant leftmost replace of regex
regex
toreplacement
.The most general replace operation, handling any regex as the part to be replaced.
- Parameters:
regex – NFA representing regex to be replaced.
replacement – Literal to replace with.
alphabet – Alphabet over which to create the NFT.
replace_mode – Whether to replace all or just the single (the leftmost) occurrence of
regex
.begin_marker – Symbol to be used internally as a begin marker of replaced
regex
.
- Returns:
The reluctant leftmost replace NFT.
-
static Nft replace_literal(const Word &literal, const Word &replacement, Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All, Symbol end_marker = END_MARKER)¶
Create NFT modelling a reluctant leftmost replace of literal
literal
toreplacement
.- Parameters:
literal – Literal to replace.
replacement – Literal to replace with.
alphabet – Alphabet over which to create the NFT.
replace_mode – Whether to replace all or just the single (the leftmost) occurrence of
literal
.end_marker – Symbol to be used internally as an end marker marking the end of the replaced literal.
- Returns:
The reluctant leftmost replace NFT.
-
static Nft replace_symbol(Symbol from_symbol, Symbol replacement, mata::Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All)¶
Create NFT modelling a reluctant leftmost replace of symbol
from_symbol
toreplacement
.
-
static Nft replace_symbol(Symbol from_symbol, const Word &replacement, mata::Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All)¶
Create NFT modelling a reluctant leftmost replace of symbol
from_symbol
toreplacement
.
Protected Functions
-
Nft reluctant_leftmost_nft(const std::string ®ex, Alphabet *alphabet, Symbol begin_marker, const Word &replacement, ReplaceMode replace_mode)¶
-
Nft reluctant_leftmost_nft(nfa::Nfa nfa, Alphabet *alphabet, Symbol begin_marker, const Word &replacement, ReplaceMode replace_mode)¶
-
Nft replace_literal_nft(const Word &literal, const Word &replacement, const Alphabet *alphabet, Symbol end_marker, ReplaceMode replace_mode = ReplaceMode::All)¶
-
static Nft replace_regex(nfa::Nfa regex, const Word &replacement, Alphabet *alphabet, ReplaceMode replace_mode = ReplaceMode::All, Symbol begin_marker = BEGIN_MARKER)¶
-
Nfa reluctant_nfa(Nfa nfa)¶
-
namespace seg_nfa¶
Segment Automata including structs and algorithms.
These are automata whose state space can be split into several segments connected by ε-transitions in a chain. No other ε-transitions are allowed. As a consequence, no ε-transitions can appear in a cycle. Segment automaton can have initial states only in the first segment and final states only in the last segment.
Typedefs
-
using VisitedEpsilonsCounterVector = std::vector<unsigned>¶
Projection of VisitedEpsilonsNumberMap to sorted keys (in descending order).
-
using Noodle = std::vector<std::shared_ptr<SegNfa>>¶
A noodle is represented as a sequence of segments (a copy of the segment automata) created as if there was exactly one ε-transition between each two consecutive segments.
-
using SegmentWithEpsilonsCounter = std::pair<std::shared_ptr<Nfa>, VisitedEpsilonsCounterVector>¶
Segment with a counter of visited epsilons.
-
using NoodleWithEpsilonsCounter = std::vector<SegmentWithEpsilonsCounter>¶
Noodles as segments enriched with EpsCntMap.
-
using TransducerNoodle = std::vector<TransducerNoodleElement>¶
Functions
segs_one_initial_final
segments_one_initial_final[init, final] is the pointer to automaton created from one of the segments such that init and final are one of the initial and final states of the segment and the created automaton takes this segment, sets initial={init}, final={final} and trims it; also segments_one_initial_final[unused_state, final] is used for the first segment (where we always want all initial states, only final state changes) and segments_one_initial_final[init, unused_state] is similarly for the last segment TODO: should we use unordered_map? then we need hash
-
std::vector<Noodle> noodlify(const SegNfa &aut, Symbol epsilon, bool include_empty = false)¶
Create noodles from segment automaton
aut
.Segment automaton is a chain of finite automata (segments) connected via ε-transitions. A noodle is a vector of pointers to copy of the segments automata created as if there was exactly one ε-transition between each two consecutive segments.
- Parameters:
automaton – [in] Segment automaton to noodlify.
epsilon – [in] Epsilon symbol to noodlify for.
include_empty – [in] Whether to also include empty noodles.
- Returns:
A list of all (non-empty) noodles.
-
std::vector<NoodleWithEpsilonsCounter> noodlify_mult_eps(const SegNfa &aut, const std::set<Symbol> &epsilons, bool include_empty = false)¶
Create noodles from segment automaton
aut
.Segment automaton is a chain of finite automata (segments) connected via ε-transitions. A noodle is a vector of pointers to copy of the segments automata created as if there was exactly one ε-transition between each two consecutive segments.
- Parameters:
automaton – [in] Segment automaton to noodlify.
epsilons – [in] Epsilon symbols to noodlify for.
include_empty – [in] Whether to also include empty noodles.
- Returns:
A list of all (non-empty) noodles.
-
std::vector<Noodle> noodlify_for_equation(const std::vector<std::reference_wrapper<Nfa>> &lhs_automata, const Nfa &rhs_automaton, bool include_empty = false, const ParameterMap ¶ms = {{"reduce", "false"}})¶
Create noodles for left and right side of equation.
Segment automaton is a chain of finite automata (segments) connected via ε-transitions. A noodle is a copy of the segment automaton with exactly one ε-transition between each two consecutive segments.
Mata cannot work with equations, queries etc. Hence, we compute the noodles for the equation, but represent the equation in a way that libMata understands. The left side automata represent the left side of the equation and the right automaton represents the right side of the equation. To create noodles, we need a segment automaton representing the intersection. That can be achieved by computing a product of both sides. First, the left side has to be concatenated over an epsilon transitions into a single automaton to compute the intersection on, though.
- Parameters:
lhs_automata – [in] Sequence of segment automata for left side of an equation to noodlify.
rhs_automaton – [in] Segment automaton for right side of an equation to noodlify.
include_empty – [in] Whether to also include empty noodles.
params – [in] Additional parameters for the noodlification:
”reduce”: “false”, “forward”, “backward”, “bidirectional”; Execute forward, backward or bidirectional simulation minimization before noodlification.
- Returns:
A list of all (non-empty) noodles.
-
std::vector<Noodle> noodlify_for_equation(const std::vector<Nfa*> &lhs_automata, const Nfa &rhs_automaton, bool include_empty = false, const ParameterMap ¶ms = {{"reduce", "false"}})¶
Create noodles for left and right side of equation.
Segment automaton is a chain of finite automata (segments) connected via ε-transitions. A noodle is a copy of the segment automaton with exactly one ε-transition between each two consecutive segments.
Mata cannot work with equations, queries etc. Hence, we compute the noodles for the equation, but represent the equation in a way that libMata understands. The left side automata represent the left side of the equation and the right automaton represents the right side of the equation. To create noodles, we need a segment automaton representing the intersection. That can be achieved by computing a product of both sides. First, the left side has to be concatenated over an epsilon transitions into a single automaton to compute the intersection on, though.
- Parameters:
lhs_automata – [in] Sequence of pointers to segment automata for left side of an equation to noodlify.
rhs_automaton – [in] Segment automaton for right side of an equation to noodlify.
include_empty – [in] Whether to also include empty noodles.
params – [in] Additional parameters for the noodlification:
”reduce”: “false”, “forward”, “backward”, “bidirectional”; Execute forward, backward or bidirectional simulation minimization before noodlification.
- Returns:
A list of all (non-empty) noodles.
Create noodles for left and right side of equation (both sides are given as a sequence of automata).
- Parameters:
lhs_automata – [in] Sequence of pointers to segment automata for left side of an equation to noodlify.
rhs_automaton – [in] Sequence of pointers to segment automata for right side of an equation to noodlify.
include_empty – [in] Whether to also include empty noodles.
params – [in] Additional parameters for the noodlification:
”reduce”: “false”, “forward”, “backward”, “bidirectional”; Execute forward, backward or bidirectional simulation minimization before noodlification.
- Returns:
A list of all (non-empty) noodles together with the positions reached from the beginning of left/right side.
-
VisitedEpsilonsCounterVector process_eps_map(const VisitedEpsilonsCounterMap &eps_cnt)¶
Process epsilon map to a sequence of values (sorted according to key desc)
- Parameters:
eps_cnt – Epsilon count
- Returns:
Vector of keys (count of epsilons)
-
class Segmentation¶
- #include <strings.hh>
Class executing segmentation operations for a given segment automaton.
Works only with segment automata.
Public Types
-
using EpsilonDepth = size_t¶
Depth of ε-transitions. Dictionary of lists of ε-transitions grouped by their depth. For each depth ‘i’ we have ‘depths[i]’ which contains a list of ε-transitions of depth ‘i’.
-
using EpsilonDepthTransitions = std::unordered_map<EpsilonDepth, std::vector<Transition>>¶
-
using EpsilonDepthTransitionMap = std::unordered_map<EpsilonDepth, std::unordered_map<State, std::vector<Transition>>>¶
Public Functions
-
inline explicit Segmentation(const SegNfa &aut, const std::set<Symbol> &epsilons)¶
Prepare automaton
aut
for segmentation.- Parameters:
aut – [in] Segment automaton to make segments for.
epsilon – [in] Symbol to execute segmentation for.
-
inline const EpsilonDepthTransitions &get_epsilon_depths() const¶
Get segmentation depths for ε-transitions.
- Returns:
Map of depths to lists of ε-transitions.
-
inline const EpsilonDepthTransitionMap &get_epsilon_depth_trans_map() const¶
Get the epsilon depth trans map object (mapping of depths and states to eps-successors)
- Returns:
Map of depths to a map of states to transitions
-
const std::vector<Nfa> &get_segments()¶
Get segment automata.
- Returns:
A vector of segments for the segment automaton in the order from the left (initial state in segment automaton) to the right (final states of segment automaton).
-
const std::vector<Nfa> &get_untrimmed_segments()¶
Get raw segment automata.
- Returns:
A vector of segments for the segment automaton in the order from the left (initial state in segment automaton) to the right (final states of segment automaton) without trimming (the states are same as in the original automaton).
-
inline const VisitedEpsMap &get_visited_eps() const¶
Private Functions
-
void compute_epsilon_depths()¶
Compute epsilon depths with their transitions.
-
void split_aut_into_segments()¶
Split segment
automaton
intosegments
.
-
void update_next_segment(size_t current_depth, const Transition &transition)¶
Propagate changes to the current segment automaton to the next segment automaton.
- Parameters:
current_depth – [in] Current depth.
transition – [in] Current epsilon transition.
-
void update_current_segment(size_t current_depth, const Transition &transition)¶
Update current segment automaton.
- Parameters:
current_depth – [in] Current depth.
transition – [in] Current epsilon transition.
-
std::unordered_map<State, bool> initialize_visited_map() const¶
Initialize map of visited states.
- Returns:
Map of visited states.
-
std::deque<StateDepthTuple> initialize_worklist() const¶
Initialize worklist of states with depths to process.
- Returns:
Queue of state and its depth pairs.
-
void process_state_depth_pair(const StateDepthTuple &state_depth_pair, std::deque<StateDepthTuple> &worklist)¶
Process pair of state and its depth.
- Parameters:
state_depth_pair – [in] Current state depth pair.
worklist – [out] Worklist of state and depth pairs to process.
-
void add_transitions_to_worklist(const StateDepthTuple &state_depth_pair, const SymbolPost &move, std::deque<StateDepthTuple> &worklist)¶
Add states with non-epsilon transitions to the
worklist
.- Parameters:
move[in] – Move from current state.
depth[in] – Current depth.
worklist[out] – Worklist of state and depth pairs to process.
-
void handle_epsilon_transitions(const StateDepthTuple &state_depth_pair, const SymbolPost &move, std::deque<StateDepthTuple> &worklist)¶
Process epsilon transitions for the current state.
- Parameters:
state_depth_pair – [in] Current state depth pair.
move – [in] Move from current state.
worklist – [out] Worklist of state and depth pairs to process.
-
void remove_inner_initial_and_final_states()¶
Remove inner initial and final states.
Remove all initial states for all segments but the first one and all final states for all segments but the last one.
Private Members
-
const std::set<Symbol> epsilons¶
Symbol for which to execute segmentation. Automaton to execute segmentation for. Must be a segment automaton (can be split into
segments
).
-
EpsilonDepthTransitions epsilon_depth_transitions = {}¶
Epsilon depths.
-
EpsilonDepthTransitionMap eps_depth_trans_map = {}¶
-
std::vector<SegNfa> segments = {}¶
Epsilon depths with mapping of states to epsilon transitions.
Segments for
automaton
.
-
VisitedEpsMap visited_eps = {}¶
-
struct StateDepthTuple¶
number of visited eps for each state
Pair of state and its depth.
Public Members
-
EpsilonDepth depth¶
Depth of a state.
-
EpsilonDepth depth¶
-
using EpsilonDepth = size_t¶
-
struct TransducerNoodleElement¶
- #include <strings.hh>
Public Functions
-
using VisitedEpsilonsCounterVector = std::vector<unsigned>¶
-
using Transition = nfa::Transition¶
-
namespace strings¶