Abstract
For a regular language L, let \({{\,\mathrm{Var}\,}}(L)\) be the minimal number of nonterminals necessary to generate L by right linear grammars. Moreover, for natural numbers \(k_1,k_2,\ldots ,k_n\) and an nary regularity preserving operation f, let \(g_f^{{{\,\mathrm{Var}\,}}}(k_1,k_2,\ldots ,k_n)\) be the set of all numbers k such that there are regular languages \(L_1,L_2,\ldots , L_n\) such that \({{\,\mathrm{Var}\,}}(L_i)=k_i\) for \(1\le i\le n\) and \({{\,\mathrm{Var}\,}}(f(L_1,L_2,\ldots , L_n))=k\). We completely determine the sets \(g_f^{{{\,\mathrm{Var}\,}}}\) for the operations reversal, Kleeneclosures \(+\) and \(*\), and union; and we give partial results for product and intersection.
Introduction and definitions
In the last 30 years, the following problem was studied very intensively: How behaves the complexity of languages under operations (on languages). More precisely, for a complexity measure K on regular languages and a regularity preserving nary function f on languages, one is interested in the set \(g_f^K(k_1,k_2,\ldots ,k_n)\) of all numbers k such that there are regular languages \(L_1,L_2,\ldots , L_n\) such that \(K(L_i)=k_i\) for \(1\le i\le n\) and \(K(f(L_1,L_2,\ldots , L_n))=k\) (in many cases only the maximal number \(h_f^K(k_1,k_2,\ldots ,k_n)\) in \(g_f^K(k_1,k_2,\ldots ,k_n)\) is considered).
Most of the research concerns the state complexity \({{\,\mathrm{sc}\,}}\), where \({{\,\mathrm{sc}\,}}(L)\) is defined as the minimal number of states of a deterministic finite automata which accepts L. As an example we mention
(see [12]). For further results concerning the behaviour of the state complexity under operations, we refer to the survey article [5].
Because the family of regular languages can be characterized by nondeterministic finite automata, by incomplete deterministic finite automata, regular expressions etc., it is natural to study the set \(g_f^{K}\) and the function \(h_f^K\) for measures K connected with those characterizations.
The state complexity of a regular language coincides with the number of its left quotients by words. However, using the quotients instead of states gives some further interesting aspects. The behaviour of the quotient complexity under operations was intensively studied by Brzozowski and others. A summary is given in [1].
In 2003, Holzer and Kutrib started the investigation with respect to the nondeterministic state complexity \({{\,\mathrm{nsc}\,}}\) where \({{\,\mathrm{nsc}\,}}(L)\) is defined as the minimal number of states of a nondeterministic finite automata which accepts L [10, 11]. We mention here the facts given in Table 1.
A complete deterministic automaton with r states and s input symbols has \(r\cdot s\) transitions. However, often it is not necessary to require completeness, i.e., it can be that, for some pairs of states and input symbols, no transition is defined. For such incomplete automata, one can study the transition complexity tc where tc(L) is the minimal number of transitions in incomplete deterministic finite automata which accepts L. The behaviour of this measure was studied in [6, 16]. For instance, in [6], it was shown that for all numbers \(m\ge 1\) and \(n\ge 1\),
The behaviour of the size of regular expressions was studied in [4, 7].
We now introduce a further concept which characterizes regular languages. We start with some notation.
For a set M, by \({{\,\mathrm{card}\,}}(M)\) we denote its cardinality. An alphabet is a nonempty finite set. The elements of an alphabet are called letters. A word over an alphabet V is a finite sequence of letters of V. The length of a word is the number of its letters (where we count the letters as often as they occur in the word); the length of a word w is denoted by \(\vert w\vert \). A word w of length \(n\ge 1\) is written as \(w=a_1a_2\ldots a_n\), where \(a_i\in V\) for \(1\le i\le n\). The empty word (of length 0) is designated by \(\lambda \). The set of all nonempty words of V is denoted by \(V^+\), and we set \(V^*=V^+\cup \{\lambda \}\). The reversal of a nonempty word \(w=a_1a_2\ldots a_n\) is defined by \(w^R=a_na_{n1}\ldots a_1\); moreover, we set \(\lambda ^R=\lambda \).
Any subset of \(V^*\) is called a language over V. For two languages L and \(L'\), we define the concatenation by \(LL'=\{ ww' \mid w\in L, w'\in L'\}\), the powers of L by \(L^0=\{\lambda \}\), \(L^n=L^{n1}L\) for \(n\ge 1\), the positive Kleeneclosure, the Kleeneclosure and the reversal by
A right linear grammar is a quadruple \(G=(N,T,P,S)\) where N and T are two disjoint (nonempty) alphabets, P is a finite subset of \(N\times (T^*N\cup T^*)\), and \(S\in N\). The elements of N and T are called nonterminals and terminals, respectively. The elements of P are called rules. For a rule (A, w), we mostly use the notation \(A\rightarrow w\). S is called the axiom or the start symbol.
For two words \(x\in T^*N\) and \(y\in T^*N\cup T^*\), we write \(x\Longrightarrow y\) if and only if there is a rule \(A\rightarrow w\) in P such that \(x=vA\) and \(y=vw\), and then we say that x derives y in one derivation step. Let \(\Longrightarrow ^*\) be the reflexive and transitive closure of \(\Longrightarrow \). If \(x\Longrightarrow ^* y\), we say that y is derived from x.
The language L(G) generated by a right linear grammar \(G=(N,T,P,S)\) is defined by
It is known that a language L is regular if and only if it is generated by a right linear grammar.
For a right linear grammar \(G=(N,T,P,S)\), we define the complexity measure \({{\,\mathrm{Var}\,}}\) and extend it for regular languages by setting
and
A right linear grammar \(G=(N,T,P,S)\) is called to be in normal form, if all rules of P are of the form \(A\rightarrow aB\) or \(A\rightarrow \lambda \) where A and B are nonterminals and a is a terminal. Let \({{\,\mathrm{Varnf}\,}}\) be the complexity if one restricts to right linear grammars in normal form. Then it is known that \({{\,\mathrm{Varnf}\,}}(L)={{\,\mathrm{nsc}\,}}(L)\). However, the complexities of concrete languages differ essentially. For instance, it is obvious that \({{\,\mathrm{nsc}\,}}(\{a^m\}^*)=m\) and \({{\,\mathrm{Var}\,}}(\{a^m\}^*)=1\).
Obviously, contextfree grammars (where the rules have the form \(A\rightarrow w\in (N\cup T)^*\) with \(A\in N\) and \(w\in (N\cup T)^*\)) are a generalization of right linear grammars, and one can define the \({{\,\mathrm{Var}\,}}\)complexity for contextfree grammars, which is denoted by \({{\,\mathrm{Varcf}\,}}\). The behaviour of this measure under operations was studied in [3]. Some results are presented in Table 2 (there are no results for intersection and complement because the class of contextfree languages is not closed under these two operations).
We mention that, again, the \({{\,\mathrm{Var}\,}}\)complexities of languages can considerably differ. For instance, the set \(U=\{aba\}^+\{ab^2a\}^+\cdots \{ab^{2m}a\}^+\) has the complexities \({{\,\mathrm{Varcf}\,}}(U)=m\) (by Lemma 2.3 in [3]) and \({{\,\mathrm{Var}\,}}(U)=2m\) (by Lemma 1 below).
In this paper we continue the research mentioned above by a (partial) determination of the sets \(g_f^{{{\,\mathrm{Var}\,}}}\) with respect to the operations union, concatenation, Kleeneclosure, intersection, and reversal.
By the above remarks, the study of the sets \(g_f^{{{\,\mathrm{Var}\,}}}\) is an intermediate step between the investigation of \(g_f^{{{\,\mathrm{nsc}\,}}}\) and the study of \(g_f^{{{\,\mathrm{Varcf}\,}}}\). Therefore our results sometimes look similar to those of [3, 12,13,14]. In some cases, below, we can use modifications of the proofs in those papers. However, by the above mentioned differences in the complexities for certain languages, mostly, we cannot follow the ideas of that papers. For instance, to prove the first line of Table 1 the authors only use finite and unary languages. However, such languages cannot be used to prove an analogous statement for the measure \({{\,\mathrm{Var}\,}}\) because \({{\,\mathrm{Var}\,}}(L)\le 2\) holds for any finite and/or unary language L.
The complexity of some special languages
In this section we determine the complexities of some languages, which are used later.
For two different letters a and b, a natural number p with \(p\ge 1\), and pairwise different natural numbers \(r_1, r_2,\ldots ,r_p\) with \(1\le r_i\) for \(1\le i\le p\), we set
Lemma 1
For two different letters a and b, natural numbers \(n\ge 1\), \(p_1,p_2, \ldots ,p_n\), \(p_i\ge 1\) for \(1\le i\le n\), and pairwise different natural numbers \(r_{1,1}, r_{1,2},\ldots ,r_{1,p_1}, r_{2,1},r_{2,2},\ldots ,r_{2,p_2}, \ldots ,r_{n,p_n}\), \(r_{i,j}\ge 1\) for \(1\le i\le n\), \(1\le j\le p_i\), let
Then we have
and
Proof
Let \(G=(N,\{a,b\},P,S)\) be a right linear grammar such that
Further, let \(t={{\,\mathrm{card}\,}}(N)\) and \(t'=\max \{ \vert z\vert \mid A\rightarrow z\in P\}\).
For i and j, \(1\le i\le n\), \(1\le j\le p_i\), we consider the word
and its derivation. There are nonterminals \(A_1,A_2,\ldots ,A_{t+1}\) such that
with \(w_i=ab^{r_{i,1}}aab^{r_{i,2}}a\ldots ab^{r_{i,j1}}aw_i'\), \(\vert w_i'\vert \le t'\), \(2r_{i,j}+3\le \vert v_{i,l}\vert \le 2r_{i,j}+3+t'\) for \(1\le l\le t\). Because \(\vert w_i'v_{i,1}v_{i,2}\ldots v_{i,t1}v_{i,t}\vert \le t'+(2r_{i,j}+3+t')t\le (5r_{i,j}+t')(t+1)\), we get \(w_i'v_{i,1}v_{i,2}\ldots v_{i,t1}v_{i,t}=(ab^{r_{i,j}}a)^pw''\) with some natural number p and a proper prefix \(w''\) of \(ab^{r_{i,j}}a\). By the length condition for the words \(v_{i,l}\), we obtain that \(v_{i,l}\) contains at least one occurrence of \(ab^{r_{i,j}}a\) as a subword.
Since N contains only t nonterminals, it follows that there are two numbers \(l_1\) and \(l_2\) such that \(A_{l_1}=A_{l_2}\). We set \(B_{i,j}=A_{l_1}=A_{l_2}\). Then the derivation
can be written as
and by construction \(x_{i,j}=x_{i,j}'(ab^{r_{i,j}}a)^{p(i,j)}x_{i,j}''\) for some natural number \(p(i,j)\ge 1\), a proper suffix \(x_{(i,j)}'\) of \(ab^{r_{i,j}}a\), and a proper prefix \(x_{(i,j)}''\) of \(ab^{r_{i,j}}a\).
If \(B_{i,j}=B_{i,j'}\) for some \(1\le j< j'\le p_i\), then we have a derivation
By construction, \(z\in L(G)=L\). But z contains an occurrence of \(ab^{r_{i,j}}a\) after an occurrence of \(ab^{r_{i,j'}}a\) with \(j<j'\) which contradicts the structure of the words in L (more precisely, in \(P(r_{i,1},r_{i,2},\ldots , p_{i,p_i})\)).
If \(B_{i,j}=B_{i',j'}\) for some \(1\le i<i'\le n\), \(1\le j\le p_i\), and \(1\le j'\le p_{i'}\), then we can analogously show that a word is generated which contains an occurrence of \(ab^{r_{i,j}}a\) as well as an occurrence of \(ab^{r_{i',j'}}a\) which contradicts the structure of words in L, again.
Thus N contains at least the \(p_1+p_2+\cdots +p_n\) different letters \(B_{i,j}\), \(1\le i\le n\), \(1\le j\le p_i\).
Moreover, if \(n\ge 2\) and \(B_{i,j}\) is the axiom for certain \(1\le i\le n\), \(1\le j\le p_i\), then we can as above produce a word which contains an occurrence of \(ab^{r_{i,j}}a\) as well as an occurrence of \(ab^{r_{i',j'}}a\), where \(i'\ne i\). Again, we obtain a contradiction to the structure of L. Thus, N contains a start symbol different from all symbols \(B_{i,j}\).
Hence we obtain \({{\,\mathrm{Var}\,}}(L)\ge p_1+ p_2+\cdots +p_n+1\) for \(n\ge 2\) and \({{\,\mathrm{Var}\,}}(L)\ge p_1\) for \(n=1\).
On the other hand, the right linear grammars \(G_1=(N_1,\{a,b\},P_1,C_1)\) and \(G_{\ge 2}=(N_{\ge 2},\{a,b\},P_{\ge 2},S)\) with
generate L for \(n=1\) and \(n\ge 2\), respectively. Thus
The proof that we need at least \(p_1\) nonterminals for the generation of the set \(P'(r_{1,1},r_{1,2},\ldots ,r_{1,p_1})\) can be analogously given to the proof of this statement for \(P(r_{1,1},r_{1,2},\ldots ,r_{1,p_1})\). To show that that \(p_1\) nonterminals are sufficient we add the rules \(C_i\rightarrow C_{i+1}\) for \(1\le i\le p_11\) and \(C_{p_1}\rightarrow \lambda \) to the rules of \(G_1\), which results in a grammar generating \(P'(r_{1,1},r_{1,2},\ldots ,r_{1,p_1})\) with \(p_1\) nonterminals. \(\square \)
Lemma 2
For three pairwise different letters a, b and c, natural numbers \(p\ge 1\) and \(q\ge 1\), and pairwise different numbers \(r_1,r_2,\ldots , r_p, s_1,s_2,\ldots , s_q\), \(r_i\ge 1\) for \(1\le i\le p\) and \(s_j\ge 1\) for \(1\le j\le q\), we have

(i)
\({{\,\mathrm{Var}\,}}(P(r_1,r_2,\ldots ,r_p)\{c\})=p\),

(ii)
\({{\,\mathrm{Var}\,}}(\{c\}P(r_1,r_2,\ldots ,r_p))=p+1\),

(iii)
\({{\,\mathrm{Var}\,}}(\{c\}P(r_1,r_2,\ldots ,r_p)\cup P(s_1,s_2,\ldots ,s_q))=p+q+1\).
Proof

(i)
The proof that we need at least p nonterminals for the generation of \(P(r_1,r_2,\ldots ,r_p)\{c\}\) can be analogously given to the corresponding statement for the set \(P(r_{1,1},r_{1,2},\ldots ,r_{1,p_1})\) in the proof of Lemma 1. To show that p rules are sufficient we consider the right linear grammar \(G_1'=(\{C_1,C_2,\ldots C_p\}, \{a,b,c\},P_1',C_1)\) with
$$\begin{aligned} P_1'=\{C_p\rightarrow ab^{r_p}aC_p, C_p\rightarrow ab^{r_p}ac\} \cup \bigcup _{i=1}^{p1} \{C_i\rightarrow ab^{r_i}aC_i, C_i\rightarrow ab^{r_i}aC_{i+1}\}, \end{aligned}$$which generates \(P(r_1,r_2,\ldots ,r_p)\{c\}\) with p nonterminals.

(ii)
As in the proof of Lemma 1, we can show the existence of p different nonterminals \(B_i\), \(1\le i\le p\), with derivations
If one of these nonterminals is the axiom, say \(B_i\), we have the derivation \(B_i\Longrightarrow ^* u_i(ab^{r_i}a)^{p_i}v_iB_i\Longrightarrow ^* u_iab^{r_i}av_iz_i\) which produces a word which does not start with c. Thus, we need at least \(p+1\) nonterminals.
Moreover, we modify \(G_1'\) of part i) to \(G_1''\) adding a new nonterminal C which is the axiom of \(G_1''\), by replacing \(C_p\rightarrow ab^{r_p}ac\) by \(C_p\rightarrow ab^{r_p}a\) and adding the rule \(C\rightarrow cC_1\). Then \(G_1''\) generates \(\{c\}P(r_1,r_2,\ldots ,r_p)\) with \(p+1\) nonterminals.

(iii)
As in the proof of Lemma 1, we can show that we need \(p+q\) different nonterminals to generate the words \(cab^{r_1}a\ldots ab^{r_{i1}}a(ab^{r_i}a)^tab^{r_{i+1}}a\ldots ab^{r_p}a\), \(1\le i\le p\), and \(ab^{s_1}a\ldots ab^{s_{j1}}a(ab^{s_j}a)^tab^{s_{j+1}}a\ldots ab^{s_q}a\), \(1\le j\le q\), for sufficiently large t. Furthermore, as in the proof of Lemma 1, we can show that these nonterminals are pairwise different and different from the axiom. Therefore \({{\,\mathrm{Var}\,}}(\{c\}P(r_1,r_2,\ldots ,r_p)\cup P(s_1,s_2,\ldots ,s_q))\ge p+q+1\).
Because the grammar \(G_2'=(\{S,C_1,C_2,\ldots ,C_p,D_1,D_2,\ldots ,D_q\}, \{a,b,c\},P_2',S)\) with
$$\begin{aligned} P_2'= & {} \{S\rightarrow cC_1, S\rightarrow D_1, C_p\rightarrow ab^{r_p}aC_p, C_p\rightarrow ab^{r_p}a, D_q\rightarrow ab^{s_q}aD_q, D_q\rightarrow ab^{s_q}a\} \\&\cup \!\bigcup _{i=1}^{p1} \{C_i\rightarrow ab^{r_i}aC_i, C_i\rightarrow ab^{r_i}aC_{i+1}\} \cup \!\bigcup _{j=1}^{q1} \{D_j\rightarrow ab^{s_j}aD_j, D_j\rightarrow ab^{s_j}aD_{j+1}\} \end{aligned}$$generates \(\{c\}P(r_1,r_2,\ldots ,r_p)\cup P(s_1,s_2,\ldots ,s_q)\) with \(p+q+1\) nonterminals, the statement follows. \(\square \)
Lemma 3
For pairwise different letters a, b, and c, natural numbers \(p\ge 1\), \(q\ge 1\), and pairwise different numbers \(r_1,r_2,\ldots , r_p, s_1,s_2,\ldots , s_q\), \(r_i\ge 1\) for \(1\le i\le p\) and \(s_j\ge 1\) for \(1\le j\le q\), we have

(i)
\({{\,\mathrm{Var}\,}}(P(r_1,r_2,\ldots ,r_p)\cup U(s_1,s_2,\ldots ,s_q))=p+2\),

(ii)
\({{\,\mathrm{Var}\,}}(\{c\}P(r_1,r_2,\ldots ,r_p)\cup U(s_1,s_2,\ldots ,s_q)=p+2\),

(iii)
\({{\,\mathrm{Var}\,}}(P'(r_1,r_2,\ldots ,r_{p'})U(s_1,s_2,\ldots ,s_q)P(r_{p'+1},r_{p'+2},\ldots ,r_p))=p+1\), where \(p> p'\ge 1\) holds in addition,

(iv)
\({{\,\mathrm{Var}\,}}(U(s_1,s_2,\ldots ,s_q)P(r_{1},r_{2},\ldots ,r_p))=p+1\).
Proof
(i) As in Lemma 1, we can show that the generation of
for sufficiently large t requires \(p+1\) different nonterminals which are also different from the axiom. Thus \({{\,\mathrm{Var}\,}}((P(r_1,r_2,\ldots ,r_p)\cup U(s_1,s_2,\ldots ,s_q))\ge p+2\).
On the other hand, the grammar \(H=(\{S,C,C_1,C_2,\ldots ,C_p\},\{a,b\},P,S)\) with
generates \(P(r_1,r_2,\ldots ,r_p)\cup U(s_1,s_2,\ldots ,s_q)\) with \(p+2\) nonterminals. (ii) can be analogously shown. (iii) As in Lemma 1, we can show that the generation of the words
for sufficiently large t requires \(p+1\) different nonterminals. Thus
Since the grammar \(H'=(\{C,C_1,C_2,\ldots C_p\},\{a,b\},P',C_1)\) with
generates \(P'(r_1,r_2,\ldots ,r_{p'})U(s_1,s_2,\ldots ,s_q)P(r_{p'+1},r_{p'+2},\ldots ,r_p)\) with \(p+1\) nonterminals, the statement follows. (iv) can be shown analogously. \(\square \)
Lemma 4

(i)
For different letters a and b, we have \({{\,\mathrm{Var}\,}}(\{b\}\{a,b\}^*)=2\).

(ii)
For different letters a and b, a natural \(p\ge 1\), and pairwise different natural numbers \(r_1,r_2,\ldots ,r_p\), \(r_i\ge 1\) for \(1\le i\le p\), we have \({{\,\mathrm{Var}\,}}(P(r_1,r_2,\ldots ,r_p)\cup \{c\}\{a,b\}^*)=p+2\).
Proof

(i)
Let \(G=(N,\{a,b\},P,S)\) be a right linear grammar with \(L(G)=\{b\}\{a,b\}^*\) and \({{\,\mathrm{Var}\,}}(G)={{\,\mathrm{Var}\,}}(\{b\}\{a,b\}^*)\). Since \(ba^t\in L(G)\) for all (sufficiently large) numbers \(t\ge 1\), there is a nonterminal \(A\in N\) such that \(A\Longrightarrow ^* a^rA\) for some \(r\ge 1\). If \(A=S\), then we have a derivation \(S=A\Longrightarrow ^* a^rA=a^rS\Longrightarrow ^* a^rb\) (because \(S\Longrightarrow ^* b\in L(G)=\{b\}\{a,b\}^*\)), which produces a word not in \(\{b\}\{a,b\}^*\), which is a contradiction. Hence, \({{\,\mathrm{Var}\,}}(\{b\}\{a,b\}^*)={{\,\mathrm{Var}\,}}(G)\ge 2\).
Because \(H=(\{S,S'\},\{a,b\},\{S\rightarrow bS', S'\rightarrow aS', S'\rightarrow bS', S'\rightarrow \lambda \},S)\) generates \(\{b\}\{a,b\}^*\) with two nonterminals, we obtain \({{\,\mathrm{Var}\,}}(\{b\}\{a,b\}^*)=2\).

(ii)
can be shown analogously to the proofs of Lemma 3 and part i). \(\square \)
Behaviour under operations
In this section we study the behaviour of the measure \({{\,\mathrm{Var}\,}}\) for regular languages (presented by right linear grammars) under some operations.
Reversal
We start with the operation reversal.
Theorem 1
We have
Proof

(i)
We first prove that, for a language L with \({{\,\mathrm{Var}\,}}(L)=n\), \({{\,\mathrm{Var}\,}}(L^R)\le n+1\) holds.
Since \({{\,\mathrm{Var}\,}}(L)=n\), L is generated by a right linear grammar \(G=(N,T,P,S)\) with \({{\,\mathrm{card}\,}}(N)=n\). We construct the grammar \(G^R=(N\cup \{S^R\}, T,P^R, S^R)\) where \(S^R\) is a new symbol, i.e., \(S^R\notin N\), and \(P^R\) consists of all rules constructed as follows:

if \(A\rightarrow w\in P\), \(w\in T^*\) and \(A\in N\), then \(S^R\rightarrow w^RA\in P^R\),

if \(A\rightarrow wB\in P\), \(w\in T^*\), \(A\in N\), and \(B\in N\), then \(B\rightarrow w^RA\in P^R\),

if \(S\rightarrow w\in P\) and \(w\in T^*\), then \(S^R\rightarrow w^R\in P^R\), and

\(S\rightarrow \lambda \in P^R\).
By construction, \(G^R\) is right linear and satisfies \(L(G^R)=L^R\). Moreover, we have \({{\,\mathrm{Var}\,}}(G^R)=n+1\). Therefore \({{\,\mathrm{Var}\,}}(L^R)\le n+1\).


(ii)
We now prove that, for a language L with \({{\,\mathrm{Var}\,}}(L)=n\), \({{\,\mathrm{Var}\,}}(L^R)\ge n1\) holds.
Assume that there is a language L such that \({{\,\mathrm{Var}\,}}(L)=n\) and \({{\,\mathrm{Var}\,}}(L^R)\le n2\). Since \((L^R)^R=L\), we obtain from i) that \({{\,\mathrm{Var}\,}}(L)={{\,\mathrm{Var}\,}}((L^R)^R)\le {{\,\mathrm{Var}\,}}(L^R)+1 \le n1\) in contrast to our supposition.

(iii)
We now present witnesses for the values n for \(n\ge 1\), \(n+1\) for \(n\ge 1\), and \(n1\) for \(n\ge 2\).
Let \(n\ge 1\) and \(L_n=P(1,2,\ldots ,n)\). Then we obtain \(L_n^R=P(n,n1,\ldots ,1)\). By Lemma 1, we get \({{\,\mathrm{Var}\,}}(L_n)={{\,\mathrm{Var}\,}}(L_n^R)=n\).
Let \(n\ge 1\) and \(K_n=P(1,2,\ldots ,n)\{c\}\). Then \(K_n^R=\{c\}P(n,n1,\ldots ,1)\) and, by Lemma 2, \({{\,\mathrm{Var}\,}}(K_n)=n\) and \({{\,\mathrm{Var}\,}}(K_n^R)=n+1\).
Let \(n\ge 2\) and \(M_n=\{c\}P(1,2,\ldots ,n1)\). Then \(M_n^R=P(n1,n2,\ldots ,1)\{c\}\). By Lemma 2, we have \({{\,\mathrm{Var}\,}}(M_n)=n\) and \({{\,\mathrm{Var}\,}}(M_n^R)=n1\). \(\square \)
Kleeneclosure
First, we consider the positive closure.
Theorem 2
For \(n\ge 1\), we have \(g_+^{{{\,\mathrm{Var}\,}}}(n)=\{1,2,\ldots ,n\}\).
Proof

(i)
We show that \({{\,\mathrm{Var}\,}}(L)=n\) implies \({{\,\mathrm{Var}\,}}(L^+)\le n\).
Let L be a language with \({{\,\mathrm{Var}\,}}(L)=n\). Then there is a right linear grammar \(G=(N,T,P,S)\) such that \({{\,\mathrm{card}\,}}(N)=n\) and \(L(G)=L\). We construct the grammar \(G'=(N,T,P',S)\) where
$$\begin{aligned} P'=P\cup \{A\rightarrow wS \mid A\rightarrow w\in P\}. \end{aligned}$$By construction, after finishing a derivation in G, we can start a new derivation in \(G'\). Thus \(L(G')=L^+\). Because \({{\,\mathrm{Var}\,}}(G')={{\,\mathrm{card}\,}}(N)=n\), we obtain \({{\,\mathrm{Var}\,}}(L^+)\le n\).

(ii)
We now show that \(n\in g_+^{{{\,\mathrm{Var}\,}}}(n)\) for \(n\ge 1\). Let \(L_n=P(1,2,\ldots ,n)\). By Lemma 1, \({{\,\mathrm{Var}\,}}(L_n)=n\).
Let \(G=(N,\{a,b\}, P, S)\) be a right linear grammar such that \(L(G)=L_n^+\). Since \(L_n\subseteq L_n^+\), we can prove as in the proof of Lemma 1 that there are n different nonterminals \(B_1,B_2,\ldots ,B_n\) in N such that there are derivations
$$\begin{aligned} S\Longrightarrow ^* y_iB_i\Longrightarrow ^* y_ix_iB_i\Longrightarrow ^* y_ix_iy_i' \end{aligned}$$where \(x_i=x_i'ab^iax_i''\) with \(x_i',x_i'', y_i,y_i'\in \{a,b\}^*\). Thus we have \({{\,\mathrm{Var}\,}}(L_n^+)\ge n\). By i) we get \({{\,\mathrm{Var}\,}}(L_n^+)=n\). Hence \(n\in g_+^{{{\,\mathrm{Var}\,}}}(n)\) for \(n\ge 1\).

(iii)
We now prove that all values k with \(3\le k\le n1\) are in \(g_+^{{{\,\mathrm{Var}\,}}}(n)\). Let
$$\begin{aligned} K_{n,k}=P(1)\cup P(2)\cup \ldots \cup P(nk)\cup P(nk+1,nk+2,\ldots ,n1). \end{aligned}$$By Lemma 1, we have \({{\,\mathrm{Var}\,}}(K_{n,k})=n\). Moreover, let
$$\begin{aligned} U=(\{aba, ab^2a,\ldots ,ab^{nk}a\}\cup P(nk+1,nk+2,\ldots ,n1))^+. \end{aligned}$$Because \(ab^ia\in K_{n,k}\) for \(1\le i\le nk\) and \(P(nk+1,nk+2,\ldots ,n1)\subseteq K_{n,k}\), we get \(U\subseteq K_{n,k}^+\). Furthermore, \((ab^ia)^j\in U\) for \(1\le i\le nk\) and \(j\ge 1\) and \(P(nk+1,nk+2,\ldots ,n1)\subseteq U\) imply \(K_{n,k}^+\subseteq U\). Therefore we have \(U=K_{n,k}^+\).
Because the right linear grammar \((\{S,A_1,A_2,\ldots ,A_{k1}\}, \{a,b\}, P,S)\) with
$$\begin{aligned} P= & {} \{S\rightarrow A_1, A_{k1}\rightarrow ab^{n1}aA_{k1}, A_{k1}\rightarrow ab^{n1}a, A_{k1}\rightarrow ab^{n1}aS\} \\&\qquad \cup \bigcup _{i=1}^{nk}\{ S\rightarrow ab^iaS, S\rightarrow ab^ia\} \\&\qquad \cup \bigcup _{j=1}^{k2} \{A_j\rightarrow ab^{nk+j}aA_j, A_j\rightarrow ab^{nk+j}aA_{j+1}\} \end{aligned}$$generates U, we have \({{\,\mathrm{Var}\,}}(K_{n,k}^+)={{\,\mathrm{Var}\,}}(U)\le k\). On the other hand, let \(G'=(N',\{a,b\},P',S')\) be a right linear grammar such that \(L(G')=K_{n,k}^+\). As in the proof of Lemma 1, we can show that, starting from the words \((aba)^t\) and \(ab^{nk+1}a\ldots ab^{nk+j1}a(ab^{nk+j}a)^tab^{nk+j+1}a\ldots ab^{n1}a\), \(1\le j\le k1\), for sufficiently large t, there are k letters \(B_1, B_{nk+1}, \ldots ,B_{n1}\) with derivations
$$\begin{aligned} B_i&\Longrightarrow ^*&u_i(ab^{i}a)^{p_i}v_iB_i \text{ where } p_i \text{ is } \text{ a } \text{ natural } \text{ number }, u_i \text{ and } v_i \text{ are } \\&\text{ a } \text{ proper } \text{ suffix } \text{ and } \text{ a } \text{ proper } \text{ prefix } \text{ of } ab^{i}a, \text{ respectively }, \\ B_i&\Longrightarrow ^*&z_i \in \{a,b\}^*. \end{aligned}$$As in Lemma 1, we can prove that all these nonterminals are different. Thus \({{\,\mathrm{Var}\,}}(K_{n,k}^+)\ge k\).
Combining these estimations we get \({{\,\mathrm{Var}\,}}(K_{n,k}^+)=k\). Thus \(k\in g_+^{{{\,\mathrm{Var}\,}}}(n)\).

(iv)
We show that \(1\in g_+^{{{\,\mathrm{Var}\,}}}(n)\) for \(n\ge 3\). Let \(M_n=P(1)\cup P(2)\cup \ldots \cup P(n1)\). By Lemma 1, \({{\,\mathrm{Var}\,}}(M_n)=n\). Since \(M_n^+=\{aba,ab^2a,\ldots ,ab^{n1}a\}^+\) is generated by the grammar
$$\begin{aligned} G=\left( \{S\},\{a,b\},\bigcup _{i=1}^{n1}\{S\rightarrow ab^iaS, S\rightarrow ab^ia\}, S\right) , \end{aligned}$$we have \({{\,\mathrm{Var}\,}}(M_n^+)=1\). Therefore \(1\in g_+^{{{\,\mathrm{Var}\,}}}(n)\).

(v)
We show that \(2\in g_+^{{{\,\mathrm{Var}\,}}}(n)\) for \(n\ge 3\). Let \(R_n=P(1)\cup P(2)\cup \ldots \cup P(n2)\cup \{c\}P(n1)\).
As in Lemma 2, we can prove that \({{\,\mathrm{Var}\,}}(R_n)=n\).
Moreover, we have \(R_n^+=(\{aba,ab^2a,\ldots ,ab^{n2}a\}\cup \{c\}\{ab^{n1}a\}^+)^+\). Again, as in the proof of Lemma 1, we can show the existence of a nonterminal \(B_{n1}\) with derivations
$$\begin{aligned} B_{n1}&\Longrightarrow ^*&u_{n1}(ab^{n1}a)^{p_{n1}}v_{n1}B_{n1} \text{ where } p_{n1} \text{ is } \text{ a } \text{ natural } \text{ number, } \\&u_{n1} \text{ and } v_{n1} \text{ are } \text{ a } \text{ proper } \text{ suffix } \text{ and } \text{ a } \text{ proper } \text{ prefix } \\&\text{ of } ab^{n1}a, \text{ respectively }, \\ B_{n1}&\Longrightarrow ^*&z_{n1} \in \{a,b\}^*. \end{aligned}$$If \({{\,\mathrm{Var}\,}}(R_n^+)=1\), then \(B_{n1}\) is the start symbol. Then there is a derivation
$$\begin{aligned} B_{n1}\Longrightarrow ^* u_{n1}(ab^{r_{n1}}a)^{p_{n1}}v_{n1}B_{n1}\Longrightarrow ^* u_{n1}(ab^{r_{n1}}a)^{p_{n1}}v_{n1}z_{n1} \end{aligned}$$which derives a word not in \(R_n^+\) since it contains the subword \(ab^{n1}a\) but no c. Therefore \({{\,\mathrm{Var}\,}}(R_n^+)\ge 2.\)
The right linear grammar \((\{S,S'\},\{a,b,c\},P,S)\) with
$$\begin{aligned} P= & {} \{S\rightarrow cS', S'\rightarrow ab^{n1}aS', S'\rightarrow ab^{n1}a, S'\rightarrow ab^{n1}aS\} \\&\qquad \cup \bigcup _{i=1}^{n2}\{S\rightarrow ab^iaS, S\rightarrow ab^ia\} \end{aligned}$$generates \(R_n^+\), which gives \({{\,\mathrm{Var}\,}}(R_n^+)=2\). This proves \(2\in g_+^{{{\,\mathrm{Var}\,}}}(n)\).

(vi)
Finally, we prove \(1\in g_+^{{{\,\mathrm{Var}\,}}}(2)\).
Let \(Q=\{a\} \cup \{a^3\}^*\). Assume that \({{\,\mathrm{Var}\,}}(Q)=1\). Then there is a right linear grammar \((\{S\},\{a\},P,S)\) which generates Q. Obviously, P contains a rule \(S\rightarrow a^kS\), \(k\ge 1\), since otherwise a finite language is generated. Moreover, there is a derivation \(S\Longrightarrow ^* a\). Then we have the derivations
$$\begin{aligned} S\Longrightarrow a^kS\Longrightarrow ^* a^ka=a^{k+1} \text{ and } S\Longrightarrow a^kS\Longrightarrow a^ka^kS\Longrightarrow ^* a^ka^ka=a^{2k+1}. \end{aligned}$$Because \(k+1>1\), we obtain \(k+1=3p\) and \(2k+1=3q\) for certain positive integers p and q. We add these equations and get \(3k+2=3(p+q)\) or equivalently \(2=3(p+qk)\) which is impossible. This contradiction proves \({{\,\mathrm{Var}\,}}(Q)\ge 2\).
The right linear grammar \((\{S,S'\}, \{a\},\{S\rightarrow a, S\rightarrow S', S'\rightarrow a^3S', S'\rightarrow a^3\},S)\) generates Q. Therefore \({{\,\mathrm{Var}\,}}(Q)=2\).
Moreover, we have \(Q^+=\{a\}^+\) and \(\{a\}^+\) is generated by the right linear grammar \((\{S\},\{a\},\{S\rightarrow aS, S\rightarrow a\},S)\). Hence \({{\,\mathrm{Var}\,}}(Q^+)=1\) and \(1\in g_+^{{{\,\mathrm{Var}\,}}}(2)\). \(\square \)
Theorem 3
For \(n\ge 1\), we have \(g_*^{{{\,\mathrm{Var}\,}}}(n)=\{1,2,\ldots ,n+1\}\).
Proof

(i)
We first prove that \({{\,\mathrm{Var}\,}}(L)=n\) implies \({{\,\mathrm{Var}\,}}(L^*)\le n+1\). We start with a grammar \(G=(N,T,P,S)\) such that \(L(G)=L\) and \({{\,\mathrm{Var}\,}}(G)={{\,\mathrm{Var}\,}}(L)=n\). Then we construct the grammar \(G'=(N,T,P',S)\) which generates \(L^+\) as in part i) of the proof of Theorem 2. Now we modify \(G'\) to \(G''=(N\cup \{S'\}, T, P'\cup \{S'\rightarrow \lambda ,\) \(S'\rightarrow S\},S')\) where \(S'\) is a new symbol. Then \(L(G'')=L^*\) and \({{\,\mathrm{Var}\,}}(G'')=n+1\). Thus \({{\,\mathrm{Var}\,}}(L^*)\le n+1\).

(ii)
Let \(n\ge 1\). Let \(L_n=P(1,2,\ldots ,n)\{c\}\). Then \({{\,\mathrm{Var}\,}}(L_n)=n\) by Lemma 2.
Let \(G=(N,\{a,b,c\},P,S)\) be a grammar such that \(L(G)=L_n^*\) and \({{\,\mathrm{Var}\,}}(G)={{\,\mathrm{Var}\,}}(L_n^*)\). Since \(L_n\subseteq L_n^*\), as in Lemma 1, we can prove the existence of n different nonterminals \(B_1,B_2,\ldots ,B_n\) with a derivation
$$\begin{aligned} B_i&\Longrightarrow ^*&u_i(ab^{i}a)^{p_i}v_iB_i \text{ where } p_i \text{ is } \text{ a } \text{ natural } \text{ number }, \nonumber \\&u_i \text{ and } v_i \text{ are } \text{ a } \text{ proper } \text{ suffix } \text{ and } \text{ a } \text{ proper } \text{ prefix } \text{ of } ab^{i}a, \text{ respectively }, \end{aligned}$$(1)\(1\le i\le n\), in N. Moreover, since \(\lambda \in L_n^*\), there is a derivation \(S\Longrightarrow ^* C\Longrightarrow \lambda \) where \(C\rightarrow \lambda \) is applied in the last step. If \(C=B_i\) for some i, \(1\le i\le n\), then we have a derivation
$$\begin{aligned} S\Longrightarrow ^* C=B_i\Longrightarrow u_i(ab^{i}a)^{p_i}v_iB_i = u_i(ab^{i}a)^{p_i}v_iC \Longrightarrow u_i(ab^{i}a)^{p_i}v_i, \end{aligned}$$which produces a word in \(L(G)=L_n^*\) which is not empty and contains no c. This contradicts the fact that a nonempty word of \(L_n^*\) contains a c. Thus we need an additional nonterminal, i.e., \({{\,\mathrm{Var}\,}}(L_n^*)\ge n+1\). By part i) of this proof, \({{\,\mathrm{Var}\,}}(L_n^*)=n+1\). Hence \(n+1\in g_*^{{{\,\mathrm{Var}\,}}}(n)\) for \(n\ge 1\).

(iii)
Let \(n\ge 1\) and \(L_n'=P'(1,2,\ldots ,n)\). Then we have \({{\,\mathrm{Var}\,}}(L_n')=n\) by Lemma 1. Let \(G=(N,\{a,b\},P,S)\) be a grammar such that \(L(G)=(L_n')^*\) and \({{\,\mathrm{Var}\,}}(G)={{\,\mathrm{Var}\,}}((L_n')^*)\). Since \(L_n'\subseteq (L_n')^*\), as in Lemma 1, we can prove the existence of n different nonterminals \(B_1,B_2,\ldots ,B_n\) with a derivation (1). Hence \({{\,\mathrm{Var}\,}}((L_n')^*)\ge ~n\).
Because the grammar \(H=(\{B_1,B_2,\ldots ,B_n\}, \{a,b\}, P,B_1)\) with
$$\begin{aligned} P= \{ B_n\rightarrow ab^naB_n, B_n\rightarrow B_1, B_n\rightarrow \lambda \} \cup \bigcup _{i=1}^{n1} \{B_i\rightarrow ab^iaB_i, B_i\rightarrow B_{i+1}\} \end{aligned}$$generates \((L_n')^*\), we obtain \({{\,\mathrm{Var}\,}}((L_n')^*)=n\).
Hence \(n\in g_*^{{{\,\mathrm{Var}\,}}}(n)\) for \(n\ge 1\).

(iv)
\(1,2,\ldots ,n1\in g_*^{{{\,\mathrm{Var}\,}}}(n)\) can be shown by the witnesses given in the proof of Theorem 2; we only have to add the rule \(S\rightarrow \lambda \) for the axiom S because then the Kleeneclosure of the witness is generated. \(\square \)
Concatenation
For concatenation, we only present some partial results.
Lemma 5
For any positive integers \(n\ge 1\) and \(m\ge 1\), languages \(L_n\) and \(K_m\) with \({{\,\mathrm{Var}\,}}(L_n)=n\) and \({{\,\mathrm{Var}\,}}(K_m)=m\), we have \({{\,\mathrm{Var}\,}}(L_nK_m)\le n+m\).
Proof
Let \(G_n=(N_n,T,P_n,S_n)\) and \(H_m=(N_m,T,P_m,S_m)\) be two right linear grammars such that \(L(G_n)=L_n\), \(L(H_m)=K_m\), \({{\,\mathrm{card}\,}}(N_n)={{\,\mathrm{Var}\,}}(L_n)=n\), and \({{\,\mathrm{card}\,}}(N_m)={{\,\mathrm{Var}\,}}(K_m)=m\). We can assume that \(N_n\cap N_m=\emptyset \). We consider the right linear grammar
Since the finishing of a derivation in \(G_n\) by a rule \(A\rightarrow w\) is replaced by an application of \(A\rightarrow wS_m\), we have to continue with a derivation in \(H_m\). Thus \(L(G)=L_nK_m\). Because \({{\,\mathrm{Var}\,}}(G)=n+m\), we obtain the relation \({{\,\mathrm{Var}\,}}(L_nK_m)\le n+m\). \(\square \)
Theorem 4
For \(n\ge 1\), \(m\ge 1\), we have
Proof

(i)
For \(n\ge 1\) and \(m\ge 1\), let
$$\begin{aligned} L_n=P(1,2,\ldots ,n) \quad \text{ and } \quad K_{m,n}=P(n+1,n+2,\ldots ,n+m). \end{aligned}$$By Lemma 1, \({{\,\mathrm{Var}\,}}(L_n)=n\) and \({{\,\mathrm{Var}\,}}(K_m)=m\). Moreover,
$$\begin{aligned} L_nK_m=P(1,2,\ldots ,n,n+1,\ldots ,n+m). \end{aligned}$$Again, by Lemma 1, \({{\,\mathrm{Var}\,}}(L_nK_m)=n+m\). This proves \(n+m\in g_{\cdot }^{{{\,\mathrm{Var}\,}}}(m,n)\).

(ii)
Let \(n\ge m\ge 2\) and \(n1\ge k'\ge 0\). We consider the languages
$$\begin{aligned}&L_n=P'(1,2,\ldots ,n) \text{ and } K_{m,k'}\\&\quad =U(k'+1,k'+2,\ldots ,n)P(n+1,n+2,\ldots n+m1). \end{aligned}$$Then we obtain
$$\begin{aligned} L_nK_{m,k'}=P'(1,2,\ldots ,k')U(k'+1,k'+2,\ldots ,n)P(n+1,n+2,\ldots n+m1). \end{aligned}$$$$\begin{aligned} {{\,\mathrm{Var}\,}}(L_n)=n,\ {{\,\mathrm{Var}\,}}(K_{m,k'})=m, \quad \text{ and } \quad {{\,\mathrm{Var}\,}}(L_nK_{m,k'})=k'+m. \end{aligned}$$Thus \(k\in g_{\cdot }^{{{\,\mathrm{Var}\,}}}(n,m)\) for \(m\le k\le m+n1\).
If \(n\ge m=1\), then we consider the languages \(L_n=P'(1,2,\ldots ,n)\) and \(K_{m,k'}=U(k'+1,k'+2,\ldots ,n)\).

(iii)
For \(m\ge n\ge 2\) and \(m1\ge k'\ge 0\), we consider
$$\begin{aligned}&L_{n,k'}=P(1,2,\ldots ,n1)U(n+1,n+2,\ldots ,n+mk'),\\&\quad K_m=P'(n+1,n+2,\ldots n+m) \end{aligned}$$and obtain \(k\in g_{\cdot }^{{{\,\mathrm{Var}\,}}}(n,m)\) for \(n\le k\le m+n1\).
For \(m\ge n=1\), we consider the modification analogous to that in ii). \(\square \)
With respect to numbers which are smaller than \(\min \{n,m\}\), we only know that 1 can be obtained if \(n\ge 2\) and \(m\ge 2\).
Lemma 6
For \(n\ge 2\) and \(m\ge 2\), we have \(1\in g_{\cdot }^{{{\,\mathrm{Var}\,}}}(n,m)\).
Proof
Let h be the homomorphism defined by \(h(a)=b\) and \(h(b)=a\).
If \(n\ge 3\) and \(m\ge 3\), we consider the languages
Then we obtain \({{\,\mathrm{Var}\,}}(L_n)=n\) and \({{\,\mathrm{Var}\,}}(K_m)=m\) by Lemma 4 and symmetry. Moreover, \(L_nK_m=\{a,b\}^+\), which gives \({{\,\mathrm{Var}\,}}(L_nK_m)=1\).
If \(n=2\) or \(m=2\), we replace \(P(1,2,\ldots ,n2)\) and \(P(1,2,\ldots ,m2)\) by the empty set. \(\square \)
With respect to state complexity, there are investigations where instead of all regular languages only members of some subregular families are considered. For instance, the restriction to prefixfree languages was considered in [8, 15] (where a language L is prefixfree if and only if no proper prefix of a word in L belongs to L). We now show that such restrictions lead to the situation that small values cannot be obtained for the \({{\,\mathrm{Var}\,}}\)complexity of \(L_nK_m\).
Definition 1
Two languages L and K are well concatenated if and only if, for any words u and v, \(u\in L\) and \(uv\in LK\) implies \(v\in K\).
As an example, the languages \(L=\{x,x^2\}\) and \(K=\{x^2\}\) are not well concatenated since \(x\in L\) in \(xx^3=x^4\in LK\), but \(x^3\) is not in K. On the other hand, \(L'=\{x^2,x^3\}\) and \(K'=\{x^2,x^3\}\) are well concatenated.
Remark 1
If L is prefixfree and K is an arbitrary language, then L and K are well concatenated.
This can be seen as follows. Let us assume that L and K are not well concatenated. Then there are words u and v such that \(u\in L\), \(uv\in LK\), and \(v\notin K\). Then there are words \(u'\) and \(v'\) such that \(u'v'=uv\), \(u'\in L\), and \(v'\in K\). Obviously, \(u\ne u'\). Therefore u is a proper prefix of \(u'\) or \(u'\) is a proper prefix of u. Since both words are in L, we obtain a contradiction to the prefixfreeness of L.
We now show that small numbers cannot occur if L and K are well concatenated.
Lemma 7
Let L and K be well concatenated. Then \({{\,\mathrm{Var}\,}}(LK)\ge {{\,\mathrm{Var}\,}}(K)1\).
Proof
Let \(G=(N,T,P,S)\) be a right linear grammar such that \(L(G)=LK\) and \({{\,\mathrm{Var}\,}}(G)={{\,\mathrm{Var}\,}}(LK)\). Let t be the maximal length of a terminal word w with \(A\rightarrow wB\in P\) or \(A\rightarrow w\in P\).
Let \(S'\) be a symbol not in \(N\cup T\). For any derivation
with \(zu\in L\) (and \(X,Y\in N\)), we set \(p_D=S'\rightarrow vY\) or \(p_D=S'\rightarrow v\), respectively. Let \(P'\) be the set of all rules obtained in this way. Since we have only finitely many rules for any nonterminal X, \(P'\) is finite. Then we construct the the right linear grammar \(G'=(N\cup \{S'\},T,P\cup P',S')\) and prove that \(L(G')=K\).
Let \(w\in K\). We take a word \(w'\in L\). Then \(w'w\in LK\). Hence, in G, there are derivations
with \(zu=w'\) and \(vy=w\) or \(v=w\), respectively. In the former case, by construction \(S'\rightarrow vY\in P'\), and hence we have the derivation \(S'\Longrightarrow vY\Longrightarrow ^* vy=w\) in \(G'\) (because the derivation \(Y\Longrightarrow ^* y\) uses only rules of P), which proves \(w\in L(G')\). Analogously, we conclude in the latter case. Thus \(K\subseteq L(G')\).
Conversely, let \(w\in L(G')\). Then there are derivations
with \(w=vy'\) or \(w=v\). Since \(S'\rightarrow vY\) or \(S'\rightarrow v\) in \(P'\), there are derivations
in G with \(zu\in L\). However, then we also have the derivations
We obtain \(zuvy'\in LK\) or \(zuv\in LK\), respectively. Since \(zu\in L\) and L and K are well concatenated, we get \(vy'=w\in K\) or \(v=w\in K\), respectively. Therefore \(L(G')\subseteq K\).
By definition \({{\,\mathrm{Var}\,}}(K)\le {{\,\mathrm{Var}\,}}(G')={{\,\mathrm{card}\,}}(N)+1={{\,\mathrm{Var}\,}}(LK)+1\). \(\square \)
We note that the witnesses given in the proof of Theorem 4 are not prefixfree. However, if we consider the prefixfree languages \(L_n=P(1,2,\ldots ,n)\{c\}\) and \(K_m=P(n+1,n+2,\ldots ,n+m)\{d\}\), where c and d are additional letters, we also get \({{\,\mathrm{Var}\,}}(L_n)=n\), \({{\,\mathrm{Var}\,}}(K_m)=m\), and \({{\,\mathrm{Var}\,}}(L_nK_m)=n+m\), i.e., the upper bound \(m+n\) is also obtained with prefixfree sets. We do not know whether Theorem 4 also holds for prefixfree languages (together with Lemma 7 and Remark 1 it would be a nearly optimal result).
Intersection
We start with a lemma which is an intermediate consequence of the commutativity of intersection.
Lemma 8
For \(n\ge 1\) and \(m\ge 1\), we have \(g_{\cap }^{{{\,\mathrm{Var}\,}}}(n,m)=g_{\cap }^{{{\,\mathrm{Var}\,}}}(m,n)\). \(\Box \)
We now give a partial result concerning \(g_{\cap }^{{{\,\mathrm{Var}\,}}}\). In contrast to concatenation, we only know that certain small numbers are in the intersection.
Lemma 9

(i)
For \(n\ge 3\) and \(m\ge 3\), we have \(\{0,1,2,\ldots , n+m3\}\subseteq g_{\cap }^{{{\,\mathrm{Var}\,}}}(n,m)\).

(ii)
(i) For \(n\ge 3\) and \(m\in \{1,2\}\), we have \(\{0,1,2,\ldots , n\}\subseteq g_{\cap }^{{{\,\mathrm{Var}\,}}}(n,m)\).
Proof
By Lemma 8, it is sufficient to consider the situation that \(n\ge m\).
We distinguish some cases and give witnesses \(L_n\) with \({{\,\mathrm{Var}\,}}(L_n)=n\) and \(K_m\) with \({{\,\mathrm{Var}\,}}(K_m)=m\) such that \({{\,\mathrm{Var}\,}}(L_n\cap K_m)=k\). Since the proofs that the languages have the required \({{\,\mathrm{Var}\,}}\)complexities follow directly from Lemmas 1–4 or can be given by arguments analogous to those used in the proofs of Lemmas 1–4 in all cases, we only present the witnesses in Table 3 (in order to simplify the notation we omit the possible dependence of the languages by the parameters k or \(k'\)). \(\square \)
We do not know an upper bound for the numbers in \(g_{\cap }^{{{\,\mathrm{Var}\,}}}(n,m)\). This comes from the fact that all constructions of a right linear grammar for \(L_n\cap K_m\) from right linear grammars \(G_n\) and \(H_m\) for \(L_n\) and \(K_m\), respectively, do not only depend on n and m. For instance, we can transform the given grammars into right linear grammars \(G_n'=(N_n,T,P_n,S_n)\) and \(H_m'=(N_m,T,P_m,S_m)\) in normal form. Then the right linear grammar \((N_n\times N_m, T,P,S)\) with \(S=(S_n,S_m)\) and
generates \(L_n\cap K_m\). However, the blowup of the number of nonterminals in the construction of the normal forms depends on the length of the right hand sides of the rules in the original grammars. Since this length can be arbitrarily large, we have no upper bound for \(g_{\cap }^{{{\,\mathrm{Var}\,}}}(n,m)\) using this construction. The situation for other constructions—known to us—is similar.
Union
Finally we study the behaviour under union.
We start by proving that \(g_{\cup }^{{{\,\mathrm{Var}\,}}}\) is a symmetric function.
Lemma 10
For \(n\ge 1\) and \(m\ge 1\), we have \(g_{\cup }^{{{\,\mathrm{Var}\,}}}(n,m)=g_{\cup }^{{{\,\mathrm{Var}\,}}}(m,n)\). \(\Box \)
The statement follows immediately from the commutativity of union.
Theorem 5
For \(n\ge 1\) and \(m\ge 1\), we have \(g_{\cup }^{{{\,\mathrm{Var}\,}}}(n,m)=\{1,2,\ldots ,m+n+1\}\)
Proof
We first prove that \({{\,\mathrm{Var}\,}}(L_n)=n\) and \({{\,\mathrm{Var}\,}}(K_m)=m\) imply \({{\,\mathrm{Var}\,}}(L_n\cup K_m)\le n+m+1\). Let \(G_n=(N_n,T,P_n,S_n)\) and \(H_m=(N_m,T,P_m,S_m)\) be right linear grammars with \(L(G_n)=L_n\), \(L(H_m)=K_m\), \({{\,\mathrm{Var}\,}}(G_n)={{\,\mathrm{Var}\,}}(L_n)=n\), and \({{\,\mathrm{Var}\,}}(H_m)={{\,\mathrm{Var}\,}}(K_m)=m\). We assume that \(N_n\cap N_m=\emptyset \). Then, by the standard construction, the right linear grammar
generates \(L_n\cup K_m\) and has \(n+m+1\) nonterminals. Therefore, \({{\,\mathrm{Var}\,}}(L_n\cup K_m)\le n+m+1\).
For \(n\ge 1\) and \(m\ge 1\), let \(L_n=P(1,2,\ldots ,n)\) and \(K_m=P(n+1,n+2,\ldots ,n+m)\). Then we have \({{\,\mathrm{Var}\,}}(L_n)=n\), \({{\,\mathrm{Var}\,}}(K_m)=m\), and \({{\,\mathrm{Var}\,}}(L_n\cup K_m)=n+m+1\) by Lemma 1.
For \(n\ge 1\), \(m\ge 1\), and \(1\le k\le n+m\), in Tables 4 and 5 , we now present witness languages \(L_n\) and \(K_m\) such that the relations \({{\,\mathrm{Var}\,}}(L_n)=n\), \({{\,\mathrm{Var}\,}}(K_m)=m\), and \({{\,\mathrm{Var}\,}}(L_n\cup K_m)=k\) hold. By Lemma 10, we can assume without loss of generality that \(n\ge m\). Because in all cases the \({{\,\mathrm{Var}\,}}\)complexity of the given languages follows directly by Lemmas 1–4 or can be given by arguments similar to those used in the proof of Lemmas 1–4, we only give the witnesses and their union. \(\square \)
Conclusions
In this paper we have started the investigation of the sets \(g_f^{{{\,\mathrm{Var}\,}}}(n,m)\) (and \(g_f^{{{\,\mathrm{Var}\,}}}(n)\)) for regularity preserving binary (and unary, respectively) operations f. Summarizing our results, we obtain Table 6.
We see that we have complete results for reversal, positive Kleeneclosure, Kleeneclosure and union. However, for concatenation and intersection, we have only given some partial results; an exact determination of \(g_.^{{{\,\mathrm{Var}\,}}}(n,m)\) and \(g_{\cap }^{{{\,\mathrm{Var}\,}}}(n,m)\) remains to be done.
In comparison with the measures \({{\,\mathrm{Varnf}\,}}\) of right linear grammars in normal forms (which equals the nondeterministic state complexity \({{\,\mathrm{nsc}\,}}\)) and \({{\,\mathrm{Varcf}\,}}\) of contextfree grammars given in the Tables 1 and 2 , we see that our results correspond more to those of \({{\,\mathrm{Varnf}\,}}\) (\(={{\,\mathrm{nsc}\,}}\)) (those for \({{\,\mathrm{nsc}\,}}\) are more complete) than to those of \({{\,\mathrm{Varcf}\,}}\) where the upper bound for concatenation and the range for reversal are different.
It is left open to determine the range for further operations as—for instance—complement, setsubtraction, and symmetric difference.
The behaviour of other complexity measures defined on special regular languages under operations is already investigated; for \({{\,\mathrm{sc}\,}}\) see [5]. Especially, the behaviour of finite and unary languages is studied. With respect to \({{\,\mathrm{Var}\,}}\), the behaviour of finite and unary languages is not of interest since, for any finite language L, \({{\,\mathrm{Var}\,}}(L)=1\) holds, and for any unary language K, \({{\,\mathrm{Var}\,}}(K)\le 2\) is valid. However, for other special languages as regular prefixfree languages or regular suffixclosed languages (any suffix of a word in L is in L, too) the behaviour can be studied.
Furthermore, for a right linear grammar \(G=(N,T,P,S)\), we can also define the complexity measures
which count the number of rules and the sum of symbols contained in rules, respectively. Then we can extend these measures for regular languages as in Section 1. These measures describe the complexity of a language in more precise manner. For contextfree languages, the investigation of the behaviour of the corresponding complexity measures under operations was done in [2, 9]. A study of the ranges for the measures \({{\,\mathrm{Prod}\,}}\) and \({{\,\mathrm{Symb}\,}}\) for right linear grammars is left as an open field.
References
 1.
Brzozowski, J.: Quotient complexity of regular languages. J. Autom. Lang. Comb. 15, 71–89 (2010)
 2.
Dassow, J., Harbich, R.: Descriptional complexity of union and star on contextfree languages. J. Autom. Lang. Comb. 17, 123–143 (2012)
 3.
Dassow, J., Stiebe, R.: Nonterminal complexity of some operations on contextfree languages. Fundamenta Informaticae 83, 35–49 (2008)
 4.
Ellul, K., Krawetz, B., Shallit, J., Wang, M.W.: Regular expressions: new results and open problems. J. Autom. Lang. Comb. 10, 407–437 (2005)
 5.
Gao, Y., Moreira, N., Reis, R., Yu, Sh: A survey on operational state complexity. J. Autom. Lang. Comb. 21, 251–310 (2016)
 6.
Gao, Y., Salomaa, K., Yu, S.: Transition complexity of incomplete DFAs. Fundamenta Informaticae 110, 143–158 (2011)
 7.
Gruber, H., Holzer, M.: Language operations with regular expressions of polynomial size. Theor. Comput. Sci. 418, 3281–3289 (2009)
 8.
Han, Y.S., Salomaa, K., Wood, D.: State complexity of prefixfree regular languages. In: Leung, H., Pighizzini, G. (eds.) Proceedings of 8th International Workshop of Descriptional Complexity of Formal Systems, pp. 165–176. University of Las Cruces, Las Cruces (2010)
 9.
Harbich, R.: Regel und Symbolkomplexität kontextfreier Sprachen unter ausgewählten Operationen. Dissertation, OttovonGuerickeUniversität Magdeburg (2019)
 10.
Holzer, M., Kutrib, M.: State complexity of basic operations on nondeterministic automata. In: Champernaud, J.M., Maurel, D. (eds.) Implementation and Application of Automata, LNCS 2608, pp. 148–157 (2003)
 11.
Holzer, M., Kutrib, M.: Nondeterministic finite automatarecent results on the descriptional and computational complexity. Int. J. Found. Comput. Sci. 20, 563–580 (2009)
 12.
Hricko, M., Jirásková, G., Szabari, A.: Union and intersection of regular languages and descriptional complexity. In: Mereghetti, C., Palano, B., Pighizzini, G., Wotschke, D. (eds.) Proceedings of 7th International Workshop of Descriptional Complexity of Formal Systems, pp. 170–181. University of Milano, Milan (2005)
 13.
Jirásková, G.: Concatenation of regular languages and descriptional complexity. In: Frid, A.E., Morozov, A., Rybalchenko, A., Wagner, K.W. (eds.) Fourth International Computer Science Symposium in Russia 2009, LNCS 5675, pp. 203–214 (2009)
 14.
Jirásková, G.: The ranges of state complexities for complement, star, and reversal of regular languages. Int. J. Found. Comput. Sci. 25, 101–124 (2014)
 15.
Jirásková, G., Krausová, M.: Complexity in prefixfree regular languages. In: McQuillan, I., Pighizzini, G., Trost, B. (eds.) Proceedings 12th International Workshop of Descriptional Complexity of Formal Systems, pp. 236–244. University of Sasketchewan, Saskatoon (2010)
 16.
Maia, E., Moreira, N., Reis, R.: Incomplete operational transition complexity of regular languages. Inf. Comput. 244, 1–22 (2015)
Acknowledgements
Open Access funding provided by Projekt DEAL. The author is very grateful to the referees for their comments which considerably improved the paper.
Author information
Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Dassow, J. Operational complexity and right linear grammars. Acta Informatica 58, 281–299 (2021). https://doi.org/10.1007/s00236020003863
Received:
Accepted:
Published:
Issue Date:
Mathematics Subject Classification
 68Q05
 68Q42
 68Q45