27

We define a regular tree language as in the book TATA: It is the set of trees accepted by a non-deterministic finite tree automaton (Chapter 1) or, equivalently, the set of trees generated by a regular tree grammar (Chapter 2). Both formalisms hold close resemblances to the well-known string analogues.

Is there a regular tree language in which the average height of a tree of size $n$ is neither $\Theta(n)$ nor $\Theta(\sqrt{n})$?

Obviously there are tree languages such that the height of a tree is linear in its size; and in the book Analytic Combinatorics it is shown e.g. that binary trees of size $n$ have average height $2\sqrt{ \pi n}$. If I understand Proposition VII.16 (p.537) of the mentioned book correctly, then there is a wide subset of regular tree languages that have average height of $\Theta(\sqrt{n})$, namely those in which the tree language is also a simple variety of trees fulfilling some extra conditions.

So I was wondering whether there is a regular tree language showing a different average height or if there is a true dichotomy for regular tree languages.

Note: This question has been asked before on Computer Science, yet it has been unanswered for more than three months. I would like to repost it here because the question is too old to migrate and because there is still an interest in the question. Here is a link to the original post.

john_leo
  • 431
  • 4
  • 7
  • The single tree with constant depth is an obvious answer: o(\sqrt{n}) but not $\Omega(\sqrt{n})$. I believe you probably meant some other question? Replace $\Theta(\sqrt{n})$ with $O(\sqrt{n})$ maybe? – Joseph Stack Jan 19 '14 at 13:30
  • Yes and no. I think a regular tree language with average depth $O(n^{1/3})$ (say) would also be very interesting. But you're right in that we should exclude such degenerated cases. Maybe we should require that the tree language contains infinitely many elements? – john_leo Jan 19 '14 at 13:39
  • What kind of trees do you have in mind? Ranked trees, unranked sibling-ordered trees, unranked unordered trees; and, by the way, what kind of tree automata do you mean, bottom-up or top-down? – f-h Jan 19 '14 at 13:46
  • @JosephStack how can the height of a regular tree be infinite? A tree with $n$ nodes cannot have a height greater than $n$. – john_leo Jan 19 '14 at 13:50
  • Of course you're right; height is finite. I was trying to delete that 'stupid' comment of mine but did not manage, sorry – Joseph Stack Jan 19 '14 at 13:51
  • @f-h just as defined in Tata, chapter 1/2. So ranked trees. I mean bottom-up tree automata. – john_leo Jan 19 '14 at 13:58
  • I believe the "limit" is not well defined in terms of $\Theta$. Because you could extract two infinite subsequences of $n$ for which the average height radically differs. Are you interested in $\lim \sup$ of the average height? – Joseph Stack Jan 19 '14 at 14:28
  • @JosephStack: Asymptotics don't make sense on finite classes (that is, the limit is $0$ on all finite classes), so the question already requires an infinite class of trees. The term "regular tree language" has clear definitions (see the cited references) and the average tree height is canonically defined as the mean height of all trees with $n$ nodes that are in the language. Therefore, the question is quite unambiguous. – Raphael Jan 19 '14 at 14:41
  • @JosephStack My bad, the question does not explicitly ask for a class with mean height in $\Theta(g)$ for $g \not\in {\sqrt{n}, n}$. – Raphael Jan 19 '14 at 14:50
  • 1
    @Raphael: If you do not consider $\lim\sup$, it's not clear to me what the question would be. The answer to "is there an infinite regular tree language such that the mean height is a function $f$ with $f(n)\notin \Theta(\sqrt{n})$ and $f(n)\notin\Theta(n)$" is obviously yes: make sure that for odd $n$ you have $\Theta(n)$ and even ones $\Theta(\sqrt{n})$.

    P.S. every function I can imagine belongs to $\Theta(g)$ for some $g\notin{\sqrt{n},n}$, so it's not a correct fix :)

    – Joseph Stack Jan 19 '14 at 15:10
  • @JosephStack Why consider any limits? $\Theta$ is not typically defined via $\lim$; if anything, $\lim\sup$ is used and your concern is gone. Note that your "..." are not equivalent to what I wrote. (I should have added that in my phrasing, we'd like $g \in \omega(1)$ and probably stricly increasing so nasty corner cases go away.) – Raphael Jan 19 '14 at 15:18
  • Maybe you could just state that you ask whether for every $L$ there is a $c$ and $C$ such that for every $n$ the average depth of size $n$ trees is either at most $C$ or between $c\sqrt n$ and $C\sqrt n$ or at least $cn$. Btw, I think this could be true and if I were you, I would rather ask it on mathoverflow. – domotorp Jan 20 '14 at 11:55
  • @domotorp What would that reformulation change? Btw, I also believe it's true. And I was also thinking about asking it on mathoverflow, since the answer of the question surely involves many analytic tools... – john_leo Jan 20 '14 at 12:10
  • I think half of the above comments are related to people being confused about the exact meaning of your question, so it would change that. – domotorp Jan 20 '14 at 15:05
  • fyi the question seemed not exactly precise over on cs.se awhile back either but was high voted also & still stumped the crowd. maybe some more clarification of bkg/ definitions to make it self contained (as possible) esp wrt math formulations instead of mere citations would help. also presumably the book does not say anything about a dichotomy & that is your own term? no dichotomy seems to be apparent/ clearly defined... – vzn Feb 10 '15 at 22:15
  • For those who stumble over this question: john_leo and me actually worked on this for a while in 2014; find a summary here. – Raphael Oct 14 '18 at 14:26

1 Answers1

2

I believe that the answer is as you suggest that no other asymptotics than $\Theta(1)$, $\Theta(\sqrt{n})$ and $\Theta(n)$ are possible. A promising route to prove this could be to apply the techniques from the paper which derives the $\Theta(\sqrt{n})$ asymptotics to the run trees of the regular language. Notice that a tree is accepted if there exists a run tree so it should be possible to first derive (using loc.cit.) the average height of a randomly generated run tree and take it from there, i.e. show that projecting away the states does not change the average height.

Martin Hofmann
  • 643
  • 6
  • 8
  • 3
    I think this is a comment and not an answer since it is far from clear wether this attempt works out. – Danny Oct 18 '17 at 11:19