This is only an extended comment.
A few times ago I asked (myself :-) how fast a multitape NTM that accepts a (reasonably encoded) NP-complete language can be. I came up with this idea:
3-SAT remains NP-complete even if variables are represented in unary. In particular we can convert a clause - suppose $(x_i \lor \neg x_j \lor x_k)$ - of an arbitrary 3-SAT formula $\varphi$ on $n$ variables and $m$ clauses in a sequence of characters over alphabet $\Sigma = \{ +, -, 1 \}$ in which every variable occurrence is represented in unary:
$ + 1^{i} 0,- 1^{j} ,+ 1^{k} $
For example, $(x_2 \lor -x3 \lor +4)$ can be converted to:
+110-1110+11110
So we can convert a 3-SAT formula $\varphi_i$ in an equivalent string $U(\varphi_i)$ concatenating its clauses. The language $L_U = \{ U(\varphi_i) \mid \varphi_i \in 3-SAT \}$ is NP-complete.
A 2-tape NTM can decide if a string $x \in L_U$ in time $2|x|$ in this way.
- the first head scans the input from left to right and with the internal logic it keeps track when it enters or exit a clause or reach the end of the formula. Whenever it finds a $+$ or $-$, the second head starts moving right with it on the $1^i$ that represents $x_i$. At the end of $1^i$, if the second head is on a $0$ then it guesses a truth value $+$ or $-$ (it makes an assignment) and writes it on the second tape; if it finds a $+$ or $-$ then that variable has already been assigned a value;
- in both cases, using the internal logic, the NTM matches the truth value under the second head (the assignment) with the last seen $+$ or $-$; if they match then the clause is satisfied;
- then the second head can return to the rightmost cell;
- with the internal logic the NTM can keep track if all clauses are satisfied while the first head moves towards the end of the input.
Example:
Tape 1 (formula) Tape 2 (variable assignments)
+110-1110+11110... 0000000000000...
^ ^
+110-1110+11110... 0000000000000...
^ ^
+110-1110+11110... 0000000000000...
^ ^
+110-1110+11110... 0+00000000000... first guess set x2=T; matches +
^ ^ so remember that current clause is satisfied
+110-1110+11110... 0+00000000000...
^ ^
...
+110-1110+11110... 0+00000000000...
^ ^
...
+110-1110+11110... 0++0000000000... second guess set x3=T
^ ^ don't reject because current
clause is satisfied (and in every
case another literal must be parsed)
The time can be reduced to $|x|$ if we add some redundant symbols to the clause representation:
$ + 1^{i} 0^i,- 1^{j} 0^j ,+ 1^{k} 0^k \; ... \; \text{+++}$
($\text{+++}$ marks the end of the formula)
In this way the second head can return to the leftmost cell while the first scans the $0^i$ part.
Using $\text{++}$ as a clause delimiter and $\text{+++}$ as a marker for the end of the formula we can use same representation for CNF formulas with an arbitrary number of literals per clause.
Note that the representation is important. I'm assuming that a $m$-literal $n$-variable CNF $\varphi$ is represented in the obvious way using $|\varphi|=\Theta(m \log n)$ bits. Is that the representation you had in mind?
– Thomas Oct 23 '15 at 16:05