19

(UPDATE: a better formed question is posed here as the comments for the accepted answer below show that this question is not well-defined)


The classical proof of the impossibility of the halting problem depends on demonstrating a contradiction when trying to apply the halting detection algorithm to itself as input. See background below for more information.

The contradiction demonstrated applies because of a self-referential paradox (like the sentence "This sentence is not true"). But if we strictly prohibited such self-references (i.e. accepted the fact that such self-references are not decidable to halt), what result are we left with ? Is the halting problem for the remaining set of non-self-referencing machines are decidable to halt or not ?

The questions is:

If we consider a subset of all possible Turing machines, which are not self-referencing (i.e. does not take them selves as input), what do we know about the halting problem for this subset ?

UPDATE

Maybe a better reformulation of what I am after is a better understanding of what defines a decidable set. I was trying to isolate the classical undecidability proof because it doesn't add any information about the undecidability except for the cases where you run HALT on itself.

Background: Assuming towards a contradiction that there is a Turing machine $Q$ that can decide on input $M$ which is an encoding for a Turing machine and $X$, whether or not $M(X)$ halts. Then consider a Turing machine $K$ that takes $M$ and $X$ and uses $Q$ to decide whether $M(X)$ halts or not, and then does the opposite, i.e. $K$ halts if $M(X)$ doesn't halt, and doesn't halt if $M(X)$ halts. Then $K(K)$ demonstrates a contradiction, as $K$ should halt if it doesn't halt, and doesn't halt when it halts.

Motivation: A colleague is working on formal verification of software systems (esp. when the system is already proven at source code level and we want to reprove it for its compiled version, to neutralize compiler issues), and in his case he cares about a special set of embedded control programs for which we know for sure they wouldn't be self-referencing. One aspect of verification he wants to carry out is whether it is guaranteed that the compiled program will halt if the input source code is proven to terminate.

UPDATE

Based on the comments below I clarify the meaning of non-self-referencing Turing machines.

The goal is to define it as the set that does not lead to the contradiction posed in the proof (cf. "Background" above). It might be defined as follows:

Assuming that there is a Turing machine $Q$ that decides the halting problem for a set of Turing machine $S$, then $S$ is non-self-referencing with respect to $Q$ if it excludes all machines that invokes $Q$ on $S$ (directly or indirectly). (Clearly that means that $Q$ can't be a member of $S$.)

To clarify about what is meant by invoking $Q$ on $S$ indirectly:

Invoking $Q$ on $S$ is denoted by a Turing machine $Q$ with a set of states and a certain possible initial inputs on the tape (corresponding to any member of $S$), with the head initially at the beginning of that input. A machine $W$ invokes $Q$ on $S$ "indirectly" if there is a (finite) sequence of steps that $W$ would take to make its configuration "homomorphic" to the initial configuration of $Q(S)$.

UPDATE 2

From an answer below arguing that there are infinitely many Turing machines doing the same task, and so $Q$ is not unique, we change the definition above by saying that $Q$ is not a single Turing machine, but the (infinite) set of all machines computing the same function (HALT), where HALT is the function that decides whather a Turing machine halts on a particular input.

UPDATE 3

The definition of Turing Machine homomorphism:

A TM A is homomorphic to TM B if the transition graph of A is homomorphic to that of B, in the standard sense of homomorphisms of graphs with labeled nodes AND edges. A transition graph (V,E) of a TM is such that V=states, E=transition arcs between states. Each arc is labeled with the (S,W,D), S=symbol read off the tape and W=the symbol to be written to it, and D=the direction the head show move to.

Mohammad Alaggan
  • 1,812
  • 18
  • 29
  • 5
    "the remaining set of non-self-referencing" Before I can sensibly discuss this set, I'd like a definition of "self-reference". However, I think that will be a tricky thing to define? – Sam Nead Nov 09 '10 at 15:53
  • Indeed it would be tricky. The goal is to define it as the set that does not lead to the contradiction posed in the proof. I would try a shot: assuming that there is a Turing machine $Q$ that decides the halting problem for a set of Turing machine $S$, then $S$ is non-self-referencing with respect to $Q$ if it excludes all machines that invokes $Q$ on $S$ (directly or indirectly). Does that sound plausible enough ? – Mohammad Alaggan Nov 09 '10 at 16:05
  • 1
    There are studies of provably halting programs (this class does not include all halting programs, though). Basically they are a pair of a program and a proof that it halts. For example, if I am not mistaken, Agda only allows programs which halt. I think that people working on logic and programming languages have more to say on this. – Tsuyoshi Ito Nov 09 '10 at 16:17
  • @Tsuyoshi: Thanks for the information, and the link. I am actually more interested on the theoretical results of which classes are known to be halting-decidable (I am not talking either on proving a particular algorithm). The programming language motivation is my colleague's, not mine. – Mohammad Alaggan Nov 09 '10 at 16:21
  • Wouldn't that break the diagonalization method used to prove the undecidability of the Halting problem? – Mohammad Al-Turkistany Nov 09 '10 at 16:33
  • 1
    @M. Alaggan. Now I'd want a definition of "invokes $Q$ on $S$ indirectly," which I suspect to be as difficult to define as the original "self-reference" :) – Rob Simmons Nov 09 '10 at 16:51
  • 2
    This raises an interesting question: Are all uncomputability (undecidability) proofs traceable to Cantor's diagonalization method? Is there any undecidability proof that does not rely directly or indirectly on the diagonalization method? – Mohammad Al-Turkistany Nov 09 '10 at 16:52
  • @Rob Simmons: Invoking $Q$ on $S$ is denoted by a Turing machine $Q$ with a set of states and a certain possible initial inputs on the tape (corresponding to any member of $S$), with the head initially at the beginning of that input. A machine $W$ invoking $Q$ on $S$ "indirectly" means that there is a (finite) sequence of steps that $W$ would take to make its configuration "homomorphic" to the initial configuration of $Q(S)$. – Mohammad Alaggan Nov 09 '10 at 17:01
  • @turkistany: The proof provided in the background (in the question), I don't think it depends on the diagonalization method, or am I wrong ? – Mohammad Alaggan Nov 09 '10 at 17:12
  • @M. Alaggan, If we disallow self-reference, then we can't exclude the existence of always-halting Turing machine $Q$. – Mohammad Al-Turkistany Nov 09 '10 at 17:21
  • @turkistany: I am not sure I got what you mean, can you elaborate more ? – Mohammad Alaggan Nov 09 '10 at 17:25
  • A diagonalization proof relies on deriving a contradiction which leads to the conclusion that TM $Q$ does not exist. If you disallow self-reference then there is no contradiction and you can't exclude the existence of $Q$. – Mohammad Al-Turkistany Nov 09 '10 at 17:39
  • What is the significance of always-halting Turing machine $Q$? What problems does this present? – Matt Groff Nov 09 '10 at 18:41
  • @Matt Groff: turkistany is not speaking of any machine $Q$, he is speaking about $Q$ which decides the halting problem on the set $S$ of non-self-referencing turing machines. If $Q$ always halts, that means that the halding problem for $S$ is decidable. – Mohammad Alaggan Nov 09 '10 at 18:51
  • @turkistany, I think I read that Boolos proposed a proof of Godel's Incompleteness theorem that relies on the Berry paradox and does not use diagonalization. I would imagine there are similar proofs for the halting problem; perhaps Chaitin's proof regarding Kolmogorov complexity could be adapted for this purpose. –  Nov 09 '10 at 20:50
  • Would it make sense to phrase this question by requiring a hypothetical UTM that simulates Turing machines on inputs to automatically reject any machine that attempts execution on itself? E.g., if I ran UTM(100010101010111100,100010101010111100), that would instantly reject without simulating the computation. –  Nov 09 '10 at 20:54
  • @Philip: The problem is that the machine to be tested for halting, doesn't necessarily get executed. Only its discription will be analysed. In this case, it might be buried deep in the discription, and on several recursive levels as well, it might not be always possible to compute if a given description will eventually call itself. However in the exposition above we are not considering the computability of this property, but taking it for granted and reasoning about the resulting sets. – Mohammad Alaggan Nov 09 '10 at 22:59
  • I missed pointing out that the only way, under my version, to call a machine would be via the UTM. So, basically I was discussing a model of computation under which every machine is more or less prohibited from ever calling itself. That said, it may still be possible to call the machine by writing a similar program. –  Nov 09 '10 at 23:05
  • It's a nice idea. But, yes as you said, it is also may be called by "merging" its states with the current machine, so that the new machine has the machine it calls "coded into" its state transition table. This why I said "homomorphic". – Mohammad Alaggan Nov 09 '10 at 23:26
  • 1
    Some of the discussion at http://cstheory.stackexchange.com/questions/2823/is-one-definition-of-the-word-paradox-something-that-can-be-used-to-prove-the-h seems relevant to your question. – András Salamon Nov 10 '10 at 00:50
  • I read the your “definition” of “non-self-referencing”. I cannot call it a definition because you have not defined what it means for a configuration of one Turing machine to be “homomorphic” to a configuration of another Turing machine. Usually, a homomorphism is defined as a mapping between the set of configurations of one machine to the set of configurations of another machine, and you cannot speak of a homomorphism between two configurations alone. I doubt that there is any way to define “non-self-referencing Turing machines” while keeping the halting problem still nontrivial. – Tsuyoshi Ito Nov 10 '10 at 02:14
  • A TM A is homomorphic to TM B if the transition graph of A is homomorphic to that of B, in the standard sense of homomorphisms of graphs with labeled nodes AND edges. A transition graph (V,E) of a TM is such that V=states, E=transition arcs between states, labeled with the symbol read off the tape and the symbol written to it, and the direction the head show move to. – Mohammad Alaggan Nov 10 '10 at 02:37
  • My criticism still applies. I know what a homomorphism means in the standard sense, but you are speaking of a configuration being homomorphic to another configuration without referring to the whole configuration graphs. But I am not sure if it matters. Anyway I do not think that I can answer anything that pleases you (primarily because I do not know much). – Tsuyoshi Ito Nov 10 '10 at 03:12
  • 2

2 Answers2

11

If I understand your motivation correctly, it's seems like this is a "compiler correctness" issue more than a "restricted halting problem" issue. You have a property (termination) that you've proven for some source level program Prog that you want to then extend to the compiled code to get the same property of compiled(Prog). But the compiler can (in general) do arbitrarily goofy things like implement a turing-complete runtime (say the JVM), compile your terminating program into a JVM bytecode and then dump out an executable that starts the JVM and feeds it your compiled bytecode.

In practice it's probably quite possible to utilize the implicit knowledge you have about how your compiler works to implement a some verification procedure that pretty much proves compiled programs correct given correct source programs (indeed, a lot of automatic verification tools for programs are utilizing implicit knowledge of the algorithm-to-code "compiler" in the brains of programmers). However, in general you're probably looking at a problem of compiler correctness. As I understand it, there are two classic ways of doing so.

One option is to have a compiler that takes as input the program Prog and the proof terminates(Prog) and outputs compiled(Prog) and terminates(compiled(Prog)) - the latter is a proof that can then be double-checked independently of the compiler. The classic paper on this is Necula and Lee's The design and implementation of a certifying compiler, I believe.

Alternatively, you can prove a fact about the function compiles() - that whenever compiles() is giving a terminating input it produces a terminating output. An accessible introduction to this way of thinking about compiler correctness is Xavier Leroy's CACM article, Formal verification of a realistic compiler.

(P.S. I hope this answer is helpful - I recognize it's a bit divergent from the question you asked, so let me know if I'm way off-base and/or repeating something you already know.)

Rob Simmons
  • 2,396
  • 13
  • 24
  • Thanks for the great answer. This would definitly be useful for my colleague. However I (independently from my colleague) am more interested on the theoretical implications on the proof of the halting problem, that if we got rid of the case that showed the contradiction, then what else do we know about the decidability of the halting problem ? – Mohammad Alaggan Nov 09 '10 at 16:40
9

I think it'll take a bit more work to get to a "well-defined" question. In particular, this is problematic:

Invoking Q on S is denoted by a Turing machine Q with a set of states and a certain possible initial inputs on the tape (corresponding to any member of S), with the head initially at the beginning of that input. A machine W invokes Q on S "indirectly" if there is a (finite) sequence of steps that W would take to make its configuration "homomorphic" to the initial configuration of Q(S).

One problem is that there are infinitely many Turing machines which compute the same function. In the standard diagonalization argument, I can just replace the Q subroutine with another decider for HALT, since there are infinitely many of them. Or a function that is computably equivalent to HALT. So it's not entirely clear to me how to define your notion of "indirect invocation".

A different question might be: for what sets of Turing machines is the halting problem decidable? Here there are an abundance of answers: resource restricted TMs (e.g. use only f(n) space, where f is some specific computable function), TMs that are operationally restricted in some particular way (e.g. read head only moves one way), etc. But, another interesting question is whether membership in that restricted set is decidable, or whether you have to restrict yourself to "promise problems", where you only guarantee a correct answer on some "promised" subset of inputs, but don't verify membership.

Mark Reitblatt
  • 1,690
  • 15
  • 20
  • Okay, nice, so let's try this: Let H be the (infinite) set of all Turing machines computably equivalent to HALT, then replace my $Q$ with $H$. – Mohammad Alaggan Nov 09 '10 at 22:01
  • It's not quite that simple. Your definition is paradoxical now, since you are looking for a computable HALT. But if this is computable, any computable function is computably equivalent to it. But if your input set contained semi-computable problems (TMs), you'd have a contradiction since deciding the halting problem for such a TM would give you a decision procedure for that problem. – Mark Reitblatt Nov 09 '10 at 22:18
  • Wouldn't an uncomputable HALT mean undecidability ? I was assuming such a computable HALT exists, hoping for contradiction. 2) I am not familiar with the concept that all computable functions are computably equivalent to each others, I was quoting you, and meaning that it is a function that solves the HALT problem. Apparently λx.1 is computable but it doesn't decide HALT. Correct me if I am wrong please. About the semi-computable problems, HALT may take infinite number of steps, that would still not lead to a the contradiction in the original proof that HALT is not decidable.
  • – Mohammad Alaggan Nov 09 '10 at 22:50
  • Right. But the problem is trying to define your notion of "non-self-referencing". Either it's a weak restriction, which allows diagonalization as I argued, or it's a strong restriction which eliminates everything.

  • It's simple. "Computably equivalent" roughly means there is a computable mapping from one to the other that preserves answers. But if I can compute an answer, I can cheat and make the mapping trivial.

  • If the TM deciding HALT doesn't itself terminate, it's not a decider for HALT.

  • – Mark Reitblatt Nov 09 '10 at 22:55
  • 1
    Something else that is a bit confusing is the conflation of TM with the decision problem computed by them. It's not normal to talk about a TM being computationally equivalent to one another. Rather, the functions computed by them can be equivalent (or equal). The problem is that trying to say a TM doesn't simulate another TM is hard to define in general, without giving something concrete to separate the functions computed by them. For example, a Log-space TM can't simulate a TM solving an EXP-space problem. – Mark Reitblatt Nov 09 '10 at 23:06
  • By eliminate everything you mean that it proves the undecidability ? 2) The mapping would need the input, otherwise you'd not be able to map a constant function to HALT, however I was not meaning to use "comptuable equivalence", I was meaning that the two compute the same function (without mapping). 3) So you prove that HALT is undecidable just because such a machine will decide semi-computable functions ? So are semi-computable functions has a prove that they are not halting-decidable ?
  • – Mohammad Alaggan Nov 09 '10 at 23:09