2

It appears that algorithmic complexity theory has already figured out Kolmogorov complexity, when applied to representations of programs themselves, can already serve as a solid theoretical metric of enumeration and comparison for programs, given a common basis language.

Does there exist an online database that attempts to enumerate algorithms similar to the Human Genome Project according to a basis encoding or language?

CinchBlue
  • 309
  • 2
  • 9
  • 1
    Enumerate all algorithms? Why? Most algorithms compute something no one would ever want to compute. – Sasho Nikolov Nov 14 '17 at 06:52
  • I posted some links in reply to your question Does there exist an online database of algorithms? in chat. The only major omission I noticed is that parallel algorithms are seriously underrepresented. I probably noticed that omissions only since I currently study parallel algorithms and hardware architecture. So I guess there are other major omissions as well, which I simply don't notice at the moment. – Thomas Klimpel Nov 14 '17 at 10:43
  • 1
    Online database? How is that connected to ontology? – Emil Jeřábek Nov 14 '17 at 13:35
  • 1
    @SashoNikolov Reifying and relating classes of algorithms can be helpful for teaching and analysis purposes. – CinchBlue Nov 14 '17 at 20:13
  • 1
    @EmilJeřábek In bioinformatics, databases with GO and KEGG are commonly used in analysis to relate genes to pathways, diseases, etc. It is a very helpful reference in the field -- I was wondering if a similar thing existed for algorithms in computer science. – CinchBlue Nov 14 '17 at 20:15
  • seems like wolframian New Kind of Science went in this direction, a kind of zoology of algorithmic behavior... https://en.wikipedia.org/wiki/A_New_Kind_of_Science there are other prjs such as "implement short algorithms in many different languages and see how they compare" eg Rosetta code https://en.wikipedia.org/wiki/Rosetta_Code – vzn Nov 15 '17 at 04:51
  • @SashoNikolov: While I agree that most algorithms compute something no one would want to compute, I am at least sympathetic to the idea that understanding the behavior of algorithms more generally could help us develop lower bounds techniques. (Now, I'm not sure enumerating programs is the best way to go about getting that understanding, but...) – Joshua Grochow Nov 16 '17 at 17:24
  • @JoshuaGrochow I think the space of algorithms is too huge and wild to learn anything from enumeration. I should say I don't understand this question at all: what does anything have to do with Kolmogorov complexity here? what does OP mean by an ontology? are we talking about enumerating small Turing machines or whatever, or some classification of human-created algorithms? – Sasho Nikolov Nov 16 '17 at 17:52
  • @SashoNikolov: I 100% agree (except I feel like I have some idea about "ontology," but not whether the OP is interested in enumerating small TMs vs classification of human-created algorithms). – Joshua Grochow Nov 16 '17 at 18:08
  • @SashoNikolov The space of algorithms is indeed large, but many individual algorithms of interest should be able to be broken down into its constituents. The point is to create a definite system of representation of a program and its subprograms in order to establish a hierarchy of algorithms. Doing this for a small language should be sufficient for establishing a basic framework that could potentially grow into something that could be extended to a higher-level programming language after decades of work. – CinchBlue Nov 16 '17 at 19:45
  • @JoshuaGrochow Enumerating the small Turing machines and figuring out a way to enumerate interesting algorithms would be interesting. – CinchBlue Nov 16 '17 at 19:51

1 Answers1

2

[More of an extended comment.]

I think I disagree with the premise of your first paragraph... That being said, while I don't know of a database, there has been some work systematically studying small Turing machines. See Small universal monotone Turing machines, https://cstheory.stackexchange.com/a/20980/129, https://cstheory.stackexchange.com/a/11614/129. While some of this is interesting, I would say it has yet to "bear much fruit (i.e., understanding)." I hope someday it will.

There's also a mismatch between listing small Turing machines and understanding algorithms that are built by people (much as there is still a gap between listing all genes and understanding what they do, though not quite the same). Namely, an ontology of algorithms that are built by people should including things like those mentioned in Thomas Klimpel's comment. (Note: it should also include holographic algorithms, a relatively recent surprise in the ontology of algorithms!) But an ontology of small TMs looks completely different. Given how small a corner of the space of algorithms people have explored, and given the uncomputable problems involved, I highly doubt there would ever be a "complete" ontology of algorithms - either in the sense of small TMs or in the sense of all algorithms people would build. But I suppose having a partial one might still be interesting for some purposes...

Joshua Grochow
  • 37,260
  • 4
  • 129
  • 228
  • 1
    Even a limited ontology can be of great use, as the representation and relations are enough to support a more precise common ground and common language that people can reference. I want to look more into small/bounded Turing machines now -- thank you for the references. – CinchBlue Nov 14 '17 at 23:53
  • Also, why do you disagree about with the first paragraph? I don't understand – CinchBlue Nov 16 '17 at 06:53
  • 1
    @VermillionAzure: Well, technically I agree, I just don't really know what you mean by "serve as a solid theoretical metric of enumeration and comparison for programs." K is not for comparing programs, its for comparing strings. For enumerating programs, you just enumerate them, you don't need K. For comparing programs, I'm not sure you'd want to use K either, though I guess you could... Very little work has been done on "K for programs," defined as $K(f)$ being the minimum size of a program computing $f$ (in some fixed, prefix-free language, say), though it's an interesting topic. – Joshua Grochow Nov 16 '17 at 17:26
  • Every Turing machine is defined on a finite set of language symbols and on a finite number of internal states. Therefore, since its constituents all have a finite quality, it is possible to come up with a finite, discrete representation for it. – CinchBlue Nov 16 '17 at 19:42
  • @VermillionAzure: Sure, but that seems to have little to do with Kolmogorov complexity. (Or rather, the definition of $K$ relies on the facts you mention, but the facts you mention existed long before Kolmogorov complexity did.) – Joshua Grochow Nov 16 '17 at 20:39