The "infinitive" is as the name implies nonfinite (but in French called "mode impersonnel"); the present participle is another nonfinite form of the verb, but it is far from being as "malleable" as the infinitive, it is quite restricted as it has right away the value of an adjective. There is the essential difference when we consider the two in real sentences : the first is in relation with other verbs, whereas the second is in relation with nouns.
The difference is then to be sought in the type of verb that introduces the action taking into account the scope those two verbal nonfinite forms have.
There is a rule that says that after verbs that express the idea of a perception you use the infinitive (ref);
verbs of perception (ref) : regarder, apercevoir , écouter, voir, sentir, entendre
However, there is an exception, you do not use the infinitive with the verb "apercevoir"; this can be verified in the ngrams for current verbs;
courir, entrer, monter, arriver, partir, venir, manger, s'en aller, passer, atterrir, jouer, chanter, tirer, travailler, voler, s'exercer, tomber
There is an exception to this usage and it is confirmed by the TLFi;
(caratères gras et italique ajoutés)
I A 2. Plus rarement. Sans effort d'attention, ni recherche
[…]
... Aimé regarda son père. Soudain, il l'aperçut vieilli.
SYNT. Apercevoir tout à coup, brusquement, soudain, pour la première fois.
[…]
− Insolite. [Suivi de l'inf., p. anal. avec voir + inf.]
7. Il vénérait Verhaeren. (...) Il aimait, (...) en retrouver à tout instant les larges visions tragiques : le moulin, dont les bras, ... comme des bras de plainte, se sont tendus, et sont tombés ... Les astres qui, là-haut, semblent les feux de grands cierges, tenus en main, dont on n'aperçoit pas monter la tige immense.
[…]
Rem. Apercevoir s'oppose à regarder et à voir par l'aspect, qui dans tous ses emplois est perfectif, c'est-à-dire marque l'aboutissement qui de soi est momentané (cf. p. ex. découvrir opposé à chercher); d'où l'impossibilité de construire ce verbe avec un inf. prés. duratif (cf. supra ex. 7).
It's not at all current, only literary; there must be in the situation the verb applies to a quality of being strange ("insolite" in French), and on top of that, let's not forget that the verb must be used in a context from which we know that is dispensed no effort of attention, no effort of research. It is to be noticed that in this usage the aspect of the infinitive "monter" is not "durative" nor "ponctual" or "perfective" according to another terminology, this being so because the verb in this particular acceptation is not dynamic.
The remark "d'où l'impossibilité de construire ce verbe avec un inf. prés. duratif" confirms that you can't have a verb in the infinitive after "apercevoir".
This is why you have to use a past participle.
The idea, in other terms, is this; as I said in the introduction the two apprehensions of the action refer to the same thing, that is a certain period of time during which what the person does is called "going into something"; however the relations in the two contexts change because this action impinges upon the reader according to two different points of view and the terms used refer to the points of view and not the action: one is the action fully seen, the other is the action guessed at and merely revealed by a snapshot. In the first case it calls for "voir" and the "aspect duratif" of this verb is well suited for the infinitive as the infinitive is the most general form; in the second it calls for "apercevoir" and as the "aspect perfectif" of this verb is not compatible with the concept of something that lasts, a relation with a noun only is possible (a pronoun here, "l'"); there is then only the option of "inserting" the action through a description of what the the person (l') represented in the object is doing, and that is donne by a kind of adjective, in other words the present participle.
The explanation of the dictionary doesn't take fully into account the realities of this situation : in the case of the "verbe duratif" (voir) the action lasts (he saw him all the time he went in) and it bears upon verb and person, in the case of the "verbe perfectif" it doesn't last (he saw him while he went in, at one point of the whole time it took him to get in) and it bears upon the person only. To show and summarise the difference better, let's say that there is a recognised relation in "on voit entrer" but there is none in "on voit entrant".
Let's mention, by the way, that the very same point of view is part of the semantics of English and that should help drive the idea home; that can be seen in the following examples;
- He saw him go into the building.
- He noticed him going into the building.
(You can't say, as the ngram shows, "He noticed him go into the building.".)