Okay, I've given it a bit of thought and I see that the algorithm is very easy.
Example: Let % denote "commutes". Let A % B, A % C, B % C, B % X.
Compute ABAAXCBA:
Let us keep a "result", an "active multiset" and a "remaining suffix". Let us append a special "$" symbol that commutes with noone onto the string.
Invariant: the product of "result", active multiset and the remaining suffix equals the product of original string.
Invariant: all items in the active multiset commute.
At each new element of the suffix, drop elements from the active multiset that don't commute with it into the product, and insert the element into the multiset.
(1, {}, ABAAXCBA$) ->
(1, {A:1}, BAAXCBA$) ->
(1, {A:1, B:1}, AAXCBA$) ->
(1, {A:2, B:1}, AXCBA$) ->
(1, {A:3, B:1}, XCBA$) -> (drop A^3)
(A^3, {B:1, X:1}, CBA$) ->
(A^3 X, {B:1, C:1}, BA$) ->
(A^3 X, {B:2, C:1}, A$) ->
(A^3 X, {B:2, C:1, A:1}, $) ->
(A^3 X B^2 C A, {$}, )
This version has at most O(|A|) (size of alphabet) overhead per input symbol and it does an optimal number of multiplications.