It seems to me that it would be more natural to consider Dirac Delta as a piecewise-defined function, as described here, with the scaling rule $\delta (ax)=\delta(x)$. This way we keep all the integral transforms properties of delta function, and its derivatives as long as its argument is the integration variable. But at the same time, we can define arbitrary functions on delta function, including rising it to any positive power, evaluate it at zero, and consider it outside of integrals.
It seems to me absolutely natural that a derivative of scaling-invariant function, like $\operatorname{sign}x$ also should be scaling-invariant. This, in particular, allows to generalize sign function to dual numbers while keeping its properties.
So, the question is: why another convention, that is $\delta(a x)=\frac1{|a|}\delta(x)$ was adopted, which does not even allow to ascribe to it a value at zero?