The title is not the best but I really do not know how to describe the scenario in a better way.
The context
Consider taking measurements of two different quantities:
- The time needed for a car to traverse the city of Koenigsberg
- The time needed for a person to traverse the city of Koenigsberg
Now, this is how measurements are done for every car and person:
- As they enter the town, I would start a timer.
- The car and the person would choose different paths inside the city but will eventually get out.
- That is when I would stop the timer and record the time.
I get to collect 2 sets of times:
- Let us call $C$ the random variable capturing the times of cars.
- Let us call $H$ the random variable capturing the times of people.
Both random variables would be distributed according to a certain distribution. As the collected values are plotted in a graph, showing the frequencies of ranges of times (bins), it will be possible to get a glimpse of the PDF of both $C$ and $H$: $f_C$ and $f_H$.
Question
Consider now this procedure:
- I take all the measurements done for cars.
- I take all the measurements done for people.
- I merge those into one collection.
- I plot the frequency histogram of that set.
By doing so I would get a third PDF capturing a third random variable which I will call $X$.
- How does $X$ relate to $C$ and $H$?
- How can I mathematically retrieve $f_X$ from $f_C$ and $f_H$? What is the relation connecting the 3 distributions from an analytical perspective?
Some further reflection
This looks as if $X$ is a combination of $C$ and $H$:
X = g(C, H)
But what is $g$?
If this was a scenario such as $X = C + H$, it would be simple as $X$ would be the sum of two random variables, and there is extensive literature on how to tackle that situation. But here $X$ is not the sum of $C$ and $H$, is something else.