1

There are two measuring devices, device 1 is the standard and considered to be very accurate.

The same five items are measured on both devices. The device 2 is calibrated to device 1, so the means are already identical.

So far I throw all five measurement groups for each device together, so analyze 200 vs 50 measurements. But I feel like this might not be "allowed", as the measured items are very similar, but not exactly the same.

Sorry for the noob question, but what is the best way to make a conclusion as to wether device 2 is "good"?

Frank
  • 11
  • Where are the numbers 200 and 50 coming from? Is each item measured 10 times on each device? – Peter Flom Nov 24 '23 at 13:00
  • Five items are measured 10 times on device 1 and 40 times on device 2. The items are very similar but not identical. So you could almost group them together, but it's probably technically not allowed. – Frank Nov 24 '23 at 13:09

1 Answers1

1

For each item I would make plots. The first that comes to mind is a parallel strip plot with a box plot overlaid. The box won't be great with only 10 items, but still might help. Then I might do a Tukey mean difference plot (aka a Bland Altman plot). Then look. As Yogi Berra may have said "you can see a lot by looking."

As to what is "good" -- you decide. That isn't statistically answerable and it depends on the field. What are you measuring? Some things require much more precise measurement than other things. Even the same thing may require more precise measurement in different contexts. For instance, the dice used in Las Vegas are much more precisely measured than the ordinary dice you can buy in a store.

So, think about what degree of precision you need in your application. You may need to consult with subject matter experts.

Peter Flom
  • 119,535
  • 36
  • 175
  • 383
  • Thanks for your answer. – Frank Nov 24 '23 at 13:33
  • Didn't mean to send that answer yet,

    Thanks for your answer. I should have been a bit more clear in my question.

    The means are already adjusted to be exactly the same, by calibrating device 2 to match device 1. All that's left to figure out is things like variability, spread, etc.

    Also each measurement on the same item is done in a different spot, so no measurement is really the same. Each measurement is different, and each item is also different. The means are quite or very close to each other. Maybe it's not so bad to throw all 5 items into one group?

    – Frank Nov 24 '23 at 13:41
  • Yes, I understood that the means are the same. The plots are a good way to measure variability. If each measurement is done in a different spot, then ... that's just noise and a bad experimental design. I don't see how you can adjust for that. Throwing all 5 items into one group seems like a mistake, but, you might do that in the plots and use color to distinguish the item. – Peter Flom Nov 24 '23 at 13:47
  • Device 1 uses a destructive testing method, hence why there can't be two measurements in the same spot.

    The measurements are spread evenly, to create a "complete" picture. We do as many measurements as possible but after 9 evenly spaced measurements, that item is pretty much destroyed and can't be measured anymore.

    Also sadly it's not possible to test the same spot with both devices.

    Thanks for your answers, if you have any idea how I could make a good assessment of the quality of the measurements from device 2, I'd love to hear it.

    – Frank Nov 24 '23 at 13:59
  • I already gave my ideas on how to assess this. – Peter Flom Nov 24 '23 at 14:03