0

We have a paper about a novel statistical model and we describe its input data. Should we write

The input data also include a vector of numbers F ...

or

The input data also includes a vector of numbers F ...

?

Google search tells me that:

data is treated as singular when used as a mass noun to mean “information” and as plural when used to mean “individual facts.”

I don't think our context is mass noun, but it is not "individual facts" either. I would be inclined to use it as plural, sounds more natural to me, but colleague has similar feeling about the singular. Which one is correct?

Tomas
  • 839
  • 3
    IMO data is almost always singular in most writing. I don't know about statistics, which may have a completely different convention, but in general English "data includes" generally feels more natural today. Maybe ask on a statistics forum if you are specifically worried about the usage in the field of statistics. If you search here for data singular you will find a lot of relevant questions, but as I said, statisticians may not follow the same conventions. – Stuart F Oct 06 '23 at 10:35
  • This is almost certainly opinion-based: neither usage is incorrect per se. 'I don't think our context is a mass noun' indicates a misunderstanding; you can choose the count (which takes a plural form verb, as the word is plural in form) or noncount (which always takes a singular verb form, like the majority of noncount usages) usage. And while the noncount usage is, as Stuart says, far more common in general, some academics prefer ... even insist on ... the count usage when used in their domain (eg stats). // Choosing 'the data include' emphasises individual items of data. – Edwin Ashworth Oct 06 '23 at 10:44
  • @EdwinAshworth Thanks (and to Sturat as well). With "The input data" I mean a very specific and well defined "package" of all the data that the model takes at the input. And I am saying that this one package also contains this specific thing. But I am not sure if this usage makes the "input data" a plural or singular. – Tomas Oct 06 '23 at 10:53
  • 1
    As I said, there is no basic rule dictating this. (There may well be pressure from others of a more prescriptivist bent, and the choice may well be affected by seeking to accommodate the intended audience.) But neither should be considered ungrammatical. – Edwin Ashworth Oct 06 '23 at 12:02
  • This also differs between American and British English. I think Brits treat it as plural rather than a mass noun. – Barmar Oct 06 '23 at 15:08
  • Include creates the image of input data as a container of the vector. This is a notionally singular usage driven by the particular choice of verb. But you could also have " the input data comprise several vector elements", where the verb creates a pairing between input data and the input fields. I was under the impression that you'd normally just use input as the collective noun. – Phil Sweet Oct 06 '23 at 18:53

0 Answers0