5

I'll start off with a little GEDCOM background information before firing off my questions.

The GEDCOM 5.5.1 specification distinguishes record structures (which have xref-ids) from substructures which are contained in records.

For notes, there is a NOTE_RECORD and a NOTE_STRUCTURE. The NOTE_STRUCTURE substructure is embedded by many other records and substructures. The NOTE_STRUCTURE is essentially a wrapper to either an embedded note or a reference to a NOTE_RECORD via its xref-id.


My questions are from the perspective of a GEDCOM writer:

When should a note be embedded and when should it be a stand-alone record? Are there any applications that have an explicit interface to generate NOTE_RECORDs?

I am trying to simplify my GEDCOM writer while not limiting its future expansion. I can visualize an application that offers the user an opportunity to enter free-form notes (as NOTE_RECORDs) that could later be referenced in other contexts. A NOTE_RECORD also offers the opportunity to enter a source citation, user reference, and change date.

It seems to be more common that notes are entered in the context of some other record type, and in that case it is easier to implement them as embedded.

smitchell
  • 61
  • 5

2 Answers2

6

"When should a note be embedded and when should it be a stand-alone record?" Unhelpfully, perhaps - whenever the user requests it... (but see later)

"Are there any applications that have an explicit interface to generate NOTE_RECORDs?" I thought PAF did, I know my application (Family Historian) does.

"It seems to be more common that notes are entered in the context of some other record type, and in that case it is easier to implement them as embedded." Easier, yes. But if I want to create a common note, accessed from several points, I'd be disenchanted if I couldn't.

Let me ask a couple of questions in return:

  • Simplifying your application is admirable - but what would you do if your user received a GEDCOM with NOTE_RECORDs and you couldn't deal with it? The essence of GEDCOM is transport, i.e. dealing with any conformant GEDCOM;
  • How would you advertise a program that was not 100% GEDCOM compliant in that it would not read NOTE_RECORDs?

Now - a very good answer would be, "It's only going to be me using this software and I'm not going to use someone's else GEDCOM". But only you know that.

Let me go back to the first question I answered and explain a bit more. When would I create a NOTE_RECORD? When the note that I want to create is shared across a number of individuals or families. Or when it's simply a note that doesn't link to any individual or family. Examples of the latter include a NOTE_RECORD with a list of census dates, or with details of when the calendars changed from Julian to Gregorian. Examples of the former include details of organisations that are, or might be (because I might not yet know the answer) applicable to several people. For instance, mini-histories of towns or regiments or railway companies. If you think of the ideal web-based version of that file, the mini-history would be accessed via a hyperlink and never have its text duplicated if I find several people in the same regiment / employed by the same company, etc.

In essence, NOTE_RECORDs to me are about sharing and prevention of duplicated text. I detest duplication because a week later - the duplicates are no longer quite duplicate.

AdrianB38
  • 11,560
  • 22
  • 37
  • Another way of considering it - what's the workflow? I am proactive with entering data. I am liable to enter (shared) notes first, before the thing they'll be shared with, and always enter my sources first (then my facts). If someone is reactive and only enter notes in the context of something else, they may never use NOTE_RECORDs. But this also presumes the process of data-entry is using GEDCOM directly. – AdrianB38 Mar 12 '13 at 20:58
  • Hi, @AdrianB38. Thanks for your answer. First, to your questions: NOTE_RECORDs are to be supported, with a separate interface for creating & editing them. I appreciate that NOTE_RECORDs offer sharing & reduction of duplicated text (as well as other benefits). So for widest appeal, especially to proactive note producers, any note input should offer up a selection of pre-entered notes as well as a direct "input box" for the embedded variety. – smitchell Mar 12 '13 at 21:57
  • Sounds good to me. – AdrianB38 Mar 13 '13 at 09:53
  • H'm, I think that the suggestion that you can make notes shareable by writing note records, and sort of private by embedding them, is false. I write that, because I see no word in the GEDCOM standard saying that they should be interpreted that way on import. I know from experience that Gramps puts every note it reads in a global note table, and exports every note as a note record, so I wouldn't count on anything here. – Enno Borgsteede Mar 15 '13 at 17:44
  • @Enno Borgsteede – AdrianB38 Mar 15 '13 at 22:46
  • @Enno Borgsteede - my use of "shareable" was not intended to imply anything about privacy. Simply that the text of what you refer to as an "embedded" note refers only to the person (or whatever) that it's subsidiary to, whereas the note record can refer to anyone / anything. Privacy in the context of a GEDCOM file is a nonsense anyway, given that it's a text file that can be read in Notepad, etc, as I'm sure you appreciate. – AdrianB38 Mar 15 '13 at 22:53
  • I know. I deliberately used the term 'sort of private' because I knew no other way to express the nature of an embedded note. I used it a bit like the term private in a programming language. – Enno Borgsteede Mar 16 '13 at 12:48
  • But my main message is, that I think that it does not make sense to expect than an embedded note will be treated as embedded by a receiving genealogy program, as is proved by Gramps, which imports all notes as global. And I think that's an important thing to note, because the OP seems to think that it does make a difference. And as far as I know, that difference is not explicitly supported by the current GEDCOM standard. – Enno Borgsteede Mar 16 '13 at 12:51
  • @Enno Borgsteede - Yes. And no. I was talking about the structure of the GEDCOM and not about the structure of the database that might be used to generate a GEDCOM. So - yes. An embedded note in a GEDCOM file may look functionally identical to a shared note once loaded into a database. Quite agree. And this gives rise to a potential oddity in that unless GRAMPS or whatever marks the database with the original type, when the data is exported back into a GEDCOM, then it can switch type. – AdrianB38 Mar 16 '13 at 17:43
  • @Enno Borgsteede - the "No" angle is that I interpreted the question as referring solely to the context of software writing a GEDCOM file. In that context, the embedded note will stay embedded, the shared note will stay shared. In that instance, the OP's assumption is correct because no database is involved. Once it's loaded into a database then, as you point out, what happens after load and what comes out the other end can, as you point out, differ. – AdrianB38 Mar 16 '13 at 17:49
  • OK, I get it, and after running an experiment, I can now say that any assumption about a note type getting through the GEDCOM barrier is unsafe. On one side, I know that Gramps imports all notes as shareable, and on the other side I also know that RootsMagic does the opposite, and that means that shared notes are embedded, and thus duplicated, and that note records that are not actually used, will disappear without warning. – Enno Borgsteede Mar 17 '13 at 13:53
2

The official answer can be found in the GEDCOM standard, version 5.5, and 5.5.1:

Logical GEDCOM record sizes should be constrained so that they will fit in a memory buffer of less than 32K. GEDCOM files with records sizes greater than 32K run the risk of not being able to be loaded in some programs. Use of pointers to records, particularly NOTE records, should ensure that this limit will be sufficient.

Enno Borgsteede
  • 1,425
  • 10
  • 13
  • Personal note: Since I can't find an official text that tells you when you should embed a note, I think it's safe (and simple) to always use note records on writing. On reading however, your software must support embedded notes to be compliant. – Enno Borgsteede Mar 16 '13 at 18:57
  • 1
    The 32K constraint was what is now an obsolete limit. I doubt if there is any modern program today that will have such a low limit on record size. – lkessler Mar 17 '13 at 14:37
  • Well, you may be surprised. Many popular programs have a long history, so I think it's safe to assume that there is some 16-bit code in those. – Enno Borgsteede Mar 18 '13 at 15:18
  • But that's not the point. For me the original question seemed to be about a possible functional difference between embedded notes and note records, and as far as I can see, there is none. – Enno Borgsteede Mar 18 '13 at 15:20