8

Reading through this page on MongoDB website it states:

This MongoDB Wire Protocol Specification is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. You may not use or adapt this material for any commercial purpose, such as to create a commercial database or database-as-a-service offering.

Does this mean that if I want to create a commercial database or opensource database with DAAS offering, I cannot use this message structure, ie this series of 16 bytes?

struct MsgHeader {
    int32   messageLength;
    int32   requestID;
    int32   responseTo;
    int32   opCode;
}

If they can, how did Amazon get away with it with their DocumentDB? Additionally, FerretDB OSS, that implements MongoDB wire protocol, is licenced under Apache2. Does that mean that I can take FerretDB with its implementation of MongoDB wire protocol and offer as a service?

  • 3
    Related question on Law.SE: https://law.stackexchange.com/q/82632 . Note also that even if the license applies to the protocol and not just the specification document, copyright laws often contain exceptions for interoperability purposes. No Creative Commons license prohibits me from doing things that I'm allowed to do by law. – amon Aug 26 '22 at 11:50

1 Answers1

8

The reasoning that FerretDB is relying on in order to "use" the MsgHeader structure and other elements in the wire protocol (without complying with the BY-NC-SA license) is stated in the README.md document of FerretDB:

FerretDB (previously MangoDB) was founded to become the de-facto open-source substitute to MongoDB.

[...]

In its early days, its ease-to-use and well-documented drivers made MongoDB one of the simplest database solutions available. However, as time passed, MongoDB abandoned its open-source roots; changing the license to SSPL - making it unusable for many open source and early-stage commercial projects.

So, the developers of FerretDB do not consider it to be an adaptation of MongoDB; they consider it to be a substitute for it.

Does this mean that if I want to create a commercial database or opensource database with DAAS offering, I cannot use this message structure, ie this series of 16 bytes?

In my opinion, "use" in this manner, it is too broad/ambiguous of a word to consider in a copyright license. Even though that word appears in the note at the top of the MongoDB Wire Protocol Specification document, the document also states that the governing license for the specification is the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License.

Note that that license governs only the specifications document, not the facts inside it. For the example presented, the fact is that the MsgHeader structure contains 4 32-bit integers representing the message length, request ID, response "To", and opcode. The only copyrightable element would be how that fact is represented in the document.

If you translate that fact into code in the language you're writing, and if there is only one way or only a few ways to express that fact in that language, then that must be excluded from a copyright analysis in my opinion, so I would not consider it to be therefore based on or adapted from the specification, in the copyright sense.

This is all speculation on my part, but here are some other possible ways that the MsgHeader definition may have made its way into the FerretDB implementation (and probably any other software packages that understand this protocol) without it being an adaptation from the Specifications document:

  1. The developers didn't use the above document in their development, but instead used an earlier published version (i.e. before October 2018) from a time when MongoDB's documentation and/or source code were available under an open source license that explicitly allows this particular use.

  2. The developers didn't use the documentation at all, but instead analysed MongoDB's message outputs themselves using protocol analysis tools and developed their own specification of the message header based on that analysis.

  3. The developers did make use the current protocol specification document, but did so in a way that does not require copyright permission. Even though the document says "you may not use this material for any commercial purpose", a document license cannot prevent you from using the document in a way that is otherwise allowed by copyright law.

See also: Can I cleanroom code by myself, if public specifications already exist?

See also: https://law.stackexchange.com/questions/81159/is-this-copyright-infringement/81169

... how did Amazon get away with [implementing the protocol] with their DocumentDB?

I don't know the answer to this. But if the information in the MongoDB Atlas vs. Amazon DocumentDB comparison page is accurate, then it appears as though the Amazon implementation is "based on" (in a functional sense; not in a copyright sense) an earlier version of MongoDB, well before it was changed to the SSPL license.

Note that MongoDB, Inc. also offers so-called "commercial licenses" with special terms. I suppose it's possible that Amazon has a special license from MongoDB, Inc. to put the protocol implementation in their product, but if that were true, then I'd be surprised that MongoDB is publishing the above comparison page in such an unflattering way to Amazon.

Brandin
  • 2,520
  • 11
  • 17
  • I don't understand the fact/document distinction you're making when the purported facts are the creative choices the authors made. Is "Darth Vader is the bad guy in Star Wars, a story in which his son, Luke Skywalker fights him" a fact not entitled to copyright protection? Or is it a creative choice fully entitled to it? Every creative choice protected by copyright is also the fact that that creative choice was made, isn't it? What does a story set in the Star Wars universe take from Star Wars but these "facts"? – David Schwartz Aug 26 '22 at 20:52
  • 1
    @DavidSchwartz In copyright law, probably the better terminology is "idea" instead of "fact". That is, whoever wrote that specification had the "idea" to pack 4 pieces of information together as 4 32-bit integers. That "idea" is not protectable, but the expression of the idea is. For the Star Wars example, the idea of a story where the villian is actually the father of the hero is probably not protectable, but the specific expression in Star Wars that involve the specific characters like Darth Vader and Luke Skywalker, is. – Brandin Aug 27 '22 at 06:10
  • My instinct as a programmer is that the "struct MsgHeader" example shown in the document is basically the only basic way to express the idea of 4 consecutive 32-bit integers in most programming languages. For Star Wars, however, there are basically unlimited different ways one could write a story about a villian who actually turns out to be the father of the hero. – Brandin Aug 27 '22 at 06:13
  • But the choice to use 4 consecutive 32-bit integers for this structure, in that order, for this purpose is one of a basically unlimited number of ways one could create a structure for a purpose. Surely you're not arguing that the entire DB wire protocol is the only basic way to make a DB wire protocol. – David Schwartz Aug 27 '22 at 06:20
  • 1
    @DavidSchwartz For this particular example, I believe it's the only way to write a compatible implementation. To be sure, one should probably try a clean room implementation. Someone who has not seen the specification could use some other means (such as analyzing the protocol traffic examples), and then use that information to build his own specification. I haven't tried this myself, but I suspect that whoever does this will still arrive at a similar result: there appear to be 4 items in the header, they are all 32-bits, and the purpose of them appears to be: length, identifier, and so on. – Brandin Aug 27 '22 at 09:21
  • That wouldn't help. Unless those aspects of the protocol are ideas, a clean room implementation would still violate the copyright because you broght those aspects of the protocol into the clean room. You can't watch Star Wars on a TV and then claim that you implemented Star Wars yourself because you only saw some pictures on a TV screen made with the DVD. If you take anything containing the protected expression into the clean room, it is no longer clean. (The 'took only what I needed for compatibility' defense has been rejected by courts.) – David Schwartz Aug 27 '22 at 21:13
  • @DavidSchwartz Can you point to a reference of the case where it's been rejected? I thought the opposite was true. Still, I think this example is much different from copying elements from fiction such as Star Wars. For fiction, the analogous action of a "clean room" implementation is probably to make a 'parody' of Star Wars instead, which has been done, and which has been found to be OK (fair use) with specific requirements (your parody can't be called "Star Wars," for example). For protocol implementations, we'd have to look carefully at what cases there are on this specific topic. – Brandin Aug 29 '22 at 04:24
  • Google v. Oracle. Google held that they took only what they could observe without looking at the code and only what they needed for compatibility. The court held that the rule is that you can take only what the copyright holder needed for compatibility, not only what you needed for compatibility. The designer of the wire protocol didn't need any of this for compatibility, it was creative choices they made which you are stealing by observing their effects because you need them for compatibility. This is precisely what the court rejected. – David Schwartz Aug 29 '22 at 15:51
  • But the clean room implementation idea is pretty clearly pointless. If the things you're taking aren't protected by copyright, why do you need a clean room? And if the things you're taking are protected by copyright, how does the clean room help? These are the things you brought into the clean room, so you can't argue you didn't have access to them and that's the entire point of a clean room. With or without a clean room, the issue is the same -- are these API decisions copyrightable or are they not? – David Schwartz Aug 29 '22 at 16:17
  • @DavidSchwartz Are you talking about the case discussed here? https://opensource.stackexchange.com/questions/762/what-are-the-implications-of-the-google-vs-oracle-case-on-the-state-of-public-a/1454 I think that's a bit different; they copied actual code; 7,000 lines of "creative code" according to an article. I would personally dispute that the "struct msgHeader" example here contains any creative code. Any developer who knows how to write a C-like structure, given a diagram, an explanation, or a specification of how the data look like on the wire would write similar code. – Brandin Aug 30 '22 at 06:25
  • I think you're missing the point. The only issue is whether what you're taking is protected by copyright or not. The issue is whether creative choices in the wire protocol are protected or not, such as the choice to use four 32-bit values with those meanings in that order. There is no defense available that you took them because you needed them for compatibility. So if they're significant protectable expression, you are screwed. The clean room doesn't help for precisely the reason you stated -- they follow immediately from the design decisions and you took the design decision into the room. – David Schwartz Aug 30 '22 at 22:24
  • @DavidSchwartz At this point I think you should summarize this into your own Answer and show your point of view with examples. All the example cases I've seen seem to point in the other direction -- MongoDB, Inc. cannot really rely on copyright protecting these aspects of their protocol (they could have applied for a patent instead) in an original implementation, but they won't dare bring a court case which they might lose, because that may damage their business model beyond repair. The – Brandin Aug 31 '22 at 05:56
  • You can't patent the choice to use three 32-bit numbers or the choice to put those variables in that particular order because there are dozens of equally good ways to make those choices. Copyright only applies in cases where there are numerous equally good ways to do something. Patent only applies in cases where there aren't. – David Schwartz Aug 31 '22 at 06:04