Sunday, April 22, 2012

The pragmatic approach to the copyrightability of API packages: case-by-case analysis

The first major decision in Oracle v. Google will relate to the copyrightability of the 37 API packages Oracle is asserting (alongside other intellectual property). Absent a genuine factual dispute, this will be decided as a matter of law by Judge William Alsup. A decision could come down very soon. Over the last couple of days, the judge sent the parties three sets of questions, all of which relate to the copyrightability of API packages and have to be answered until 3 PM local time today.

Unless this case is settled, I believe that whichever party loses the decision on this particular issue will make this one the first and most important item of an appeal. That's just my personal belief based on how much depends on this.

Having followed the copyright-related pleadings in Oracle v. Google for well over year and having spent the better part of this weekend refreshing and augmenting my related knowledge, I'm now going to dissect this question in detail. This is a relatively lengthy post, but I'm doing my best to make this complex issue as clear and comprehensible as possible.

The 37 API specifications in the broader context of the case

In this litigation, Oracle is alleging the violation of a long list of intellectual property rights by Google's Android: two patents and dozens of copyrights. The 37 API packages are a subset of the asserted copyrights, and their copyrightability is just one (though highly important) question related to that particular subset.

Here, we don't have to take the two patents-in-suit into consideration. If they are infringed, they are not infringed by Google's use of the API specifications.

For the copyright part of the case, Oracle stated the infringement issues as recently as in an April 12, 2012 answer to questions from the court.

There are issues with 10 Android source code files that are not part of Android's implementation of the 37 API packages. Eight of those files (and their corresponding object code) contain copyrighted material that was apparently processed by a decompiler. The file names are AclEntryImpl.java, AclImpl.java, GroupImpl.java, OwnerImpl.java, PermissionImpl.java, PrincipalImpl.java, PolicyNodeImpl.java, and AclEnumerator.java. One of them (PolicyNodeImpl.java) was attached to Oracle's amended complaint. I found out about six of the seven other files and blogged about them in January 2011. In addition, Oracle says that "[c]omments contained in two Android source code files, CodeSourceTest.java and CollectionCertStoreParametersTest.java", were copied. Google says that it has removed any infringing material that was contained in those 10 (8+2) files in earlier versions of Android. Oracle argues that most Android devices still contain at least some of this material. At any rate, for the purposes of this post we don't have to go into more detail.

Oracle also says that "[t]wo Android source code files (and corresponding object code), TimSort.java and ComparableTimSort.java, which are part of Android's implementation of the 37 API packages", raise copyright issues. I mentioned the TimSort issue in my previous blog post. Google's opening presentation argued that the rangeCheck function represents only 9 lines of code out of 924 in the file and 15 million lines of code in Android as a whole. Google furthermore claims that Oracle's own damages expert, Professor Iain Cockburn, attached no commercial value (literally $0, according to Google) to those 9 lines. No matter how small this infringement may be, it is yet another incident of direct copying and, as such, detrimental to Google's credibility.

Now we're getting to the last but most important part of the asserted copyrights, the 37 API packages. These are their names:

java.awt.font, java.beans, java.io, java.lang, java.lang.annotation, java.lang.ref, java.lang.reflect, java.net, java.nio, java.nio.channels, java.nio.channels.spi, java.nio.charset, java.nio.charset.spi, java.security, java.security.acl, java.security.cert, java.security.interfaces, java.security.spec, java.sql, java.text, java.util, java.util.jar, java.util.logging, java.util.prefs, java.util.regex, java.util.zip, javax.crypto, javax.crypto.interfaces, javax.crypto.spec, javax.net, javax.net.ssl, javax.security.auth, javax.security.auth.callback, javax.security.auth.login, javax.security.auth.x500, javax.security.cert, javax.sql

Oracle estimates, based on a count of the lines of a sample of the Android documentation, that "Google copied an estimated 103,400 lines1 from the Java API specifications into Android's documentation". Oracle furthermore allegesi nfringement by

  • "[t]he selection, arrangement, and structure of all API elements (including names) of the Android class library source code and object code that implements the 37 API packages",

  • "[t]he declarations of the API elements in the Android class library source code and object code that implements the 37 API packages", and

  • "[t]he Android class library source code and object code that implements the 37 API packages as derivative works of the Java API specifications, including the English-language descriptions".

Oracle explicitly does not bring copyright infringement allegations against these parts and aspects of Android:

  • "Android's use of the Java programming language",

  • "[a]ny particular name of an API element, including names for each package, class, exception, field, method, and parameter name, considered individually" (this was the only item with respect to which a copyright-related summary jugment motion by Google succeeded, but the judge made clear that Oracle could still assert rights in larger structures consisting of such names),

  • "[t]he Android source code implementing the APIs contained in the 37 packages at the line-by-line level, except as set forth [in the infringement allegations described] above",

  • "[t]he idea of APIs",

  • "[t]he Dalvik virtual machine (without associated class libraries)",

  • "Android API packages and associated class libraries other than the 37 API packages and associated class libraries listed [in the infringement allegations described] above", and

  • "[o]ther Android source code and object code except as set forth [in the infringement allegations described] above".

Disputed distinction between Java language and Java APIs

Now that I have mentioned that "Android's use of the Java programming language" is explicitly excluded from the targets of Oracle's copyright infringement allegations, it's necessary to explain that there's a dispute between the parties as to whether the Java language (i.e., the keywords of the programming language) and the API packages are separate issues (as Oracle essentially contends) or inextricably linked (as Google would have the court believe).

There are two ways in which Google seeks to benefit from blurring or even eliminating the distinction between the Java language and the APIs. Since Oracle declared the language itself freely available in certain contexts, Google hopes to have an equitable defense. Also, Google believes that it can convince the judge (or, if necessary, the jury) of the idea that languages shouldn't be copyrightable.

In its opening presentation, Google highlighted such quotes (from documents and depositions) as "Package java.lang [p]rovides classes that are fundamental to the design of the Java programming language". Oracle points to the fact that, as its copyright expert wrote, "it was not necessary to include any particular class or package (beyond perhaps a very few classes like Object and Class that are tied closely to the Java language) for the Java language to function". That very limited overlap was shown on day 3 of the trial by Dr. Mark Reinhold, the Chief Architect of Oracle's Java Platform Group; see page 20 of his slide deck). Oracle furthermore notes that the Java language itself changes rarely (only three versions so far) compared to the APIs, which change frequently and grow fast (see page 19 of Dr. Reinhold's slides). And Oracle makes references to what Google's own copyright expert wrote last summer:

"'Java' may refer to three very different things: the Java programming language, the Java Application Programming Interfaces (APIs), or software source code that references and implements the APIs."

In other words, Google's position on this has shifted. I don't have the impression at this stage that Judge Alsup buys the conflation Google proposes now against better knowledge and its own earlier arguments. One thing the judge was nevertheless interested in is how the parties view the copyrightability and patentability of programming languages on the one hand and APIs on the other hand. Even though the Java programming language isn't at issue in this litigation, the judge presumably asked those patentability and copyrightability questions also because he wanted to understand which of the two categories of intellectual property rights has, or from a policy point of view should have, scope for the kind of creation Sun made and Oracle acquired. For example, if the programming language appeared more protectable than the API, it could be considered Oracle's fault that it declared the language itself to be freely available. Or in another scenario, if it was clear that APIs are meant to be protected by patent law, the judge probably wouldn't want to stretch the boundaries of copyright law for someone who failed to obtain patent protection. It's even possible that the judge concludes that neither patents nor copyright can protect those APIs, but I think he believes in the rule of law and in intellectual property and will be hesitant to deny all protection for an important contribution to innovation. All of the case law cited by the parties on copyrightable subject matter is more or less useful in terms of what it says about general principles of copyrightability, but there's no single case that would be able to answer the question. And in particular, there hasn't been any high-profile case like this that raised any reasonably similar issue. That's why Judge Alsup's court as well as any appellate courts who may subsequently look at this issue will have to factor in some general policy considerations.

Before we continue with the parties' positions, I'll state my personal, long-standing perspective on this policy question.

Patentability vs. copyrightability of programming languages and APIs

The question of copyrightability vs. patentability reminds me of my discussions with European politicians in 2004 and 2005 during the legislative process concerning a proposed EU law on the patentability of "computer-implemented inventions". As an anti-software patent but not anti-all-IP activist, I accepted some politicians' opinion that copyright law on its own (even if combined with trade secret protection) may not provide sufficient protection for certain kinds of software-related innovation. I told them that software still shouldn't be patented. Instead, I proposed to think about ways to strengthen copyright in connection with software.

In that same political debate I also argued that overreaching "interoperability" privileges would be inconsistent. Either there is a certain kind of protection and then it should, in principle, also apply to interoperability-related inventions and creations, or there isn't. Nevertheless, in connection with industry standards I believe that FRAND commitments must be honored -- I've also had that position for years.

While that legislative process was ongoing in Europe, someone named Greg Aharonian brought a lawsuit, in the same district as the one in which Oracle v. Google is now being tried, aiming to have the courts determine that software shouldn't be copyrightable at all -- but only patentable. A Reuters story on that lawsuit can still be found on the Internet (1, 2) I was shocked when I heard about that, and glad when he lost.

The problem with a patents-only approach to software protection would be that there's a whole lot of innovation that takes real creativity (and hard work and sometimes substantial investment) that I believe should be protected regardless of whether it meets the criteria for patentability. For example, if developer A implements a certain patentable invention and developer B comes up with a faster and/or more efficient and/or more secure implementation, I think it's fair to give either one copyright in what he writes, even if B might be denied a patent because his creation is deemed anticipated by, or obvious over, what A did before him.

In my view, this also applies to API design provided that it truly represents an original work that deserves to be protected as a creative achievement. I can see value in it that deserves to be protected irrespectively of whether novelty, non-obviousness and other criteria for patentability are met.

Theoretically, it is conceivable that a patent constitutes a monopoly on functionality without which it's impossible to interpret or compile a command of a programming language, or to reasonably implement a method of an API class, but that's only an indirect kind of protection. It would not protect structure, selection, sequence, organization, etc., and it would be both unfair and insufficient. Unfair because API design as a creative process deserves protection; insufficient because even a very patent-savvy developer will be lucky to be granted a patent that is essential to the implementation of even one such feature, and an infringer could then throw out that feature and still make use of lots of the rest, which would not be protected by patent law.

I know some people out there claim the copyrightability of APIs, for which Oracle v. Google may become a landmark case, would be undesirable. But I think they generally overstate the implications of this. In particular, Oracle isn't arguing that all APIs should be copyrightable, and in particular, Oracle doesn't advocate the copyrightability of abstract ideas.

Balance is key. In all the decisions on copyrightability that I read, there's one (frequently-cited) sentence in a Ninth Circuit ruling (Rosenthal v. Kalpakian) that I consider very appropriate and pragmatic:

"The guiding consideration in drawing the line [between a non-copyrightable "idea" and copyrightable "expression"] is the preservation of the balance between competition and protection reflected in the patent and copyright laws."

This one was not only quoted but further clarified in CDN Inc. v. Kapes.

I'll be frank: I think there are far more reasons to consider those Java APIs -- also at the level of selection, structure and organization -- deserving of copyright protection. The question of whether such theories as the "merger doctrine" (inseparability of idea and expression), which is one of the theories based on which Google disputes the copyrightability of the asserted APIs, render a certain way to use such material legal should be a subsequent issue in my opinion, and has to be decided in light of policy considerations concerning competition and innovation as opposed to a bright-line approach.

This is reasonably consistent with what Google's Senior Copyright Counsel, William F. Paltry, wrote in this 2003 paper on copyright and software:

"The validity of merger as a doctrine separate from the idea-xpression dichotomy is doubtful, however. If an idea and its alleged expression are truly inseparable, there can be no selectivity sufficient to satisfy the originality requirement. If, on the other hand, an author has choices regarding the content or design of a work and imbues the work with more than a minimal amount of expression, the court should not focus on copyrightability, but instead on the scope of protection. Thus, the better approach [...] is that merger is relevant at the infringement stage as a limiting principle on the scope rather than on the existence of protection. When used at the infringement stage, merger can be applied sensitively to the facts before the court, permitting the court to ensure that the proper balance between protection and competition is preserved. [...] Thus used, merger and other doctrines such as the fair use privilege, can be important, almost surgical tools to strike the appropriate balance in individual cases."

Note the word "surgical" and the last two words, "individual cases": this is about very case-specific considerations. And I'll voice my opinion again on how I view these case-specific issues in Oracle v. Google: I just can't see how it would be good public policy to let Google get away with reckless infringement, hijacking an entire platform and fragmenting it, and ultimately getting, as a Twitter user told me yesterday, to have its Java and drink it.

Even if the approach chosen was to adjudicate Google's defenses such as the merger doctrine at the stage of evaluating copyrightability, one need look no further than at what is happening in this present case to see, from a public policy point of view, a need for protection against such conduct. From this angle it also becomes clear that interoperability privileges must not be applied to such an inappropriate extent that someone who doesn't even want to be truly compatible gets to benefit from them. Otherwise, the floodgates would be opened for "embrace, extend, extinguish" strategies.

Google's affirmative defenses relating to the 37 API packages

Oracle's claims related to the 37 API packages can succeed only if it overcomes all of Google's affirmative defenses. In its March 9, 2012 Opening Copyright Liability Trial Brief, Google made the following claims (literal quotes but reformatted by me):

  • "Google contends that the selection, arrangement and structure of the API elements are not copyrightable, for several reasons.

    • First, the selection, arrangement and structure are uncopyrightable ideas, processes, systems or methods of operation.

    • Second, the selection, arrangement and structure are functional requirements for compatibility, and thus not protectable.

    • Third, the selection, arrangement and structure are unprotectable scenes a faire, and/or their expression has merged into unprotected underlying ideas [the aforementioned "merger doctrine"].

  • Even if they are copyrightable, Google contends that any similarities arising from use of any protectable elements either are not substantial and are therefore noninfringing, or are a fair use.

  • [defenses relating to source files other than API packages]

  • Finally, Google contends that Oracle's copyright claims are barred by the equitable doctrines of laches, estoppel, waiver and implied license."

We can simplify this by ignoring for the purposes of this post, besides the defenses relating non-API-package source files, the final point about equitable defenses because they will be considered after the decision on copyrightability.

That is also true of the "fair use" defense. With respect to "not substantial and therefore noninfringing", a declaration filed by Oracle stated that "when printed out, Oracle's 37 API specifications are more than 11,000 pages long". Further above I mentioned the claim that Google copied an estimated 100,000+ lines into Android's documentation. Unless only a small part of this material is deemed copyrightable, the non-substantiality argument is not going to succeed on its own.

The "fair use" defense is also a difficult one in light of mountains of evidence of reckless infringement and the negative effects of fragmentation on Oracle as a company and the wider Java community.

We're now down to the three second-level bullet points, the non-copyrightability arguments starting with "First", "Second", and "Third". Let me reformat those non-copyrightability claims in a condensed form so we get to focus on the gist of this:

"[T]he selection, arrangement and structure of the API elements are [...]"

  1. "uncopyrightable ideas, processes, systems or methods of operation",

  2. "functional requirements for compatibility, and thus not protectable", and

  3. "unprotectable scenes a faire, and/or their expression has merged into unprotected underlying ideas."

It's allowed to raise defenses that are mutually exclusive. In this case, Google simultaneously raises defenses of the "too abstract to be protectable" kind as well as others of the "too functional to be protectable" kind. It's obviously hard to be too abstract and too functional at the same time. But it can be correctly deduced from this that Oracle will prevail only if what it claims to be copyrightable steers clear of both extremes. There is, however, a pretty broad corridor in between those extremes that constitutes the scope of copyrightable subject matter.

Copyrightability of selection, arrangement and structure is independent from copyrightability of individual elements

When reading the above defenses, it's important to consider that all of this refers to the "selection, arrangement and structure of the API elements" as opposed to the elements themselves. Therefore, even if every single one of those elements could be struck down with one or more of those non-copyrightability arguments, the "selection, arrangement and structure" might nevertheless be copyrightable.

But it's important to understand in this context that Oracle's legal theory for the protection of the structure, selection and organization of the asserted APIs is not the one of a "compilation", defined by the law as "a work formed by the collection and assembling of preexisting materials or of data that are selected, coordinated, or arranged in such a way that the resulting work as a whole constitutes an original work of authorship". Instead, Oracle argues that "[t]he 37 APIs are original works of authorship" because "[t]he specifications, as written documentation, and the class libraries (source or object code implementations of those specifications), as computer programs, are both protected as literary works."

The difference is that a compilation comprises material that is not protected, such as mere facts (for example, a telephone directory as such, or a dictionary), while Oracle's APIs contain, for the most part if not entirely, material that can be written in any number of ways. Oracle cites the Satava v. Lowry decision:

"Our case law suggests, and we hold today, that a combination of unprotectable elements is eligible for copyright protection only if those elements are numerous enough and their selection and arrangement original enough that their combination constitutes an original work of authorship."

Oracle knows that it has a strong argument for originality because Google itself admitted it's a low hurdle -- much, much lower than most people think. On Wednesday (April 18), Judge Alsup entered an order according to which the jury will be told, among other things, that the following fact is not disputed by Google:

"The Java APIs as a whole meet the low threshold for originality required by the Constitution."

Oracle would have preferred to have this applied to the 37 asserted APIs, but that won't make much of a difference: whether one looks at those 37 APIs or hundreds of APIs, the threshold for originality is easily met. In its Feist v. Rural decision, the Supreme Court said that "[t]he constitutional requirement necessitates independent creation plus a modicum of creativity". It's hard to see how 37 Java APIs cannot meet this -- presumably every single one of those packages can.

If the threshold is only met because it's very low, Google's fair use defense is stronger than if a very significant degree of originality is determined. Again, given the size and complexity of this, and all of the creativity that goes into API design, I doubt that the finding would be that there's little originality. Google engineer Joshua Bloch and Google's copyright expert Owen Astrachan confirmed the creative achievement that API design constitues in their earlier testimonies:

That was page 62 of Oracle's opening slide deck. On page 63, a presentation Joshua Bloch gave in 2005 (which bears the Google logo) stressed that "API design is a noble and rewarding craft". When asked about it on day of the trial, he confirmed this perspective: "Yes, I certainly believe that."

There's a lot more that could be said now about copyrightability of APIs. Chances are I will talk about some more of these issues, and some of the defenses that have been raised. But the purpose of this blog post was not to provide an in-depth paper on copyrightable subject matter and how it relates to the Java APIs. I mostly just wanted to explain the basic approach of making this a case-specific issue, taking into account the creative achievement on the one hand and the way in which a party makes use of such API material on the other hand, which I believe makes more sense than any bright-line, razor-like approach.

If you'd like to be updated on the smartphone patent disputes and other intellectual property matters I cover, please subscribe to my RSS feed (in the right-hand column) and/or follow me on Twitter @FOSSpatents and Google+.

Share with other professionals via LinkedIn: