Database Protection And Access Issues, Recommendations
Patent and Trademark Office Report on Recommendations
from the April 1998 Conference on Database Protection and
Access Issues
US Patent and Trademark Office
Department of Commerce
Washington, D.C.
July 1998
I. EXECUTIVE SUMMARY
In the 1991
Feist Publications v. Rural Telephone Service Corp
case, the U.S. Supreme Court ended the "sweat of the brow"
doctrine that had conferred some degree of copyright protection
on non-creative compilations of information. The
Feist
decision has produced subsequent case law in which databases
resulting from a substantial investment have been taken by
others to produce competing products; lack of copyright leaves
the database maker with no recourse against third party
predators -- except what state misappropriation law might offer
-- and only limited recourse, based on contract law, against
contracting parties.
Beginning in the late 1980's, Member States of the European
Union (EU) sought to harmonize the copyright laws of their
various legal systems. That effort resulted in an awareness
that some EU States -- Ireland, the U.K., the Netherlands, and
the Nordic countries -- provided greater protection to
non-creative compilations than other Member States. Eventually,
efforts to harmonize the EU copyright laws for the TRIPS
Agreement left the EU without any intellectual property
protection for non-creative compilations of data. After
considering varied proposals, in March 1996 the EU adopted a
Database Directive requiring all Member States to provide a
sui generis
form of intellectual property protection for databases.
The EU Database Directive became the basis for the EU's
proposal for a draft international treaty that was submitted to
the World Intellectual Property Organization (WIPO). In
anticipation of a WIPO Diplomatic Conference in December 1996,
and because of substantial concerns about provisions of the EU
proposal, the U.S. submitted its own proposal to WIPO.
Ultimately, the 1996 Diplomatic Conference focused on copyright
and neighboring rights; database protection was left
unaddressed. Nonetheless, WIPO established a timetable to
resume discussions on database protection in 1998.
In the United States, a proposal for
sui generis
protection was introduced in the House in 1996 by
then-Congressman Carlos Moorhead. That proposal generated
considerable opposition from the scientific, education, and
library communities. In the 105th Congress, Howard Coble,
Chairman of the House Subcommittee on Courts and Intellectual
Property, introduced H.R. 2652, which would provide a database
maker with protection against misappropriation of any
substantial part of its database, where such misappropriation
would harm the actual or potential market for the database. In
hearing in late 1997 and early 1998, scientists and educators
-- as well as telecommunications companies -- expressed
significant concerns over many aspects of the bill.
Nonetheless, on May 19, H.R. 2652 passed the House on voice
vote from the suspension calendar. Recently, a corresponding
bill was introduced in the Senate (S. 2291); as of the time of
this memorandum, S. 2291 was co-sponsored by Senators Grams,
Cochran, Faircloth, and Helms.
In an effort to help policy makers understand the concerns of
all parties, the Patent and Trademark Office ("PTO") held a one
day conference on database protection and access issues ("PTO
Database Conference") on April 28, 1998. At that time, H.R.
2652 had only been approved by the House Judiciary Committee.
The conference was held at the Brookings Institution and
attracted over 175 attendees representing academia, the
business community, libraries, government, non-profits, and the
scientific community.
The conference did not -- and was not expected to -- produce
consensus on any issues, including the most fundamental issue
of whether or not database protection is needed. We believe,
however, that the proceedings helped (a) initiate dialog, then
and subsequently, between various parties, and (b) helped
identify areas where disparate interests may be accommodated
through further legislative developments. After reviewing the
conference proceedings, we believe that the Administration
should be willing to support database protection legislation
that meets five widely-supported principles:
1. A change in the law to protect commercial database
developers from
Warren Publishing
-like situations is desirable.
2. Consistent with Administration policies, databases generated
with Government funding should not be placed,
de jure
or
de facto
, under exclusive control of private parties.
3. Any database protection regime must carefully define and
describe databases and prohibited acts, so as to avoid
unintended consequences, including undue disruption of existing
business relationships and non-profit research.
4. Any database protection regime should be subject to
exceptions largely co-extensive with "fair use" principles of
copyright law.
5. Consistent with U.S. trade policy, it is desirable to secure
for U.S. companies the benefit of the EU Database Directive and
laws in other countries protecting database products.
This document provides a brief summary of the April 28
conference; our analysis of these principles both generally and
as they relate to H.R. 2652; and a few areas where we believe
further work may be needed to be produce an acceptable legal
regime for databases.
II. THE PTO DATABASE ISSUES CONFERENCE
The PTO Database Conference was held on April 28, 1998 at
the Brookings Institution in Washington. Preparations for the
conference began in late January. The format of the conference
was a series of plenary sessions with mid-day "breakout"
sessions devoted to more specialized topics.
In planning conference topics and possible panelists, we
reviewed all testimony given before the House Subcommittee on
Intellectual Property in its hearings in October 1997 and
February 1998. We also met with representatives of the
Information Industry Association (IIA), the Information
Technology Association of America (ITAA), and the National
Research Council (NRC). We had on-going discussions with
representatives of these organizations as well as conversations
with the American Library Association (ALA), the Association of
American Publishers (AAP), the Association of Research
Libraries (ARA), and the Business Software Alliance (BSA).
The 23 panelists and moderators consisted of 18 Americans and 5
Europeans. These included seven legal or economic academics
(divided roughly equally between supporters and critics of
database protection proposals); six scientists and
representatives of scientific organizations; two library
representatives; and five business groups. The conference
panelists/moderators also included representatives of the State
Department, the Copyright Office, and the European Commission.
Approximately 175 people attended one or more sessions of the
conference. In addition to many people from trade associations
and Washington law firms, participants included:
from the scientific community and related government agencies,
representatives of the Centers for Disease Control, Chemical
Abstracts Service, the House Science Committee, the National
Science Foundation, the National Research Council, OSTP, the
State Department, and the U.S. Geological Survey; from the
private sector, representatives of ABC Cable and News Media,
BellSouth, Dun & Bradstreet, Eli Lilly, Fujitsu, IBM,
Intermetrics, Lexis-Nexis, McGraw-Hill, Reuters, MCI and
several smaller businesses, including information firms for
realtors and insurance companies; from non-profit
organizations, representatives of the Modern Language
Association, the Church of Jesus Christ of Latter Day Saints,
and National Public Radio.
The conference had four plenary sessions with seven mid-day
"breakout" discussion groups. Plenary session topics reflected
neutral statements of general issues that have arisen
repeatedly in Congressional hearings and scholarly writings on
database protection and access issues; several "breakout"
sessions were dedicated to thorny issues identified by specific
groups. The first plenary panel discussed whether there is need
for additional database protection; the second plenary was
devoted to the concerns of the scientific and research
communities; and the third plenary session explored the "fair
use" needs of libraries, non-profit entities, and database
producers who rely on government data. The fourth plenary
session consisted of reports to the Assistant Secretary of
discussions in the mid-day break-out sessions. Attachment A is
a program of plenary topics and breakout sessions from the
conference. Individuals interested in obtaining copies of
videotapes of the first and fourth plenary sessions (the only
sessions recorded) may do so for the cost of reproduction by
contacting Justin Hughes, Office of Legislative and
International Affairs, Patent and Trademark Office, Department
of Commerce, Washington, D.C. 20231, justin.hughes@uspto.gov.
III. EMERGING PRINCIPLES AND ISSUES
A. BASIC PRINCIPLES
In light of the conference proceedings and after reviewing
Congressional testimony, scholarly writings, and reports on
these issues, we believe that a set of principles emerge that
should shape the administration's position on database
protection. This principles could be embodied in any number of
approaches, including H.R. 2652 with appropriate modifications
to reflect these goals. After listing the principles, we
discuss each principle and analyze how H.R. 2652 fulfills or
fails to achieve those goals.
1. A change in the law to protect commercial database
developers from
Warren Publishing
-like situations is desirable.
2. Consistent with Administration policies, databases
generated with Government funding should not be placed,
de jure
or
de facto
, under exclusive control of private parties.
3. Any database protection regime must carefully define
and describe databases and prohibited acts, so as to avoid
unintended consequences, including undue disruption of
existing business relationships and non-profit research.
4. Any database protection regime should be subject to
exceptions largely co-extensive with "fair use" principles of
copyright law.
5. Consistent with U.S. trade policy, it is desirable to
secure for U.S. companies the benefit of the EU Database
Directive and laws in other countries protecting database
products.
The discussion which follows elaborates on each of these
principles.
1. A change in the law to protect commercial database
developers from
Warren Publishing
-like situations is desirable.
There was considerable, albeit not complete, consensus at
the conference that some type of legislative "fix" would be
reasonable to provide commercial database producers with
protection for their products. This has been stated by leading
scientists and by legal scholars identified as critics of
database protection. A handful of people remain who insist that
case law will develop and/or that a combination of technology
and contract, copyright, and trade secrecy law offer database
producers sufficient incentive. But on the whole, there seems
to be agreement that situations like
Warren Publishing
and the
ProCD
case are likely to arise in digital commerce and that some
protection in such situations is desirable. A recent report by
the Japan Institute of Intellectual Property reaches the same
conclusion:
"In today's society the database industry has proved to be
of vital support for governmental, educational, and commercial
purposes. Since databases are plainly open to full-scale
misappropriation a lack of adequate legal protection obviously
could have a range of damaging effects on the everyday life of
society. . . . Once disclosed to the public, information can be
used generally speaking and leaving aside contractual or
tortious liability, freely without the database provider's
permission or an obligation to reimburse him for his
investment. This holds equally true for the off-line as well as
the on-line market."
While the NRC principally advocates scientists' concerns and
believes science has a specific, public-minded paradigm for
data-gathering, in their seminal study of database issues,
Bits of Power
, they recognized the problem existing in the commercial
sector:
"In the private sector, by contrast, commercial compilers of
data have long suffered from a risk of market failure owing to
the intangible, ubiquitous, and above all, invisible nature of
information goods and the ease with which free riders may have
appropriated the fruits of the compilers' investment once the
information goods were made available to the public in print
media."
These sorts of cases are only likely to increase with
digital media. The
ProCD v. Zeidenberg
case provides an example of a fact pattern that may become
commonplace without appropriate legal safeguards. In
ProCD
, defendant Matthew Zeidenberg purchased ProCD's CD-ROM
database of 3,000 telephone directories from around the
country. He then formed a company to sell the telephone
directory information online -- for far less than the price for
the CD-ROM set. ProCD prevailed in this case at the appellate
level because the Seventh Circuit panel ruled that the
"shrink-wrap" license which limited the defendant to
non-commercial use of the CD-ROMs was enforceable. In a case
where Zeidenberg
gave
the CD-ROM set to someone else, who later started the same
company, ProCD would have had no privity of contract against
the defendant company and would have lost control of its
database. Similarly, in
Warren Publishing v. Microdos Data Inc.
, Warren Publishing's "Directory of Cable System" classified
cable television systems classified by the principal
communities they served. The directory was apparently taken and
reproduced by Microdos Data in a competitor product sold in
software format. The Eleventh Circuit, sitting en banc, ruled
that there were no copyrightable aspects to Warren Publishing's
database that had been taken by the defendant.
The database protection regime set out in H.R. 2652 would
clearly meet the goal of addressing these situations. At the
same time, this goal could probably be met with a modified "
NBA v. Motorola
" approach (as amended by suggestions of Professors Ginsburg
and Reichman) built on the elements of a misappropriation claim
being: (i) the plaintiff generates or collects information at
some expense, (ii) the defendant's use of the information
constitutes free-riding on the plaintiff's costly efforts to
generate or collect it, (iii) the defendant's use of the
information is in competition with a product or service offered
by the plaintiff or likely to be offered by the plaintiff, and
(iv) the ability of other parties to free ride on the efforts
of the plaintiff would so reduce the incentive to produce the
product or service that the existence or quality of the product
would be substantially threatened. At the same time, we think
that these are largely the principles that govern H.R. 2652.
Where H.R. 2652 diverges from an
NBA v. Motorola
model, there may be good reasons.
Some participants at the conference also raised concerns about
the constitutionality of different database protection
proposals. We believe that there are two principal concerns.
The first is whether the Supreme Court's interpretation of the
Intellectual Property Clause (Article I, Section 8, Clause 8 )
as set forth in
Feist
pre-empts Congressional exercise of Commerce Clause power to
legislate in this area under the doctrine of
Railway Labor Executives' Ass'n v. Gibbons
("Clause 8 pre-emption"). Given Congress's creation of discrete
intellectual property rights in areas previously treated as
related to copyright or patent (trademark, semiconductor mask
protection) and the Supreme Court's continued recognition of
"non-copyright grounds" for protection of information, we
believe that a database protection bill can be properly crafted
to avoid Clause 8 pre-emption.
The second concern is what limits the First Amendment imposes
on any database protection regime. This is not a new problem;
courts have frequently dealt with the relationship between
trademark law and the First Amendment, copyright law and the
First Amendment, and trade secrecy law and the First Amendment.
All of these laws limit "speech" in which citizens may engage
but remain, nonetheless, compatible with the First Amendment.
We believe that First Amendment concerns can be addressed as
long as any database protection regime (a) permits unhampered
independent collection of information, (b) permits use of data
for criticism, news reporting, and
de minimis
personal communications, and (c) recognizes a wide berth of
"fair" uses that do not substantially affect the commercial
activities of the database owner. We understand that the
Department of Justice's Office of Legal Counsel is in the
process of preparing a preliminary analysis of
constitutionality issues concerning H.R. 2652; we look forward
to reviewing this preliminary analysis.
2. Consistent with Administration policies, databases
generated with Government funding should not be placed,
de jure
or
de facto
, under exclusive control of private parties.
There seems to be general agreement that compilations of data generated with U.S. Government funding should not be subject to any protection regime. There are several reasons for this. First, if U.S. Government-funded databases were subject to some type of protection regime, taxpayers might "pay twice" for access to data. Second, the principal argument for a protection regime is that, absent such protection, private parties will lack adequate incentives for database production. But government funding provides the incentive in the case of publicly-financed compilations, such as weather information, census data, and medical studies funded by NIH grants. As the Office of Management and Budget has stated:
"Government information is a valuable national resource. It
provides the public with knowledge of the government, society,
and economy -- past, present, and future. It is a means to
ensure the accountability of government, to manage the
government's operations, to maintain the healthy performance of
the economy, and is itself a commodity in the marketplace."
For many government agencies, the responsibility to make
government-generated information widely available is a
statutory obligation. For example, the Agriculture Department
works under a wide directive to "diffuse among people of the
United States, useful information on subjects connected with
agriculture . . " (7 U.S.C. section 2201) while statutes such
as the Freedom of Information Act and the Government in the
Sunshine Act "establish a broad and general obligation on the
part of Federal agencies to make government information
available to the public and to avoid erecting barriers that
impede public access."
a. A Wide Definition of Government Data
While there is wide agreement on this general proposition,
some questions have been raised whether data generated by the
government (for example, from government-owned satellites) is
distinct from data generated by non-government entities
funded
by the government (for example, private researchers working
with NIH grants). We believe that
even if it were desirable to draw a distinction of this
sort
, no statutory language could adequately capture this
distinction, particularly in a time when efforts to "reinvent"
government may lead to private parties gathering datasets under
government contracts that might have been gathered previously
by government employees. For example, many private contractors
participate in gathering data for the decennial Census; the
individuals who work for these private entities are sworn in as
"special census employees" only for purposes of statutory
confidentiality requirements and are not federal employees
under Title 5 of the U.S. Code.
H.R. 2652 presently addresses this issue with the following
broad section 1204(a) exclusion:
"Protection under this chapter shall not extend to
collections of information gathered, organized, or maintained
by or for a government entity, whether Federal, State, or
local, including any employee or agent of such entity, or any
person exclusively licensed by such entity, within the scope of
the employment, agency, or license. Nothing in this subsection
shall preclude protection under this chapter for information
gathered, organized, or maintained by such an agent or licensee
that is not within the scope of such agency or license, or by a
Federal or State educational institution in the course of
engaging in education or scholarship."
We believe that this provision serves the general policy
goal of making all forms of government information available to
the public, but we believe that the language can be improved.
In response to concerns raised as the "publicly-funded data"
breakout session about the different government contractual
arrangements with laboratories and private companies, we
suggest that the drafters of H.R. 2652 should examine existing
definitions of "government information" for descriptions that
capture a fuller range of government-sponsored data collection.
For example, OMB Circular A-130 states that "the definition of
'government information' includes information created,
collected, processed, disseminated, or disposed of both by and
for the Federal Government."
At a minimum, we are concerned that the present language does
not adequately cover situations in which the government
contracts for information gathering. It was pointed out at the
conference that government contracts sometimes expressly
preclude the private entity from being an "agent" or "licensee"
of the government -- thus removing their activities from the
ambit of section 1204(a) as presently written. One way to
address this would be inclusion of "contracting for the
government" language. Another possibility would be inclusion of
statutory language that the 1204(a) exclusion also applies to
data gathering "funded by the government" in section 1204(a)
and discussion in the legislative history to make it clear that
section 1204(a) applies to databases developed by a private
entity
as a necessary part
of a government-funded contract, whether or not "gather[ing],
organiz[ing], or maintain[ing]" a collection of information was
the
purpose
of the government contract. For example, if a company working
on airport safety under contract from the FAA builds a database
of airport characteristics that is required to complete its
contract with the FAA, then the company should not be able to
assert any exclusionary rights over the airport database. It
may be possible to develop standards for when a database is
necessary for a government contract from existing standards for
when government agencies must collect data. In any case, the
same rationales apply to government contracting as to data
generated by the government itself: government funding already
provides an adequate incentive and there is no reason taxpayers
should pay 'twice' for data gathering.
The distinction which need to be drawn is between (a)
compilations of data made as a necessary element of a
Government-funded activity, and (b) compilations of data made
by private entities over and above the activity being funded.
This appears to be the intent of the section 1204(a) language
that:
"Nothing in this subsection shall preclude protection under
this chapter for information gathered, organized, or maintained
by [a government] agent or licensee that is not within the
scope of such agency or license . . ."
This appears to protect other activities of a government
licensee
and
to permit protection of value-added databases that the licensee
generates from government data. Nonetheless, we think that this
section could be clarified by express language (or discussion
in the legislative history) that
transformative
developments from government compilations of data can be
protected, i.e. that value-added activities outside the ambit
of a government contract can produce protected databases,
subject to the general principle -- drawn from copyright law --
that where government-funded data and value-added data are
commingled
and
the government-funded data predominates, then the private data
producer should take affirmative steps to distinguish the two
types of information.
b. State and Local Government Data
Another minor question in this area has been whether data
generated from funding by state or local governments should be
treated differently than data generated from funding by the
Federal Government. The above language takes the position that
data generated with funding from
any
level of government, federal, state, or local, may not take
advantage of the H.R. 2652 database protection regime. The
Committee Report notes that this "exclusion is broader than the
similar provision in section 105 of the Copyright Act" in that
it applies to state and local governments. This raises some
interesting questions.
Given the rationale that taxpayers should not "pay" for
databases twice, this does create the possibility that, for
example, a database whose creation was funded by the California
state government will be used by private citizens of Arizona --
giving the Arizonans a free-ride on the California taxpayers'
investment. Nonetheless, we agree with the Committee's approach
because of the importance of developing a strong, clear
principle that government-generated data is not subject to
exclusion.
c. University Generated Databases
Section 1204(a) is currently worded to ensure that data
gathered by state-funded colleges and universities may enjoy
2652 protection:
"Nothing in this subsection shall preclude protection under
this chapter for information gathered, organized, or maintained
by such an agent or licensee that is not within the scope of
such agency or license, or by a Federal or State educational
institution in the course of engaging in education or
scholarship."
According to the Committee Report, "educational institutions
that happen to be government owned should not be disadvantaged
relative to private institutions when producing databases
unrelated to the provision of regulatory government functions."
This is a topic where guiding principles may conflict. What
happens with a database gathered by medical researchers at a
state university working under a federal grant from NIH? Should
this be excluded from 2652 protection on the ground that it is
government-funded research (and data for which the American
public has already paid)? Or should the database be eligible
for 2652 protection on the grounds that it comes from "a
Federal or State educational institution in the course of
engaging in education or scholarship" and the principle that
state-funded schools should not be prejudiced against private
universities?
Administration policies clearly establish that the U.S.
Government has a right to disseminate data produced by any
federal grant to institutions of higher education, hospitals,
and non-profit research organizations. OMB Circular A-110
states the general framework, including the U.S. Government's
right to a "royalty-free, non-exclusive and irrevocable"
license to any copyright and, concerning compilations of
information:
(c) Unless waived by the Federal awarding agency, the Federal
Government has the right to (1) and (2): (1) Obtain, reproduce,
publish or otherwise use the data first produced under an
award. (2) Authorize others to receive, reproduce, publish, or
otherwise use such data for Federal purposes.
In keeping with this policy and our belief that
Government-funded data should not be subject to 2652
protection, we believe that databases resulting from research
directly funded by the government, whether generated by a
for-profit entity or a non-profit entity, should be ineligible
for 2652 protection. No distinction should be drawn between the
research being funded at Sloan-Kettering, Harvard, Michigan
State, or a Kaiser Permanente hospital as long as the research
is directly funded by the government. On the other hand, we
think that a professor working at a state university without
any government grant beyond her state university salary and
laboratory funds should be able to apply 2652 protection to a
database resulting from her work. This would address what might
otherwise be an inequitable situation between private
institutions like Amherst College and USC versus state
institutions like the University of Massachusetts at Amherst
and UCLA. It may be difficult to craft statutory language that
absolutely resolves this problem, but we believe this should be
thoroughly addressed in the legislative history. The issue can
and should be expressly addressed in government grants.
d. Realistic Government Action in a H.R. 2652 Environment
All parties should recognize that § 1204(a), whether as
currently worded or amended along the lines suggested, will
require diligence on the part of government contracting agents
to ensure that delivery of data (in a reasonable form) to the
public is part of the described government-funded activity.
Otherwise, licensees could argue that the form in which they
were making the data available to the public was a value-added
format and "outside" the scope of their government contract. We
think that any future legislative report should be clarified on
this count: that when the government contracts with a private
firm to produce data, usually the goal is to not only produce
data, but also to make that data reasonably available to the
relevant public in at least raw form.
At the same time, we think that the discussion about database
protection and the need to keep government-generated data in
the public domain has ignored one fact: that the U.S.
Government has already undertaken some programs intended to
generate scientific data and
not
place it in the public domain. For example, the Sea-viewing
Wide Field-of-view Sensor ("SeaWiFS") is a "cost-sharing
collaboration" between NASA and Orbital Sciences Corporation
(OSC) "wherein NASA's Goddard Space Flight Center . . .
specified the data attributes and bought the research rights to
these data" while "OSC provided the spacecraft, instrument, and
launch" and retains "the operational and commercial rights to
these data." The Space Commercialization Act is a broader
example of government/private sector collaboration in which the
government partially funds research efforts conscious that the
resulting data will be commercialized. Federal agencies are
under direction to ensure that "information systems do not
unnecessarily duplicate information systems available . . .
from the private sector. For example, NOAA buys substantial
amounts of data from private entities and negotiates the terms
for data usage in such buys. How would §1204(a) relate to
these efforts?
There are two possible, alternative answers. First, it would be
credible to take the position that while the Government may
engage in collaborative programs with private entities, both
the Government and the private entities do so without the
benefit of any database protection law, i.e. the results of its
collaborative projects with private industry can be protected
by any of the means now available -- technological means of
controlling access, contract law, etc -- but not by the new
law. This would suggest that the first sentence of
§1204(a) should be written to "govern" all public/private
joint ventures. The second alternative is to say that the
second sentence of §1204(a) governs: depending on how the
government/private entity contract is crafted, certain uses of
data can be outside the government license, contract, or
agency, such that a private company like OSC can enjoy database
protection rights. Given the existence of the Space
Commercialization Act, we think that a final resolution among
these two alternatives is a broader question than H.R. 2652.
Our hope is that H.R. 2652 will be compatible with either view.
3. Any database protection regime must carefully define and
describe databases and prohibited acts, so as to avoid
unintended consequences, including undue disruption of existing
business relationships and non-profit research.
Defining a database or "compilation of information" is one of
the most daunting tasks in drafting any database protection or
access law. We believe that a database protection law should
exclude the following from the ambit of protection: (a)
audio-visual works, despite the fact that they are arguably
"compilations" of film frames; (b) narrative texts, whether
fiction or non-fiction, regardless of length, despite these
being "compilations" of words; and (c) pieces of music, whether
in sheet music or recorded performance form, despite these
being "compilations" of chords, lyrics, musical notes, etc. We
are also unsure that the present bill adequately addresses
concerns about datasets embedded in the nation's
telecommunications infrastructure.
This challenge of defining "compilations of information" is one
area where we believe there is room for improvement of H.R.
2652, either in the statutory language or in legislative
history which can clarify Congress' intent. At present, H.R.
2652 defines a compilation of data as follows:
"1201" As used in this chapter:
"(1) Collection of information. -- The term 'collection of information' means information that has been collected and has been organized for the purpose of bringing discrete items of information together in one place or through one source so that users may access them
"(2) Information. -- The term 'information' means facts,
data, works of authorship, or any other intangible material
capable of being collected and organized in a systematic way."
And provides the following legislative report on the
subsection:
"Section 1201 . . . defines 'collection of information' . . . .
The definition is intended to avoid sweeping too broadly,
particularly in the digital environment, where all types of
material when in digital form could be viewed as collections of
information. It makes clear that the statute protects what has
been traditionally thought of as a database, involving a
collection made by gathering together multiple discrete items
with the purpose of forming a body of material that consumers
can use as a resource in order to obtain the items themselves.
This is in contrast to elements of information combined and
ordered in a logical progression or other meaningful way in
order to tell a story, communicate a message, represent
something, or achieve a result. Thus, a novel would not be
considered a 'collection of information' even if it appears in
electronic form, and therefore could be described as made up of
elements of information that have been put together in some
logical way. Similarly, materials such as interface
specifications would not ordinarily be covered, although a
collection of such specifications created in order to provide
consumers access to the individual specifications could be
covered."
In terms of the general definition, we think that this
present language takes a viable approach, but that it can be
improved.
For example, the EU Directive differs from the present
definition in H.R. 2652 in requiring that the information be
"arranged in a systematic or methodical way and individually
accessible by electronic or other means." [Article 1(2)] The
problem with the EU definition is that single frames of films
and specific parts of songs are already "individually
accessible" and will become more so with increasing
digitization; we think that the true difference between a
database and, on the other hand, a film or song is that the
elements of a database are
intended
to be accessed individually. They are also
intended
to be accessed in sets and subsets, as when one uses a column
of information in a spreadsheet database. This suggests
definition of a compilation based on the intention that
elements be accessed in a particular way: a database is
information collected
for the purpose of allowing users to access items of
information both individually and in sets or subsets of related
items of information
. We understand that this may have been the intent of the
1201(1) language that a collection of information is
"information that has been collected . . . for the purpose of
bringing discrete items of information together in one place or
through one source so that users may access them" but the
language could more clearly convey this intention by shifting
where the "purpose" is located and introducing the notion of
accessing data individually or in sets, i.e. a collection is
"information that has been collected . . . in one place or
through one source for the purpose of allowing users to access
items of information both individually and in sets or subsets
of related items of information."
We believe, however, that no abstract definition of a database
will give us a bright line border between databases and
non-database works. Therefore, we think that clear legislative
history on this question is especially important. For example,
where the current legislative report gives the example that "a
novel would not be considered a 'collection of information'
even if it appears in electronic form . . . " we think that the
legislative history should enumerate several examples of work
with a "logical" or "linear" progression (or a representational
nature) that are not intended to be protected as databases:
audio-visual works, video games, computer software code,
fictional narrative texts, non-fictional narrative texts, and
photographs. We think that the single example of a fiction
novel in the present legislative report is especially
troublesome because it does not sufficiently clarify the
important point that a non-fiction narrative text should also
fail to qualify as a database.
A second area of concern with the current definition of a
database relates to computers and the Internet. The statute
expressly states in §1204(b)(2) that "[a] collection of
information that is otherwise subject to protection under this
chapter is not disqualified from such protection solely because
it is incorporated into a computer program." Read by itself,
this strongly suggests that all databases in computer programs
are protected. Many such embedded databases are not intended
for human perception; we believe that these databases should be
protected on a "sweat of the brow" justification to avoid
situations in the future in which competitors steal significant
unprotected value-added from software makers. This appears to
be something the House Subcommittee did not fully consider.
(There was no testimony before the Subcommittee on this subject
during its two days of hearings.)
While we believe that protection should be afforded to datasets
built into software and made through substantial investments,
regardless of whether they are "accessed" by humans or not,
there seems to be some equivocation on the bill and its
legislative report. First, the definition of a "collection of
information" in §1201(1) speaks of information arranged
"so that users may access them." On the one hand, this
ambiguous term coupled with the software inclusion provision of
§1204(b)(2) would suggest that non-human "users" might
qualify. On the other hand, the legislative report states:
". . . material such as interface specifications would not
ordinarily be covered, although a collection of such
specifications created in order to provide consumers access to
the individual specifications could be covered." [discussion of
§1201]
The use of "consumers" in this phrase suggests a human-use
standard is intended, but this is not clear. We agree that the
"interface specification" problem should be resolved as the
legislative report states, but we also believe that the
software-embedded database problem apparent in the
Gates Rubber
opinion should be resolved favorably for the parties investing
in these databases; this would, at a minimum, suggest different
language in the legislative history.
4. Any database protection regime should be subject to
exceptions largely co-extensive with "fair use" principles of
copyright law.
There seems to be general agreement that any database
protection regime should be subject to exceptions with
approximately the same scope as copyright "fair use." Some
critics would call for exceptions with
at least
the same scope as "fair use." The most significant detractors
from this view are those who argue that such discussions of
fair use demonstrate that any database protection regime is
actually a copyright law -- lurking under a different label and
forbidden by the Supreme Court's ruling in
Feist.
A. The 1203(d) Exception
H.R. 2652 does not provide exceptions from liability
parallel to those in the copyright law. Some take the position
that the bill's exceptions are not as broad as copyright fair
use; some argue that the bill gives broader exceptions. We
think this issue merits further attention. The main exception
from liability provided by the bill is § 1203(d) which
provides as follows:
"(d) Nonprofit Educational, Scientific, or Research Uses. --
Nothing in the chapter shall restrict any person from
extracting or using information for nonprofit educational,
scientific, or research purposes in a manner that does not harm
the actual or potential market for the product or service
referred to in section 1202."
We agree that this language does not solve the problem of
databases actually developed for scientists or researchers. At
least one representative of the scientific community at the
April 28 conference has further criticized this proposal as
"illusory"; we understand this criticism, i.e. that §
1203(d) really adds nothing to § 1202. But we believe that
the § 1203(d) language recognizes a wide range of
exceptions. For example, § 1203(d) would permit the
following research uses:
+ A statistician uses lists from the AMA's directory of
physicians and the Martindale-Hubbell directory of attorneys to
do a statistical analysis of the distribution of recently
graduated medical specialists correlated to different legal
specialties, particularly personal injury lawyers, among major
metropolitan areas;
+ A sociologist reproduces some of Warren Publishing's list of
cable operators in a book on the effects of mass media in
America;
+ A statistician and an economist reprint sections of Phillips
Business Information's
Electronic Commerce Directory
and
Canadian Electronic Commerce Directory
in their comprehensive study of e-commerce developments in
NAFTA countries;
+ A biologist specializing in mammalian metabolism integrates
drug testing data from a study done and publicized by a
pharmaceutical company (to promote the efficacy of its drug) in
her scholarly analysis of mammal reactions to certain chemical
compounds.
+ A medical researcher uses grocery shopping data generated
from checkout scanning equipment in supermarkets (which is
marketed back to supermarkets and to food companies) to study
the possible effects of consumption patterns on cancer rates.
One concern is that businesses will try to define their
"actual" and "potential" market broadly to include these
research uses, either in litigation claims or (for the
far-sighted party) in their business plan for any new database.
We think this is a possibility, but not a great danger. As with
any legislation, some private parties will try to manipulate
their behavior to gain undue advantage from statutory language
and courts must curb such activities. We think that this
concern about harm to a "potential" market for a database can
be addressed through some improvement of § 1203(d)
discussed below.
Given the amount of discussion at the conference on fair use,
it is worthwhile to directly compare how § 1203(d) and
other exculpatory provisions of H.R. 2652 would work in
comparison to copyright's principal fair use provision, 17
U.S.C. § 107. Section 107 states that "fair use" is the
use of copies "for purposes such as criticism, comment, news
reporting, teaching (including multiple copies for classroom
use), scholarship, or research . . . ." But not all uses in
these categories are fair uses; instead a court must consider
four factors:
"(1) the purpose and character of the use, including whether such use is of commercial nature or is for nonprofit educational purposes;
"(2) the nature of the copyrighted work;
"(3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
"(4) the effect of the use upon the potential market for or
value of the copyrighted work."
Initially, it should be noted that § 1203(d) of H.R.
2652 offers a stronger exception than 17 U.S.C. §107
because §1203(d) is absolute -- if a party falls into its
description, the exception applies. In contrast, 17 U.S.C.
§107 requires a court to weigh the four factors, so a use
that falls in the "teaching" or "research" description may
still infringe.
Of the four fair use factors, §1203(d) already addresses
"(1)" by stating that the present exclusion applies to
"nonprofit educational, scientific, or research purposes".
There may be some question whether the word "nonprofit"
modifies only "educational" or modifies all three adjectives
"educational, scientific, or research" The legislative report
sheds limited light on this point, particularly because it uses
the same grammatical construction twice. The report does say
that §1203(d) is intended to "alleviate concerns expressed
by members of the research, scientific, and university
communities"; since none of those concerns have been expressed
by for-profit researchers, we take §1203(d) to refer to
nonprofit activities, whether educational, scientific, or
research. We think that for-profit research, as in research
laboratories at companies like Amgen, IBM, or Ford, would fall
outside the ambit of §1203(d).
Opponents of the legislation have criticized H.R. 2652 for not
including the second and third of the four § 107 fair use
factors:
"(2) the nature of the copyrighted work;
"(3) the amount and substantiality of the portion used in
relation to the copyrighted work as a whole
Supporters of the bill have responded that these criteria
are already built into the H.R. 2652 framework and do not need
to be restated in the exception(s).
As concerns factor (2) of §107 -- which calls for
consideration of the "nature of the copyrighted work" --
proponents of H.R. 2652 argue that since it only covers
databases, a court enforcing H.R. 2652 would not need to engage
in the same type of "nature of the . . . work" analysis. We
generally agree that a court enforcing a law modeled on H.R.
2652 would not face the wide range of works that are
copyrightable -- from feature films to sculptures to
non-fiction scholarly articles. A court enforcing a law modeled
on H.R. 2652 would not face what is arguably the principle
distinction in § 107(2) analysis: whether the work is
fictional or factual. At the same time, we recognize that there
will still be significant variations in the kinds of databases
that would be subject to this law. We address this point in the
"balancing" discussion below. Similarly, the bill's proponents
have pointed out that the third fair use factor of §107
calls for analysis of the "substantiality" of the infringement
and that H.R. 2652 largely achieves the same effect by creating
liability only if there has been a "substantial" taking of the
database. Many critics of H.R. 2652 at the conference seemed to
prefer a balancing test that would allow courts to consider
degrees of substantiality in the taking.
The fourth fair use factor under copyright law is the effect of
the use on the "market" for the protected work. Supporters of
H.R. 2652 say that it provides this test because §1203(d)
shields researchers from liability unless there is "harm" to
"the actual or potential market for the product or service
referred to in section 1202." On this point, we believe that
the § 1203(d) exception could be improved by inserting
"substantially" or a similar standard before "harm" so that any
person may extract or use "information for nonprofit
educational, scientific, or research purposes in a manner that
does not
substantially
harm the actual or potential market for the product or service
referred to in section 1202." Such a "substantial harm"
standard is familiar to courts; would focus judges on the
primary market for a database; and, in the face of a database
owner contending that "science" or "research" were his intended
markets, would tend to exculpate researchers who used the
database. Another possibility would be an exemption "for
nonprofit educational, scientific, or research purposes in a
manner that does not
unreasonably
harm the actual or potential market for the product or service
referred to in section 1202." This test follows the spirit of
Article 9(2) of the Berne Convention that exemptions from
copyright protection are permitted which do not "unreasonably
prejudice the legitimate interests of the author." Yet another
option that might be considered would be to expose nonprofit
researchers and scientists to liability only for harm to an
actual market and eliminate any potential liability for effects
on "potential" markets. There is a reasonable basis for drawing
this distinction: commercial actors are more likely to know the
potential market of a competitor through market research and
business planning than nonprofit actors who are not market
participants.
H.R. 2652 also provides exceptions for "extracting information
. . . for the sole purpose of verifying the accuracy of
information independently gathered, organized or maintained by
that person" [§ 1203(c)] and for "extracting or using
information for the sole purpose of news reporting, including
news gathering, dissemination, and comment," unless the
information has been gathered by a news agency for a like
purpose -- an exception to the exception intended to capture
the
INS
case [§ 1203(e)]. The bill also includes an express
protection for independent gathering of the same data [§
1203(b)]. In general, we believe that these are all reasonable,
appropriate, and in the spirit of fair use. To parallel First
Amendment concerns manifest in copyright law, H.R. 2652 could
include an express exception for "criticism" similar to the
existing § 1203(e).
Finally, even if the § 1203(d) exception largely captures
the substantive content of § 107 of the Copyright law, one
of the concerns repeatedly expressed at the April conference
was that H.R. 2652 does not include a "balancing" mechanism to
give judges more leeway in determining what uses of
compilations of data should be shielded from liability. We
would not be opposed to the addition of a balancing mechanism
in H.R. 2652 that indicated to judges that they could exercise
more leeway in considering the "nature" of the work and the
"amount" of the copying above "substantiality" in determining
what kind of liability a non-profit "educational, scientific,
or research" entity should face; the possibility of such a
revision, however, turns on a clear understanding of how the
remedies provisions of H.R. 2652 function.
B. Remedies-Delineated Exceptions
A second area where the fair use-like elements of H.R. 2652
might be clarified or strengthened for the benefit of nonprofit
researchers and educators is the bill's remedies provisions.
According to proponents of H.R. 2652, it has already been
amended to effectively shield scientists, libraries, and
researchers from monetary damages, i.e. such institutions and
individuals would only be subject to injunctive relief. This is
best seen by a review of the remedies provisions.
I. Civil Remedies
Civil remedies are provided in § 1206 of the bill.
Subsection 6(a) provides for federal court jurisdiction
"without regard to the amount in controversy"; subsections
6(b), (c), and (d) empower the court to award, respectively,
temporary and permanent injunctions; impoundment and
"modification or destruction of copies"; and defendants
profits, treble damages, and attorneys fees.
Subsection 6(d) also provides that a where court determines
that a database producer brought an action "in bad faith
against a nonprofit educational, scientific, or research
institution, library, or archives, or an employee or agent of
such an entity, acting within the scope of his or her
employment" the court shall award costs and attorney fees
against the database producer. This is clearly intended as a
disincentive to frivolous lawsuits against nonprofit entities.
More importantly, subsection 6(e) provides:
"Reduction or Remission of Monetary Relief for Nonprofit
Educational. Scientific, or Research Institutions. -- The court
shall reduce or remit entirely monetary relief under subsection
(d) in any case in which the defendant believed and had
reasonable grounds for believing that his or her conduct was
permissible under this chapter, if the defendant was an
employee or agent of a nonprofit educational, scientific, or
research institution, library, or archives acting within the
scope of his or her employment."
We believe that it would be desirable to consider ways this
exception from monetary liability could be clarified or
strengthened. In particular, we think that the following
changes should be considered:
(a) The existing language in subsection 6(d) concerning costs
and attorney's fees, including the provision for mandatory
costs and fees against a plaintiff who sued a nonprofit entity
in bad faith could be moved to a new subsection 6(f) (b) The
remaining subsection 6(d) could be amended to make it clear,
immediately, that the monetary damages described therein are
"subject to the limitation described in subsection 6(e)";
and/or (c Subsection 6(e) could be amended to clarify that the
burden of proof would fall on the plaintiff to establish that
the defendant knew or had reasonable grounds to know that its
actions were not permitted under the law; and/or (d) Subsection
6(e) could be amended to eliminate any initial awarding of
damages. As presently written, subsection 6(e) intimates that a
court would first award monetary relief (damages, profits,
etc.) against a nonprofit defendant and then be required to
"reduce or remit entirely" that monetary relief; and/or (e)
Subsection 6(e) could be amended to require a court deny any
monetary relief absent a showing the defendant knew or had
reasonable grounds to know that its actions were not permitted
under the law.
Any of these changes, singularly or in combination, could make
it easier for nonprofit institutions to establish the "ground
rules" for when they might face monetary liability. [To the
degree that clear ground rules can be established for
researchers so that they know they will, at worst, be subject
only to injunctive relief, we believe that this would
substantially eliminate any "chilling effect" H.R. 2652 might
have on non-profit educational and research activities.
ii. Criminal Remedies
H.R. 2652 includes criminal sanctions in § 1207 which
provide for a fine up to $250,000 and up to five years
imprisonment. Subsection 1207(a)(2) provides a very clear
exception from any criminal liability for any "employee or
agent or a nonprofit educational, scientific, or research
institution, library, or archives acting within the scope of
his or her employment." We believe that some criminal
provisions are desirable to handle
LaMacchia
-like situations, i.e. in which judgment-proof individuals
might seek to disseminate protected databases without any
profit incentive. We also believe that the protection against
criminal prosecution for nonprofit entities and individuals is
adequately strong.
The Department of Justice has informally recommended that
§ 1207 be amended to distinguish between "misdemeanor" and
"felony" liability, with the latter available only for damage
to a database producer exceeding $20,000. We understand that
Justice is concerned that a statute establishing a relatively
new form of liability should not have too low a threshold for
criminal liability. We think that such a change would be
appropriate, although it will only impact commercial and
private entities and individuals -- not the nonprofit entities
and individuals already exempted from the criminal provisions
of the bill.
5. Consistent with U.S. trade policy, it is desirable to
secure for U.S. companies the benefit of the EU Database
Directive and laws in other countries protecting database
products.
There was much discussion at the April conference of the
effect of the EU Directive's "reciprocity" provision on
American database producers. Unlike in a "national treatment"
scheme, US companies do not automatically enjoy the protections
afforded by the Directive's
sui generis
protection scheme. Presently, a database of a U.S. company is
protected under the EU laws only if the U.S. company has a
substantial economic presence in an EU Member State. A recent
comparative study from Japan has concluded that "the existing
disparity between US and EU database protection gives European
database producers a distinct advantage" and that "[i]t may be
argued that this reciprocity requirement enables European
database producers to grow by exploiting US databases as long
as the US . . . fails to provide an equivalent level of
protection for European databases."
An American firm that does not enjoy protection under the EU
Directive faces several possible competitive disadvantages.
First and most obviously, its noncopyrightable database may be
duplicated and remarketed by others. Second, European data
sources looking for a firm to "process" and market raw data
will be more likely to enter into a contract with a European
company that can guarantee protection of the database versus an
American company that cannot. Thus, even if the American firm
could effectively protect the database with technology and
contract law, it may be at a disadvantage in obtaining
"suppliers" of data.
Could the U.S. force the EU to protect American databases in
the absence of a U.S. database protection law? The U.S. has
already cited the reciprocity provision of the Database
Directive as one reason the EU was placed on the Priority Watch
List in this year's Special 301 review process. Nonetheless,
the U.S. has limited pressure it can bring to bear on the EU.
We believe that the failure of the EU Directive to provide
national treatment probably does not violate TRIPS. Because the
Directive offers copyright protection to databases on virtually
the verbatim terms required by TRIPS (Article 10(2)), the
additional protection of the EU
sui generis
regime is probably not subject to the TRIPS national treatment
requirement. This means that in order to protect all U.S.
database producers, the U.S. would have to adopt domestic
legislation that the European Commission would judge to be
comparable to the EU Directive.
A set of more abstract arguments is pitted against the general
desirability of giving American firms the benefit of the EU
Directive's reciprocity provision. First, there is the argument
that given U.S. advocacy of national treatment, we should not
condone the EU's use of reciprocity in their Database Directive
because it will embolden both the EU and other countries to use
reciprocity in other policy areas. The concern is that this
would cause a breakdown of the national treatment doctrine
under international law and "further balkanization of data
availability conditions." We agree that there will be some
superficial inconsistency between opposing the Directive's
reciprocity approach and any U.S. adoption of a database
protection regime that appears intended to meet the reciprocity
requirement. But the U.S. often responds to the acts of other
countries while disagreeing with those acts; the true
inconsistency with our stated international policy would only
be if a U.S. database protection law required reciprocity.
The question remains whether H.R. 2652would be sufficiently
comparable to the EU Directive. We believe that H.R. 2652
offers protection that is equivalent to the EU Directive and
would give the United States a strong position to insist with
the EU Commission that U.S. nationals enjoy the full benefits
of the EU Directive:
Like the EU Directive, H.R. 2652 protects investment,
qualitative or quantitative, in a database [EU art. 7(1); HR
§ 1202]; Like the EU Directive, H.R. 2652 prohibits
unauthorized takings of the whole or a
substantial
part of a database [EU art. 7(1); HR § 1202]; Like the EU
Directive, H.R. 2652 permits insubstantial takings [EU art.
8(1); HR § 1203(a)], but prohibits unauthorized
repeated
takings of
insubstantial
part of the database [EU art. 7(5); HR § 1203(a)]; Like
the EU Directive, H.R. 2652 applies separately from copyright
[EU art. 7(4); HR § 1205(c)]; The EU Directive permits
exceptions for "teaching or scientific research" [EU art. 9(b)]
of the sort set out in H.R. 2652 [HR § 1203(d)]. Like the
EU Directive, H.R. 2652 provides a fifteen year term of
protection [EU art. 10; HR § 1208(c)]. Like the EU
Directive, H.R. 2652 provides that it does not alter the effect
of any other intellectual property laws [EU art. 13; HR §
1205(a)].
The principal differences between the two approaches include:
While the EU Directive establishes a
sui generis
property right "located in the neighborhood of copyright," H.R.
2652 adopts a misappropriation approach that targets particular
acts; The EU Directive appears to permit renewal of protection
for an entire database when the database is revised [EU art.
10(3)] while H.R. 2652 permits a new term of protection only
for the new elements of the revised database [HR §
1208(c)]; The EU Directive arguably has a narrower definition
of a database than H.R. 2652; The EU Directive and H.R. 2652
take different approaches on the exemptions carved out of the
protection regime.
We believe, on the whole, that the comparable aspects of the
two regimes far outweigh the differences. The case that H.R.
2652 provides comparable protection is strengthened by the fact
that direct comparisons are not appropriate: the Directive
provides guidance to the EU Member States for implementing
legislation. Thus, each provision of H.R. 2652 that arguably
diverges from the Directive should be compared to the parallel
provision in each of the fifteen Member States' implementing
laws. Only if
all
fifteen Member States adopted implementing legislation
completely different from the H.R. 2652 provision would this be
a grounds that the two are not "comparable" in that respect.
B. OTHER ISSUES
1. Databases Prepared for Scientific Markets
We believe that there remains at least one place where the
interests of database producers and scientists/educators may be
in a "zero sum" conflict: how to handle collections of
information specifically prepared and marketed to scientists
and educators. The problem is apparent in the § 1203(d)
exception that shield "extracting or using information for
nonprofit educational, scientific, or research purposes" as
long as such activity "does not harm the actual or potential
market for the product or service referred to in section 1202."
Many people have pointed out that this does not exempt from
liability extraction/use from databases
marketed
to the nonprofit scientific or research communities.
This is a place where the desire to provide proper incentives
for the production of databases runs squarely into the desire
to provide as much as access to information as possible to
researchers and educators. If a commercial firm creates a
database intent on educators/researchers being a substantial
part of the market for that database, then consistent
application of the incentive rationale requires that the firm
have the same protection against educators/researchers that it
would have against others in the marketplace. This is also
consistent with Congress' recognition that a number of types of
copyrighted works -- such as informational newsletters targeted
to particular audiences, textbooks, testing materials, and
other materials prepared for the school market may not enjoy as
wide a range of fair use as other types of materials.
2. "Sole-Source" Database Issues
Both prior to and during the conference, the debate over
database protection has frequently turned to the issue of "sole
source" databases. Critics of database protection proposals
have often advocated that databases which are the only source
for certain types of information should be treated differently
from other databases. The argument is that otherwise, any "sole
source" database protection scheme would create a monopoly over
access to the facts in these sole-source databases. A
frequently heard proposal is that such sole source databases
should be subject to some type of mandatory licensing system.
There is an initial problem in defining what is meant by a
"sole source" database. Is it an
absolute
sole source for the data? Or is it a
practical
sole source for the data? We believe that there is a tremendous
difference between the two and that critics of database
protection frequently use the former extreme cases to advocate
mandatory licensing or similar restrictions on a broader range
of compilations.
Examples of an
absolute
sole source database would be, for example, (a) measurements of
solar flares during a specific period that were done at only
one telescope, (b) temperature and air content measurements
made inside a cave by the initial spelunkers who discovered it
and opened it to the surface, (c) historic climatological
measurements for the specific location that were made by only
one party. In fact, scientific measurements are among the most
likely candidates to be absolutely unique datasets. There are
also many unique sources of historic data, i.e. the Mormon
Church's genealogical records might qualify.
If it is correct that these are the vast majority of true
sole-source databases, then access to information in
sole-source databases may not be a significant issues in any
database protection regime which (a) does not apply to
government-funded data and (b) which has a reasonably defined
sunset on database protection rights. Critics of database
protection have, however, broadened their view of "sole-source"
databases to include those where, while the raw information
still exists in the world and could be collected independently,
the information has been collected and commercialized by only
one party. The argument is that the information is, for
practical purposes, under the control of a single entity and
because there is no competition the database owner will extract
monopolist rents from users.
The problem with this argument is that it cuts too wide. There
will inevitably be many small markets that can only be viably
served by one firm; we should expect that the number of such
niche markets will only increase with time. Instituting a
mandatory licensing system would, in effect, penalize those who
are "first to market" in serving these niche demands. It is
undesirable to create an IP regime that dissuades firms from
entering such small markets. Our country takes, for example,
the opposite approach with the "orphan drug law" -- which is
intended to give firms an incentive to fill and stay in niche
markets for which R&D costs cannot be easily recovered.
Similarly, in the copyright field, there has been recognition
that fair use should be drawn more narrowly when the producer
of the work is supplying a small market.
H.R. 2652 offer a limited response to possible sole source
monopolist pricing by expressly providing in section 1205(d)
that nothing in the statute effects "Federal and State
antitrust laws, including those regarding single suppliers of
products and services." This raises a minor concern: under
patent and copyright law, courts have developed "misuse"
doctrines independent of antitrust law. Does the express
mention of antitrust law in H.R. 2652 preclude a "database
protection misuse" doctrine? We think the answer is unsettled,
albeit probably 'no.' To clarify this possible ambiguity, we
suggest that § 1205(d) be written in a way as to ensure
that courts remain free to develop any equitable doctrines
doctrine that would be appropriate in this area. We think that
this would be the easiest way to unambiguously preserving the
possible use of doctrines like unclean hands or "misuse"
against database producers.
If such language were not adopted in the act, we would
recommend that the legislative history make clear that express
consideration of the antitrust laws in the statute does not
prevent the courts from denying relief to a database producer
on equitable grounds and the possible development of a
"database protection misuse" doctrine.
3. Distinguishing Protected from Unprotected Material: the
issues of "perpetual" protection and value-added compilations
of government-generated data
One of the places where a neutral observer might wonder if the
sides are speaking about the same issue is the question of the
duration
of protection. Critics of database protection frequently claim
that a regime of "perpetual" protection would be created or
that proposals call for protection greater than copyright
protection --- yet the current legislative proposal calls for a
15 year duration (and copyright endures for the life of the
author plus 50 years). For reasons we will explore below, this
problem has certain contours in common with the issue of
privately-held, sole source databases from government-generated
data.
The critics' concern about "perpetual protection" is rooted in
the need to provide some type of protection for
revisions
of databases. If legislation were passed that provided
protection to new databases, but did not provide protection to
revision of databases, this would skew investment. There would
be a disincentive to revise proven, useful databases in favor
of creating new databases. Reassembling (largely) the same
information in a new database would be inefficient not only for
data gatherers, but for data users who -- in order to use the
most current data -- would have to accustom themselves to the
format of the new database. The drafters of H.R. 2652 believe
they resolve this problem with the general definition of what
is protected and the 15 year statute of limitations:
"[N]o action can be maintained more than fifteen years after
the investment of resources that qualified that portion of the
collection of information that is extracted or used. This
language means that new investments in an existing collection,
if they are substantial enough to be worthy of protection, will
themselves be able to be protected, ensuring that producers
have the incentive to make such investment in expanding and
refreshing their collections. At the same time, however,
protection cannot be perpetual; the substantial investment that
is protected under the Act cannot be protected for more than
fifteen years. By focusing on that investment that made the
particular portion of the collection that has been extracted or
under eligible for protection, the provision avoids providing
on-going protection to the entire collection every time there
is an additional substantial investment in its scope or
maintenance." (Legislative Report)
We believe that this does not wholly address the concerns of
those who believe that the bill could create "perpetual
protection." While the bill provides no
de jure
perpetual protection, many users believe that the digital
environment might be manipulated in some situations to produce
de facto
perpetual protection.
This potential problem is limited to a discrete set of
databases. Some databases are revised extensively and
constantly; for these databases, the value of the database is
much shorter than 10 or 15 years. Stock exchange price listings
are the most extreme example, but other lists -- realtors' sale
listings and used car valuations also fall in this category.
Other databases will be revised rarely, if ever, once a
definitive version is completed, i.e. a database of Union
warships in the Civil War or the passengers on the
Mayflower
. The databases for which the "perpetual protection" problem
arises are ones that have value over many years and require
substantial, but not total, revision. An example would be a
historical database of the batting statistics of all baseball
players in the major leagues or a database of medical
compounds. Our understanding of the "perpetual protection"
problem with these databases is as follows. In the classic case
of a copyrighted book, the text loses protection at the end of
its term, although new, revised versions of the text may enjoy
fresh periods of protection. This means that one can find
unprotected texts of
Antigone
or
Pride and Prejudice
in libraries all over the country. At the same time, new
versions of these books can be under some copyright protection
(including new introductions, translations, "notes," artwork,
etc.) It is possible to compare the two versions -- old,
unprotected and new, protected -- side-by-side. In the digital,
on-line environment, content producers may chose not to
alienate copies of their works; instead access to a database
may be licensed to users. The advantage is that the database
user can receive the most current version of the compilation.
The disadvantage is that the user may lack access to any old
version of the database in which to compare old and new
entries.
Imagine that in 2000, a database producer makes a database; we
will designate the first twelve entries alphabetically:
A
B
C
D
E
F
G
H
I
J
K
L
In 2003, it "expands and refreshes" the database, so that the first fifteen entries are as follows:
A
B
BB
C
D
E
F
FF
G
H
I
J
K
KK
L
In theory, under H.R. 2652 in the year 2016, all of the
entries except BB, FF, and KK lose protection -- and can be
copied in their entirity. The problem is that if the database
is provided via on-line services, there may be no means for the
user to know which entries are unprotected because they were
original entries and which entries are protected because they
are the result of maintenance investment within the past 15
years.
Critics of database protection are correct to point out that
this could produce "chilling" effects on those who want to use
the database after the initial term of protection. One
commentator has suggested that new entries by electronically
"tagged," so that a user can readily determine what is
protected and what is not, i.e.
A B
BB
C D E F
FF
G H I J K
KK
L
To the extent they have considered this idea, the protection
advocates have not been favorable to the "tagging" idea. We too
recognize that it might create substantial technological
problems or costs, depending on the database.
Another possible solution would be to require any database
producer that wanted to enjoy protection for a revision of
their database after the fifteen year period to make (or have
made) the original, no longer-protected database available in a
reasonable format. This would be the electronic equivalent of
the old copy of
Wuthering Heights
in the public library. The original database need not be
as
available as the new version -- just as old library books
usually are not as available as books at retail stores, but it
should reach some standard of public access. On this count, it
is possible that the problem of "perpetual protection" could be
addressed by establishing a limited, well-defined archiving
right for libraries, possibly taking ideas from 17 U.S.C.
§108 and §403 of the Digital Millennium Copyright
Act, which modifies 17 U.S.C. §108 to cover digitized
archiving.
This archiving approach does not, however, resolve the similar
problem that could arise when (a) a private entity adds value
to government-generated information, (b) distributes the new,
value-added compilation, and (c) the government withdraws from
supplying the data to the public. In such situations, there is
the possibility that the private entity will use a minimal
amount of value-added processing to claim that the entire
compilation of information is protected. This could frustrate
the goal of making government-generated data widely available;
at the same time, we do not want to adopt any regime which will
take away from incentives to "value-add" to
government-generated data. Copyright law has addressed a
parallel problem: mixtures of privately-generated
(copyrightable) materials with government-created
(noncopyrightable) materials. In such cases, 17 U.S.C. §
403 provides that where a work is "predominantly" U.S.
Government material, the copyright notice should include a
"statement identifying, either affirmatively or negatively,
those portions" protected under the copyright law as contrasted
with the "works of the United States Government". If a
copyright holder fails to include such a statement, 17 U.S.C.
§ 403 provides that the defendant in an infringement
action can claim a defense based on innocent infringement to
mitigate any damages. We think that it would be appropriate to
consider whether a similar provision, possibly linked to
"tagging" or otherwise identifying government-generated data,
should be included in H.R. 2652.
#####
