The Computerworld Honors Program
Honoring those who use Information Technology to benefit society
Los Angeles, CA, US



Media, Arts and Entertainment


The J. Paul Getty Trust

Web-Based Global Art Resources: The Getty Vocabularies

Short Summary
For more than two decades, the J. Paul Getty Trust, an international cultural and philanthropic organization devoted to the visual arts, has produced a suite of electronic resources to support its commitment to making art information accessible to all. These resources provide access to information on the visual arts and related disciplines by promoting standards and practices, and by developing tools and guidelines for creating, managing, preserving, and delivering information in electronic form. Guided by the belief that art has the power to enrich lives and strengthen humanistic values, the Getty continuously employs a range of technologies to serve a growing worldwide audience. The Getty’s electronic resources include three vocabulary databases: the Art & Architecture Thesaurus® (AAT), the Getty Thesaurus of Geographic Names® (TGN), and the Union List of Artist Names® (ULAN). The Web-Based Global Art Resources program is a portfolio of projects supported by Getty Information Technology Services (ITS). To support the three Getty vocabularies, ITS built, enhanced, and maintains a powerful thesaurus construction and publication system. In addition to licensing the vocabularies to hundreds of non-profit cultural and educational institutions and commercial vendors of museum- and heritage-related software products, the Getty makes its vocabulary databases available free of charge on the Web (at Hundreds of thousands of searches are conducted in the AAT, TGN, and ULAN each month. In addition to being essential resources for the documentation of art, architecture, and material culture, the Getty vocabularies are also invaluable lookup tools and knowledge bases, and powerful searching assistants that increase both precision and recall in online queries.

Introductory Overview
Since the early 1980s, the Getty has devoted significant institutional efforts to the creation of a number of electronic resources to support its commitment to making art information accessible to all. This case study focuses on the portfolio of projects that developed and delivered the three Getty vocabulary databases (AAT, TGN and ULAN), as these are the most widely used among the art and art history research resources offered by the Getty. These vocabulary databases are fully compliant with ISO (International Organization for Standardization) and NISO (National Information Standards Organization) standards for thesaurus construction. They contain terms, names, and other information about people, places, things, and concepts relating to art, architecture, and material culture. These structured vocabularies, which grow and enhance through the Getty’s efforts and contributions from selected partners, can be used in three ways: at the data entry stage by catalogers or indexers who are describing works of art, architecture, material culture, archival materials, visual surrogates, or bibliographic materials; as knowledge bases providing information for researchers; and as search assistants to enhance end-user access to online resources. The Art & Architecture Thesaurus® (AAT) currently contains 131,000 preferred and variant terms, descriptions, bibliographic citations, and other information relating to fine art, architecture, decorative arts, archival materials, and material culture. The Getty Thesaurus of Geographic Names® (TGN) contains 1.1 million place names (including linguistic and historical variants), place types, coordinates, and descriptive notes, focusing on places that are important for the study of art and architecture. The Union List of Artist Names® (ULAN) contains 293,000 names (including a wealth of variant names, pseudonyms, and language variants), as well as biographical and bibliographic information about artists and architects. Since 1998, Getty ITS and Web Services have worked closely with the Getty Vocabulary Program to conceive, develop, deploy, and continuously enhance the technologies necessary to support the growth and usage of the vocabulary databases. Three major projects comprise the work completed to date. First, the Vocabulary Coordination System (VCS) project created a single production system that replaced three separate, outdated and disparate data collection and editorial systems that had been used to produce the three vocabularies. The new, more powerful production engine allows Getty staff to efficiently collect, analyze, edit, merge and distribute terminology from Getty departments, as well as from external collaborating institutions. Second, the Vocabularies on the Web project produced unified Web-based access to the three Getty vocabularies and made them available to hundreds of thousands of researchers, scholars, and members of the general public who are interested in the subject areas covered by the vocabularies. This project also enhanced security to protect the Getty’s intellectual property, and added measurement metrics to allow the Getty to gauge the usage volume, usage patterns, and the success of these efforts. Finally, the Vocabulary Contributions project created processes and procedures for making use of and contributing to the vocabulary databases an integral part of the work in all relevant Getty and external projects. These multi-year efforts required out-of-pocket expenditures of $750,000, an expert technology project team, co-led by the Getty Vocabulary Program and Getty ITS, and the provisioning of necessary infrastructure elements from servers to databases and desktop technologies.

Has your project helped those it was designed to help?   Yes

What new advantage or opportunity does your project provide to people?
Until the 1980s, concepts like controlled vocabularies, metadata and schemas were all but unknown in the world of art and material culture information. The Getty Vocabulary Program believes in the value of tools and resources that provide controlled vocabularies and rich metadata and has worked for many years to enhance access to information on the visual arts and related disciplines by promoting standards and practices and providing tools and guidelines for developing, managing, preserving, and delivering information in electronic form. The Getty vocabularies are an embodiment of this belief, and represent our long-term commitment to the field of art and art history research. Today, hundreds of thousands of people outside the Getty, from scholars, researchers, students, and teachers to end-users with an interest in art and art history, can access the Getty vocabularies completely free of charge, using a technology (the World Wide Web) that is easy and widely available. Our vocabularies average over 900,000 searches every quarter. There are four key types of users of these resources: 1) information professionals in museums, visual resource specialists, librarians, and archivists; 2) academics in art history, architectural history, archaeology and history; 3) software system implementers, vendors and vocabulary providers; and 4) the general public. The use of the Getty vocabularies greatly enhances the efficiency and broadens the scope of results of each user's research. In many cases, they enable users to reach information and knowledge that may be otherwise unreachable to them. Students and other non-expert users benefit especially, as these tools help them find information that could only be attained, in the past, by those with many years of research or academic experience. A few examples of this enabling process are illustrated in the appendices (when viewing these jpg files, please enlarge the images to see the details) and described below.

Has your project fundamentally changed how tasks are performed?   Yes

How do you see your project's innovation benefiting other applications, organizations, or global communities?
Many museums, libraries, universities and educational and academic institutions regularly consult the Getty vocabularies in their research work and in the management and cataloging of their cultural heritage collections. The appendices contain examples of how the Getty vocabulary systems can extend, expand, enrich, and enable the search or research of art or art history. Appendix 1 illustrates three retrievals using synonyms. First, a search for terminology related to the Koran in the AAT and the results showing 15 different forms used to denote this particular cultural object. Using these various forms, the AAT enables the researcher to reach the most comprehensive set of information related to the Koran, regardless of the spelling or transliteration that is used in the documents that are being searched. Second, a search in ULAN using any of the many variant names of the Flemish sculptor and architect Giambologna will take the users to the information on the same artist. And thirdly, in TGN, users can find information on the city of Cairo by using any of its variant names. Appendix 2 illustrates how a search using a broad term such as “vessels” can retrieve, with the aid of the AAT, more than 20 specific subcategories of vessels ranging from “alembics” to “ewers” in just the alphabetic listing up to the letter “f.” Appendix 3 lists the major groups of users and types of organizations that benefit from the use of the Getty vocabularies (including the Metropolitan Museum of Art, the Carnegie Museum of Art, the Victoria and Albert Museum, Princeton University, and the University of California at Berkeley) and the various delivery mechanisms of our data. Government organizations such as the U.S. Department of Defense and the National Archives and Records Administration, and commercial entities, such as IBM, have licensed the Getty vocabulary data.

The Importance of Technology
How did the technology you used contribute to this project and why was it important?
Technology has played a crucial role in the history of the Getty vocabularies since the early 1980’s, when the first production system for the AAT was developed. Complexity and volume of data in the vocabularies mandated the use of technology from the very beginning. The AAT was also the first of the Getty vocabularies to be published in electronic form (in 1992), followed by the ULAN in 1994. In the mid 1990s, all three Getty vocabularies were first made available on the Web in a rudimentary form. It soon became clear that not only was technology important to the creation of the Getty vocabularies, but that the vocabularies themselves were important to technological solutions as they had the potential for enhancing access to materials cataloged in electronic formats. It also became clear that it was time to move the three aging disparate production systems to a single core production system that would make use of modern technology to allow the Getty Vocabulary Program editors to more efficiently produce these increasingly valuable and relevant thesauri. Because our editorial methods and production requirements were on the cutting edge, and could not be met by any software product on the market, it was up to the Getty's technical team to use modern database tools and programming languages to provide an appropriate production and publication system for the AAT, TGN, and ULAN. The work resulted in the building of the Vocabulary Coordination System and a second project, the re-design of the publicly accessible Getty Vocabularies on the Web. Once the core data structure was in place, technology made it possible for us to plug in a number of different interfaces to aid in the automated collection and distribution of the vocabularies in an even wider range of formats. A third project developed the interfaces necessary to allow the Getty Vocabulary Program to collect, analyze, merge, and distribute terminology not only from the many Getty departments that create this type of information, but also from many external collaborating institutions as well. The current applications supported by the Getty's technical team are: the core production system (VCS), the Getty Vocabularies on the Web where the data is updated monthly, Web-based forms for the automated contribution of single records, programs to automatically load batches of contributed data in XML format, and programs to produce yearly exports in XML, relational tables, and the MARC (Machine-readable Cataloging) format for institutions and commercial entities that elect to license the entire datasets. In short, technology has made it possible for the Getty to build, maintain, and disseminate its vocabularies through a variety of modalities that we couldn’t have even imagined when production of the Art & Architecture Thesaurus began decades ago.

What are the exceptional aspects of your project?
In the early 1980’s, the Getty began work on AAT, seeking to build a tool that would be useful as an authority file for those whose job it was to catalog and describe not only bibliographic materials about art and material culture, but also visual surrogates of works of art, architecture, and material culture (in the case of slide libraries, photographic archives, and similar repositories), as well as the objects themselves (in the case of museums, archives, and other holding institutions). This effort was the first in its field. In the mid-1980s, we began developing our second vocabulary tool, ULAN, a database of preferred and variant names, biographical information, and bibliographic citations for artists, architects, and other creators in the field of visual arts and architecture. This was also the first tool of its kind. In the late 1980s, work began on the third vocabulary databases, TGN, with the data first published on the Web in the mid 1990’s.

How is it original?
According to the Online Computer Library Center (OCLC), “The Getty Vocabularies are the premier references for categorizing works of art, architecture, material culture, and the names of artists, architects and others.” The Getty has pioneered work in the fields of authority control, controlled vocabularies, metadata and art information schemas since the early 1980s in an environment where an expression like "authority control" was often rejected by many curators and museum professionals. The Getty worked and persisted through a period of "consciousness raising" and educational outreach, to the point where it is considered a leading authority on standards-based cultural heritage information, both nationally and internationally. Our originality and continuous dedication have been recognized by many of our colleagues. Kay Bearman, Senior Administrator of Collections Management at the Metropolitan Museum of Art, New York, says: “In my opinion, the Getty is the recognized leader in the field of documentation and access tools for art and architecture. The Getty vocabularies—AAT, TGN, and ULAN—are tools that are used by staff in museums like the Met every single day to provide enhanced access to our collections information both internally and for our vast audience of online users. The vocabularies and other research databases produced by the Getty are invaluable resources for the museum community, and I venture to say that no other institution has the knowledge, expertise, and institutional mission to produce—and to share—such powerful tools.”

Is it the first, the only, the best or the most effective application of its kind?   All of the above

Has your project achieved or exceeded its goals?   Achieved

Is it fully operational?   Yes

How many people benefit from it?   100,000s

If possible, include an example of how the project has benefited a specific individual, enterprise or organization. Please include personal quotes from individuals who have directly benefited from your work.
The Getty has received many expressions of recognition of the success and significance of our work. Ann Baird Whiteside, President of the Art Libraries Society of North America and Head of the Rotch Library of Architecture & Planning at MIT, says: “For two decades, the art library and visual resources communities have looked to the Getty for leadership in data standards, research tools, and documentation resources for art, architecture, and other visual collections. The thesauri produced by the Getty Vocabulary Program are tools that every art library and image collection in North America and beyond consults—and, more recently, contributes to—on a regular basis. The leadership of the Getty has helped to shape the art library and visual resources profession, and has helped to create strategic alliances between various library, archival, and museum communities. Our profession owes a profound debt of gratitude to the Getty for providing us with essential tools that enable us to fulfill our professional duties as librarians and information professionals in the digital age.” The Online Computer Library Center (OCLC), a nonprofit organization that provides computer-based cataloging, reference, resource sharing, eContent and preservation services to 57,000 libraries in 112 countries and territories, announced in late 2006 that the Getty vocabularies will be available through its OCLC Terminologies Service. "We are especially pleased to offer the Getty vocabularies because their use by libraries, museums and archives will increase consistency and improve access to digital collections,” said Phyllis B. Spies, Vice President, OCLC Collection Management Services.

How quickly has your targeted audience of users embraced your innovation? Or, how rapidly do you predict they will?
As indicated below, under “Difficulty,” there was a period of years during which training and outreach were necessary to raise awareness, particularly in the museum community, of the importance of metadata standards and controlled vocabularies for documenting and providing access to collections; the library and archival communities already had a long tradition of standards-based documentation, but few tools that were specifically devoted to art and material culture. Since the late 1990s, and especially in the first years of the new millennium, awareness of the importance of the Getty vocabularies, and usage both on line and in the form of licensed datasets, has increased tremendously, with hundreds of thousands of Web users and hundreds of institutions and vendors that have acquired licenses for the vocabulary data.

What were the most important obstacles that had to be overcome in order for your work to be successful? Technical problems? Resources? Expertise? Organizational problems?
Internally, building the vocabularies has been a time- and labor-intensive undertaking, both technically and intellectually. Many of the ideas behind the Getty vocabularies were conceived prior to the advent of the World Wide Web. As the Web world and the creation of millions of largely unstructured resources became realities at a very fast speed, the Getty faced the internal challenge of renovating our processes and systems to make the backend production engine robust and powerful enough to handle large amounts of data effectively and the frontend user interface friendly and flexible enough for the ever-changing Web environment. Project management in a world of changing requirements and limited resources was a major challenge; subject matter knowledge of art and art history research was another.

A great external challenge lies in convincing museums to use the standards-based tools for documentation and end-user access. Museum professionals tend to be horrified by an expression like “authority control.” Art historians usually consider themselves authorities on a particular artist, school, or art form and being told the exact name to use for an artist, or what an object in their collections should be called, is abhorrent. Thus, a period of “consciousness raising” and education began in the late 1980s, as museums made the first attempts to control their collection information in electronic information systems. We believe we have been successful in showing art professionals that the use of a controlled vocabulary or thesaurus does not put constraints on their scholarship—rather, it uses the power of expert information and technology to enable users to retrieve relevant information whether or not they use the particular term used by a given institution or scholar. The educational effort continues today, with museums increasingly open to receive the message of standards-based documentation for enhanced and expanded access to collections information.

Often the most innovative projects encounter the greatest resistance when they are originally proposed. If you had to fight for approval or funding, please provide a summary of the objections you faced and how you overcame them.
The allocation of resources in the Getty ITS and Web Services departments, which support 20 to 40 technology-related projects in any given year, is always an issue that demands clear institutional prioritization and effective execution of our tasks once resources are allocated. Getty ITS worked closely with the Getty Trust, program and department executives to prioritize all technology project proposals and obtain funding. ITS worked closely with Getty Vocabulary Program (the user department) to clearly articulate the benefits and significance of these projects to the overall mission of the Getty Trust. All proposals were collected, documented and presented to Getty executives for decision making. By assigning a project manager who was well-versed in the vocabularies, ITS achieved a high level of teamwork and alignment with users. By holding a high standard of project management discipline across the department, ITS successfully delivered the projects in the portfolio and earned the trust of the internal clients (the Vocabulary Program) and Getty executive management. ITS’ success in delivering projects on time and within budget—and the tremendous groups of users whom we can demonstrably show we have served with our vocabulary tools—proved to be the best argument for obtaining approval for additional funding to further support and grow the Getty vocabularies. Recognitions and confirmation from our users of the value added by ITS to core Getty projects are often more powerful than self-advertisement. Murtha Baca, Head of the Getty Vocabulary Program, states: “Our production system is the result of extensive analysis and programming—and expert project management—from a dedicated team of technical experts from ITS with specific subject expertise. The database editors who develop and maintain the Getty vocabularies have advanced knowledge of and degrees in art history and information science, and are experts in standards-based thesaurus construction.”
Digital/Visual Materials
The Program welcomes nominees to submit digital and visual images with their Case Study. We are currently only accepting .gif, .jpg and .xls files that are 1MB or smaller. The submission of these materials is not required; however, please note that a maximum of three files will be accepted per nominee. These files will be added to the end of your Case Study and will be labeled as "Appendix 1", "Appendix 2" or "Appendix 3." Finally, feel free to reference these images in the text of your Case Study by specifically referring to them as "Appendix 1", "Appendix 2" or "Appendix 3."

Currently Uploaded Appendices: