DH09 Tuesday: Christine Borgman keynote
[note to self: read Gibson's Spook Country.]
[OK, I'll admit: I'm tired and punchy. Hopefully, I'll do some justice to Borgman's talk.]
“Scholarship in the Digital Age: Blurring the boundaries between the sciences and the humanities.”
Borgman’s Scholarship in the Digital Age: Information, Infrastructure and the Internet was published by MIT Press in 2007. Well received and worth reading. What’re you waiting for? Here it is! Encourage your local library to get a copy! And now, on to the talk.
Wikipedia article on Humanities, as Wordle sees it: so many definitions. Likewise, the digital humanities. But here’s a usable one: “not a unified field but an array of convergent practices that explore [the humanities]“.
Take-up of DH is difficult, partly because infrastructure isn’t there yet. Whose problem is this? J.Drucker says: “We face a critical juncture. Leaving it [tool building] to “them” is unfair, wrongheaded, and irresponsible. Them is us.”
General outline for Borgman’s talk:
- Scholarly information infrastructure
- Science and, or, vs humanities: publication practices; data; research methods; collaboration; incentives; learning
- Call to action
It’s something we can build, but it’s also emergent. We can build new bridges, but we also have Roman aqueducts. When infrastructure works well, it’s invisible, and when something goes very wrong with it (see DC Metro crash yesterday) we become very aware of it.
Scholarly information infrastructure writ large is also called cyberinfrastructure, eScience, eSocial Science, eHumanities, …eResearch. Its goal is to enable new forms of scholarship that are:
This is very much a time to influence its formation, as our cultural heritage becomes (is converted, is born) digital.
Building a reserch agenda for digital scholarship, per Amy Friedlander: we need to address questions of scale, language/communication, space/time, and social networking.
2. Science and, or, vs humanities.
First, publication practices. Some things about scholarly journal publication have changed, but some haven’t changed since the first one in 1665: there are titles, subtitles, volume numbers, etc.
Why do scholars publish? To be legitimized (via peer review); to have their work disseminated (by publisher and pre-print distribution); and to have their work preserved, curated and accessed (via libraries).
Digital publication is largely the same, currently, as print publication. Except access/preservation/curation, which is expanded to library, publisher’s server, repository and homepage.
Citation: in the sciences, there’s a very fast drop-off, because stuff becomes irrelevant quickly. But they publish far back (to about 1900). Humanities citation drop off is not nearly as precipitous, because earlier publications remain relevant; but works go out of print long before they become irrelevant.
arXiv.org, the originally physics and now other-fields archive, has half a million articles in it. Mostly people are republishing (submitting to arXiv and at the same time submitting to journals), but some people publish *only* on arXiv.
Digital publication scholarship reaches audiences faster, and citation rates increase. The print world risks a closed community.
On to data in digital scholarship. They’re the third stream of scholarly capital, along with human capital and instrumentation. They leverage research investment by replicating and verifying research findings and asking new questions with data that already exist. Data creation, sharing and reuse is a huge issue in the sciences right now, and will be in DH if it’s not already here.
So what are data? Data can be observational, computational, experimental, and records. Are data objective or subjective? They can be facts, or they can be “alleged evidence.” Humanists might start with alleged evidence, but actually that is more pervasive in the sciences than scientists tend to acknowledge.
Scientific data comes in many forms. Each field/discipline has many things they’d call data (weather in ecology, spectral surveys in astronomy). Scientists generate their own data or acquire from collaborators, other scientists, repositories. Social scientists generate their own data but also acquire them from things like government records.
So what are humanities and arts data? Borgman says: newspapers, photographs, letters, diaries, books, birth/death/marriage records, church records, maps — any record of the human experience can be data. They’re found in libraries, archives, museums, public records, corporate records, mass media. Intellectual property is a much hairier issue than in the sciences.
Wired article by Chris Anderson exactly a year ago: “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete.” Provocative, yes, but also serious. What does the data deluge mean for the humanities? One person’s theory, in the humanities, is another person’s data.
The notion of what you *mean* by data needs to be tackled head-on. How it’s captured and curated for future re-use is also important.
The laboratory for the humanities has traditionally been a library. Now there’s a hybrid laboratory: stacks and computers, interesting new partnerships with the library community. What to do with the library building now that the vast majority of a library’s resources is online? We refurbish buildings into hybrid environments.
Problem solving methods: Aristotle – empirical; Leibniz – theory; von Neumann – simulation; and ___ – data. George Djorgovski: “applied CS is now playing the role that math did from the 17th through the 20th centuries: providing an orderly, formal framework and exploratory apparatus for other sciences.” DH are more collaborative than that, but how do you publish the same thing in a CS journal and in a history journal?
Sloan Digital Sky Survey: broke open [the] access to astronomical data. Massive use by non-astronomers, notably schoolchildren. More than 1700 scholarly publications came out of the SDSS. Talk about broad reach.
Just as humanities has gone from a bunch of small projects to big collaborative projects, scientists have gone from “small science” to much larger projects. It changes the questions you can ask, and it also changes professional training in the field.
Will digital biologists know how to do “wet” biology? Will digital humanists be able to use a physical archive?
Rome Reborn Project: used to look more like a digital library than it does now. Has taken many forms over time. Was the biggest booth at last year’s SIGGRAPH, at 3500 sq. ft. Nice example of a cross-over.
On to collaboration. Is it everything? In the sciences, yeah. Nobody gets their own linear collider. This also means people grow up (professionally) in labs, together. The lone scholar has trouble changing.
Scientists, when asked what are their data, are less sure of the answer(s) than we might think. What’s data and trustworthy to one person is not either to someone else. The many different kinds of data, who is generating them, and how to bring them back together again are issues that DH haven’t begun to consider. [vz: I think this conference's sessions so far indicate that we have, in fact, started to consider this. Just started, though.]
We learn new practices, practice listening, and learn/create new pidgin common languages for various communities.
Incentives to share (more for scientists right now, but coming to DH): open science/scholarship; recognition; collaboration; reciprocity; coersion. Incentives not to share are rewards for publication (not good metadata); effort involved to document data; also competition, priority of claims and intellectual property issues.
Finally, learning in sciences vs. humanities. Cyberlearning: the use of *networked* computing to learn. Why is it important? It leverages learning through communications technologies and students’ tech skills; it extends capacity of educational institutions for lifelong learning opportunities.
We need to instill a “platform perspective.” Let’s get a common platform, let’s invest in tools like Zotero (etc) to be able to cross over.
We need to enable students to use data (many teaching opportunities there).
We need to promote open educational resources. (Hello, OER program, Creative Commons, Science Commons.)
Openness matters, and is a necessary though not sufficient condition for interoperability, which trumps all. Also discoverability.
Borgman wraps up with a call to action. Here it is:
- Publication practices: Increase speed and scope of dissemination through online publishing and open access.
- Data: Define, capture, manage, share, and reuse data.
- Reserach methods: Adapt practices to ask new questions, at scale, with a deluge of data. Don’t ask the same questions faster.
- Collaboration: Find partners whose expertise complements yours, listen closely, and learn. It takes a year or so to get a pidgin language for collaboration.
- Incentives: Identify best practices for documenting, sharing and licensing humanities content. Learn this from the sciences.
- Learning: Build a vibrant digital humanities community, starting in the primary grades.
- Generally: Err toward openness, reusability, and generalizability.
Research questions/problems to consider: what are data? What are the infrastructure requirements? Where are the social studies of digital humanities? What is the humanities laboratory of the 21st century? What is the value proposition for digital humanities in an era of declining budgets? [Not a snarky question; we have to make a case for ourselves.]