An article, a CFP, and a useful site

April 7th, 2009 in DigiLib BLog 0 comments

The article, from the Public Library of Science, is this: “Clickstream Data Yields High-Resolution Maps of Science.” The authors collected “nearly 1 billion user interactions recorded by the scholarly web portals of some of the most significant publishers, aggregators and institutional consortia,” says the abstract. They proceeded to create maps that illustrate citations in the articles with which the users interacted. These maps “provide a detailed, contemporary view of scientific activity and correct the underrepresentation of the social sciences and humanities that is commonly found in citation data.” The most interesting illustration in this context is Figure 5—check out that big white and yellow cluster in the center. It’s worth the load time to view the larger image.

The CFP is for the next annual meeting of the Text Encoding Initiative Consortium. This year’s theme is text encoding in the era of mass digitization. The the first three suggested topics are conceptually larger than TEI, and are intriguing: In-depth encoding vs. mass digitization; Is text encoding sustainable?; Is text encoding scalable? People are bound to talk about crowdsourcing metadata, which I think is the only hope we have of scaling semantic encoding. (The quality control issues, which are the first concern that usually arises when people talk about collaborative knowledge work, are real. But there are ways to deal with them, and data that can be corrected may well be better than no data at all.)

The site I came across today is FairShare. It allows people to track how their online publications are used and/or remixed. Haven’t played with it yet, but it looks promising, particularly in the context of an institutional repository. Imagine a researcher depositing an article, pointing FairShare at it and seeing others respond to her work. Just the psychological boost from that is valuable in spurring future work.

JISC Digital Preservation Policy Report

November 24th, 2008 in DigiLib BLog 0 comments

In October of this year the UK-based Joint Information Systems Committee (JISC) released a final report resulting form a six-month digital preservation policy study they’d conducted earlier in the year. The report is available here, and appendices are here (both links lead to PDF files).

Although the study was performed in the UK and the report is chiefly aimed at UK audiences, the JISC investigators drew on an international set of data, and their findings will certainly be useful to large institutions outside the Isles.

For all that the report is long and detailed, the authors’ definition of digital preservation is impressively concise: “In contrast to printed materials, digital information will not survive and remain accessible by accident: it requires ongoing active management. [...] Digital preservation is the process of active management by which we ensure that a digital object will be accessible in the future.” (10)

I list some of their recommendations below, as thinking points that jumped out at me. (My present context: recent return from the SPARC Digital Repositories Meeting 2008 held in Baltimore last week, about which soon.) Much more information is available in the report itself. This is what the JISC investigators see as best practices for thinking through a digital preservation policy (DPP below) on an institutional level:

- Have a principles statement, and tie it in to the university’s stated overall aims
- Highlight connections between the DPP and other policies, practices, objectives that may be in place at the same institution; highlight also connections between the DPP and similar policies at other institutions
- Clearly state preservation objectives (archival requirements, long-term research prospects) and an intent to “deliver a reliable and authentic version to [the] user community” (19)
- Speak not only to preservation itself but also to user experience
- State explicitly which relevant governmental statutes the policy will adhere to (Freedom of Information Act, for example)
- Specify what kinds of materials will be preserved (can be presented in different groupings, for example organized by how complex preservation is for given objects, by formats, by priority)
- Specify transparency and accountability as goals, and provide venues for external entities to check on that
- Outline an implementation plan. This is possibly the most difficult step in the process, but crucial.
- The policy should be version controlled.

The report also addresses important topics like intellectual property, financial and staff responsibility, distributed services, standards compliance, auditing and risk assessment – the list goes on. Though the report is sixty pages long, it is an excellent source of information and springboard for a detailed approach to creating – and most importantly implementing – a digital preservation policy.

