International and national research initiatives using archived web material

Big UK Domain Data for the Arts and Humanities

The project will work with the dataset derived from the UK domain crawl from 1996 to 2013, totalling approximately 65 terabytes and constituting many billions of words. A key objective of the project will be to develop a theoretical and methodological framework within which to study this data, which will be applicable to the much larger on-going UK domain crawl, as well as in other national contexts.

NetLab is a research infrastructure project for the study of internet materials within the national Danish Digital Humanities Lab (DigHumLab). The aim of NetLab is to initiate and conduct a number of research-driven projects to contribute to the establishment and development of a research infrastructure.
WebART: Web Archive Retrieval Tools

The WebART project aims to critically assess the value of Web archives for realistic research scenarios, and develop information access tools and methods that maximize the archive’s utility for research. Our approach is to conduct actual Web archive research hand-in-hand with the development of Web archive access tools tailored to the realistic research scenarios. Within the project, our focus is on the use-case of a humanities researcher, although tools will no doubt be useful for other use-cases as well.

The ALEXANDRIA project aims to develop models, tools and techniques necessary to archive and index relevant parts of the Web, and to retrieve and explore this information in a meaningful way. While the easy accessibility to the current Web is a good baseline, optimal access to Web archives requires new models and algorithms for retrieval, exploration, and analytics which go far beyond what is needed to access the current state of the Web. This includes taking into account the unique temporal dimension of Web archives, structured semantic information already available on the Web, as well as social media and network information.
Atelier DL web Ina

Les ateliers de recherche méthodologiques du dépôt légal du web à l’Ina sont créés en novembre 2009 dans le but de susciter, étudier et promouvoir les usages du web et de ses archives à des fins d’étude et de recherche. Ces ateliers s’adressent aux étudiants, chercheurs, professionnels des médias. Au rythme d’un par mois, ils abordent des problématiques variées liées aux enjeux de la migration des contenus en ligne et à l’évolution des usages pour leur production, réception et conservation.
Analytical Access to the Domain Dark Archive (AADDA)

Developing new forms of access to a dark archive of UK websites (1996-2010). Funded by the JISC, led by the Institute of Historical Research in partnership with the British Library and the University of Cambridge.
WIRE — Working with Internet Archives for Research

WIRE is a workshop in 2014 organized by Rutgers University, Northeastern University, and the Internet Archive, 17-18 June 2014. The workshop takes place at Harvard, Cambridge MA. The aim of the workshop is twofold. First, the workshop will provide a forum for presentations and discussions of ongoing research involving community development and historical Internet data. Second, a closing session will be devoted to discussing future research needs and unanswered research questions with regard to data and access to historical Internet records.