Ïã¸ÛÁùºÏ²Ê¹ÒÅÆ

DH@Lboro

Online Resources

These are more comprehensive listings of the digital research tools available.

The AEOLIAN Network is creating a total of five case studies on the use of Artificial Intelligence in cultural organisations across the UK and US. The case studies are open access and have been compiled for research purposes.

Contains curated lists of tools and code for studying texts. Project is led and based at the University of Alberta.

Curated by Alan Liu, English professor at the University of California, Santa Barbara.

More than simply tools, this site also contains tutorials on how to use them.

These are a selection of some of the most useful tools. In the main they are free and open source; where they aren’t a free trial is usually available. They are organised under the following headings: Capture, Creation, Enrichment, Analysis and Dissemination.

CAPTURE

Transcription

Scribe offers an open source framework for setting up community transcription projects around handwritten or OCR-resistant texts. Scribe is particularly geared toward digital humanities, library, and citizen science projects seeking to extract highly structured, normalisable data from a set of digitised materials.

 (Transcription for Paleographical and Editorial Notation) 

Web-based set of tools to allow collaborative transcription of manuscript pages in TEI-compliant XML. Users attach transcription data (new or uploaded) to the actual lines of the original manuscript in a simple, flexible interface.

Text recognition software.

Format Conversion

A web service to manage the transformation of documents between a variety of formats. The majority of transformations use the Text Encoding Initiative format as a pivot format.

Pandoc can convert documents in reStructuredText, textile, HTML, or LaTeX formats to a variety of other formats including XHTML, PDF, EPUB, docx, odt, and more.

CREATION

Turtle, the Terse RDF Triple Language, a concrete syntax for RDF. Developed by David Beckett and Tim Berners-Lee. Turtle does not rely on XML and is more readable and easier to edit manually than  XML and SPARQL, the query language for RDF, uses a similar syntax to Turtle for expressing query patterns.

ENRICHMENT

Annotating

Open source, web-based app that integrates a powerful set of textual interpretation tools behind an intuitive interface. You can upload your texts and annotate them with styled text, video, images and weblinks.

 

An online environment that synchronises web based video with timeline based annotations.

Cleanup

Tool for cleaning messy data (e.g. fixing inconsistencies), transforming between different formats and exploring data. Previously Google Refine.

Editing

The TEI is a consortium that develops and maintains a standard for the representation of texts in digital form. Its chief deliverable is a set of Guidelines that specify encoding methods for machine-readable texts, chiefly in the humanities, social sciences and linguistics.

Digital scholarly editing tool with long term research data archive.

Microsoft XML notepad is an open-source XML editor.

ANALYSIS

Textual analysis tools

 Computer Aided Textual Markup and Analysis.

A free, open source markup and analysis tool. Also generates basic visualisation options for texts and corpora. Interfaces with the Voyant toolset.

 

A word cloud and concordance tool built in Javascript. Users can paste text into the provided box and generate a word cloud, concordance or list of words ordered by frequency. 

Program which can search Google’s text corpora for a single word or phrase in sources printed between 1500 and 2008.

Open source tool for comparing and collating multiple witnesses to a single textual work. You can upload your comparison sets to a free online workspace, Juxta Commons, where you can analyse your data privately or share visualisations of your work with anyone on the web.

TAPoRware is a set of text analysis tools that enable users to perform text analysis on HTML, XML and plain text files using documents form the users’ machine or on the Web.

Free text analysis app that allows you to analyse webpages, tweet streams and documents; explores the relationships between words in the text via an intuitive word cloud interface.

Voyant is a web-based reading and analysis environment, designed to facilitate reading and interpretive practices. Voyant Tools is an open-source project and the code is available through .

A text analysis environment that combines visualisation, information retrieval, sense making and natural language processing to make the contents of text navigable, accessible, and useful.

Data analysis

A Java-based package for statistical natural language processing, document classification, clustering, topic modelling, information extraction, and other machine learning applications to text.

Data mining

Weka is a collection of machine learning algorithms for data mining tasks.

Timeline tools

Chronos Timeline allows you to dynamically present historical data in a flexible online environment.

Open source, allows you to create timelines and maps in minutes from a spreadsheet.

Visualisation tools

Visualisation and exploration software for all kinds of graphs and networks; open source and free.

A platform for visualising and analysing networks of historical data.

Free, open source data curation and visualisation plugin for WordPress.

A free data storytelling app where you can create and share interactive charts and graphs, maps and live dashboards.

App which can be used to generate graphs and statics and share the data and visualisations.

Mapping tools

A closed source, ArcGIS is a platform for building a complete geographic information system (GIS) that lets you easily create, edit, and analyse geographic knowledge. Free trial is available.

A plugin for Omeka, Neatline is a tool for the creation of interlinked timelines and maps as interpretive expressions of the literary or historical content of archival collections.

Free editable map of the world.

User-friendly, open source, geographic Information System; supports numerous vector, raster and database formats and functionalities.

Open source software developed by the Center for Geographic Analysis at Harvard. Allows you to build your own mapping portal and share it.

DISSEMINATION

Publishing

Open source content management system for supporting resources like blogs and web sites.

Omeka is web-publishing platform that allows anyone with an account to create or collaborate on a website to display collections and build digital exhibitions. A free basic trial is available for a hosted account;  is also free.

An open-source, web-based viewer for high-resolution zoomable images, implemented in pure JavaScript, for desktop and mobile.

Scalar is a free, open source authoring and publishing platform.

Free, easy-to-use web publishing platform originally designed around blogging that has now evolved with functionality as a robust content or learning management system, with many themes and plugins for extra functionality, a digital humanities toolkit designed for non-technical users;  allows readers to comment paragraph-by-paragraph, line-by-line or block-by-block in the margins of a text.