PhiloLogic is a "primary full-text search, retrieval and analysis tool" developed by the University of Chicago. PhiloLogic leverages the "wide array of XML data specifications and the recent deployment of basic XML processing tools provides an important opportunity for the collaborative development of higher-level, interoperable tools for Humanities Computing applications". Recognizing that the TEI community does not suit a one-size-fits-all approach, PhiloLogic focuses on the development of "specialized, interoperable tools" that can be implemented for end-user applications in a cost-effective manner. PhiloLogic is committed to the open source development of these applications: drawing on a wide range of technical abilities and expertise that is "not well supported by the commercial sector".
Tesseract is a free raw OCR engine originally developed by HP Labs and now maintained by Google. It works with the Leptonica Image Processing Library, and is capable of reading a variety of image formats. It can convert images to text in over 40 languages. [Credit to TAPoR for this exceptional annotation]
PDF Extract is an open source set of tools that "allow you to identify and extract the individual references from a scholarly journal article". PDF Extract utilizes the visual clues present in an academic article via formatting to "identify semantically important areas of a PDF" and facilitate appropriate extraction of material. PDF Extract was created to assist "small and medium-sized publishers to meet CrossRef’s linking requirements and to participate in CrossRef’s Cited-by service".
"Sheetsee.js is a client-side library for connecting Google Spreadsheets to a website and visualizing the information in tables, maps and charts". Sheetsee.js' "features are divided into modules": sheetsee-core, which gets new users started working with and visualizing data simply; sheetsee-tables, which contains all of the function necessary to sort data into columns etc.; sheetsee-maps, built on map box.js and transforms spreadsheet data into a map; and finally sheetsee-charts, which includes basic line, bar, and pie charts.
"VisualEyes is web-based authoring tool developed at the University of Virginia to weave images, maps, charts, video and data into highly interactive and compelling dynamic visualizations". VisualEyes facilitates the presentation of traditional and multimedia primary resources in a manner that encourages "active inquiry and hands-on learning among general and targeted audiences". The aim of VisualEyes is to reveal relationships between multiple and unlikely datasets. "VisualEyes is fairly available for academic and non-profit use"
Kaleidoscope is a tool designed to spot the differences in text files (text scope), images (image scope) and folders (folder scope) in seconds. Text scope allows user to compare different text files to spot difference and discrepancies. It also facilitates instantaneous merging of documents. Image scope provides four different comparative layouts that assist the user in spotting and analyzing the differences between image files. Images can be arranged two-up, one-up, split, or difference. Finally, folder scope allows users to compare directories and to clone files from one to the other.
"Graphviz is open source graph visualization software" that has been in constant development since 1988. Graph visualizations provide an avenue for representing information as abstract diagrams or networks. While originally designed for bioinformatics and software engineering, Graphviz is flexible in structure and is highly applicable to humanities work. Graphviz functions by transforming simple text language into useful diagrams. The user is able to customize the Graphviz graphics by altering the colour, fonts, nodes, layouts, hyperlinks, and shapes to create specialized diagrams specifically suited to the project's needs.
CulturalAnalytics is a program that facilitates "code for statistical analysis and plotting of image properties". CulturalAnalytics uses R to generate visualizations such as histograms, colour clouds, or image scatter charts. CulturalAnalytics was designed for use in the digital humanities and is of value to any scholar who is interested in analyzing digital or digitized images.