The DHmine Framework
DHmine is an experimental toolkit for Digital Humanities projects. Its first protype focusing on automated document processing was created in 2015. A more advanced framework and set of new modules were published in 2019. See Installation and References for demonstrations.
The central component of our system is a cloud-based framework that integrates many kind of modules. This dockerized framework is open source and you can easily integrate your own software into it.
Main components of the system
- How can it help me?
If you're curious how DHmine could help your project read this summary.
- Shared Document Storage
The system consits of a corpus repository called Shared Document Storage. It stores documents, provides functions related to their handling, and it has various groupware capabilities. The repository works also as a command interface for the automated document processing subsystem called Agent-based Information Processing Framework.
- Data Repository
Data Repository can store any kind of information including texts and numeric data. Its main purpose is to provide a scalable and flexible storage for research data. The two storage systems are coupled to each other using automated mechanisms which can synchronize selected content from both storages.
- Agent-based Information Processing Framework
This information processing subsystem is an extensible framework of automated tools for various tasks like character recognition, document transformation, XML tagging, entity recognition and others.
- Stylometry analysis
DHmine allows you to perform stylometry analysis using a Web-based interface. Our solution relies on three tools: Stylo to perform the analysis, a heuristic corpus analyzer to suggest analysis options and an optimizer that tries to find the best parameter settings for a given corpus.
- Publishing support
All data stored in the system can be shared with the public using the Document Storage. Data or even services can be referenced by scientific papers published by DHmine users in order to support their findings.
- Installation
- Technical details