Document Management
Scope
- Document Management System (DMS) Project for scanning, storing, indexing, and sharing books.
- DMS project is limited to scanned documents, though it certainly can be expanded to include any digitally stored content outside of the scope of this project.
Ownership and Copyright Issues
- Access to all copyright content is restricted to Accounts in the DMS. The DMS Accounts are the Owners of the content.
- Contemporary or possibly contested materials will require additional Check In/Out protocol and content expiration for checked out documents.
- Content explicitly requiring only a single owner is owned by the Account that checks out the document. The Check-out Account is the document's owner for the duration time from check-out to check-in or content expiration.
- Anonymous access can only be granted to the overall Titles catalog and documents that have licenses allowing public domain access. (e.g. Creative Commons, GPL, etc.)
- This DMS is not to be used to illegally store or transmit content.
Proof of Concept
Physical Requirements
Bill of Materials
Document Scanner | Auto Doc Feed (ADF), Duplex, and Flatbed | Epson GT3000? (looking into it) | Used but Functional or $2500 |
Binding Paper Cutter | 300 pages capacity Guillotine Cutter | Band Saw can be used for concept or Stack Paper Cutters | New Blade for saw or $600 |
Belt Sander | 220 Grit? | For use if Band Saw burr or other separation issues (spilled beer etc) | |
Input PC Workstation | 2.5GHz, 2GB RAM, Firewire or SCSI for Epson GT scanner, Windows 7, 32 bit, Adobe Acrobat X Prof, 1GbEth | 1280x1024 monitor or touch screen, keyboard, mouse, etc. | |
Input Operational Foot Print | A sturdy induction workspace, 3'x5' minimum | 1/4 for Terminal, 1/4 for Document Prep, 1/2 for Scanner | |
DMS Server VM | LAMP or WIMP, 500GB drive space (VHD expandable is OK) | ||
DMS Backupcache VM | Windows or Linux 500GB drive space (VHD expandable is OK) | Different Host from DMS server | |
DMS Public Interface Server VM | LAMP or WIMP minimal for web application to access the documents over the Internet | If security requires this | |
Subversion Edge Server VM | LAMP, Collab.net Subversion Edge, Ubuntu 12.0.4LTS (server), 120GB (VHD expanding OK), 2GB RAM (1GB will work if RAM is a constraint) | Can be used for the DMS and other software development projects at UAS |
System Constraints
System constraints are requirements or conditions that are expected to limit the overall performance of the system. In this case the system constraints will limit the rate that new documents can be added to the DMS. Exploiting these constraints will maximise the throughput of the DMS.
- Document Preparation
The ADF requires that the binding be removed and the pages manually separated prior to loading.
- Document Scanning
An operator must load the ADF, operate the scanner, and monitor its operation. The operator must be skilled enough to correct miss-feeds and jams
- Document Managment
The scanner operator must identify the document and enter that information into the DMS. The operator must associate the scanned document file with the DMS.
Software
The architecture of the DMS will be Client-Server.
DMS client workstation performs document and data input
- Windows 7/32bit operating system (32bit for easier scanner driver support)
- Acrobat X (current version) Professional version with SDK for COM automation
- Visual Studio Professional (MSDN) for VB and C# DMS applicaton development
- MySQL Client to interface with MySQL running on the DMS server.
DMS server provides a central database server and documents repository
- MySQL Server for storing Document Indexing, User Account, and other DMS information
- Repository File Server for storing PDF documents
Backup cache file server
- Maintain a backup of the documents
Because human operation is a constraint, as soon as a document is committed to the repository, a copy should be transmitted to a backup to minimise the possibility that a document would need to be re-scanned.
- Maintain a backup of the database
- Maintain a backup of the source code repository.
Public facing server(s)
- Custom Web Application available DMS accounts to access the documents
- Subversion Edge for software developers' version controlled source code repository
Note
- For proof of concept, some of the above servers can run on the same machine. Only the backup cache should be on a separate physical machine than the main data.
- Technically, the scanner workstation does not store any critical data, so it does not need to be backed up, other than maybe an image backup for ease of reconstruction due to hardware failure or catastrophic software update.
Scanner PC Client Software
General Requirements
- Connection to Network Shared File Storage for the PDF documents.
- Scanner Interface Drivers (e.g. TWAIN or vendor proprietary)
Optional Support Software
- Software Development Tools for the Scanner Application
- Database Management Tools
Adobe Acrobat Manual PDF Creation
- Import from Scanner Feature
Custom Application
- Automate Acrobat's Import from Scanner method to scan documents
- Transfer scanned documents to the server
- Associate Scanned Documents with Database Records
- Manage Document Lifecycle properties (e.g. expiration)
- Manage Document Security attributes (e.g. ownership, security level, access control)
DMLS Server Software
- Store and secure document files
- Database for indexing documents
- Provide a Public Interface for authenticating users and processing their requests for documents.
- Assign Security to documents and transmit them
IMPLEMENTATION PHASE 1
Prepare Scanner Workstation and Document Server for Development
(TBD) = To Be Determined
Scanner Workstation Setup
- 1. PC Workstation, Windows 7 Pro/32, install updates, test
- 2. Document Scanner, FireWire or SCSI, install drivers, test
- 3. Install Acrobat, test Import from Scanner
== Scanner Workstation Development Environment Setup
- 1. Install Microsoft .NET 4 extended (from the optional updates)
- 2. Install Visual Studio 2010 Pro
- 3. Install SQL Server Express 2008 Management Studio Express
Document Server Setup
- 1. Server VM, Windows Server 2008 R2 Std, install updates, test
- 2. Install File Services Feature, Configure Share for Documents Repository
Note: The only systems that should be able to access the Repository are the Server and Scanner Workstation.
- 3. Configure Scanner Workstation to access the Document Repository Share, test
- 4. Enable Remote Desktop for administrative and developer access through RDP client.
Document Server Development Environment Setup
- 1. Install Microsoft .NET 4 extended (from the optional updates)
- 2. Install IIS Role with ASP.NET extension and other options (TBD)
- 3. Install Visual Studio 2010 Pro with SQL Server Express 2008. Include Visual Basic, C#, and C
- 4. Install SQL Server Express 2008 Management Studio Express
- 5. Install Acrobat