Advances in the electronic environment have led to a pronounced increase in the amount of classified information being produced. Staggering volume and scarcity of resources make the eventual human review of these records for declassification impossible. Human review as it is done today is estimated at two full-time employees (FTEs) per gigabyte. At one intelligence agency alone, the growth of classified records is approximately 1 petabyte (1 million gigabytes, ~49 million cubic feet of paper) every 18 months. The Government cannot dedicate 2 million FTEs a year to review 1 petabyte, much less over 20 million FTEs a year to review the tens of petabytes of classified records being created across the Government.
A Technological Solution for Both Declassification and Classification
Technology can be employed to address the challenges of mass declassification in a more accurate, cost-effective, and efficient manner. Existing technologies such as information retrieval tools, natural language processing, optical character recognition software, predictive analytics, and cloud computing can serve as a foundation for future innovation, but the Public Interest Declassification Board (PIDB) believes the most integral and necessary component to a new system will be a robust context accumulation capability.
Context accumulation is a means by which computers predict classification and declassification dispositions. Human inputs, either as priori rules or individual decisions based on classification and declassification guidance, direct the process. The system ingests the decisions of human reviewers to classify or declassify (in full or in part) pieces, categories, or associations of information, using reviewers’ determinations as the basis for future, automated decisions. The greater the body of knowledge (e.g., reviewer decisions, classification and declassification guides, previously released documents, open source material) ingested into the system, the better the predictions the computer would generate.
Based on these data points, the computer learns how to sort information into release and withholding bins. In instances of conflicting context, the system would require human input by reviewers and subject matter experts. The decisions of these individuals would train the system to better sort information. As the system learns through more human input, its declassification review ability will evolve. For those areas in which the system’s aptitude is sufficiently advanced, meticulous human review will no longer be necessary. In time, reviewers will be able to focus exclusively on evaluating those pieces of information identified by the system as posing unique challenges. Reviewers’ decisions on the rationale for the withholding or declassifying of this information will provide context to the system to address and sort all other appearances of that information. As more of these review precedents are established, the volume of records reviewable by the system will continue to increase.
Because a context accumulation tool ingests both declassification and classification guidance, any system could serve concurrently as an automated classification tool. Allowing computers to classify information minimizes the user burden. Moreover, automating classification reduces the potential for over-classification by ensuring that classification determinations are made in the strictest accordance with current policy and only in appropriate circumstances. The rationale for classification determinations would be digitally imprinted on documents, creating metadata which the system could later use to locate and declassify this information as policy guidance changes.
The accuracy and consistency afforded by this system will enhance information security and thus national security.
Approaches to Context Accumulation
A context accumulation system can be implemented in two ways. Policy guidance (“rules”) can be input at the onset and used as the basis for initial classification decisions, or decision-making criteria can be based entirely on the ingestion of documents whose classification status is then determined by individual reviewers based on existing policy guidance. Assigning rules at the onset entails a potentially protracted battle over classification standards but ensures that the computer begins with standardized criteria. Developing rules organically based on the ingestion of documents increases the likelihood for poor review from human error but mitigates intra- or interagency disputes over classification guidance and allows for quicker implementation.
Employing these technologies, and context accumulation tools in particular, would:
- Improve consistency and accuracy of classification and declassification decisions.
- Minimize instances of over-classification by automating routine classification and consistently aligning classification decisions with established guidance.
- Facilitate the immediate implementation of changes in classification and declassification guidance.
- Reduce the administrative burden of declassification reviewers, allowing them to focus exclusively on the rationale for declassification or classification.
- Audit individual human inputs to measure reviewer performance and identify areas for improvement.
- Identify to cleared users exactly what information is and is not classified or available through open sources.
- Reinforce classification standards in real time as documents are created in the classified environment.
- Replace document-level, pass/fail reviews with redaction-level reviews, significantly increasing the volume and quality of material declassified.
Technology and the National Declassification Center (NDC)
A research laboratory could be created within the NDC for launching and evaluating pilot projects that incorporate these technologies. Digital records collections of historically significant records could be used to field test a new system (based on either of the two approaches outlined above). The NDC provides an ideal interagency environment in which to share classification guidance and draw on the expertise of subject matter experts. Successful projects could encourage future interagency cooperation and innovation in this area.