Article | Proceedings of the 2nd International Conference on Historical Cryptology, HistoCrypt 2019, June 23-26, 2019, Mons, Belgium | Using the Entropy of N-Grams to Evaluate the Authenticity of Substitution Ciphers and Z340 in Particular Linköping University Electronic Press Conference Proceedings
Göm menyn

Title:
Using the Entropy of N-Grams to Evaluate the Authenticity of Substitution Ciphers and Z340 in Particular
Author:
Tom S. Juzek: Saarland University, Campus A2.2, Rl.25, 66123 Saarbrilcken, Germany
Download:
Full text (pdf)
Year:
2019
Conference:
Proceedings of the 2nd International Conference on Historical Cryptology, HistoCrypt 2019, June 23-26, 2019, Mons, Belgium
Issue:
158
Article no.:
013
Pages:
117-125
No. of pages:
9
Publication type:
Abstract and Fulltext
Published:
2019-06-12
ISBN:
978-91-7685-087-9
Series:
Linköping Electronic Conference Proceedings
ISSN (print):
1650-3686
ISSN (online):
1650-3740
Series:
NEALT Proceedings Series
Publisher:
Linköping University Electronic Press, Linköpings universitet


Export in BibTex, RIS or text

The present paper uses information theoretic entropy as a means to evaluate the authenticity of homophonic substitution ciphers. We motivate the use of entropy on n-grams and then validate its applicability, by using it on various true ciphers and pseudo-ciphers. Differences in entropy allow us to apply further formal analyses, e.g. support-vector machines, in order to make predictions about a potential cipher’s status. We train several support-vector machines and validate them. We then apply the models to two classic ciphers, the Zodiac Killer’s first major cipher, z408, which has been solved, and his second cipher, z340, which remains unsolved. The models correctly identify z408 as a substitution cipher. z340 is classified as an advanced cipher or pseudo-cipher.

Keywords: cryptography cryptology substitution ciphers entropy n-gram-based analysis support-vector machines z340

Proceedings of the 2nd International Conference on Historical Cryptology, HistoCrypt 2019, June 23-26, 2019, Mons, Belgium

Author:
Tom S. Juzek
Title:
Using the Entropy of N-Grams to Evaluate the Authenticity of Substitution Ciphers and Z340 in Particular
References:

Asa Ben-Hur, David Hom, Hava T. Siegelmann, and Vladimir Vapnik. 2002. Support vector clustering. Journal of Machine Learning Research, 2: 125-137.

Eric Corlett and Gerald Penn. 2010. An exact a* method for deciphering letter-substitution ciphers. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL ’10, pages 1040---1047, Stroudsburg, PA, USA. Association for Computational Linguistics.

Edward W. Forgy. 1965. Cluster analysis of multivariate data: Efficiency versus interpretability of classification. Biometrics, 21(3):768-769.

Robert Graysmith. 1987. Zodiac. Berkley, New York City, NY.

Jeffrey Hoffstein, Jill Pipher, and Joseph H. Silverman. 2008. An Introduction to Mathematical Cryptography. Springer Publishing Company, Incorporated, 1 edition.

Dan Jurafsky, 2019. Lecture CS 124: From Languages to Information. Introduction to N-grams lecture notes. http://web.stanford.edu/class/cs124/.

Kevin Knight, 2013. Decipherment Tutorial. Workshop at the 20!3 Annual Meeting of the Association for Computational Linguistics - lecture notes. https://kevincrawfordknight.github.io/extra/acl-tutorial-13-decipher-final.pdf.

Stuart P. Lloyd. 1982. Least squares quantization in pcm. IEEE Transactions on Information Theory, 28:129-137.

Project Gutenberg, 2018. Project Gutenberg. http://www.gutenberg.org.

Python Software Foundation, 2018. Python: A dynamic, open source programming language. https://www.python.org/.

Sujith Ravi and Kevin Knight. 2008. Attacking decipherment problems optimally with low-order ngram models. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP ’08, pages 812-819, Stroudsburg, PA, USA. Association for Computational Linguistics.

Sujith Ravi and Kevin Knight. 2011. Bayesian inference for zodiac and other homophonic ciphers. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT ’11, pages 239-247, Stroudsburg, PA, USA. Association for Computational Linguistics.

Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell System Technical Journal, 27(3):379-423, 7.

The Wikimedia Foundation, 2018a. Wikipedia, the free encyclopedia. https://wikipedia.org/

The Wikimedia Foundation, 2018b. Wikisource, the free library. https://en.wikisource.org/wiki/Author:Zodiac_Killer#Letters.

Zodiackillerciphers.com, 2012. Annotated solution to the 408 cipher, based on the Harden worksheets. http://zodiackillerciphers.corn/408/key.html#1.

Proceedings of the 2nd International Conference on Historical Cryptology, HistoCrypt 2019, June 23-26, 2019, Mons, Belgium

Author:
Tom S. Juzek
Title:
Using the Entropy of N-Grams to Evaluate the Authenticity of Substitution Ciphers and Z340 in Particular
Note: the following are taken directly from CrossRef
Citations:
No citations available at the moment


Responsible for this page: Peter Berkesand
Last updated: 2019-10-02