As we've discussed here and on the podcast, I'd like to use document retrieval & natural language processing ("big data") technology to prove that the "Flavian Signature" is an objective, verifiable fact which can be statistically validated. The idea is to use passages from Luke and the other gospels as a type of "search query", and look for sections in Josephus Wars that are judged as 'good matches' by the document comparison / retrieval algorithm. Hopefully, the linear sequence of matches will emerge from this process, creating a statistically verifiable link between the Gospels and JW. Also, at the level of individual passages and paragraphs, we can get an idea of the objective statistical strength of the parallel.
So far, what I've been able to accomplish is to download some tools from the Internet: a trained set of "Context Vectors" from Google Research, and utility packages known as 'NLTK' (Natural Language Processing Tools) and 'Gensim', an open source document retrieval system by Radim Rehurek. This software is in Python, so I have a learning curve: in the past, I've used C++ and Tcl/Tk for doing this sort of work, but use of those languages is fading fast, while Java and Python are in the ascendency. Python is conceptually pretty similar to Tcl/Tk: interpreted, dynamically typed, and syntactically very compact. But I'm finding it to be a huge improvement in terms of the readability of the code, and I feel that Tcl/Tk has been defeated by a worthy opponent. Learning Python is, I think, time well spent.
Also, I've obtained the texts of the KJV bible and all Josephus' works (Whiston translation) in plaintext (UTF-8) format, from gutenberg.org, and written some glue code to parse the texts and return the chapter and verse sections as documents.
Finally, I've written some code that can compare two documents on a "bag of words" basis, word for word, looking for statistically significant word matches. Words are considered "matched" either if they are identical, or judged similar based on "context vector" cosine distance. Matches are evaluated using a modified TF-IDF function. I tried this out using Matt. 23:1-25:1 as the "search query" across Josephus Wars, parsed into 824 "documents" (Whiston book, chapter and section numbers, but not verse numbers.) The Flavian Signature match for this pericope is considered to be JW Book VI, Chapter 5, sections 1-3 and extending into the first verse of VI.6.1 (verses 271-316), according to Giles' analysis, which is similar but not exactly the same as Joe's. See:
http://postflaviana.org/community/i...city-destruction-of-the-temple-doomsday.1492/
for the color coded match analysis. I feel this is the most powerful of any of our parallels, which is why I started here.
The algorithm found JW VI.5.2 and VI.5.3 in the top 5 matches, VI.5.3 was #2. (Actually, by tweaking the scoring system, it's possible to get various other results.) The top winner by this scoring formula was JW V.10.5, which reads:
So far, what I've been able to accomplish is to download some tools from the Internet: a trained set of "Context Vectors" from Google Research, and utility packages known as 'NLTK' (Natural Language Processing Tools) and 'Gensim', an open source document retrieval system by Radim Rehurek. This software is in Python, so I have a learning curve: in the past, I've used C++ and Tcl/Tk for doing this sort of work, but use of those languages is fading fast, while Java and Python are in the ascendency. Python is conceptually pretty similar to Tcl/Tk: interpreted, dynamically typed, and syntactically very compact. But I'm finding it to be a huge improvement in terms of the readability of the code, and I feel that Tcl/Tk has been defeated by a worthy opponent. Learning Python is, I think, time well spent.
Also, I've obtained the texts of the KJV bible and all Josephus' works (Whiston translation) in plaintext (UTF-8) format, from gutenberg.org, and written some glue code to parse the texts and return the chapter and verse sections as documents.
Finally, I've written some code that can compare two documents on a "bag of words" basis, word for word, looking for statistically significant word matches. Words are considered "matched" either if they are identical, or judged similar based on "context vector" cosine distance. Matches are evaluated using a modified TF-IDF function. I tried this out using Matt. 23:1-25:1 as the "search query" across Josephus Wars, parsed into 824 "documents" (Whiston book, chapter and section numbers, but not verse numbers.) The Flavian Signature match for this pericope is considered to be JW Book VI, Chapter 5, sections 1-3 and extending into the first verse of VI.6.1 (verses 271-316), according to Giles' analysis, which is similar but not exactly the same as Joe's. See:
http://postflaviana.org/community/i...city-destruction-of-the-temple-doomsday.1492/
for the color coded match analysis. I feel this is the most powerful of any of our parallels, which is why I started here.
The algorithm found JW VI.5.2 and VI.5.3 in the top 5 matches, VI.5.3 was #2. (Actually, by tweaking the scoring system, it's possible to get various other results.) The top winner by this scoring formula was JW V.10.5, which reads:
It is therefore impossible to go distinctly over every instance
of these men's iniquity. I shall therefore speak my mind here at once
briefly:--That neither did any other city ever suffer such miseries,
nor did any age ever breed a generation more fruitful in wickedness than
this was, from the beginning of the world. Finally, they brought
the Hebrew nation into contempt, that they might themselves appear
comparatively less impious with regard to strangers. They confessed
what was true, that they were the slaves, the scum, and the spurious
and abortive offspring of our nation, while they overthrew the city
themselves, and forced the Romans, whether they would or no, to gain a
melancholy reputation, by acting gloriously against them, and did almost
draw that fire upon the temple, which they seemed to think came too
slowly; and indeed when they saw that temple burning from the upper
city, they were neither troubled at it, nor did they shed any tears on
that account, while yet these passions were discovered among the Romans
themselves; which circumstances we shall speak of hereafter in their
proper place, when we come to treat of such matters.
of these men's iniquity. I shall therefore speak my mind here at once
briefly:--That neither did any other city ever suffer such miseries,
nor did any age ever breed a generation more fruitful in wickedness than
this was, from the beginning of the world. Finally, they brought
the Hebrew nation into contempt, that they might themselves appear
comparatively less impious with regard to strangers. They confessed
what was true, that they were the slaves, the scum, and the spurious
and abortive offspring of our nation, while they overthrew the city
themselves, and forced the Romans, whether they would or no, to gain a
melancholy reputation, by acting gloriously against them, and did almost
draw that fire upon the temple, which they seemed to think came too
slowly; and indeed when they saw that temple burning from the upper
city, they were neither troubled at it, nor did they shed any tears on
that account, while yet these passions were discovered among the Romans
themselves; which circumstances we shall speak of hereafter in their
proper place, when we come to treat of such matters.
Last edited: