31 January 2022

Literal matches: minimal uniquity and maximal extension

Let us consider the raw Greek texts of the Old and New Testaments. Each biblical book will be considered as a separated text, without spaces and punctuation, and each psalm will be internally defined as a unique book (see Acts 13:33 that references to Psalm 2). Otherwise no chapters or verses will be numbered: there is no numbering in the Greek manuscripts either. Even if the Hebrew original text was indeed split into passages in some books, the Greek text lost all such special features (including Psalm 119 and 145, or some parts of Lamentations, or the last verses of Proverbs). We will not distinguish between the Greek characters σ and ς (there is no difference between them in the manuscripts). As a result we work with 24 letters: each book will be a flow of 24 letters and no other information.

For convenience we will always convert the Greek letters to Latin characters. This will speed up some algorithmic steps because it is simpler to handle the Latin alphabet in computer programs. Most Greek characters have a natural correspondant among the Latin characters. Let's give it a short try:

Then enter the command text1 αβγδεζηθικλμνξοπρστυφχψω – it will copy the given text in the first clipboard and inform you that the text will be internally stored as abgdeqhuiklmnjoprstyfxcv.

We can try if the stored text is present in the Bible. By issuing the command find1 LXX or find1 SBLGNT the answer is clear: no. A full list of the ordered Greek alphabet is missing from the texts. But we can surely ask if the words “Ιησου χριστου” (Jesus Christ) appear in the New Testament: first we store the Greek or the Latin text in the second clipboard by using the command latintext2 ihsoyxristoy, and then start a search: find2 SBLGNT. You will find 106 occurrences, but this is, of course, just a part of the story, because sometimes only the word “Jesus” is present (in 913 occurrences), or sometimes only “Christ” (in 249 occurrences). By using the inclusion–exclusion principle we may conclude that there are \(913+249-106=1056\) explicit mentions of Jesus Christ in the New Testament.

Interestingly enough, the Old Testament also contains several passages with the words “Jesus” (275 times, but they are used differently, for example, for another person Joshua) and “Christ” (12 times), but no “Jesus Christ” appears. The revelation of this piece of information is delayed to the New Testament.

Quotations, in general, appear in texts that were written later than the quoted text. In most cases we assume that the quoted text is unique, otherwise it is difficult to identify where the quoted text comes from. For a longer quotation it is usually unequivocal that the quoted text comes from a certain book, but there can be exceptions. The longer the text, the easier the identification of its source.

To be as precise as possible, we will call a passage from the New Testament a quotation, and its source in the Old Testament the quoted text. There can be multiple quotations of the same quoted text. For each quotation we expect an introduction, either an explicit one or some implicit indication that a quotation follows. In many cases the source of the quoted text is explicitly given in the introduction, but in other cases there is no exact source given.

Some quoted texts in the Old Testaments are not completely unique. Such passages are, for example, the Ten Commandments and some parts of the Psalms. In such cases we make the connection with the first appearance of the quoted text: for example, for the Ten Commandments we always choose the passages from Exodus and not from Deuteronomy.

Some quotations in the New Testaments are repeated. For example, Romans 4:9 repeats the quoted text which already appears in Romans 4:3 too. In such cases we acknowledge the fact of a quotation, but do not count that as an extra one, just as a “repeated quotation”. We do not show these repetitions in the diagrams either.

An example of two quotations of the same quoted text

Let us consider Deuteronomy in LXX. It contains 108557 Greek characters. We search for the passage “Ουκ εκπειρασεις κυριον τον θεον σου” (“Thou shalt not tempt the Lord thy God”) in Deuteronomy 6:16 6:16-33 that is expected to be quoted in Luke 4:12. We issue the command lookup1 LXX Deuteronomy 6:16 6:16-33 and store the Greek text ουκεκπειρασειςκυριοντονθεονσου in clipboard 1. The stored text begins at position 23212 and has a length of 30 characters. It is unique in LXX. We can issue the command find1 LXX to check this. Eventually we can count the length of the passage by using the command length1.

Our aim is to find this quoted text in the New Testament. As a candidate, we may consider Luke that contains 94627 Greek characters. The command find1 SBLGNT will however find two matches: Both results seem like good candidates and by reading the Greek text we find their introductions (“Παλιν γεγραπται”: “it is written again”, in Matthew, and “Ειρηται”, “it is said”, in Luke) as well. So we identify both Matthew 4:7 and Luke 4:12 and their near neighborhood as verified quotations and visualize them with the following diagrams:



Extension of a match

Now let us consider another example from Psalm 91 and the passage “τοις αγγελοις αυτου εντελειται περι σου του διαφυλαξαι σε” (“he shall give his angels charge over thee, to keep thee”), having the length of 49 characters. This text is unique in LXX and matches to the passage in Luke 4:10. On the other hand, the text can be extended to a longer match by adding three more letters before the word τοις in the beginning. To find the longest possible extension we can use the bibref command extend:
  1. lookup1 LXX Psalms 91:11+3 91:11-20
  2. find1 LXX
  3. find1 SBLGNT
  4. extend LXX SBLGNT Luke 4:10+15 4:10
The obtained result is We will call the obtained text the maximal extension of a quoted passage.

Minimally unique texts in the Old Testament

Finally, we have a look on Psalms 82:6. It is well known that its beginning is quoted in the last part of John 10:34: “εγω ειπα θεοι εστε” (“I said, Ye are gods”). This text is unique in LXX, but the last two words alone are not unique, because they are present at the end of Isaiah 41:23:
  1. lookup1 LXX Psalms 82:6+7 82:6-20
  2. find1 LXX
We get two occurrences: This suggests introducing the notion of a minimally unique passage in the Old Testament. If we prepend another letter to these two words, we get a unique passage:
  1. lookup1 LXX Psalms 82:6+6 82:6-20
  2. find1 LXX
That is: But, interestingly, if we remove the last 3 characters, we still have a unique match:
  1. lookup1 LXX Psalms 82:6+6 82:6-23
  2. find1 LXX
This is not true anymore if we remove another character from the end:
  1. lookup1 LXX Psalms 82:6+6 82:6-24
  2. find1 LXX
Now we are ready to define the notion of minimal uniqueness. A passage in the Old Testament is minimally unique if it is unique in the Old Testament, but after removing its first or last character the resulted passage is not unique any longer. The bibref program has an internal algorithm to find minimally unique passages in clipboard 1: one needs to issue the command minunique1 in the following way:
  1. Clipboard 1 must be prepared, e.g. by using lookup1 LXX Psalms 82:6+6 82:6-20
  2. The command minuniqe1 LXX must be issued.

Practical uses

Now we can find these useful statements:
  1. Let us assume that a New Testament quotation contains a unique passage \(U\) from the Old Testament. Then \(U\) can be extended to \(E\) where \(E\) is the maximal extension: here \(U\) is a substring of \(E\).
  2. Let us assume that a New Testament quotation quotes a unique passage \(U\) from the Old Testament. Then there is a minimally unique text \(M\) that is a substring of \(U\).
These statements will help us find probable quoted texts by starting with short texts \(S\) in the Old Testament, and extending them to be minimally unique (\(M\)). Then, if there is a match in the New Testament, that is, \(M\) is a part of a book in the New Testament, then we will extend \(M\) to the maximal extension \(E\). In a next blog entry we will find an effective algorithm to do this mechanically.

Exercises

  1. Find the word Maria in the New Testament. What can we conclude on Mary's importance compared to Jesus', according to the obtained statistics?
  2. Find all minimally unique passages in Psalm 82:6.
  3. We know that Mark 12:11 is a part of a quotation. Find the maximal extension of verse 11. How long is it?


Entries on topic internal references in the Bible

  1. Web version of bibref (12 January 2022)
  2. Order in chaos (17 January 2022)
  3. Reproducibility and imperfection (20 January 2022)
  4. A student of Gamaliel's (23 January 2022)
  5. Non-literal matches in the Romans (26 January 2022)
  6. Literal matches: minimal uniquity and maximal extension (31 January 2022)
  7. Literal matches: the minunique and getrefs algorithms (1 February 2022)
  8. Non-literal matches: Jaccard distance (2 February 2022)
  9. Non-literal matches in the Romans: Part 2 (3 February 2022)
  10. A summary on the Romans (5 February 2022)
  11. The Psalms (6 February 2022)
  12. The Psalms: Part 2 (7 February 2022)
  13. A classification of structure diagrams (15 February 2022)
  14. Isaiah: Part 1 (19 February 2022)
  15. Isaiah: Part 2 (26 February 2022)
  16. Isaiah: Part 3 (2 March 2022)
  17. Isaiah: Part 4 (7 March 2022)
  18. Isaiah: Part 5 (15 March 2022)
  19. Isaiah: Part 6 (23 March 2022)
  20. Isaiah: Part 7 (30 March 2022)
  21. A summary (7 April 2022)
  22. On the Wuppertal Project, concerning Matthew (17 July 2022)
  23. Matthew, a summary (25 July 2022)
  24. Isaiah, a second summary (31 July 2022)
  25. Long false positives (23 August 2022)
  26. A general visualization (25 August 2022)
  27. Stephen's defense speech (19 September 2022)
  28. Statistical Restoration Greek New Testament (31 July 2023)
  29. Qt version of bibref (11 March 2024)
  30. Statements on Bible references (5 August 2024)

Zoltán Kovács
Linz School of Education
Johannes Kepler University
Altenberger Strasse 69
A-4040 Linz