17 July 2022

On the Wuppertal Project, concerning Matthew

In this blog entry I give an overview on my first analysis of the Wuppertal project after having a detailed look at its entries about the Gospel of Matthew. The Wuppertal project was a major research project a couple of years ago, led by the well-known theologists Martin Karrer and Siegfried Kreuzer. It aimed at collecting all quotations of the Septuagint in the New Testament, by using a manually created database, via digitizing ancient manuscripts.

In this analysis I focus on those aspects that were important for my research during the last months. Surprisingly, I learned that several basic concepts are very similar in both projects. There are, however, a couple of substantial differences. I also learned that I may need to extend my database by important entries that exist in the Wuppertal database, and there is indeed a good reason to add them in my research as well. On the other hand, I found some issues in the Wuppertal database that might be addressed by its authors or maintainers someday.

The major difference between the Wuppertal project (W) and bibref (B) is that W uses several variants of the texts. Basically all available manuscripts are listed in the database. By contrast, B uses only a plausible version of the texts.

The traditional way to store the Bible verses is to enumerate words in a verse. This is how W stores a Bible verse &endash; the database calls this a counted text. But in B I chose a different way to avoid a biased decision on where exactly the spaces are meant. As it is well known, the earliest manuscripts were written in scriptio continua. Thus, it seems more objective to store a book as a continuous script without any spaces and refer to a part of it by its beginning and ending letter positions. These technical differences between W and B make some differences in multiplying the quotations in the W database: for example, W stores the Matthew 4:15-16 quotation (taken from Isaiah 9:1-2) as two entries, while B stores the passages as one entry. I think B has a more logical and unbiased approach here. (In fact, the numbering of verses are sometimes slightly different in the projects. In the example above, W used the numbering Isaiah 8:23 and Isaiah 9:1 for the two quoted parts.)

Also, accents are removed from the read text (Lesetext in German) and shown as base text (Basistext in German) in W. I have the same approach in B. In fact, accents are not present in the original manuscripts, so this is something reasonable.

W has a system to classify the quotations (Zitaten in German). Most quotations are marked (markiert in German), this means that there is some introduction before the quotation given. I used a similar system in B. Sometimes, however, it is difficult to tell if a quotation is indeed marked, or the text that is being considered introductory refers to something else.

In some cases, however, I found that the introductory text is present, but W misses that marking and classifies the quotation as unmarked. Such an example is Matthew 9:13: its introduction “ει δε εγνωκειτε τι εστιν” (“if however you had known what is”) seems like a marking.

W lists all quoted texts from the Old Testament, not just the first occurrence. This is conspicuous for the repetition of Law in Deuteronomy. For example, Matthew 15:4 quotes Exodus 20:12 and Deuteronomy 5:16 at the same time, but B lists only the first occurence (that is, Exodus 20:12). On the other hand, sometimes repetitions occur in a different way, for example, Matthew 4:10 quotes Deuteronomy 6:13, but at the same time there is a repetition of the quoted text in Deuteronomy 10:20. In this case W lists both occurrences, but B only the first one.

W uses a classification term called free quotation (freies Zitat in German). This is useful to express that a quotation is not a verbatim copy of the quoted text. In B, this fact is usually labelled with a fuzzy marker in the string comparison approach. But, in a couple of cases W claims that the quotation is free, but B identifies the match a bit differently. For example, Matthew 13:35 is identified as a free quotation in W: the quotation is “ανοιξω εν παραβολαις το στομα μου ερευξομαι κεκρυμμενα απο καταβολης κοσμου”, the quoted text reads “ανοιξω εν παραβολαις το στομα μου φθεγξομαι προβληματα απ αρχης”. By contrast, B identifies only the first 6 words (28 letters) of the two texts as a match, and it ignores the rest of both texts.

A bit more delicate difference occurs at Matthew 4:10. W identifies this as a free (marked) quotation, and B contains the internal comments “changed” and “fuzzy” that refer to a similar observation. In fact, B identifies the quotation as a class 6 (4/5) passage. In fact, several quotations are of this type in B. But, the main challenge is to decide if a quotation is closer to being “free” or not. For example, the structure diagrams of Matthew 2:6, 2:18, 4:6, 12:18 and 26:31 suggest a class 4/5 identification in B. But W uses the classification of a “free” quotation for Matthew 19:18 (this is a class 4 quotation in B) and Matthew 19:19 (this is a class 1 quotation in B), without mentioning marking at all. I think these are minor issues in W and they could be fixed someday.

Another example for another type of difference is the classification of Matthew 13:13-15. The first verse is identified as a marked free quotation in W, however, W admits (in an internal comment) that the used expression in Matthew 13:13 is not a real marking and differs from the other ones in the database. B skips this first verse.

In certain cases during my work on B I skipped some verses that are stored in W. For example, W identifies Matthew 7:23 as a quotation of Psalms 6:8. I think the introductory text “και τοτε ομολογησω αυτοις οτι” (“and then I will declare to them”) can hardly be identified as an introduction for a quotation. Also, the quoted text is not completely unique because the next phrases seem to be non-unique expressions in the Old Testament. (Actually, getrefs SBLGNT LXX Psalms 6:8 returns a 20 characters long match from Luke 13:27 which has a similar message to Matthew 7:23. Luke's text seems a bit closer to Psalms 6:8, and because of similarity one may argue that also Matthew quotes Psalms 6:8 here.)

The last entry in W and B is Matthew 27:9. This is connected to Exodus 9:12 and Jeremiah 18:2 in W, but to Zechariah 11:13 in B. I could not figure out why Exodus 9:12 is mentioned in W, but Jeremiah 18:2 is quite clear: Matthew refers to Jeremiah, and the potter's field (or potter's house, both wordings are taken from KJV). But, this reference is incorrect here, in my opinion. Instead, the correct link is Zechariah 11:13, referring only to the 30 silver coins (“και ελαβον τους τριακοντα αργυρους” in LXX, “Και ελαβον τα τριακοντα αργυρια” in SBLGNT), with a 24% Jaccard distance. See also a possible explanation why Jeramiah is mentioned by Matthew and not Zechariah (in short: in the Hebrew Bible the collection of the Prophets began with the book of Jeremiah, thus “Jeremiah” could simply refer to all books of the Prophets).

I found some inconsequency in B as well. Matthew 15:4 is split into two quotations in B, and W lists them as a single entry. On the other hand, Matthew 19:4-5 is listed as a single entry in B, but W splits it into two quotations. In both cases there are introductory texts for each part, so a splitting for both cases could be expected. That is, for the first case, I suggest that W should split the entry into two, and in the second case, B should do the same. One may still say that for Matthew 19:4-5, the source given in Matthew 19:4 is valid for both quoted texts, so the two parts are somewhat related. Nevertheless, I fixed my version accordingly by assuming that Matthew 19:4-5 is a combination of two quotations.

Missing database entries in B

The most interesting differences were, of course, those entries in W that were missed by me during my former research. After detecting them, I was extremely happy, like the one who found a treasure hidden in the field.

In fact, I was very cautious at each candidate entry. For example, Matthew 2:23 was a real candidate because of the introduction: “οπως πληρωθη το ρηθεν δια των προφητων οτι” (“so that it should be fulfilled that having been spoken through the prophets that”). The following words, “Ναζωραιος κληθησεται” (“a Nazarene will be called”), are considered to reflect Judges 13:5 and 16:17, that contain the single word “ναζιραιον” and “ναζιραιος” (nazirite). Is this an intentional word play between the two, or just a random match? W assumes this intentional, but B assumes here that the connection is between the Hebrew version of Isaiah 11:1 and Matthew 2:23 – the connection is the branch of Jesse, where the Hebrew word נצר (nê-ṣer or nëtzer) is very similar to the sound of “Ναζωρ”. In this case I was not completely convinced by the decision of W and I kept my former choice on Isaiah 11:1. Actually, this entry is a hidden one in B, because the Greek text of LXX does not play any role here.

Another difference was that W identifies Matthew 2:6 as a mosaic of quotations. Besides Micah 5:2 it lists II_Samuel 5:2 (and some similar passages that are internal quotations of LXX of this). I learned that W's entry is correct, and also that the getrefs algorithm does not find the match because II_Samuel 5:2 is repeated in I_Chronicles 11:2. So I decided to add this entry to B as well.

A major surprise for me was that W lists Matthew 12:40 and its quoted text Jonah 1:17. I missed this one in B. The beautiful thing here is that a verbatim quotation of 51 letters can also be found with the getrefs algorithm.

I also missed to finalize Matthew 8:17 because I was not persistent enough to analyze the text of Isaiah 53:4. Now the missing entry was added fully.

W lists Matthew 13:32 as a quotation of Psalm 104:12. I think this is, however, a false positive. W lists this as a free quotation (without introduction), but, in fact, only the expression “τα πετεινα του ουρανου κατασκηνωσει…” (“the birds of the air and perch…”) matches (not fully). Actually, as the getrefs algorithm points out, Matthew 8:20 (and Luke 9:58 and Mark 4:32) could be even better matches (because of their full matches, since the word “και” is not inserted like in Matthew 13:32). Anyway, I found no evidence that Matthew 13:32 should be listed as a quotation of Psalm 104:12 so I did not make any changes on B concerning this difference.

W mentions Malachi 3:1 as a quoted part of Matthew 11:10. B contains only a link to Exodus 23:20. It is correct that both passages from LXX are similar to Matthew 11:10, but the text from Exodus matches fully (on 43 characters). The matching text in Malachi is shorter and fuzzier (“ιδου εγω [εξ]αποστελλω τον αγγελον μου”, 31/29 characters long), and Malachi can be considered chronologically newer. So it is quite reasonable to connect Matthew 11:10 to Exodus 23:20 and not to Malachi 3:1. Therefore, I did not insert this link from W into B.

Also, W mentions Leviticus 20:9 as a source of quotation for Matthew 15:4. By contrast, B does not have such an entry. In fact, Leviticus 20:9 is quite similar to Exodus 21:17 (which is mentioned by both W and B), but the latter one is a closer one (with Jaccard distances 58% and 35%). Also, Leviticus can be considered chronologically newer. So I decided to not include Leviticus 20:9 in B and keep the original state.

W lists Matthew 18:16 as a quotation of Deuteronomy 19:15. There is only one introductory word that supports this entry (“ινα”), which is of course quite short and a bit of uncertainty, but later Paul uses the same quoted text in II_Corinthians 13:1 (actually, without any introduction). So this introduces an interesting situation: Paul's clear allusion of Deuteronomy 19:15 makes Matthew's text reliable to be a quotation, but II_Corinthians 13:1 cannot be considered as a complete quotation because of the lack of introduction! That is, it does make sense to add such entries into the database with a kind of zero-introduction: we can assume that Paul wanted to express that he used a quoted text, but he did not add a textual sign of it. (Some may argue that the first words of II_Corinthians 13:1 can be considered as an unusual introduction of the quotation.) Anyway, for the moment I added Matthew 18:16 in B to have at least the first occurrence of Deuteronomy registered in the database, and put a comment on the conditional addition of II_Corinthians 13:1 for the future.

Another interesting entry is Matthew 21:9 in W. This is shown as a marked quotation – which seems actually false, that is, this should be identified as unmarked, at least according to the natural meaning of marking. I would prefer saying that the evidence for the second part of Matthew 21:9 comes from the context, namely, that Psalms 118:26 and 148:1 were cried by the multitudes. The word “λεγοντες” (“saying”) is not about someone who wrote the quoted text (and therefore the passage was cited by Matthew later) but about the crowd who repeated the two psalms. Now I think Matthew 21:9 can be indeed identified as a verse that should be added to the database, but the same idea as for II_Corinthians 13:1 should be used. That is, here we can identify a sort of contextual evidence that there is indeed a quotation. I postpone the addition of this entry to B for the moment. (Interestingly, Jesus himself quotes Psalms 118:26 according to Matthew 23:39 and Luke 13:35.)

A very difficult question is how to handle Matthew 22:24. In this passage the Sadducees come to Jesus and cite Moses, but their citation is very fuzzy, far from being literal. There are, however, several words that match (they are colored in red) with Deuteronomy 25:5: Εαν τις αποθανη μη εχων τεκνα, επιγαμβρευσει ο αδελφος αυτου την γυναικα αυτου και αναστησει σπερμα τω αδελφω αυτου. In the parallel passages of the Gospels (in Mark 12:19 and Luke 20:28) we find different formulations; those two are more similar to each other than to Matthew 22:24, but they are still far from being literal quotations. So, I tend to avoid adding Matthew 22:24 (and the two other parallel passages) in B, because the citation is somewhat vague. The Gospels may want to communicate that the Sadducees did not exactly quote the Scripture, or alternatively, that here the conversation was just informal. Satan seems doing a much better job when citing Psalm 91:11 in Matthew 4:6 (which is literal). Anyway, since Matthew 22:24 is at least an attempt to quote the Scripture (not considering if it is done in a positive way or not), one can argue that this should also be added to the database. I postpone the decision to a later date.

I found that W adds Joshua 22:5 as a second quotation source for Matthew 22:37. After checking this, I found this correct, and added this extension to B as well (and also for the parallel passages in Mark and Luke). It is very encouraging that Joshua is listed now in B as a source for some quotations – this gives me additional evidence that the full Old Testament belongs to God's Word.

Missing database entries in W

The Wuppertal database does not contain the four sentences by Jesus in Matthew 5:21, 5:27, 5:38 and 5:43. Here Jesus quotes the Law. One can argue that the way how Jesus here cites the Law is different from the other quotations, but all of these are verbatim (or almost verbatim) matches from Exodus and Leviticus. Also, Matthew 12:7 is missing from W.

The verses Matthew 24:15 and 24:21 are not present in W. These are actually quite fuzzy quotations, but it is clear that Daniel was quoted, at least the expression “βδελυγμα των ερημωσεων” (Daniel 24:15, it reads “βδελυγμα της ερημωσεως” in Matthew 24:15) is taken from Daniel: this is unique in LXX/Daniel (we do not consider minor grammatical changes of this expression). Matthew 24:21 contains the words “γαρ τοτε” (“for then”) – this may indicate that a quotation begins, and we can identify multiple words from Daniel 12:1 that are quoted mostly in a fuzzy way by Matthew/Jesus. The fuzzy form may dissuade us from adding Matthew 24:21 to the database. Currently I decided to keep these entries because it seems to me that Jesus explains the Scripture here with some short/fuzzy quotations, but the Scripture is indeed used and directly mentioned.


Writing this summary took me a longer time than expected, about 2 months, but with several breaks (due to my teaching obligatories meanwhile). I also ordered the book (a collection of papers) by de Vries and Karrer and since I already received it a couple of weeks ago, I am ready to continue this work with a higher level of studying!

