Some time ago I shared with the B-GREEK mailing list some of the results of my stylometrical analysis of the NT canon using the concept of a lexical contact. My analysis had been directed to investigating the authorship of the disputed Pauline epistles. This message presents the result of a cluster analysis on the issue. Briefly, this analysis shows a complex relation between the Pastorals (1, 2 Tim. & Tit.) and the rest of the Pauline Corpus. Although stylistically distinct overall and in terms of vocabulary, the Pastorals are nonetheless quite close to the final chapters of Paul's letters in terms of shared phraseology. In addition, this method indicates that Colossians and Ephesians are quite consistent with Paul's style, and that the Johannine epistles are related to the Gospel of John, especially chapters 14-17. Hebrews, however, is not stylistically Pauline, nor is Revelation Johannine.
A "lexical contact" between two books or corpora is a shared word or phrase (of which each word is in lexical form). The "order" of a lexical contact is the number of words being compared at a time. Thus, "first-order lexical contacts" comprise the shared vocabulary between two corpora, and "third-order lexical contacts" are the shared three-word phrases. Although other order lexical contacts are possible, the third order is used because that order generates the most contacts.
One further concept is defined with respect to a supercorpus, in this case the NT canon. An "exclusively shared lexical contact" is a contact found in only two corpora of the supercorpus. I shall use the term "characteristic" as a short hand for this concept.
Lexical contacts, especially for phrases, tends to show a common authorship because (it is hoped) an author has certain pet expressions that recur. However, as we shall see, it cannot distinguish a work that is literary dependent on another, in which large amounts of one work have been incorporated into another. It can show, on the other hand, that two corpora are sufficiently distinct to cast doubt upon a thesis that they have a common originator of expression.
Cluster analysis is a procedure which hierarchically groups the closest two items (or clusters) at time into a larger cluster until all the items have been clusters. The closeness is measured by a distance function. For this analysis, the distance function is calculated by counting the number of contacts. For each book, the number of contacts to another is calculated and normalized to account for the length of the books. The pair with the greatest number of normalized contacts are then combined to form a corpus and put back into the analysis.
Two different normalizations were used in this analysis. The first normalized the contacts based on the number of distinct words or phrases in each corpus. The second normalized based on the total number of contacts of each corpus.
In addition to performing the cluster analysis upon each book of the New Testament, this analysis has also been performed upon each chapter, too. The results are quite interesting and will be mentioned (with a link to the appropriate page) where relevant.
The following a presentation and analysis of my results.
EC31 - [by chapters]
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
*--------------------------------------------------------jd,p2
\-hb-*--*--*-----------c1-*--*--*-----pp----------------------------q1,q2
\ \ \ \ \ \-------------------------------------co,ep
\ \ \ \ \----------c2,pm
\ \ \ \-----ga,rm
\ \ \----------------------------jm,p1
\ \------------------------------------------------ t1-t2,tt
\--rv-ac-*-----------------------------lk-mk,mt
\-------------------------jn----------j1-------------j2,j3
Notes:
SC31 - [by chapters]
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
*--*-----ac-------hb,rv
\ \-------*--------------------lk-------------mk,mt
\ \----------------------------jn-------------------j1----------j2,j3
\--*--------*--------c1----c2-------------*-----pp-------------q1----pm,q2
\ \ \----------------------co,ep
\ \---------jm----------ga,rm
\-----------*--------------p1-------------------jd,p2
\-------------------------------t1----t2,tt
Notes.
EC11
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
j3-*--q2-q1-pp-ga-*--c2-p1-rm-*--*--c1-jm-hb-rv-jn-ac-lk-mk,mt
\ \ \ \----------------------------t2----------t1,tt
\ \ \---------------------------jd,p2
\ \--------------------------------------------ep----co,pm
\-------------------------------------------------------------j1,j2
Notes:
SC11 - [by chapters]
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
*--ac----*-----lk----------mk,mt
\ \-------------------jn,rv
\---*-----*--------*-----------------c2-------ga----pp----------q1,q2
\ \ \-------------c1-------------------co,ep
\ \---------rm----------------jm,p1
\--------hb-------------*-----------*--------------tt-------j1-pm-j2,j3
\ \----------------jd,p2
\---------------t1,t2
Notes:
EC32 - [by chapters]
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
*--------rv-*--ac-hb-*--*--*--c1-------------ga,rm
\ \ \ \ \----------------------c2,pm
\ \ \ \------*--*-----------------------------------jd,p2
lk-mk,mt \ \ \ \----------pp-------------q1,q2
\ \ \------------------------co,ep
\ \------------*--------------------------tt-t1,t2
\ \----------------jm,p1
\----------------------jn----------j1-------------------j2,j3
Notes.
SC32 - [by chapters]
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
rv-ac-*--*--lk-mk,mt
\ \----------jn----------------------------------j1----------------j2,j3
\---------hb----jm-*--*--c1-*--c2-------*--------------pp,pm
\ \ \ \----q1,q2
\ \ \---------------------co,ep
\ \-----------ga,rm
\----------*--------------------------tt-t2,tt
\-------p1-------------p2,jd
Notes:
EC12
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
rv-ac-*--jm-hb-p1-c1-*--------rm-*-----------------q1----------ep-------co,pm
\ \ \----ga----------------c2,q2
\ \------------pp----*-----------t1-------------t1,tt
\ \----------------------jd,p2
\-------------*--lk----------------mk-j3,mt
\----------------------------------jn-------------j1,j2
Notes:
SC12 - [by chapters]
25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
*--pp-*--ga-*--p1-jm-*--c2-*--rm-c1-------------hb-------rv-jn-ac-lk-mk,mt
\ \ \ \ \-------t2-------------t1,tt
\ \ \ \---------------------co,ep
\ \ \--------------------------------------jd,p2
\ \----------------------------q1,q2
\------------------------------------j1----------------------------pm-j2,j3
Notes:
Stephen Carlson