User Tools

Site Tools


sql_analysis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revisionBoth sides next revision
sql_analysis [2016/07/21 04:00] – [Presence and count of specific mail headers] danielsql_analysis [2016/10/04 16:17] – [Duplicate messages] daniel
Line 67: Line 67:
 </code> </code>
  
 +A stricter version, comparing the md5 hashes of bodies (text and html parts) in addition to the headers:
 +<code sql>
 +select b2.mail_id from body b1, body b2, header h1, header h2
 + where b1.mail_id < b2.mail_id
 + and h1.mail_id = b1.mail_id
 + and h2.mail_id = b2.mail_id
 + and md5(h1.lines) = md5(h2.lines)
 + and md5(b1.bodytext) is not distinct from md5(b2.bodytext)
 + and md5(b1.bodyhtml) is not distinct from md5(b2.bodyhtml);
 +</code>
 +
 +The ''IS NOT DISTINCT'' comparator behaves as expected when ''bodytext'' or ''bodyhtml'' is NULL, as opposed
 +to the simple equality operator, for which ''NULL=NULL'' is false
 ====== Messages with specific attachment types ====== ====== Messages with specific attachment types ======
 To retrieve all messages containing pdf files or any image file: To retrieve all messages containing pdf files or any image file:
sql_analysis.txt · Last modified: 2019/01/24 17:55 by daniel