User Tools

Site Tools


sql_analysis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision Both sides next revision
sql_analysis [2016/07/21 06:00]
daniel [Presence and count of specific mail headers]
sql_analysis [2016/10/04 18:17]
daniel [Duplicate messages]
Line 67: Line 67:
 </​code>​ </​code>​
  
 +A stricter version, comparing the md5 hashes of bodies (text and html parts) in addition to the headers:
 +<code sql>
 +select b2.mail_id from body b1, body b2, header h1, header h2
 + where b1.mail_id < b2.mail_id
 + and h1.mail_id = b1.mail_id
 + and h2.mail_id = b2.mail_id
 + and md5(h1.lines) = md5(h2.lines)
 + and md5(b1.bodytext) is not distinct from md5(b2.bodytext)
 + and md5(b1.bodyhtml) is not distinct from md5(b2.bodyhtml);​
 +</​code>​
 +
 +The ''​IS NOT DISTINCT''​ comparator behaves as expected when ''​bodytext''​ or ''​bodyhtml''​ is NULL, as opposed
 +to the simple equality operator, for which ''​NULL=NULL''​ is false
 ====== Messages with specific attachment types ====== ====== Messages with specific attachment types ======
 To retrieve all messages containing pdf files or any image file: To retrieve all messages containing pdf files or any image file:
sql_analysis.txt ยท Last modified: 2019/01/24 18:55 by daniel