User Tools

Site Tools



This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
sql_analysis [2016/07/21 06:00]
daniel [Presence and count of specific mail headers]
sql_analysis [2019/01/24 18:55]
daniel [Presence and count of specific mail headers]
Line 52: Line 52:
 <code sql> <code sql>
 select FIELD||':'​||arr[1],​ count(*) select FIELD||':'​||arr[1],​ count(*)
- from (select FIELD, regexp_matches(lines,​ '​(?:​^|\n)'​ || FIELD || ':​\s*([^\n]*)\n', 'g') as arr+ from (select FIELD, regexp_matches(lines,​ '​(?:​^|\n)'​ || FIELD || ':​\s*([^\n]*)',​ 'gi') as arr
        from header, ​        from header, ​
          ​(VALUES ('​X-Priority'​),​ ('​Importance'​),​ ('​Precedence'​),​ ('​Priority'​),​          ​(VALUES ('​X-Priority'​),​ ('​Importance'​),​ ('​Precedence'​),​ ('​Priority'​),​
Line 67: Line 67:
 </​code>​ </​code>​
-====== Messages with specific attachment types ====== +A stricter version, comparing the md5 hashes of bodies (text and html parts) in addition to the headers:
-To retrieve all messages containing pdf files or any image file:+
 <code sql> <code sql>
-select ​distinct ​mail_id ​FROM attachments WHERE content_type='​application/​pdf'​ OR content_type like '​image/​%'​;+select ​b2.mail_id from body b1, body b2, header h1, header h2 
 + where b1.mail_id < b2.mail_id 
 + and h1.mail_id = b1.mail_id 
 + and h2.mail_id = b2.mail_id 
 + and md5(h1.lines) = md5(h2.lines) 
 + and md5(b1.bodytext) is not distinct from md5(b2.bodytext) 
 + and md5(b1.bodyhtml) is not distinct from md5(b2.bodyhtml);
 </​code>​ </​code>​
-====== ​Messages sent or received today  ​====== +The ''​IS NOT DISTINCT''​ comparator behaves as expected when ''​bodytext''​ or ''​bodyhtml''​ is NULL, as opposed 
-<​code>​ +to the simple equality operator, for which ''​NULL=NULL''​ is false 
-select ​mail_id ​from mail where msg_date>​=date_trunc('day',now());+ 
 +====== Hierarchical view of tags  ​====== 
 +Output a list of tags sorted by hierarchy level and names, 
 +with an indentation to represent the hierarchy. 
 +<​code ​sql> 
 +WITH RECURSIVE tagr(a,​_tag_id,​name,​level) as ( 
 + ​select 
 +  array[row_number() over (order by name)] as a, 
 +  tag_id, 
 +  name, 
 +  1 as level 
 + from tags where parent_id is null 
 + ​select 
 +   ​tagr.a || row_number() over (order by,​ 
 +   ​tag_id,​ 
 +   ​,​ 
 +   ​tagr.level+1 
 + FROM tags JOIN tagr ON tagr._tag_id=tags.parent_id 
 +select repeat(' ​  ​', ​level-1|| name from tagr order by a; 
 </​code>​ </​code>​
sql_analysis.txt · Last modified: 2019/01/24 18:55 by daniel