User Tools

Site Tools


sql_analysis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
sql_analysis [2016/07/21 06:00]
daniel [Presence and count of specific mail headers]
sql_analysis [2019/01/24 18:55] (current)
daniel [Presence and count of specific mail headers]
Line 52: Line 52:
 <code sql> <code sql>
 select FIELD||':'​||arr[1],​ count(*) select FIELD||':'​||arr[1],​ count(*)
- from (select FIELD, regexp_matches(lines,​ '​(?:​^|\n)'​ || FIELD || ':​\s*([^\n]*)\n', 'g') as arr+ from (select FIELD, regexp_matches(lines,​ '​(?:​^|\n)'​ || FIELD || ':​\s*([^\n]*)',​ 'gi') as arr
        from header, ​        from header, ​
          ​(VALUES ('​X-Priority'​),​ ('​Importance'​),​ ('​Precedence'​),​ ('​Priority'​),​          ​(VALUES ('​X-Priority'​),​ ('​Importance'​),​ ('​Precedence'​),​ ('​Priority'​),​
Line 67: Line 67:
 </​code>​ </​code>​
  
-====== Messages with specific attachment types ====== +A stricter version, comparing the md5 hashes of bodies (text and html parts) in addition to the headers:
-To retrieve all messages containing pdf files or any image file:+
 <code sql> <code sql>
-select ​distinct ​mail_id ​FROM attachments WHERE content_type='​application/​pdf'​ OR content_type like '​image/​%'​;+select ​b2.mail_id from body b1, body b2, header h1, header h2 
 + where b1.mail_id < b2.mail_id 
 + and h1.mail_id = b1.mail_id 
 + and h2.mail_id = b2.mail_id 
 + and md5(h1.lines) = md5(h2.lines) 
 + and md5(b1.bodytext) is not distinct from md5(b2.bodytext) 
 + and md5(b1.bodyhtml) is not distinct from md5(b2.bodyhtml);
 </​code>​ </​code>​
  
-====== ​Messages sent or received today  ​====== +The ''​IS NOT DISTINCT''​ comparator behaves as expected when ''​bodytext''​ or ''​bodyhtml''​ is NULL, as opposed 
-<​code>​ +to the simple equality operator, for which ''​NULL=NULL''​ is false 
-select ​mail_id ​from mail where msg_date>​=date_trunc('day',now());+ 
 +====== Hierarchical view of tags  ​====== 
 + 
 +Output a list of tags sorted by hierarchy level and names, 
 +with an indentation to represent the hierarchy. 
 + 
 +<​code ​sql> 
 +WITH RECURSIVE tagr(a,​_tag_id,​name,​level) as ( 
 + ​select 
 +  array[row_number() over (order by name)] as a, 
 +  tag_id, 
 +  name, 
 +  1 as level 
 + from tags where parent_id is null 
 +UNION ALL 
 + ​select 
 +   ​tagr.a || row_number() over (order by tags.name),​ 
 +   ​tag_id,​ 
 +   ​tags.name,​ 
 +   ​tagr.level+1 
 + FROM tags JOIN tagr ON tagr._tag_id=tags.parent_id 
 +
 +select repeat(' ​  ​', ​level-1|| name from tagr order by a; 
 </​code>​ </​code>​
  
  
sql_analysis.1469073641.txt.gz · Last modified: 2016/07/21 06:00 by daniel