jaromail

a commandline tool to easily and privately handle your e-mail
git clone git://parazyd.org/jaromail.git
Log | Files | Refs | Submodules | README

commit 1c96a2c33cec3aa3b5face8314f2f771db7f3203
parent ea82a54e416ef2800c1cf72b76c3a1e3ddbad1fd
Author: Jaromil <jaromil@dyne.org>
Date:   Sun,  8 Nov 2015 18:20:57 +0100

New replay command and fixes to stats

Replay now saves the output of certain commands for replay, to
facilitate usage on the commandline (see manual updates).
Also some fixes to stats are included.

Diffstat:
Mdoc/jaromail-manual.org | 231+++++++++++++++++++++++++++++++++++++++++++++++--------------------------------
Msrc/jaro | 40+++++++++++++++++++++++++---------------
Msrc/zlibs/filters | 16++++++++--------
Msrc/zlibs/helpers | 58+++++++++++++++++++++++++++++++++++++++++++++++++++-------
Msrc/zlibs/parse | 53++++++++++++++++++++++++++++++++++++++---------------
Msrc/zlibs/stats | 31++++++++++++++++++++++++-------
6 files changed, 285 insertions(+), 144 deletions(-)

diff --git a/doc/jaromail-manual.org b/doc/jaromail-manual.org @@ -1,4 +1,4 @@ -#+TITLE: Jaro Mail 3.3 +#+TITLE: Jaro Mail 4 #+AUTHOR: by Jaromil @ dyne.org #+DATE: November 2015 @@ -159,8 +159,8 @@ and actions involved in managing one's email communication: Some dependencies are needed in order to build this software. The Makefile for GNU/Linux configures the build environment automatically on Debian and Fedora systems, using their packaging to install all needed packages. The dependencies to be installed on the system for JaroMail are - - to *build*: bison flex make autoconf automake sqlite3 libgnome-keyring-dev - - to *run*: procmail fetchmail msmtp mutt mairix pinentry abook wipe + - to *build*: gcc bison flex make autoconf automake sqlite3 libglib2.0-dev libgnome-keyring-dev + - to *run*: fetchmail msmtp mutt pinentry abook wipe notmuch alot To install all needed components (done automatically, requires root): @@ -459,94 +459,6 @@ Below a recapitulation of keys commonly used in our workflow | *C* | Copy a message to another folder | -* Addressbook - -Addressbooks are the files storing the whitelist, the blacklist and optionally other custom lists of addresses. The format we use is native *abook* database files, by convention in /$JAROMAILDIR/whitelist.abook/ and /$JAROMAILDIR/blacklist.abook/. More custom addressbooks can be used by specifying them using *-l* on the commandline, for instance *-l family* will query the /$JAROMAILDIR/family.abook/ addressbook; when not used, *whitelist* is the default. - -Addressbooks can be edited using a interactive console interface, for instance to add or delete entries by hand: use the *abook* command and optionally the *-l* option. - -: jaro abook - -This will open the current whitelist for edit. To edit the blacklist add *-l blacklist* instead. - -To quickly dump to the console all names and addresses in the Jaro Mail addressbook, one can use the *list* command - -: jaro list - -To match a string across the addressbook, simply use the composite command *addr* followed by strings, for instance: - -: jaro addr dyne - -will list all addresses containing 'dyne' in your whitelist. - -** Address lists - -Jaro Mail handles lists of addresses as plain text files or streams with entries formatted as '/Name <email>/' and newline terminated. This simple format conforms (or is normalized to) the RFC822 standard and UTF-8 charset encoding, both produced on /stdout/ and read from /stdin/ by various useful commands to take advantage of console piping. - -Such lists of addresses are the output of the *extract* command, which is able to read the output of other commands and extract a list of email addresses found. - -: jaro search open source date:2w.. | jaro extract - -Will print to stdout the list of addresses found among the results of a search for /open source/ through all the emails archived in the past 2 weeks - -: jaro search date:1y.. and folder:known | jaro extract - -Will print a sorted list of unique addresses found in the emails matching the search expression '/date:1y.. and folder:known/', meaning all messages stored in the '/known/' folder and not older than 1 year from now. - -The *import* command is complementary to extraction: it reads an address list from stdin and imports it inside an addressbook specified using '-l' or a /group/ list file provided as argument. - -: jaro search folder:unsorted | jaro extract | jaro import -l blacklist - -Will extract all addresses found in unsorted (the maildir collecting all non-mailinglist emails in which we are not an explicit recipient) and put them into our blacklist. - -** Export to VCard and other formats - -VCard is an exchange format useful to interface with other addressbook software and mobile phones, as well with spyware as Google and Apple mail. Jaro Mail supports converting address lists to a variety of formats thanks to /abook/: - -: jaro addr | jaro export vcard - -Will take the list of addresses in whitelist and convert it to the *vcard* format on stdout, ready to be redirected to a file. - -Here below a list of output formats supported as argument to export: - -| Format | Description | -|---------+-------------------------------------| -| abook | abook native format | -| ldif | ldif / Netscape addressbook (.4ld) | -| vcard | vCard 2 file | -| mutt | mutt alias | -| muttq | mutt query format (internal use) | -| html | html document | -| pine | pine addressbook | -| csv | comma separated values | -| allcsv | comma separated values (all fields) | -| palmcsv | Palm comma separated values | -| elm | elm alias | -| text | plain text | -| wl | Wanderlust address book | -| spruce | Spruce address book | -| bsdcal | BSD calendar | -| custom | Custom format | - -Of course *export* works with any list of addresses from stdin, for instance the result of *extract* operations on search queries, so that multiple commands can be concatenated. - - -** Addressbook in brief - -Here a roundup on the addressbook commands that are available from the /jaro/ commandline script. Arguments '-l abook' take the string to identify - -| Command | Arguments | Function (print on stdout, import from stdin) | -|-----------+-------------+--------------------------------------------------| -| *abook* | -l listname | edit the addressbook (default whitelist) | -| *addr* | search expr | print list of addresses matching expression | -| *extract* | maildir | print address list of all mails in maildir | -| *extract* | gpg keyring | print address list of gpg public keyring | -| *extract* | gpg pubkey | print address list of gpg key signatures | -| *extract* | vcard file | print address list of entries in VCard file | -| *import* | -l listname | import address list from stdin to addressbook | -| *export* | format | convert address list to a format (default vcard) | - - * Searching @@ -578,7 +490,33 @@ With the *addr* command the search will be run on the whitelist addressbook entr : jaro addr joe -Will list all addresses matching the string 'joe' inside the /whitelist/ addressbook. Also the blacklist can be searched this way adding the switch *-l blacklist*: +Will list all addresses matching the string 'joe' inside the /whitelist/ addressbook. Also the blacklist can be searched this way adding the switch *-l blacklist*. + +** Compute and visualize statistics + +The *stats* command is useful to quickly visualize statistics regarding folder usage as well the frequency of emails found in a stream from stdin. Such streams can be produced by the *search* and *extract* commands for instance and passed to stats in order to have a more graphical (yet ASCII based) visualization of results. + +For example lets visualize the frequency of email domain hosts in our whitelist: + +: jaro addr | jaro stat emails + +Will print out bars and domains in descending order, highlighting the most frequent email domain in our contacts, which turns out to be very often gmail.com, unfortunately for our own privacy. + +To visualize the frequency of traffic across our filtered folders in the past month: + +: jaro search date:1M.. | jaro stat folders + +Will show quantities of mails filed to folders during the past month, quickly highlighting the mailinglists that have seen more recent activity. + +To see who is most active in a folder: + +: jaro search folder:org.dyne.dng | jaro extract stdin from | jaro stat names + +Will give an overview on who is the most prolific writer in the dng mailinglist, filed into the folder by a rule in *Filters.txt* like: + +: to dng@lists.dyne save org.dyne.dng + +Please note the *extract* command is there to extract email addresses and names found in the /From:/ field of all search hits, the command is explained better in the next chapter: /Addressbook/. ** Combining terms @@ -724,6 +662,94 @@ Examples: : EET +* Addressbook + +Addressbooks are the files storing the whitelist, the blacklist and optionally other custom lists of addresses. The format we use is native *abook* database files, by convention in /$JAROMAILDIR/whitelist.abook/ and /$JAROMAILDIR/blacklist.abook/. More custom addressbooks can be used by specifying them using *-l* on the commandline, for instance *-l family* will query the /$JAROMAILDIR/family.abook/ addressbook; when not used, *whitelist* is the default. + +Addressbooks can be edited using a interactive console interface, for instance to add or delete entries by hand: use the *abook* command and optionally the *-l* option. + +: jaro abook + +This will open the current whitelist for edit. To edit the blacklist add *-l blacklist* instead. + +To quickly dump to the console all names and addresses in the Jaro Mail addressbook, one can use the *list* command + +: jaro list + +To match a string across the addressbook, simply use the composite command *addr* followed by strings, for instance: + +: jaro addr dyne + +will list all addresses containing 'dyne' in your whitelist. + +** Address lists + +Jaro Mail handles lists of addresses as plain text files or streams with entries formatted as '/Name <email>/' and newline terminated. This simple format conforms (or is normalized to) the RFC822 standard and UTF-8 charset encoding, both produced on /stdout/ and read from /stdin/ by various useful commands to take advantage of console piping. + +Such lists of addresses are the output of the *extract* command, which is able to read the output of other commands and extract a list of email addresses found. + +: jaro search open source date:2w.. | jaro extract stdin + +Will print to stdout the list of addresses found among the results of a search for /open source/ through all the emails archived in the past 2 weeks. + +: jaro search date:1y.. and folder:known | jaro extract + +Will print a sorted list of unique addresses found in the emails matching the search expression '/date:1y.. and folder:known/', meaning all messages stored in the '/known/' folder and not older than 1 year from now. + +The *import* command is complementary to extraction: it reads an address list from stdin and imports it inside an addressbook specified using '-l' or a /group/ list file provided as argument. + +: jaro search folder:unsorted | jaro extract | jaro import -l blacklist + +Will extract all addresses found in unsorted (the maildir collecting all non-mailinglist emails in which we are not an explicit recipient) and put them into our blacklist. + +** Export to VCard and other formats + +VCard is an exchange format useful to interface with other addressbook software and mobile phones, as well with spyware as Google and Apple mail. Jaro Mail supports converting address lists to a variety of formats thanks to /abook/: + +: jaro addr | jaro export vcard + +Will take the list of addresses in whitelist and convert it to the *vcard* format on stdout, ready to be redirected to a file. + +Here below a list of output formats supported as argument to export: + +| Format | Description | +|---------+-------------------------------------| +| abook | abook native format | +| ldif | ldif / Netscape addressbook (.4ld) | +| vcard | vCard 2 file | +| mutt | mutt alias | +| muttq | mutt query format (internal use) | +| html | html document | +| pine | pine addressbook | +| csv | comma separated values | +| allcsv | comma separated values (all fields) | +| palmcsv | Palm comma separated values | +| elm | elm alias | +| text | plain text | +| wl | Wanderlust address book | +| spruce | Spruce address book | +| bsdcal | BSD calendar | +| custom | Custom format | + +Of course *export* works with any list of addresses from stdin, for instance the result of *extract* operations on search queries, so that multiple commands can be concatenated. + + +** Addressbook in brief + +Here a roundup on the addressbook commands that are available from the /jaro/ commandline script. Arguments '-l abook' take the string to identify + +| Command | Arguments | Function (print on stdout, import from stdin) | +|-----------+-------------+--------------------------------------------------| +| *abook* | -l listname | edit the addressbook (default whitelist) | +| *addr* | search expr | print list of addresses matching expression | +| *extract* | maildir | print address list of all mails in maildir | +| *extract* | gpg keyring | print address list of gpg public keyring | +| *extract* | gpg pubkey | print address list of gpg key signatures | +| *extract* | vcard file | print address list of entries in VCard file | +| *import* | -l listname | import address list from stdin to addressbook | +| *export* | format | convert address list to a format (default vcard) | + + * Storage and backup Most existing e-mail systems have their own storage format which is @@ -888,6 +914,27 @@ For more information about Tomb please refer to its own documentation: environme * Advanced usage +** Replay: avoid repeating long operations + + Working on the commandline can have some disadvantages. One of them is that if one runs a long operation to see its result and forgets to save it also on a file (i.e. using tee) the operation needs to be re-run and saved. + + Jaro Mail helps the user to *replay* the last output print by saving it everytime in its own cache. Replay can also save per-command outputs so that long pipe chains can be repeated selectively by naming the command. Only some commands have the replay capability, to have a list of available replays on your system do, based on your last run commands: + +: jaro replay list + +To replay the last search command and pipe it into headers to have a better view of it: + +: jaro replay search | jaro headers + +For instance imagine giving the command that searches for all mails sent to /nettime-l/ and extracts all addresses in the /From:/ including duplicates, then sorts them and eliminates duplicates + +: jaro search to:nettime-l | jaro extract stdin from | sort | uniq + +Depending from the size of your nettime archives, this operation may take some time and one may not want to repeat it in order to compute some stats on the extract result. So one can go on and send the old output to a new command: + +: jaro replay extract | jaro stat names + +This will print out statistics about the most prolific write to the nettime list according to your archives. ** Send anonymous emails diff --git a/src/jaro b/src/jaro @@ -22,8 +22,8 @@ # this source code; if not, write to: # Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. -VERSION=3.3 -DATE=Mar/2015 +VERSION=4.0 +DATE=Nov/2015 JAROMAILEXEC=$0 # default permission on files @@ -103,7 +103,7 @@ GNOMEKEY=${GNOMEKEY:-0} # global variables for binaries called -vars+=(rm mkdir mutt SQL) +vars+=(rm mkdir mutt SQL OS) # load zsh modules zmodload zsh/regex @@ -516,7 +516,8 @@ main() { subcommands_opts[edit]="" subcommands_opts[preview]="" - subcommands_opts[later]="" + subcommands_opts[replay]="" + subcommands_opts[remember]="" subcommands_opts[backup]="" @@ -657,10 +658,15 @@ main() { exitcode=$? ;; - later|remember) - cat | deliver remember - exitcode=$? - ;; + remember) + cat | deliver remember + exitcode=$? + ;; + + replay) + replay ${PARAM} + exitcode=$? + ;; update|init) [[ "$PARAM" = "" ]] || { @@ -693,18 +699,19 @@ main() { for i in ${search_results=}; do print - "$i" done - } + } | save_replay $subcommand ;; alot) alot_search ${PARAM} ;; notmuch) notice "Command: notmuch ${PARAM}" - nm ${PARAM} + nm ${PARAM} | save_replay $subcommand exitcode=$? ;; - addr|list) search_addressbook ${PARAM} ;; + addr|list) search_addressbook ${PARAM} + ;; complete) complete ${PARAM} exitcode=$? @@ -730,7 +737,7 @@ main() { exitcode=$? ;; - stats) stats ${PARAM} | sort -n + stat*) stats ${PARAM} | sort -n exitcode=$? ;; @@ -751,7 +758,7 @@ main() { error "error updating filters, operation aborted." break } - filter_maildir ${PARAM} + filter_maildir ${PARAM} | save_replay $subcommand exitcode=$? ;; @@ -795,7 +802,9 @@ main() { folders=(`imap_list_folders`) exitcode=$? notice "List of folders for $login on $imap" - for f in $folders; do print "$f"; done + for f in $folders; do print - "$f"; done \ + | save_replay $subcommand | column + ;; # interactive) # read_account @@ -815,7 +824,8 @@ main() { # ;; extract|parse) - extract_addresses ${PARAM} | sort | uniq + extract_addresses ${PARAM} \ + | save_replay $subcommand exitcode=$? ;; diff --git a/src/zlibs/filters b/src/zlibs/filters @@ -565,18 +565,18 @@ EOF { test -r "${MAILDIRS}/Applications.txt" } && { - # here is the tweak to open attachments - # with Mutt without blocking it (fork) + # here is the tweak to open attachments + # with Mutt without blocking it (fork) - apptypes=`cat "${MAILDIRS}/Applications.txt"` - for t in ${(f)apptypes}; do - eval `print $t | awk ' + apptypes=`cat "${MAILDIRS}/Applications.txt"` + for t in ${(f)apptypes}; do + eval `print $t | awk ' { print "_type=" $1 "; _app=" $2 ";" }'` - cat <<EOF >> $MAILDIRS/.mutt/mailcap + cat <<EOF >> $MAILDIRS/.mutt/mailcap ${_type}; a="${MAILDIRS}/tmp" && f=\`basename %s\` && rm -f "\$a"/"\$f" && cp %s "\$a"/"\$f" && ${_app} "\$a"/"\$f" EOF - done - cat <<EOF >> $MAILDIRS/.mutt/mailcap + done + cat <<EOF >> $MAILDIRS/.mutt/mailcap application/*; a="${MAILDIRS}/tmp" && f=\`basename %s\` && rm -f "\$a"/"\$f" && cp %s "\$a"/"\$f" && jaro preview "\$a"/"\$f" EOF } # Applications.txt diff --git a/src/zlibs/helpers b/src/zlibs/helpers @@ -119,7 +119,7 @@ e_parse() { # check if an email address was found isemail "$_e" || continue # avoid duplicates -# [[ "${(v)e_addr[$_e]}" = "" ]] || continue + [[ "${(v)e_addr[$_e]}" = "" ]] || continue # extract also the name using comma separator _n="${(Q)_p[(ws:,:)2]}" @@ -158,6 +158,50 @@ BEGIN { head=1 } next }' "$1" } +save_replay() { + fn save_replay + _cmd="$1" + req=(_cmd) + ckreq || return 1 + + tee $MAILDIRS/cache/replay.$_cmd + [[ $? = 0 ]] && \ + ln -sf $MAILDIRS/cache/replay.$_cmd $MAILDIRS/cache/replay.last + return 0 +} + +replay() { + fn "replay $*" + + arg=$1 + + if [[ "$arg" = "" ]]; then + + if [[ -r $MAILDIRS/cache/replay.last ]]; then + notice "Replay last stdout from `stat -c %z $MAILDIRS/cache/replay.last`" + cat $MAILDIRS/cache/replay.last + return $? + else + # never run a command? + error "There is nothing to replay" + return 1 + fi + + elif [[ -r $MAILDIRS/cache/replay.$arg ]]; then + notice "Replay stdout of command '$arg' from `stat -c %z $MAILDIRS/cache/replay.last`" + cat $MAILDIRS/cache/replay.$arg + return $? + + elif [[ "$arg" = "list" ]]; then + notice "Listing available replays:" + ls -l $MAILDIRS/cache/replay.* + return $? + else + error "Nothing to replay for command: $arg" + return 1 + fi + return 1 +} ######### ## Editor @@ -226,12 +270,12 @@ open_folder() { ## Open a File preview_file() { case $OS in - GNU) - xdg-open "${PARAM}" & - ;; - MAC) - open -g "${PARAM}" - ;; + GNU) + xdg-open $* + ;; + MAC) + open -g $* + ;; esac } diff --git a/src/zlibs/parse b/src/zlibs/parse @@ -31,6 +31,8 @@ extract_mails() { act "$_tot emails to parse" + typeset -a _match + [[ ${_tot} -gt 100 ]] && { act "operation will take a while, showing progress" _prog=0 @@ -39,18 +41,27 @@ extract_mails() { # learn from senders, recipients or all _action=${1:-all} + # optional second argument limits parsing to header fields + [[ "$_action" = "all" ]] || _arg="-x $_action" - _found=0 - for m in ${mailpaths}; do - - # e_parse fills in e_addr(map) and e_parsed(newline term str) - hdr $m | e_parse $_action - for _e in ${(k)e_addr}; do + act "parsing $_action fields" + _match=() - print - "${(v)e_addr[$_e]} <$_e>" - _found=$(( $_found + 1 )) + for m in ${mailpaths}; do + # use RFC822 parser in fetchaddr + _parsed=`hdr $m | ${WORKDIR}/bin/fetchaddr ${=_arg} -a` + for _p in ${(f)_parsed}; do + + _e="${(Q)_p[(ws:,:)1]:l}" + # check if an email address was found + isemail "$_e" || continue + # extract also the name using comma separator + _n="${(Q)_p[(ws:,:)2]}" + + func "match: ${_n} <$_e>" + _match+=("${_n} <$_e>") done - + [[ $_tot -gt 100 ]] && { c=$(( $c + 1 )) [[ $c -gt 99 ]] && { @@ -61,7 +72,13 @@ extract_mails() { } done - notice "${_found} addresses extracted (including duplicates)" + _found=0 + for _l in ${_match}; do + print - "$_l" + _found=$(( $_found + 1 )) + done + + notice "${#_match} addresses extracted (including duplicates)" } # extract all addresses found into a maildir @@ -145,12 +162,18 @@ extract_addresses() { # without arguments just list all entries in the active list # default is whitelist - arg=${PARAM[1]} + arg=${1} func "extract() arg: $arg (param: $PARAM)" # no arg means parse from stdin - [[ "$arg" = "" ]] && { + stdin=0 + [[ "$arg" = "" ]] && stdin=1 + [[ "$arg" = "stdin" ]] && stdin=1 + [[ "$arg" = "in" ]] && stdin=1 + + + [[ $stdin = 1 ]] && { read_stdin # Extract all entries found in stdin. Supports two formats (autodetected) @@ -194,7 +217,7 @@ BEGIN { header=1 } elif stdin_is_pathlist; then act "stdin seems a stream of full paths to single email files inside maildirs" # is a list of files - extract_mails ${=PARAM} + extract_mails "$2" _res=$? else error "Cannot process stream from stdin, unknown format" @@ -254,7 +277,7 @@ BEGIN { header=1 } [[ "$i" =~ "[User ID not found]" ]] && { act "looking up: $i" - ${=_gpg} --recv-key ${i[(w)1]} +o ${=_gpg} --recv-key ${i[(w)1]} } done @@ -371,6 +394,6 @@ BEGIN { date=""; from=""; subj="" } /^From:/ { from=$NF } /^Date:/ { date=sprintf("%02d %s %s", $3, $4, $5)} /^Subject:/ { subj=$0} -END { printf("%s :%s: %s\t%s\n", date, folder, from, subj) }' +END { printf("%s :%s:\t%s\t%s\n", date, folder, from, subj) }' done } diff --git a/src/zlibs/stats b/src/zlibs/stats @@ -32,22 +32,39 @@ stats() { case $1 in - timecloud) timecloud ;; + # timecloud) timecloud ;; - weeks) weeks ;; + # weeks) weeks ;; + + domain*) + _domain="" + for i in "${(f)$(cat)}"; do + _domain=${i[(ws:@:)-1]/>/} + num=${count[$_domain]:-0} + count[$_domain]=$(( $num + 1 )) + done + + ;; email*) _email="" for i in "${(f)$(cat)}"; do - _email=${i[(ws:@:)-1]/>/} + _email=${i[(ws:<:)2]/>/} num=${count[$_email]:-0} count[$_email]=$(( $num + 1 )) done + ;; + + name*) + _name="" + for i in "${(f)$(cat)}"; do + _name=${i[(ws:<:)1]//} + num=${count[$_name]:-0} + count[$_name]=$(( $num + 1 )) + done + ;; - ;; - folder*) # simple stats - #list_maildirs - #for i in ${maildirs}; do + folder*) _folder=""