weechat/.weechat/python/grep.py at master

benharri.org / dotfiles
fork atom
my shell and tool configurations
fork atom
dotfiles / weechat / .weechat / python / grep.py
at master 1774 lines 64 kB view raw
wrap content
benharri.org a bunch of weechat stuff i forgot to commit 2y ago
eeadf0a3
   1# -*- coding: utf-8 -*-
   2###
   3# Copyright (c) 2009-2011 by Elián Hanisch <lambdae2@gmail.com>
   4#
   5# This program is free software; you can redistribute it and/or modify
   6# it under the terms of the GNU General Public License as published by
   7# the Free Software Foundation; either version 3 of the License, or
   8# (at your option) any later version.
   9#
  10# This program is distributed in the hope that it will be useful,
  11# but WITHOUT ANY WARRANTY; without even the implied warranty of
  12# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  13# GNU General Public License for more details.
  14#
  15# You should have received a copy of the GNU General Public License
  16# along with this program.  If not, see <http://www.gnu.org/licenses/>.
  17###
  18
  19###
  20# Search in Weechat buffers and logs (for Weechat 0.3.*)
  21#
  22#   Inspired by xt's grep.py
  23#   Originally I just wanted to add some fixes in grep.py, but then
  24#   I got carried away and rewrote everything, so new script.
  25#
  26#   Commands:
  27#   * /grep
  28#     Search in logs or buffers, see /help grep
  29#   * /logs:
  30#     Lists logs in ~/.weechat/logs, see /help logs
  31#
  32#   Settings:
  33#   * plugins.var.python.grep.clear_buffer:
  34#     Clear the results buffer before each search. Valid values: on, off
  35#
  36#   * plugins.var.python.grep.go_to_buffer:
  37#     Automatically go to grep buffer when search is over. Valid values: on, off
  38#
  39#   * plugins.var.python.grep.log_filter:
  40#     Coma separated list of patterns that grep will use for exclude logs, e.g.
  41#     if you use '*server/*' any log in the 'server' folder will be excluded
  42#     when using the command '/grep log'
  43#
  44#   * plugins.var.python.grep.show_summary:
  45#     Shows summary for each log. Valid values: on, off
  46#
  47#   * plugins.var.python.grep.max_lines:
  48#     Grep will only print the last matched lines that don't surpass the value defined here.
  49#
  50#   * plugins.var.python.grep.size_limit:
  51#     Size limit in KiB, is used for decide whenever grepping should run in background or not. If
  52#     the logs to grep have a total size bigger than this value then grep run as a new process.
  53#     It can be used for force or disable background process, using '0' forces to always grep in
  54#     background, while using '' (empty string) will disable it.
  55#
  56#   * plugins.var.python.grep.timeout_secs:
  57#     Timeout (in seconds) for background grepping.
  58#
  59#   * plugins.var.python.grep.default_tail_head:
  60#     Config option for define default number of lines returned when using --head or --tail options.
  61#     Can be overriden in the command with --number option.
  62#
  63#
  64#   TODO:
  65#   * try to figure out why hook_process chokes in long outputs (using a tempfile as a
  66#   workaround now)
  67#   * possibly add option for defining time intervals
  68#
  69#
  70#   History:
  71#
  72#   2022-11-11, anonymous2ch
  73#   version 0.8.6: ignore utf-8 decoding errors
  74#
  75#   2021-05-02, Sébastien Helleu <flashcode@flashtux.org>
  76#   version 0.8.5: add compatibility with WeeChat >= 3.2 (XDG directories)
  77#
  78#   2020-10-11, Thom Wiggers <thom@thomwiggers.nl>
  79#   version 0.8.4: Python3 compatibility fix
  80#
  81#   2020-05-06, Dominique Martinet <asmadeus@codewreck.org> and hexa-
  82#   version 0.8.3: more python3 compatibility fixes...
  83#
  84#   2019-06-30, dabbill <dabbill@gmail.com>
  85#               and Sébastien Helleu <flashcode@flashtux.org>
  86#   version 0.8.2: make script compatible with Python 3
  87#
  88#   2018-04-10, Sébastien Helleu <flashcode@flashtux.org>
  89#   version 0.8.1: fix infolist_time for WeeChat >= 2.2 (WeeChat returns a long
  90#                  integer instead of a string)
  91#
  92#   2017-09-20, mickael9
  93#   version 0.8:
  94#   * use weechat 1.5+ api for background processing (old method was unsafe and buggy)
  95#   * add timeout_secs setting (was previously hardcoded to 5 mins)
  96#
  97#   2017-07-23, Sébastien Helleu <flashcode@flashtux.org>
  98#   version 0.7.8: fix modulo by zero when nick is empty string
  99#
 100#   2016-06-23, mickael9
 101#   version 0.7.7: fix get_home function
 102#
 103#   2015-11-26
 104#   version 0.7.6: fix a typo
 105#
 106#   2015-01-31, Nicd-
 107#   version 0.7.5:
 108#   '~' is now expaned to the home directory in the log file path so
 109#   paths like '~/logs/' should work.
 110#
 111#   2015-01-14, nils_2
 112#   version 0.7.4: make q work to quit grep buffer (requested by: gb)
 113#
 114#   2014-03-29, Felix Eckhofer <felix@tribut.de>
 115#   version 0.7.3: fix typo
 116#
 117#   2011-01-09
 118#   version 0.7.2: bug fixes
 119#
 120#   2010-11-15
 121#   version 0.7.1:
 122#   * use TempFile so temporal files are guaranteed to be deleted.
 123#   * enable Archlinux workaround.
 124#
 125#   2010-10-26
 126#   version 0.7:
 127#   * added templates.
 128#   * using --only-match shows only unique strings.
 129#   * fixed bug that inverted -B -A switches when used with -t
 130#
 131#   2010-10-14
 132#   version 0.6.8: by xt <xt@bash.no>
 133#   * supress highlights when printing in grep buffer
 134#
 135#   2010-10-06
 136#   version 0.6.7: by xt <xt@bash.no>
 137#   * better temporary file:
 138#    use tempfile.mkstemp. to create a temp file in log dir,
 139#    makes it safer with regards to write permission and multi user
 140#
 141#   2010-04-08
 142#   version 0.6.6: bug fixes
 143#   * use WEECHAT_LIST_POS_END in log file completion, makes completion faster
 144#   * disable bytecode if using python 2.6
 145#   * use single quotes in command string
 146#   * fix bug that could change buffer's title when using /grep stop
 147#
 148#   2010-01-24
 149#   version 0.6.5: disable bytecode is a 2.6 feature, instead, resort to delete the bytecode manually
 150#
 151#   2010-01-19
 152#   version 0.6.4: bug fix
 153#   version 0.6.3: added options --invert --only-match (replaces --exact, which is still available
 154#   but removed from help)
 155#   * use new 'irc_nick_color' info
 156#   * don't generate bytecode when spawning a new process
 157#   * show active options in buffer title
 158#
 159#   2010-01-17
 160#   version 0.6.2: removed 2.6-ish code
 161#   version 0.6.1: fixed bug when grepping in grep's buffer
 162#
 163#   2010-01-14
 164#   version 0.6.0: implemented grep in background
 165#   * improved context lines presentation.
 166#   * grepping for big (or many) log files runs in a weechat_process.
 167#   * added /grep stop.
 168#   * added 'size_limit' option
 169#   * fixed a infolist leak when grepping buffers
 170#   * added 'default_tail_head' option
 171#   * results are sort by line count
 172#   * don't die if log is corrupted (has NULL chars in it)
 173#   * changed presentation of /logs
 174#   * log path completion doesn't suck anymore
 175#   * removed all tabs, because I learned how to configure Vim so that spaces aren't annoying
 176#   anymore. This was the script's original policy.
 177#
 178#   2010-01-05
 179#   version 0.5.5: rename script to 'grep.py' (FlashCode <flashcode@flashtux.org>).
 180#
 181#   2010-01-04
 182#   version 0.5.4.1: fix index error when using --after/before-context options.
 183#
 184#   2010-01-03
 185#   version 0.5.4: new features
 186#   * added --after-context and --before-context options.
 187#   * added --context as a shortcut for using both -A -B options.
 188#
 189#   2009-11-06
 190#   version 0.5.3: improvements for long grep output
 191#   * grep buffer input accepts the same flags as /grep for repeat a search with different
 192#     options.
 193#   * tweaks in grep's output.
 194#   * max_lines option added for limit grep's output.
 195#   * code in update_buffer() optimized.
 196#   * time stats in buffer title.
 197#   * added go_to_buffer config option.
 198#   * added --buffer for search only in buffers.
 199#   * refactoring.
 200#
 201#   2009-10-12, omero
 202#   version 0.5.2: made it python-2.4.x compliant
 203#
 204#   2009-08-17
 205#   version 0.5.1: some refactoring, show_summary option added.
 206#
 207#   2009-08-13
 208#   version 0.5: rewritten from xt's grep.py
 209#   * fixed searching in non weechat logs, for cases like, if you're
 210#     switching from irssi and rename and copy your irssi logs to %h/logs
 211#   * fixed "timestamp rainbow" when you /grep in grep's buffer
 212#   * allow to search in other buffers other than current or in logs
 213#     of currently closed buffers with cmd 'buffer'
 214#   * allow to search in any log file in %h/logs with cmd 'log'
 215#   * added --count for return the number of matched lines
 216#   * added --matchcase for case sensible search
 217#   * added --hilight for color matches
 218#   * added --head and --tail options, and --number
 219#   * added command /logs for list files in %h/logs
 220#   * added config option for clear the buffer before a search
 221#   * added config option for filter logs we don't want to grep
 222#   * added the posibility to repeat last search with another regexp by writing
 223#     it in grep's buffer
 224#   * changed spaces for tabs in the code, which is my preference
 225#
 226###
 227
 228from os import path
 229import sys, getopt, time, os, re
 230
 231try:
 232    import cPickle as pickle
 233except ImportError:
 234    import pickle
 235
 236try:
 237    import weechat
 238    from weechat import WEECHAT_RC_OK, prnt, prnt_date_tags
 239    import_ok = True
 240except ImportError:
 241    import_ok = False
 242
 243SCRIPT_NAME    = "grep"
 244SCRIPT_AUTHOR  = "Elián Hanisch <lambdae2@gmail.com>"
 245SCRIPT_VERSION = "0.8.6"
 246SCRIPT_LICENSE = "GPL3"
 247SCRIPT_DESC    = "Search in buffers and logs"
 248SCRIPT_COMMAND = "grep"
 249
 250### Default Settings ###
 251settings = {
 252    'clear_buffer'      : 'off',
 253    'log_filter'        : '',
 254    'go_to_buffer'      : 'on',
 255    'max_lines'         : '4000',
 256    'show_summary'      : 'on',
 257    'size_limit'        : '2048',
 258    'default_tail_head' : '10',
 259    'timeout_secs'      : '300',
 260}
 261
 262### Class definitions ###
 263class linesDict(dict):
 264    """
 265    Class for handling matched lines in more than one buffer.
 266    linesDict[buffer_name] = matched_lines_list
 267    """
 268    def __setitem__(self, key, value):
 269        assert isinstance(value, list)
 270        if key not in self:
 271            dict.__setitem__(self, key, value)
 272        else:
 273            dict.__getitem__(self, key).extend(value)
 274
 275    def get_matches_count(self):
 276        """Return the sum of total matches stored."""
 277        if dict.__len__(self):
 278            return sum(map(lambda L: L.matches_count, self.values()))
 279        else:
 280            return 0
 281
 282    def __len__(self):
 283        """Return the sum of total lines stored."""
 284        if dict.__len__(self):
 285            return sum(map(len, self.values()))
 286        else:
 287            return 0
 288
 289    def __str__(self):
 290        """Returns buffer count or buffer name if there's just one stored."""
 291        n = len(self.keys())
 292        if n == 1:
 293            return list(self.keys())[0]
 294        elif n > 1:
 295            return '%s logs' %n
 296        else:
 297            return ''
 298
 299    def items(self):
 300        """Returns a list of items sorted by line count."""
 301        items = list(dict.items(self))
 302        items.sort(key=lambda i: len(i[1]))
 303        return items
 304
 305    def items_count(self):
 306        """Returns a list of items sorted by match count."""
 307        items = list(dict.items(self))
 308        items.sort(key=lambda i: i[1].matches_count)
 309        return items
 310
 311    def strip_separator(self):
 312        for L in self.values():
 313            L.strip_separator()
 314
 315    def get_last_lines(self, n):
 316        total_lines = len(self)
 317        #debug('total: %s n: %s' %(total_lines, n))
 318        if n >= total_lines:
 319            # nothing to do
 320            return
 321        for k, v in reversed(list(self.items())):
 322            l = len(v)
 323            if n > 0:
 324                if l > n:
 325                    del v[:l-n]
 326                    v.stripped_lines = l-n
 327                n -= l
 328            else:
 329                del v[:]
 330                v.stripped_lines = l
 331
 332class linesList(list):
 333    """Class for list of matches, since sometimes I need to add lines that aren't matches, I need an
 334    independent counter."""
 335    _sep = '...'
 336    def __init__(self, *args):
 337        list.__init__(self, *args)
 338        self.matches_count = 0
 339        self.stripped_lines = 0
 340
 341    def append(self, item):
 342        """Append lines, can be a string or a list with strings."""
 343        if isinstance(item, str):
 344            list.append(self, item)
 345        else:
 346            self.extend(item)
 347
 348    def append_separator(self):
 349        """adds a separator into the list, makes sure it doen't add two together."""
 350        s = self._sep
 351        if (self and self[-1] != s) or not self:
 352            self.append(s)
 353
 354    def onlyUniq(self):
 355        s = set(self)
 356        del self[:]
 357        self.extend(s)
 358
 359    def count_match(self, item=None):
 360        if item is None or isinstance(item, str):
 361            self.matches_count += 1
 362        else:
 363            self.matches_count += len(item)
 364
 365    def strip_separator(self):
 366        """removes separators if there are first or/and last in the list."""
 367        if self:
 368            s = self._sep
 369            if self[0] == s:
 370                del self[0]
 371            if self[-1] == s:
 372                del self[-1]
 373
 374### Misc functions ###
 375now = time.time
 376def get_size(f):
 377    try:
 378        return os.stat(f).st_size
 379    except OSError:
 380        return 0
 381
 382sizeDict = {0:'b', 1:'KiB', 2:'MiB', 3:'GiB', 4:'TiB'}
 383def human_readable_size(size):
 384    power = 0
 385    while size > 1024:
 386        power += 1
 387        size /= 1024.0
 388    return '%.2f %s' %(size, sizeDict.get(power, ''))
 389
 390def color_nick(nick):
 391    """Returns coloured nick, with coloured mode if any."""
 392    if not nick: return ''
 393    wcolor = weechat.color
 394    config_string = lambda s : weechat.config_string(weechat.config_get(s))
 395    config_int = lambda s : weechat.config_integer(weechat.config_get(s))
 396    # prefix and suffix
 397    prefix = config_string('irc.look.nick_prefix')
 398    suffix = config_string('irc.look.nick_suffix')
 399    prefix_c = suffix_c = wcolor(config_string('weechat.color.chat_delimiters'))
 400    if nick[0] == prefix:
 401        nick = nick[1:]
 402    else:
 403        prefix = prefix_c = ''
 404    if nick[-1] == suffix:
 405        nick = nick[:-1]
 406        suffix = wcolor(color_delimiter) + suffix
 407    else:
 408        suffix = suffix_c = ''
 409    # nick mode
 410    modes = '@!+%'
 411    if nick[0] in modes:
 412        mode, nick = nick[0], nick[1:]
 413        mode_color = wcolor(config_string('weechat.color.nicklist_prefix%d' \
 414            %(modes.find(mode) + 1)))
 415    else:
 416        mode = mode_color = ''
 417    # nick color
 418    nick_color = ''
 419    if nick:
 420        nick_color = weechat.info_get('irc_nick_color', nick)
 421        if not nick_color:
 422            # probably we're in WeeChat 0.3.0
 423            #debug('no irc_nick_color')
 424            color_nicks_number = config_int('weechat.look.color_nicks_number')
 425            idx = (sum(map(ord, nick))%color_nicks_number) + 1
 426            nick_color = wcolor(config_string('weechat.color.chat_nick_color%02d' %idx))
 427    return ''.join((prefix_c, prefix, mode_color, mode, nick_color, nick, suffix_c, suffix))
 428
 429### Config and value validation ###
 430boolDict = {'on':True, 'off':False}
 431def get_config_boolean(config):
 432    value = weechat.config_get_plugin(config)
 433    try:
 434        return boolDict[value]
 435    except KeyError:
 436        default = settings[config]
 437        error("Error while fetching config '%s'. Using default value '%s'." %(config, default))
 438        error("'%s' is invalid, allowed: 'on', 'off'" %value)
 439        return boolDict[default]
 440
 441def get_config_int(config, allow_empty_string=False):
 442    value = weechat.config_get_plugin(config)
 443    try:
 444        return int(value)
 445    except ValueError:
 446        if value == '' and allow_empty_string:
 447            return value
 448        default = settings[config]
 449        error("Error while fetching config '%s'. Using default value '%s'." %(config, default))
 450        error("'%s' is not a number." %value)
 451        return int(default)
 452
 453def get_config_log_filter():
 454    filter = weechat.config_get_plugin('log_filter')
 455    if filter:
 456        return filter.split(',')
 457    else:
 458        return []
 459
 460def get_home():
 461    options = {
 462        'directory': 'data',
 463    }
 464    home = weechat.string_eval_path_home(
 465        weechat.config_string(weechat.config_get('logger.file.path')),
 466        {}, {}, options,
 467    )
 468    return home
 469
 470def strip_home(s, dir=''):
 471    """Strips home dir from the begging of the log path, this makes them sorter."""
 472    if not dir:
 473        global home_dir
 474        dir = home_dir
 475    l = len(dir)
 476    if s[:l] == dir:
 477        return s[l:]
 478    return s
 479
 480### Messages ###
 481script_nick = SCRIPT_NAME
 482def error(s, buffer=''):
 483    """Error msg"""
 484    prnt(buffer, '%s%s %s' %(weechat.prefix('error'), script_nick, s))
 485    if weechat.config_get_plugin('debug'):
 486        import traceback
 487        if traceback.sys.exc_type:
 488            trace = traceback.format_exc()
 489            prnt('', trace)
 490
 491def say(s, buffer=''):
 492    """normal msg"""
 493    prnt_date_tags(buffer, 0, 'no_highlight', '%s\t%s' %(script_nick, s))
 494
 495
 496
 497### Log files and buffers ###
 498cache_dir = {} # note: don't remove, needed for completion if the script was loaded recently
 499def dir_list(dir, filter_list=(), filter_excludes=True, include_dir=False):
 500    """Returns a list of files in 'dir' and its subdirs."""
 501    global cache_dir
 502    from os import walk
 503    from fnmatch import fnmatch
 504    #debug('dir_list: listing in %s' %dir)
 505    key = (dir, include_dir)
 506    try:
 507        return cache_dir[key]
 508    except KeyError:
 509        pass
 510
 511    filter_list = filter_list or get_config_log_filter()
 512    dir_len = len(dir)
 513    if filter_list:
 514        def filter(file):
 515            file = file[dir_len:] # pattern shouldn't match home dir
 516            for pattern in filter_list:
 517                if fnmatch(file, pattern):
 518                    return filter_excludes
 519            return not filter_excludes
 520    else:
 521        filter = lambda f : not filter_excludes
 522
 523    file_list = []
 524    extend = file_list.extend
 525    join = path.join
 526    def walk_path():
 527        for basedir, subdirs, files in walk(dir):
 528            #if include_dir:
 529            #    subdirs = map(lambda s : join(s, ''), subdirs)
 530            #    files.extend(subdirs)
 531            files_path = map(lambda f : join(basedir, f), files)
 532            files_path = [ file for file in files_path if not filter(file) ]
 533            extend(files_path)
 534
 535    walk_path()
 536    cache_dir[key] = file_list
 537    #debug('dir_list: got %s' %str(file_list))
 538    return file_list
 539
 540def get_file_by_pattern(pattern, all=False):
 541    """Returns the first log whose path matches 'pattern',
 542    if all is True returns all logs that matches."""
 543    if not pattern: return []
 544    #debug('get_file_by_filename: searching for %s.' %pattern)
 545    # do envvar expandsion and check file
 546    file = path.expanduser(pattern)
 547    file = path.expandvars(file)
 548    if path.isfile(file):
 549        return [file]
 550    # lets see if there's a matching log
 551    global home_dir
 552    file = path.join(home_dir, pattern)
 553    if path.isfile(file):
 554        return [file]
 555    else:
 556        from fnmatch import fnmatch
 557        file = []
 558        file_list = dir_list(home_dir)
 559        n = len(home_dir)
 560        for log in file_list:
 561            basename = log[n:]
 562            if fnmatch(basename, pattern):
 563                file.append(log)
 564        #debug('get_file_by_filename: got %s.' %file)
 565        if not all and file:
 566            file.sort()
 567            return [ file[-1] ]
 568        return file
 569
 570def get_file_by_buffer(buffer):
 571    """Given buffer pointer, finds log's path or returns None."""
 572    #debug('get_file_by_buffer: searching for %s' %buffer)
 573    infolist = weechat.infolist_get('logger_buffer', '', '')
 574    if not infolist: return
 575    try:
 576        while weechat.infolist_next(infolist):
 577            pointer = weechat.infolist_pointer(infolist, 'buffer')
 578            if pointer == buffer:
 579                file = weechat.infolist_string(infolist, 'log_filename')
 580                if weechat.infolist_integer(infolist, 'log_enabled'):
 581                    #debug('get_file_by_buffer: got %s' %file)
 582                    return file
 583                #else:
 584                #    debug('get_file_by_buffer: got %s but log not enabled' %file)
 585    finally:
 586        #debug('infolist gets freed')
 587        weechat.infolist_free(infolist)
 588
 589def get_file_by_name(buffer_name):
 590    """Given a buffer name, returns its log path or None. buffer_name should be in 'server.#channel'
 591    or '#channel' format."""
 592    #debug('get_file_by_name: searching for %s' %buffer_name)
 593    # common mask options
 594    config_masks = ('logger.mask.irc', 'logger.file.mask')
 595    # since there's no buffer pointer, we try to replace some local vars in mask, like $channel and
 596    # $server, then replace the local vars left with '*', and use it as a mask for get the path with
 597    # get_file_by_pattern
 598    for config in config_masks:
 599        mask = weechat.config_string(weechat.config_get(config))
 600        #debug('get_file_by_name: mask: %s' %mask)
 601        if '$name' in mask:
 602            mask = mask.replace('$name', buffer_name)
 603        elif '$channel' in mask or '$server' in mask:
 604            if '.' in buffer_name and \
 605                    '#' not in buffer_name[:buffer_name.find('.')]: # the dot isn't part of the channel name
 606                #    ^ I'm asuming channel starts with #, i'm lazy.
 607                server, channel = buffer_name.split('.', 1)
 608            else:
 609                server, channel = '*', buffer_name
 610            if '$channel' in mask:
 611                mask = mask.replace('$channel', channel)
 612            if '$server' in mask:
 613                mask = mask.replace('$server', server)
 614        # change the unreplaced vars by '*'
 615        try:
 616            from string import letters
 617        except ImportError:
 618            from string import ascii_letters as letters
 619        if '%' in mask:
 620            # vars for time formatting
 621            mask = mask.replace('%', '$')
 622        if '$' in mask:
 623            masks = mask.split('$')
 624            masks = map(lambda s: s.lstrip(letters), masks)
 625            mask = '*'.join(masks)
 626            if mask[0] != '*':
 627                mask = '*' + mask
 628        #debug('get_file_by_name: using mask %s' %mask)
 629        file = get_file_by_pattern(mask)
 630        #debug('get_file_by_name: got file %s' %file)
 631        if file:
 632            return file
 633    return None
 634
 635def get_buffer_by_name(buffer_name):
 636    """Given a buffer name returns its buffer pointer or None."""
 637    #debug('get_buffer_by_name: searching for %s' %buffer_name)
 638    pointer = weechat.buffer_search('', buffer_name)
 639    if not pointer:
 640        try:
 641            infolist = weechat.infolist_get('buffer', '', '')
 642            while weechat.infolist_next(infolist):
 643                short_name = weechat.infolist_string(infolist, 'short_name')
 644                name = weechat.infolist_string(infolist, 'name')
 645                if buffer_name in (short_name, name):
 646                    #debug('get_buffer_by_name: found %s' %name)
 647                    pointer = weechat.buffer_search('', name)
 648                    return pointer
 649        finally:
 650            weechat.infolist_free(infolist)
 651    #debug('get_buffer_by_name: got %s' %pointer)
 652    return pointer
 653
 654def get_all_buffers():
 655    """Returns list with pointers of all open buffers."""
 656    buffers = []
 657    infolist = weechat.infolist_get('buffer', '', '')
 658    while weechat.infolist_next(infolist):
 659        buffers.append(weechat.infolist_pointer(infolist, 'pointer'))
 660    weechat.infolist_free(infolist)
 661    grep_buffer = weechat.buffer_search('python', SCRIPT_NAME)
 662    if grep_buffer and grep_buffer in buffers:
 663        # remove it from list
 664        del buffers[buffers.index(grep_buffer)]
 665    return buffers
 666
 667### Grep ###
 668def make_regexp(pattern, matchcase=False):
 669    """Returns a compiled regexp."""
 670    if pattern in ('.', '.*', '.?', '.+'):
 671        # because I don't need to use a regexp if we're going to match all lines
 672        return None
 673    # matching takes a lot more time if pattern starts or ends with .* and it isn't needed.
 674    if pattern[:2] == '.*':
 675        pattern = pattern[2:]
 676    if pattern[-2:] == '.*':
 677        pattern = pattern[:-2]
 678    try:
 679        if not matchcase:
 680            regexp = re.compile(pattern, re.IGNORECASE)
 681        else:
 682            regexp = re.compile(pattern)
 683    except Exception as e:
 684        raise Exception('Bad pattern, %s' % e)
 685    return regexp
 686
 687def check_string(s, regexp, hilight='', exact=False):
 688    """Checks 's' with a regexp and returns it if is a match."""
 689    if not regexp:
 690        return s
 691
 692    elif exact:
 693        matchlist = regexp.findall(s)
 694        if matchlist:
 695            if isinstance(matchlist[0], tuple):
 696                # join tuples (when there's more than one match group in regexp)
 697                return [ ' '.join(t) for t in matchlist ]
 698            return matchlist
 699
 700    elif hilight:
 701        matchlist = regexp.findall(s)
 702        if matchlist:
 703            if isinstance(matchlist[0], tuple):
 704                # flatten matchlist
 705                matchlist = [ item for L in matchlist for item in L if item ]
 706            matchlist = list(set(matchlist)) # remove duplicates if any
 707            # apply hilight
 708            color_hilight, color_reset = hilight.split(',', 1)
 709            for m in matchlist:
 710                s = s.replace(m, '%s%s%s' % (color_hilight, m, color_reset))
 711            return s
 712
 713    # no need for findall() here
 714    elif regexp.search(s):
 715        return s
 716
 717def grep_file(file, head, tail, after_context, before_context, count, regexp, hilight, exact, invert):
 718    """Return a list of lines that match 'regexp' in 'file', if no regexp returns all lines."""
 719    if count:
 720        tail = head = after_context = before_context = False
 721        hilight = ''
 722    elif exact:
 723        before_context = after_context = False
 724        hilight = ''
 725    elif invert:
 726        hilight = ''
 727    #debug(' '.join(map(str, (file, head, tail, after_context, before_context))))
 728
 729    lines = linesList()
 730    # define these locally as it makes the loop run slightly faster
 731    append = lines.append
 732    count_match = lines.count_match
 733    separator = lines.append_separator
 734    if invert:
 735        def check(s):
 736            if check_string(s, regexp, hilight, exact):
 737                return None
 738            else:
 739                return s
 740    else:
 741        check = lambda s: check_string(s, regexp, hilight, exact)
 742
 743    try:
 744        file_object = open(file, 'r', errors='ignore')
 745    except IOError:
 746        # file doesn't exist
 747        return lines
 748    if tail or before_context:
 749        # for these options, I need to seek in the file, but is slower and uses a good deal of
 750        # memory if the log is too big, so we do this *only* for these options.
 751        file_lines = file_object.readlines()
 752
 753        if tail:
 754            # instead of searching in the whole file and later pick the last few lines, we
 755            # reverse the log, search until count reached and reverse it again, that way is a lot
 756            # faster
 757            file_lines.reverse()
 758            # don't invert context switches
 759            before_context, after_context = after_context, before_context
 760
 761        if before_context:
 762            before_context_range = list(range(1, before_context + 1))
 763            before_context_range.reverse()
 764
 765        limit = tail or head
 766
 767        line_idx = 0
 768        while line_idx < len(file_lines):
 769            line = file_lines[line_idx]
 770            line = check(line)
 771            if line:
 772                if before_context:
 773                    separator()
 774                    trimmed = False
 775                    for id in before_context_range:
 776                        try:
 777                            context_line = file_lines[line_idx - id]
 778                            if check(context_line):
 779                                # match in before context, that means we appended these same lines in a
 780                                # previous match, so we delete them merging both paragraphs
 781                                if not trimmed:
 782                                    del lines[id - before_context - 1:]
 783                                    trimmed = True
 784                            else:
 785                                append(context_line)
 786                        except IndexError:
 787                            pass
 788                append(line)
 789                count_match(line)
 790                if after_context:
 791                    id, offset = 0, 0
 792                    while id < after_context + offset:
 793                        id += 1
 794                        try:
 795                            context_line = file_lines[line_idx + id]
 796                            _context_line = check(context_line)
 797                            if _context_line:
 798                                offset = id
 799                                context_line = _context_line # so match is hilighted with --hilight
 800                                count_match()
 801                            append(context_line)
 802                        except IndexError:
 803                            pass
 804                    separator()
 805                    line_idx += id
 806                if limit and lines.matches_count >= limit:
 807                    break
 808            line_idx += 1
 809
 810        if tail:
 811            lines.reverse()
 812    else:
 813        # do a normal grep
 814        limit = head
 815
 816        for line in file_object:
 817            line = check(line)
 818            if line:
 819                count or append(line)
 820                count_match(line)
 821                if after_context:
 822                    id, offset = 0, 0
 823                    while id < after_context + offset:
 824                        id += 1
 825                        try:
 826                            context_line = next(file_object)
 827                            _context_line = check(context_line)
 828                            if _context_line:
 829                                offset = id
 830                                context_line = _context_line
 831                                count_match()
 832                            count or append(context_line)
 833                        except StopIteration:
 834                            pass
 835                    separator()
 836                if limit and lines.matches_count >= limit:
 837                    break
 838
 839    file_object.close()
 840    return lines
 841
 842def grep_buffer(buffer, head, tail, after_context, before_context, count, regexp, hilight, exact,
 843        invert):
 844    """Return a list of lines that match 'regexp' in 'buffer', if no regexp returns all lines."""
 845    lines = linesList()
 846    if count:
 847        tail = head = after_context = before_context = False
 848        hilight = ''
 849    elif exact:
 850        before_context = after_context = False
 851    #debug(' '.join(map(str, (tail, head, after_context, before_context, count, exact, hilight))))
 852
 853    # Using /grep in grep's buffer can lead to some funny effects
 854    # We should take measures if that's the case
 855    def make_get_line_funcion():
 856        """Returns a function for get lines from the infolist, depending if the buffer is grep's or
 857        not."""
 858        string_remove_color = weechat.string_remove_color
 859        infolist_string = weechat.infolist_string
 860        grep_buffer = weechat.buffer_search('python', SCRIPT_NAME)
 861        if grep_buffer and buffer == grep_buffer:
 862            def function(infolist):
 863                prefix = infolist_string(infolist, 'prefix')
 864                message = infolist_string(infolist, 'message')
 865                if prefix: # only our messages have prefix, ignore it
 866                    return None
 867                return message
 868        else:
 869            infolist_time = weechat.infolist_time
 870            def function(infolist):
 871                prefix = string_remove_color(infolist_string(infolist, 'prefix'), '')
 872                message = string_remove_color(infolist_string(infolist, 'message'), '')
 873                date = infolist_time(infolist, 'date')
 874                # since WeeChat 2.2, infolist_time returns a long integer
 875                # instead of a string
 876                if not isinstance(date, str):
 877                    date = time.strftime('%F %T', time.localtime(int(date)))
 878                return '%s\t%s\t%s' %(date, prefix, message)
 879        return function
 880    get_line = make_get_line_funcion()
 881
 882    infolist = weechat.infolist_get('buffer_lines', buffer, '')
 883    if tail:
 884        # like with grep_file() if we need the last few matching lines, we move the cursor to
 885        # the end and search backwards
 886        infolist_next = weechat.infolist_prev
 887        infolist_prev = weechat.infolist_next
 888    else:
 889        infolist_next = weechat.infolist_next
 890        infolist_prev = weechat.infolist_prev
 891    limit = head or tail
 892
 893    # define these locally as it makes the loop run slightly faster
 894    append = lines.append
 895    count_match = lines.count_match
 896    separator = lines.append_separator
 897    if invert:
 898        def check(s):
 899            if check_string(s, regexp, hilight, exact):
 900                return None
 901            else:
 902                return s
 903    else:
 904        check = lambda s: check_string(s, regexp, hilight, exact)
 905
 906    if before_context:
 907        before_context_range = reversed(range(1, before_context + 1))
 908
 909    while infolist_next(infolist):
 910        line = get_line(infolist)
 911        if line is None: continue
 912        line = check(line)
 913        if line:
 914            if before_context:
 915                separator()
 916                trimmed = False
 917                for id in before_context_range:
 918                    if not infolist_prev(infolist):
 919                        trimmed = True
 920                for id in before_context_range:
 921                    context_line = get_line(infolist)
 922                    if check(context_line):
 923                        if not trimmed:
 924                            del lines[id - before_context - 1:]
 925                            trimmed = True
 926                    else:
 927                        append(context_line)
 928                    infolist_next(infolist)
 929            count or append(line)
 930            count_match(line)
 931            if after_context:
 932                id, offset = 0, 0
 933                while id < after_context + offset:
 934                    id += 1
 935                    if infolist_next(infolist):
 936                        context_line = get_line(infolist)
 937                        _context_line = check(context_line)
 938                        if _context_line:
 939                            context_line = _context_line
 940                            offset = id
 941                            count_match()
 942                        append(context_line)
 943                    else:
 944                        # in the main loop infolist_next will start again an cause an infinite loop
 945                        # this will avoid it
 946                        infolist_next = lambda x: 0
 947                separator()
 948            if limit and lines.matches_count >= limit:
 949                break
 950    weechat.infolist_free(infolist)
 951
 952    if tail:
 953        lines.reverse()
 954    return lines
 955
 956### this is our main grep function
 957hook_file_grep = None
 958def show_matching_lines():
 959    """
 960    Greps buffers in search_in_buffers or files in search_in_files and updates grep buffer with the
 961    result.
 962    """
 963    global pattern, matchcase, number, count, exact, hilight, invert
 964    global tail, head, after_context, before_context
 965    global search_in_files, search_in_buffers, matched_lines, home_dir
 966    global time_start
 967    matched_lines = linesDict()
 968    #debug('buffers:%s \nlogs:%s' %(search_in_buffers, search_in_files))
 969    time_start = now()
 970
 971    # buffers
 972    if search_in_buffers:
 973        regexp = make_regexp(pattern, matchcase)
 974        for buffer in search_in_buffers:
 975            buffer_name = weechat.buffer_get_string(buffer, 'name')
 976            matched_lines[buffer_name] = grep_buffer(buffer, head, tail, after_context,
 977                    before_context, count, regexp, hilight, exact, invert)
 978
 979    # logs
 980    if search_in_files:
 981        size_limit = get_config_int('size_limit', allow_empty_string=True)
 982        background = False
 983        if size_limit or size_limit == 0:
 984            size = sum(map(get_size, search_in_files))
 985            if size > size_limit * 1024:
 986                background = True
 987        elif size_limit == '':
 988            background = False
 989
 990        regexp = make_regexp(pattern, matchcase)
 991
 992        global grep_options, log_pairs
 993        grep_options = (head, tail, after_context, before_context,
 994                        count, regexp, hilight, exact, invert)
 995
 996        log_pairs = [(strip_home(log), log) for log in search_in_files]
 997
 998        if not background:
 999            # run grep normally
1000            for log_name, log in log_pairs:
1001                matched_lines[log_name] = grep_file(log, *grep_options)
1002            buffer_update()
1003        else:
1004            global hook_file_grep, grep_stdout, grep_stderr, pattern_tmpl
1005            grep_stdout = grep_stderr = b''
1006            hook_file_grep = weechat.hook_process(
1007                'func:grep_process',
1008                get_config_int('timeout_secs') * 1000,
1009                'grep_process_cb',
1010                ''
1011            )
1012            if hook_file_grep:
1013                buffer_create("Searching for '%s' in %s worth of data..." % (
1014                    pattern_tmpl,
1015                    human_readable_size(size)
1016                ))
1017    else:
1018        buffer_update()
1019
1020
1021def grep_process(*args):
1022    result = {}
1023    try:
1024        global grep_options, log_pairs
1025        for log_name, log in log_pairs:
1026            result[log_name] = grep_file(log, *grep_options)
1027    except Exception as e:
1028        result = e
1029
1030    return pickle.dumps(result, 0)
1031
1032def grep_process_cb(data, command, return_code, out, err):
1033    global grep_stdout, grep_stderr, matched_lines, hook_file_grep
1034
1035    if isinstance(out, str):
1036        out = out.encode()
1037    grep_stdout += out
1038
1039    if isinstance(err, str):
1040        err = err.encode()
1041    grep_stderr += err
1042
1043    def set_buffer_error(message):
1044        error(message)
1045        grep_buffer = buffer_create()
1046        title = weechat.buffer_get_string(grep_buffer, 'title')
1047        title = title + ' %serror' % color_title
1048        weechat.buffer_set(grep_buffer, 'title', title)
1049
1050    if return_code == weechat.WEECHAT_HOOK_PROCESS_ERROR:
1051        set_buffer_error("Background grep timed out")
1052        hook_file_grep = None
1053        return WEECHAT_RC_OK
1054
1055    elif return_code >= 0:
1056        hook_file_grep = None
1057        if grep_stderr:
1058            set_buffer_error(grep_stderr)
1059            return WEECHAT_RC_OK
1060
1061        try:
1062            data = pickle.loads(grep_stdout)
1063            if isinstance(data, Exception):
1064                raise data
1065            matched_lines.update(data)
1066        except Exception as e:
1067            set_buffer_error(repr(e))
1068            return WEECHAT_RC_OK
1069        else:
1070            buffer_update()
1071
1072    return WEECHAT_RC_OK
1073
1074def get_grep_file_status():
1075    global search_in_files, matched_lines, time_start
1076    elapsed = now() - time_start
1077    if len(search_in_files) == 1:
1078        log = '%s (%s)' %(strip_home(search_in_files[0]),
1079                human_readable_size(get_size(search_in_files[0])))
1080    else:
1081        size = sum(map(get_size, search_in_files))
1082        log = '%s log files (%s)' %(len(search_in_files), human_readable_size(size))
1083    return 'Searching in %s, running for %.4f seconds. Interrupt it with "/grep stop" or "stop"' \
1084        ' in grep buffer.' %(log, elapsed)
1085
1086### Grep buffer ###
1087def buffer_update():
1088    """Updates our buffer with new lines."""
1089    global pattern_tmpl, matched_lines, pattern, count, hilight, invert, exact
1090    time_grep = now()
1091
1092    buffer = buffer_create()
1093    if get_config_boolean('clear_buffer'):
1094        weechat.buffer_clear(buffer)
1095    matched_lines.strip_separator() # remove first and last separators of each list
1096    len_total_lines = len(matched_lines)
1097    max_lines = get_config_int('max_lines')
1098    if not count and len_total_lines > max_lines:
1099        weechat.buffer_clear(buffer)
1100
1101    def _make_summary(log, lines, note):
1102        return '%s matches "%s%s%s"%s in %s%s%s%s' \
1103                %(lines.matches_count, color_summary, pattern_tmpl, color_info,
1104                  invert and ' (inverted)' or '',
1105                  color_summary, log, color_reset, note)
1106
1107    if count:
1108        make_summary = lambda log, lines : _make_summary(log, lines, ' (not shown)')
1109    else:
1110        def make_summary(log, lines):
1111            if lines.stripped_lines:
1112                if lines:
1113                    note = ' (last %s lines shown)' %len(lines)
1114                else:
1115                    note = ' (not shown)'
1116            else:
1117                note = ''
1118            return _make_summary(log, lines, note)
1119
1120    global weechat_format
1121    if hilight:
1122        # we don't want colors if there's match highlighting
1123        format_line = lambda s : '%s %s %s' %split_line(s)
1124    else:
1125        def format_line(s):
1126            global nick_dict, weechat_format
1127            date, nick, msg = split_line(s)
1128            if weechat_format:
1129                try:
1130                    nick = nick_dict[nick]
1131                except KeyError:
1132                    # cache nick
1133                    nick_c = color_nick(nick)
1134                    nick_dict[nick] = nick_c
1135                    nick = nick_c
1136                return '%s%s %s%s %s' %(color_date, date, nick, color_reset, msg)
1137            else:
1138                #no formatting
1139                return msg
1140
1141    prnt(buffer, '\n')
1142    print_line('Search for "%s%s%s"%s in %s%s%s.' %(color_summary, pattern_tmpl, color_info,
1143        invert and ' (inverted)' or '', color_summary, matched_lines, color_reset),
1144            buffer)
1145    # print last <max_lines> lines
1146    if matched_lines.get_matches_count():
1147        if count:
1148            # with count we sort by matches lines instead of just lines.
1149            matched_lines_items = matched_lines.items_count()
1150        else:
1151            matched_lines_items = matched_lines.items()
1152
1153        matched_lines.get_last_lines(max_lines)
1154        for log, lines in matched_lines_items:
1155            if lines.matches_count:
1156                # matched lines
1157                if not count:
1158                    # print lines
1159                    weechat_format = True
1160                    if exact:
1161                        lines.onlyUniq()
1162                    for line in lines:
1163                        #debug(repr(line))
1164                        if line == linesList._sep:
1165                            # separator
1166                            prnt(buffer, context_sep)
1167                        else:
1168                            if '\x00' in line:
1169                                # log was corrupted
1170                                error("Found garbage in log '%s', maybe it's corrupted" %log)
1171                                line = line.replace('\x00', '')
1172                            prnt_date_tags(buffer, 0, 'no_highlight', format_line(line))
1173
1174                # summary
1175                if count or get_config_boolean('show_summary'):
1176                    summary = make_summary(log, lines)
1177                    print_line(summary, buffer)
1178
1179            # separator
1180            if not count and lines:
1181                prnt(buffer, '\n')
1182    else:
1183        print_line('No matches found.', buffer)
1184
1185    # set title
1186    global time_start
1187    time_end = now()
1188    # total time
1189    time_total = time_end - time_start
1190    # percent of the total time used for grepping
1191    time_grep_pct = (time_grep - time_start)/time_total*100
1192    #debug('time: %.4f seconds (%.2f%%)' %(time_total, time_grep_pct))
1193    if not count and len_total_lines > max_lines:
1194        note = ' (last %s lines shown)' %len(matched_lines)
1195    else:
1196        note = ''
1197    title = "'q': close buffer | Search in %s%s%s %s matches%s | pattern \"%s%s%s\"%s %s | %.4f seconds (%.2f%%)" \
1198            %(color_title, matched_lines, color_reset, matched_lines.get_matches_count(), note,
1199              color_title, pattern_tmpl, color_reset, invert and ' (inverted)' or '', format_options(),
1200              time_total, time_grep_pct)
1201    weechat.buffer_set(buffer, 'title', title)
1202
1203    if get_config_boolean('go_to_buffer'):
1204        weechat.buffer_set(buffer, 'display', '1')
1205
1206    # free matched_lines so it can be removed from memory
1207    del matched_lines
1208
1209def split_line(s):
1210    """Splits log's line 's' in 3 parts, date, nick and msg."""
1211    global weechat_format
1212    if weechat_format and s.count('\t') >= 2:
1213        date, nick, msg = s.split('\t', 2) # date, nick, message
1214    else:
1215        # looks like log isn't in weechat's format
1216        weechat_format = False # incoming lines won't be formatted
1217        date, nick, msg = '', '', s
1218    # remove tabs
1219    if '\t' in msg:
1220        msg = msg.replace('\t', '    ')
1221    return date, nick, msg
1222
1223def print_line(s, buffer=None, display=False):
1224    """Prints 's' in script's buffer as 'script_nick'. For displaying search summaries."""
1225    if buffer is None:
1226        buffer = buffer_create()
1227    say('%s%s' %(color_info, s), buffer)
1228    if display and get_config_boolean('go_to_buffer'):
1229        weechat.buffer_set(buffer, 'display', '1')
1230
1231def format_options():
1232    global matchcase, number, count, exact, hilight, invert
1233    global tail, head, after_context, before_context
1234    options = []
1235    append = options.append
1236    insert = options.insert
1237    chars = 'cHmov'
1238    for i, flag in enumerate((count, hilight, matchcase, exact, invert)):
1239        if flag:
1240            append(chars[i])
1241
1242    if head or tail:
1243        n = get_config_int('default_tail_head')
1244        if head:
1245            append('h')
1246            if head != n:
1247                insert(-1, ' -')
1248                append('n')
1249                append(head)
1250        elif tail:
1251            append('t')
1252            if tail != n:
1253                insert(-1, ' -')
1254                append('n')
1255                append(tail)
1256
1257    if before_context and after_context and (before_context == after_context):
1258        append(' -C')
1259        append(before_context)
1260    else:
1261        if before_context:
1262            append(' -B')
1263            append(before_context)
1264        if after_context:
1265            append(' -A')
1266            append(after_context)
1267
1268    s = ''.join(map(str, options)).strip()
1269    if s and s[0] != '-':
1270        s = '-' + s
1271    return s
1272
1273def buffer_create(title=None):
1274    """Returns our buffer pointer, creates and cleans the buffer if needed."""
1275    buffer = weechat.buffer_search('python', SCRIPT_NAME)
1276    if not buffer:
1277        buffer = weechat.buffer_new(SCRIPT_NAME, 'buffer_input', '', '', '')
1278        weechat.buffer_set(buffer, 'time_for_each_line', '0')
1279        weechat.buffer_set(buffer, 'nicklist', '0')
1280        weechat.buffer_set(buffer, 'title', title or 'grep output buffer')
1281        weechat.buffer_set(buffer, 'localvar_set_no_log', '1')
1282    elif title:
1283        weechat.buffer_set(buffer, 'title', title)
1284    return buffer
1285
1286def buffer_input(data, buffer, input_data):
1287    """Repeats last search with 'input_data' as regexp."""
1288    try:
1289        cmd_grep_stop(buffer, input_data)
1290    except:
1291        return WEECHAT_RC_OK
1292    if input_data in ('q', 'Q'):
1293        weechat.buffer_close(buffer)
1294        return weechat.WEECHAT_RC_OK
1295
1296    global search_in_buffers, search_in_files
1297    global pattern
1298    try:
1299        if pattern and (search_in_files or search_in_buffers):
1300            # check if the buffer pointers are still valid
1301            for pointer in search_in_buffers:
1302                infolist = weechat.infolist_get('buffer', pointer, '')
1303                if not infolist:
1304                    del search_in_buffers[search_in_buffers.index(pointer)]
1305                weechat.infolist_free(infolist)
1306            try:
1307                cmd_grep_parsing(input_data)
1308            except Exception as e:
1309                error('Argument error, %s' % e, buffer=buffer)
1310                return WEECHAT_RC_OK
1311            try:
1312                show_matching_lines()
1313            except Exception as e:
1314                error(e)
1315    except NameError:
1316        error("There isn't any previous search to repeat.", buffer=buffer)
1317    return WEECHAT_RC_OK
1318
1319### Commands ###
1320def cmd_init():
1321    """Resets global vars."""
1322    global home_dir, cache_dir, nick_dict
1323    global pattern_tmpl, pattern, matchcase, number, count, exact, hilight, invert
1324    global tail, head, after_context, before_context
1325    hilight = ''
1326    head = tail = after_context = before_context = invert = False
1327    matchcase = count = exact = False
1328    pattern_tmpl = pattern = number = None
1329    home_dir = get_home()
1330    cache_dir = {} # for avoid walking the dir tree more than once per command
1331    nick_dict = {} # nick cache for don't calculate nick color every time
1332
1333def cmd_grep_parsing(args):
1334    """Parses args for /grep and grep input buffer."""
1335    global pattern_tmpl, pattern, matchcase, number, count, exact, hilight, invert
1336    global tail, head, after_context, before_context
1337    global log_name, buffer_name, only_buffers, all
1338    opts, args = getopt.gnu_getopt(args.split(), 'cmHeahtivn:bA:B:C:o', ['count', 'matchcase', 'hilight',
1339        'exact', 'all', 'head', 'tail', 'number=', 'buffer', 'after-context=', 'before-context=',
1340        'context=', 'invert', 'only-match'])
1341    #debug(opts, 'opts: '); debug(args, 'args: ')
1342    if len(args) >= 2:
1343        if args[0] == 'log':
1344            del args[0]
1345            log_name = args.pop(0)
1346        elif args[0] == 'buffer':
1347            del args[0]
1348            buffer_name = args.pop(0)
1349
1350    def tmplReplacer(match):
1351        """This function will replace templates with regexps"""
1352        s = match.groups()[0]
1353        tmpl_args = s.split()
1354        tmpl_key, _, tmpl_args = s.partition(' ')
1355        try:
1356            template = templates[tmpl_key]
1357            if callable(template):
1358                r = template(tmpl_args)
1359                if not r:
1360                    error("Template %s returned empty string "\
1361                          "(WeeChat doesn't have enough data)." %t)
1362                return r
1363            else:
1364                return template
1365        except:
1366            return t
1367
1368    args = ' '.join(args) # join pattern for keep spaces
1369    if args:
1370        pattern_tmpl = args
1371        pattern = _tmplRe.sub(tmplReplacer, args)
1372        debug('Using regexp: %s', pattern)
1373    if not pattern:
1374        raise Exception('No pattern for grep the logs.')
1375
1376    def positive_number(opt, val):
1377        try:
1378            number = int(val)
1379            if number < 0:
1380                raise ValueError
1381            return number
1382        except ValueError:
1383            if len(opt) == 1:
1384                opt = '-' + opt
1385            else:
1386                opt = '--' + opt
1387            raise Exception("argument for %s must be a positive integer." % opt)
1388
1389    for opt, val in opts:
1390        opt = opt.strip('-')
1391        if opt in ('c', 'count'):
1392            count = not count
1393        elif opt in ('m', 'matchcase'):
1394            matchcase = not matchcase
1395        elif opt in ('H', 'hilight'):
1396            # hilight must be always a string!
1397            if hilight:
1398                hilight = ''
1399            else:
1400                hilight = '%s,%s' %(color_hilight, color_reset)
1401            # we pass the colors in the variable itself because check_string() must not use
1402            # weechat's module when applying the colors (this is for grep in a hooked process)
1403        elif opt in ('e', 'exact', 'o', 'only-match'):
1404            exact = not exact
1405            invert = False
1406        elif opt in ('a', 'all'):
1407            all = not all
1408        elif opt in ('h', 'head'):
1409            head = not head
1410            tail = False
1411        elif opt in ('t', 'tail'):
1412            tail = not tail
1413            head = False
1414        elif opt in ('b', 'buffer'):
1415            only_buffers = True
1416        elif opt in ('n', 'number'):
1417            number = positive_number(opt, val)
1418        elif opt in ('C', 'context'):
1419            n = positive_number(opt, val)
1420            after_context = n
1421            before_context = n
1422        elif opt in ('A', 'after-context'):
1423            after_context = positive_number(opt, val)
1424        elif opt in ('B', 'before-context'):
1425            before_context = positive_number(opt, val)
1426        elif opt in ('i', 'v', 'invert'):
1427            invert = not invert
1428            exact = False
1429    # number check
1430    if number is not None:
1431        if number == 0:
1432            head = tail = False
1433            number = None
1434        elif head:
1435            head = number
1436        elif tail:
1437            tail = number
1438    else:
1439        n = get_config_int('default_tail_head')
1440        if head:
1441            head = n
1442        elif tail:
1443            tail = n
1444
1445def cmd_grep_stop(buffer, args):
1446    global hook_file_grep, pattern, matched_lines
1447    if hook_file_grep:
1448        if args == 'stop':
1449            weechat.unhook(hook_file_grep)
1450            hook_file_grep = None
1451
1452            s = 'Search for \'%s\' stopped.' % pattern
1453            say(s, buffer)
1454            grep_buffer = weechat.buffer_search('python', SCRIPT_NAME)
1455            if grep_buffer:
1456                weechat.buffer_set(grep_buffer, 'title', s)
1457            matched_lines = {}
1458        else:
1459            say(get_grep_file_status(), buffer)
1460        raise Exception
1461
1462def cmd_grep(data, buffer, args):
1463    """Search in buffers and logs."""
1464    global pattern, matchcase, head, tail, number, count, exact, hilight
1465    try:
1466        cmd_grep_stop(buffer, args)
1467    except:
1468        return WEECHAT_RC_OK
1469
1470    if not args:
1471        weechat.command('', '/help %s' %SCRIPT_COMMAND)
1472        return WEECHAT_RC_OK
1473
1474    cmd_init()
1475    global log_name, buffer_name, only_buffers, all
1476    log_name = buffer_name = ''
1477    only_buffers = all = False
1478
1479    # parse
1480    try:
1481        cmd_grep_parsing(args)
1482    except Exception as e:
1483        error('Argument error, %s' % e)
1484        return WEECHAT_RC_OK
1485
1486    # find logs
1487    log_file = search_buffer = None
1488    if log_name:
1489        log_file = get_file_by_pattern(log_name, all)
1490        if not log_file:
1491            error("Couldn't find any log for %s. Try /logs" %log_name)
1492            return WEECHAT_RC_OK
1493    elif all:
1494        search_buffer = get_all_buffers()
1495    elif buffer_name:
1496        search_buffer = get_buffer_by_name(buffer_name)
1497        if not search_buffer:
1498            # there's no buffer, try in the logs
1499            log_file = get_file_by_name(buffer_name)
1500            if not log_file:
1501                error("Logs or buffer for '%s' not found." %buffer_name)
1502                return WEECHAT_RC_OK
1503        else:
1504            search_buffer = [search_buffer]
1505    else:
1506        search_buffer = [buffer]
1507
1508    # make the log list
1509    global search_in_files, search_in_buffers
1510    search_in_files = []
1511    search_in_buffers = []
1512    if log_file:
1513        search_in_files = log_file
1514    elif not only_buffers:
1515        #debug(search_buffer)
1516        for pointer in search_buffer:
1517            log = get_file_by_buffer(pointer)
1518            #debug('buffer %s log %s' %(pointer, log))
1519            if log:
1520                search_in_files.append(log)
1521            else:
1522                search_in_buffers.append(pointer)
1523    else:
1524        search_in_buffers = search_buffer
1525
1526    # grepping
1527    try:
1528        show_matching_lines()
1529    except Exception as e:
1530        error(e)
1531    return WEECHAT_RC_OK
1532
1533def cmd_logs(data, buffer, args):
1534    """List files in Weechat's log dir."""
1535    cmd_init()
1536    global home_dir
1537    sort_by_size = False
1538    filter = []
1539
1540    try:
1541        opts, args = getopt.gnu_getopt(args.split(), 's', ['size'])
1542        if args:
1543            filter = args
1544        for opt, var in opts:
1545            opt = opt.strip('-')
1546            if opt in ('size', 's'):
1547                sort_by_size = True
1548    except Exception as e:
1549        error('Argument error, %s' % e)
1550        return WEECHAT_RC_OK
1551
1552    # is there's a filter, filter_excludes should be False
1553    file_list = dir_list(home_dir, filter, filter_excludes=not filter)
1554    if sort_by_size:
1555        file_list.sort(key=get_size)
1556    else:
1557        file_list.sort()
1558
1559    file_sizes = map(lambda x: human_readable_size(get_size(x)), file_list)
1560    # calculate column lenght
1561    if file_list:
1562        L = file_list[:]
1563        L.sort(key=len)
1564        bigest = L[-1]
1565        column_len = len(bigest) + 3
1566    else:
1567        column_len = ''
1568
1569    buffer = buffer_create()
1570    if get_config_boolean('clear_buffer'):
1571        weechat.buffer_clear(buffer)
1572    file_list = list(zip(file_list, file_sizes))
1573    msg = 'Found %s logs.' %len(file_list)
1574
1575    print_line(msg, buffer, display=True)
1576    for file, size in file_list:
1577        separator = column_len and '.'*(column_len - len(file))
1578        prnt(buffer, '%s %s %s' %(strip_home(file), separator, size))
1579    if file_list:
1580        print_line(msg, buffer)
1581    return WEECHAT_RC_OK
1582
1583
1584### Completion ###
1585def completion_log_files(data, completion_item, buffer, completion):
1586    #debug('completion: %s' %', '.join((data, completion_item, buffer, completion)))
1587    global home_dir
1588    l = len(home_dir)
1589    completion_list_add = weechat.hook_completion_list_add
1590    WEECHAT_LIST_POS_END = weechat.WEECHAT_LIST_POS_END
1591    for log in dir_list(home_dir):
1592        completion_list_add(completion, log[l:], 0, WEECHAT_LIST_POS_END)
1593    return WEECHAT_RC_OK
1594
1595def completion_grep_args(data, completion_item, buffer, completion):
1596    for arg in ('count', 'all', 'matchcase', 'hilight', 'exact', 'head', 'tail', 'number', 'buffer',
1597            'after-context', 'before-context', 'context', 'invert', 'only-match'):
1598        weechat.hook_completion_list_add(completion, '--' + arg, 0, weechat.WEECHAT_LIST_POS_SORT)
1599    for tmpl in templates:
1600        weechat.hook_completion_list_add(completion, '%{' + tmpl, 0, weechat.WEECHAT_LIST_POS_SORT)
1601    return WEECHAT_RC_OK
1602
1603
1604### Templates ###
1605# template placeholder
1606_tmplRe = re.compile(r'%\{(\w+.*?)(?:\}|$)')
1607# will match 999.999.999.999 but I don't care
1608ipAddress = r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'
1609domain = r'[\w-]{2,}(?:\.[\w-]{2,})*\.[a-z]{2,}'
1610url = r'\w+://(?:%s|%s)(?::\d+)?(?:/[^\])>\s]*)?' % (domain, ipAddress)
1611
1612def make_url_regexp(args):
1613    #debug('make url: %s', args)
1614    if args:
1615        words = r'(?:%s)' %'|'.join(map(re.escape, args.split()))
1616        return r'(?:\w+://|www\.)[^\s]*%s[^\s]*(?:/[^\])>\s]*)?' %words
1617    else:
1618        return url
1619
1620def make_simple_regexp(pattern):
1621    s = ''
1622    for c in pattern:
1623        if c == '*':
1624            s += '.*'
1625        elif c == '?':
1626            s += '.'
1627        else:
1628            s += re.escape(c)
1629    return s
1630
1631templates = {
1632            'ip': ipAddress,
1633           'url': make_url_regexp,
1634        'escape': lambda s: re.escape(s),
1635        'simple': make_simple_regexp,
1636        'domain': domain,
1637        }
1638
1639### Main ###
1640def delete_bytecode():
1641    global script_path
1642    bytecode = path.join(script_path, SCRIPT_NAME + '.pyc')
1643    if path.isfile(bytecode):
1644        os.remove(bytecode)
1645    return WEECHAT_RC_OK
1646
1647if __name__ == '__main__' and import_ok and \
1648        weechat.register(SCRIPT_NAME, SCRIPT_AUTHOR, SCRIPT_VERSION, SCRIPT_LICENSE, \
1649        SCRIPT_DESC, 'delete_bytecode', ''):
1650    home_dir = get_home()
1651
1652    # for import ourselves
1653    global script_path
1654    script_path = path.dirname(__file__)
1655    sys.path.append(script_path)
1656    delete_bytecode()
1657
1658    # check python version
1659    import sys
1660    global bytecode
1661    if sys.version_info > (2, 6):
1662        bytecode = 'B'
1663    else:
1664        bytecode = ''
1665
1666
1667    weechat.hook_command(SCRIPT_COMMAND, cmd_grep.__doc__,
1668            "[log <file> | buffer <name> | stop] [-a|--all] [-b|--buffer] [-c|--count] [-m|--matchcase] "
1669            "[-H|--hilight] [-o|--only-match] [-i|-v|--invert] [(-h|--head)|(-t|--tail) [-n|--number <n>]] "
1670            "[-A|--after-context <n>] [-B|--before-context <n>] [-C|--context <n> ] <expression>",
1671# help
1672"""
1673     log <file>: Search in one log that matches <file> in the logger path.
1674                 Use '*' and '?' as wildcards.
1675  buffer <name>: Search in buffer <name>, if there's no buffer with <name> it will
1676                 try to search for a log file.
1677           stop: Stops a currently running search.
1678       -a --all: Search in all open buffers.
1679                 If used with 'log <file>' search in all logs that matches <file>.
1680    -b --buffer: Search only in buffers, not in file logs.
1681     -c --count: Just count the number of matched lines instead of showing them.
1682 -m --matchcase: Don't do case insensitive search.
1683   -H --hilight: Colour exact matches in output buffer.
1684-o --only-match: Print only the matching part of the line (unique matches).
1685 -v -i --invert: Print lines that don't match the regular expression.
1686      -t --tail: Print the last 10 matching lines.
1687      -h --head: Print the first 10 matching lines.
1688-n --number <n>: Overrides default number of lines for --tail or --head.
1689-A --after-context <n>: Shows <n> lines of trailing context after matching lines.
1690-B --before-context <n>: Shows <n> lines of leading context before matching lines.
1691-C --context <n>: Same as using both --after-context and --before-context simultaneously.
1692  <expression>: Expression to search.
1693
1694Grep buffer:
1695  Input line accepts most arguments of /grep, it'll repeat last search using the new
1696  arguments provided. You can't search in different logs from the buffer's input.
1697  Boolean arguments like --count, --tail, --head, --hilight, ... are toggleable
1698
1699Python regular expression syntax:
1700  See http://docs.python.org/lib/re-syntax.html
1701
1702Grep Templates:
1703     %{url [text]}: Matches anything like an url, or an url with text.
1704             %{ip}: Matches anything that looks like an ip.
1705         %{domain}: Matches anything like a domain.
1706    %{escape text}: Escapes text in pattern.
1707 %{simple pattern}: Converts a pattern with '*' and '?' wildcards into a regexp.
1708
1709Examples:
1710  Search for urls with the word 'weechat' said by 'nick'
1711    /grep nick\\t.*%{url weechat}
1712  Search for '*.*' string
1713    /grep %{escape *.*}
1714""",
1715            # completion template
1716            "buffer %(buffers_names) %(grep_arguments)|%*"
1717            "||log %(grep_log_files) %(grep_arguments)|%*"
1718            "||stop"
1719            "||%(grep_arguments)|%*",
1720            'cmd_grep' ,'')
1721    weechat.hook_command('logs', cmd_logs.__doc__, "[-s|--size] [<filter>]",
1722            "-s --size: Sort logs by size.\n"
1723            " <filter>: Only show logs that match <filter>. Use '*' and '?' as wildcards.", '--size', 'cmd_logs', '')
1724
1725    weechat.hook_completion('grep_log_files', "list of log files",
1726            'completion_log_files', '')
1727    weechat.hook_completion('grep_arguments', "list of arguments",
1728            'completion_grep_args', '')
1729
1730    # settings
1731    for opt, val in settings.items():
1732        if not weechat.config_is_set_plugin(opt):
1733            weechat.config_set_plugin(opt, val)
1734
1735    # colors
1736    color_date        = weechat.color('brown')
1737    color_info        = weechat.color('cyan')
1738    color_hilight     = weechat.color('lightred')
1739    color_reset       = weechat.color('reset')
1740    color_title       = weechat.color('yellow')
1741    color_summary     = weechat.color('lightcyan')
1742    color_delimiter   = weechat.color('chat_delimiters')
1743    color_script_nick = weechat.color('chat_nick')
1744
1745    # pretty [grep]
1746    script_nick = '%s[%s%s%s]%s' %(color_delimiter, color_script_nick, SCRIPT_NAME, color_delimiter,
1747            color_reset)
1748    script_nick_nocolor = '[%s]' %SCRIPT_NAME
1749    # paragraph separator when using context options
1750    context_sep = '%s\t%s--' %(script_nick, color_info)
1751
1752    # -------------------------------------------------------------------------
1753    # Debug
1754
1755    if weechat.config_get_plugin('debug'):
1756        try:
1757            # custom debug module I use, allows me to inspect script's objects.
1758            import pybuffer
1759            debug = pybuffer.debugBuffer(globals(), '%s_debug' % SCRIPT_NAME)
1760        except:
1761            def debug(s, *args):
1762                try:
1763                    if not isinstance(s, basestring):
1764                        s = str(s)
1765                except NameError:
1766                    pass
1767                if args:
1768                    s = s %args
1769                prnt('', '%s\t%s' %(script_nick, s))
1770    else:
1771        def debug(*args):
1772            pass
1773
1774# vim:set shiftwidth=4 tabstop=4 softtabstop=4 expandtab textwidth=100: