[PATCH 4 of 4 celerytask] Migrate task logs from redis and database to logs files

Denis Laxalde denis.laxalde at logilab.fr
Mon Jun 25 09:40:52 CEST 2018


Philippe Pepiot a écrit :
> # HG changeset patch
> # User Philippe Pepiot <philippe.pepiot at logilab.fr>
> # Date 1529670043 -7200
> #      Fri Jun 22 14:20:43 2018 +0200
> # Node ID c1d93ed176f23c9d97a544d46b6a2374fa944afe
> # Parent  82130ff7c6d5d8e733c34881bc08a6b066c6a962
> # Available At https://hg.logilab.org/review/cubes/celerytask
> #              hg pull https://hg.logilab.org/review/cubes/celerytask -r c1d93ed176f2
> Migrate task logs from redis and database to logs files
> 
> Having logs stored in redis then in database took too much memory in redis and
> storage in database.
> Using files is far simplier, but it require to have a shared file system (nfs)
> when the worker and the cubiweb instance (the reader) are not in the same
> server.
> 
> Use the new cw_celerytask_helpers filelogger instead of redislogger.
> Logs are stored in celerytask-log-dir directory in gzip with a predictible
> filename based on task_id (which is unique).
> 
> Drop task_logs attribute from CeleryTask and update tests accordingly.
> celery-monitor don't copy anymore from redis to database when the task is
> ended.
> 

> diff --git a/migration/0.6.0_Any.py b/migration/0.6.0_Any.py
> --- a/migration/0.6.0_Any.py
> +++ b/migration/0.6.0_Any.py
> @@ -1,1 +1,4 @@
> +from cubes.celerytask.migration.utils import migrate_task_logs_to_bfss
>  option_added('celerytask-log-dir')
> +migrate_task_logs_to_bfss(cnx)
> +drop_attribute('CeleryTask', 'task_logs')

You can also i18ncube to cleanup catalogs from this attribute along the way.

> diff --git a/migration/__init__.py b/migration/__init__.py
> new file mode 100644
> diff --git a/migration/utils.py b/migration/utils.py
> new file mode 100644
> --- /dev/null
> +++ b/migration/utils.py
> @@ -0,0 +1,46 @@
> +# -*- coding: utf-8 -*-
> +# copyright 2018 LOGILAB S.A. (Paris, FRANCE), all rights reserved.
> +# contact http://www.logilab.fr -- mailto:contact at logilab.fr
> +#
> +# This program is free software: you can redistribute it and/or modify it under
> +# the terms of the GNU Lesser General Public License as published by the Free
> +# Software Foundation, either version 2.1 of the License, or (at your option)
> +# any later version.
> +#
> +# This program is distributed in the hope that it will be useful, but WITHOUT
> +# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
> +# FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more
> +# details.
> +#
> +# You should have received a copy of the GNU Lesser General Public License
> +# along with this program. If not, see <http://www.gnu.org/licenses/>.
> +
> +import gzip
> +
> +from logilab.common.shellutils import ProgressBar
> +from cw_celerytask_helpers.filelogger import get_log_filename
> +from cw_celerytask_helpers.redislogger import get_task_logs, flush_task_logs
> +
> +
> +def migrate_task_logs_to_bfss(cnx):
> +    """Migrate logs from redis and from database to logs files in
> +    celerytask-log-dir"""
> +    to_flush = set()
> +    rset = cnx.execute('Any X, T WHERE X is CeleryTask, X task_id T')
> +    pb = ProgressBar(len(rset))
> +    for eid, task_id in rset:
> +        entity = cnx.entity_from_eid(eid)
> +        logs = get_task_logs(task_id)
> +        if logs is not None:
> +            to_flush.add(task_id)
> +        else:
> +            if entity.task_logs is not None:
> +                logs = entity.task_logs.read()
> +        if logs is not None:
> +            fname = get_log_filename(task_id)
> +            with gzip.open(fname, 'wb') as f:
> +                f.write(logs)
> +        pb.update()
> +    cnx.commit()

What is this commit() for? It probably does not hurt, but I don't see
any db write operation.

The patch LGTM otherwise.

> +    for task_id in to_flush:
> +        flush_task_logs(task_id)


More information about the cubicweb-devel mailing list