15 years ago · c9a0fe4555
--- a/docs/userguide/executing.rst
+++ b/docs/userguide/executing.rst
@@ -14,7 +14,7 @@ Basics
 
				 ======
			
 
				 
			
 
				 Executing tasks is done with :meth:`~celery.task.Base.Task.apply_async`,
			
 
				-and its shortcut: :meth:`~celery.task.Base.Task.delay`.
			
 
				+and the shortcut: :meth:`~celery.task.Base.Task.delay`.
			
 
				 
			
 
				 ``delay`` is simple and convenient, as it looks like calling a regular
			
 
				 function:
			
@@ -23,7 +23,7 @@ function:
 
				 
			
 
				     Task.delay(arg1, arg2, kwarg1="x", kwarg2="y")
			
 
				 
			
 
				-The same thing using ``apply_async`` is written like this:
			
 
				+The same using ``apply_async`` is written like this:
			
 
				 
			
 
				 .. code-block:: python
			
 
				 
			
@@ -32,12 +32,12 @@ The same thing using ``apply_async`` is written like this:
 
				 
			
 
				 While ``delay`` is convenient, it doesn't give you as much control as using
			
 
				 ``apply_async``.  With ``apply_async`` you can override the execution options
			
 
				-available as attributes on the ``Task`` class:  ``routing_key``, ``exchange``,
			
 
				-``immediate``, ``mandatory``, ``priority``, and ``serializer``.
			
 
				-In addition you can set a countdown/eta, or provide a custom broker connection.
			
 
				+available as attributes on the ``Task`` class (see :ref:`task-options`).
			
 
				+In addition you can set countdown/eta, task expiry, provide a custom broker
			
 
				+connection and more.
			
 
				 
			
 
				-Let's go over these in more detail. The following examples use this simple
			
 
				-task, which adds together two numbers:
			
 
				+Let's go over these in more detail.  All the examples uses a simple task,
			
 
				+called ``add``, taking two positional arguments and returning the sum:
			
 
				 
			
 
				 .. code-block:: python
			
 
				 
			
@@ -45,7 +45,6 @@ task, which adds together two numbers:
 
				     def add(x, y):
			
 
				         return x + y
			
 
				 
			
 
				-
			
 
				 .. note::
			
 
				 
			
 
				     You can also execute a task by name using
			
@@ -63,8 +62,8 @@ ETA and countdown
 
				 =================
			
 
				 
			
 
				 The ETA (estimated time of arrival) lets you set a specific date and time that
			
 
				-is the earliest time at which your task will execute. ``countdown`` is
			
 
				-a shortcut to set this by seconds in the future.
			
 
				+is the earliest time at which your task will be executed.  ``countdown`` is
			
 
				+a shortcut to set eta by seconds into the future.
			
 
				 
			
 
				 .. code-block:: python
			
 
				 
			
@@ -72,22 +71,24 @@ a shortcut to set this by seconds in the future.
 
				     >>> result.get()    # this takes at least 3 seconds to return
			
 
				     20
			
 
				 
			
 
				-Note that your task is guaranteed to be executed at some time *after* the
			
 
				-specified date and time has passed, but not necessarily at that exact time.
			
 
				+The task is guaranteed to be executed at some time *after* the
			
 
				+specified date and time, but not necessarily at that exact time.
			
 
				+Possible reasons for broken deadlines may include many items waiting
			
 
				+in the queue, or heavy network latency.  To make sure your tasks
			
 
				+are executed in a timely manner you should monitor queue lenghts. Use
			
 
				+Munin, or similar tools, to receive alerts, so appropiate action can be
			
 
				+taken to ease the workload.  See :ref:`monitoring-munin`.
			
 
				 
			
 
				-While ``countdown`` is an integer, ``eta`` must be a :class:`~datetime.datetime` object,
			
 
				-specifying an exact date and time in the future. This is good if you already
			
 
				-have a :class:`~datetime.datetime` object and need to modify it with a
			
 
				-:class:`~datetime.timedelta`, or when using time in seconds is not very readable.
			
 
				+While ``countdown`` is an integer, ``eta`` must be a :class:`~datetime.datetime`
			
 
				+object, specifying an exact date and time (including millisecond precision,
			
 
				+and timezone information):
			
 
				 
			
 
				 .. code-block:: python
			
 
				 
			
 
				-    from datetime import datetime, timedelta
			
 
				+    >>> from datetime import datetime, timedelta
			
 
				 
			
 
				-    def add_tomorrow(username):
			
 
				-        """Add this tomorrow."""
			
 
				-        tomorrow = datetime.now() + timedelta(days=1)
			
 
				-        add.apply_async(args=[10, 10], eta=tomorrow)
			
 
				+    >>> tomorrow = datetime.now() + timedelta(days=1)
			
 
				+    >>> add.apply_async(args=[10, 10], eta=tomorrow)
			
 
				 
			
 
				 .. _executing-expiration:
			
 
				 
			
@@ -96,7 +97,9 @@ Expiration
 
				 
			
 
				 The ``expires`` argument defines an optional expiry time,
			
 
				 either as seconds after task publish, or a specific date and time using
			
 
				-:class:~datetime.datetime`.
			
 
				+:class:~datetime.datetime`:
			
 
				+
			
 
				+.. code-block:: python
			
 
				 
			
 
				     >>> # Task expires after one minute from now.
			
 
				     >>> add.apply_async(args=[10, 10], expires=60)
			
@@ -107,7 +110,7 @@ either as seconds after task publish, or a specific date and time using
 
				     ...                 expires=datetime.now() + timedelta(days=1)
			
 
				 
			
 
				 
			
 
				-When a worker receives a task that has been expired it will mark
			
 
				+When a worker receives an expired task it will mark
			
 
				 the task as :state:`REVOKED` (:exc:`~celery.exceptions.TaskRevokedError`).
			
 
				 
			
 
				 .. _executing-serializers:
			
@@ -115,27 +118,76 @@ the task as :state:`REVOKED` (:exc:`~celery.exceptions.TaskRevokedError`).
 
				 Serializers
			
 
				 ===========
			
 
				 
			
 
				-Data passed between celery and workers has to be serialized to be
			
 
				-transferred. The default serializer is :mod:`pickle`, but you can 
			
 
				-change this for each
			
 
				-task. There is built-in support for using :mod:`pickle`, ``JSON``, ``YAML``
			
 
				-and ``msgpack``. You can also add your own custom serializers by registering
			
 
				-them into the Carrot serializer registry.
			
 
				+Data transferred between clients and workers needs to be serialized.
			
 
				+The default serializer is :mod:`pickle`, but you can
			
 
				+change this globally or for each individual task.
			
 
				+There is built-in support for :mod:`pickle`, ``JSON``, ``YAML``
			
 
				+and ``msgpack``, and you can also add your own custom serializers by registering
			
 
				+them into the Carrot serializer registry (see
			
 
				+`Carrot: Serialization of Data`_).
			
 
				+
			
 
				+.. _`Carrot: Serialization of Data`:
			
 
				+    http://packages.python.org/carrot/introduction.html#serialization-of-data
			
 
				+
			
 
				+Each option has its advantages and disadvantages.
			
 
				+
			
 
				+json -- JSON is supported in many programming languages, is now
			
 
				+    a standard part of Python (since 2.6), and is fairly fast to decode
			
 
				+    using the modern Python libraries such as :mod:`cjson` or :mod:`simplejson`.
			
 
				+
			
 
				+    The primary disadvantage to JSON is that it limits you to the following
			
 
				+    data types: strings, unicode, floats, boolean, dictionaries, and lists.
			
 
				+    Decimals and dates are notably missing.
			
 
				+
			
 
				+    Also, binary data will be transferred using base64 encoding, which will
			
 
				+    cause the transferred data to be around 34% larger than an encoding which
			
 
				+    supports native binary types.
			
 
				+
			
 
				+    However, if your data fits inside the above constraints and you need
			
 
				+    cross-language support, the default setting of JSON is probably your
			
 
				+    best choice.
			
 
				+
			
 
				+    See http://json.org for more information.
			
 
				+
			
 
				+pickle -- If you have no desire to support any language other than
			
 
				+    Python, then using the pickle encoding will gain you the support of
			
 
				+    all built-in Python data types (except class instances), smaller
			
 
				+    messages when sending binary files, and a slight speedup over JSON
			
 
				+    processing.
			
 
				+
			
 
				+    See http://docs.python.org/library/pickle.html for more information.
			
 
				+
			
 
				+yaml -- YAML has many of the same characteristics as json,
			
 
				+    except that it natively supports more data types (including dates,
			
 
				+    recursive references, etc.)
			
 
				+
			
 
				+    However, the Python libraries for YAML are a good bit slower than the
			
 
				+    libraries for JSON.
			
 
				 
			
 
				-The default serializer (pickle) supports Python objects, like ``datetime`` and
			
 
				-any custom datatypes you define yourself. But since pickle has poor support
			
 
				-outside of the Python language, you need to choose another serializer if you
			
 
				-need to communicate with other languages. In that case, ``JSON`` is a very
			
 
				-popular choice.
			
 
				+    If you need a more expressive set of data types and need to maintain
			
 
				+    cross-language compatibility, then YAML may be a better fit than the above.
			
 
				 
			
 
				-The serialization method is sent with the message, so the worker knows how to
			
 
				-deserialize any task. Of course, if you use a custom serializer, this must
			
 
				-also be registered in the worker.
			
 
				+    See http://yaml.org/ for more information.
			
 
				 
			
 
				-When sending a task the serialization method is taken from the following
			
 
				-places in order: The ``serializer`` argument to ``apply_async``, the
			
 
				-Task's ``serializer`` attribute, and finally the global default
			
 
				-:setting:`CELERY_TASK_SERIALIZER` configuration directive.
			
 
				+msgpack -- msgpack is a binary serialization format that is closer to JSON
			
 
				+    in features.  It is very young however, and support should be considered
			
 
				+    experimental at this point.
			
 
				+
			
 
				+    See http://msgpack.org/ for more information.
			
 
				+
			
 
				+The encoding used is available as a message header, so the worker knows how to
			
 
				+deserialize any task.  If you use a custom serializer, this serializer must
			
 
				+be available for the worker.
			
 
				+
			
 
				+The client uses the following order to decide which serializer
			
 
				+to use when sending a task:
			
 
				+
			
 
				+    1. The ``serializer`` argument to ``apply_async``
			
 
				+    2. The tasks ``serializer`` attribute
			
 
				+    3. The default :setting:`CELERY_TASK_SERIALIZER` setting.
			
 
				+
			
 
				+
			
 
				+*Using the ``serializer`` argument to ``apply_async``:
			
 
				 
			
 
				 .. code-block:: python
			
 
				 
			
@@ -146,13 +198,13 @@ Task's ``serializer`` attribute, and finally the global default
 
				 Connections and connection timeouts.
			
 
				 ====================================
			
 
				 
			
 
				-Currently there is no support for broker connection pools in celery,
			
 
				-so this is something you need to be aware of when sending more than
			
 
				-one task at a time, as ``apply_async``/``delay`` establishes and
			
 
				-closes a connection every time.
			
 
				+Currently there is no support for broker connection pools, so 
			
 
				+``apply_async`` establishes and closes a new connection every time
			
 
				+it is called.  This is something you need to be aware of when sending
			
 
				+more than one task at a time.
			
 
				 
			
 
				-If you need to send more than one task at the same time, it's a good idea to
			
 
				-establish the connection yourself and pass it to ``apply_async``:
			
 
				+You handle the connection manually by creating a
			
 
				+publisher::
			
 
				 
			
 
				 .. code-block:: python
			
 
				 
			
@@ -171,9 +223,15 @@ establish the connection yourself and pass it to ``apply_async``:
 
				     print([res.get() for res in results])
			
 
				 
			
 
				 
			
 
				-The connection timeout is the number of seconds to wait before we give up
			
 
				-establishing the connection. You can set this with the ``connect_timeout``
			
 
				-argument to ``apply_async``:
			
 
				+.. note::
			
 
				+
			
 
				+    This particularly example is better expressed as a task set.
			
 
				+    See :ref:`sets-taskset`.  Tasksets already reuses connections.
			
 
				+
			
 
				+
			
 
				+The connection timeout is the number of seconds to wait before giving up
			
 
				+on establishing the connection.  You can set this by using the
			
 
				+``connect_timeout`` argument to ``apply_async``:
			
 
				 
			
 
				 .. code-block:: python
			
 
				 
			
@@ -191,20 +249,20 @@ Routing options
 
				 ===============
			
 
				 
			
 
				 Celery uses the AMQP routing mechanisms to route tasks to different workers.
			
 
				-You can route tasks using the following entities: exchange, queue and routing key.
			
 
				 
			
 
				 Messages (tasks) are sent to exchanges, a queue binds to an exchange with a
			
 
				 routing key. Let's look at an example:
			
 
				 
			
 
				-Our application has a lot of tasks, some process video, others process images,
			
 
				-and some gather collective intelligence about users. Some of these have
			
 
				-higher priority than others so we want to make sure the high priority tasks
			
 
				-get sent to powerful machines, while low priority tasks are sent to dedicated
			
 
				-machines that can handle these at their own pace.
			
 
				+Let's pretend we have an application with lot of different tasks: some
			
 
				+process video, others process images, and some gather collective intelligence
			
 
				+about its users.  Some of these tasks are more important, so we want to make
			
 
				+sure the high priority tasks get sent to dedicated nodes.
			
 
				 
			
 
				-For the sake of example we have only one exchange called ``tasks``.
			
 
				-There are different types of exchanges that matches the routing key in
			
 
				-different ways, the exchange types are:
			
 
				+For the sake of this example we have a single exchange called ``tasks``.
			
 
				+There are different types of exchanges, each type interpreting the routing
			
 
				+key in different ways, implementing different messaging scenarios.
			
 
				+
			
 
				+The most common types used with Celery are ``direct`` and ``topic``.
			
 
				 
			
 
				 * direct
			
 
				 
			
@@ -212,17 +270,15 @@ different ways, the exchange types are:
 
				 
			
 
				 * topic
			
 
				 
			
 
				-    In the topic exchange the routing key is made up of words separated by dots (``.``).
			
 
				-    Words can be matched by the wild cards ``*`` and ``#``, where ``*`` matches one
			
 
				-    exact word, and ``#`` matches one or many.
			
 
				+    In the topic exchange the routing key is made up of words separated by
			
 
				+    dots (``.``).  Words can be matched by the wild cards ``*`` and ``#``,
			
 
				+    where ``*`` matches one exact word, and ``#`` matches one or many words.
			
 
				 
			
 
				     For example, ``*.stock.#`` matches the routing keys ``usd.stock`` and
			
 
				     ``euro.stock.db`` but not ``stock.nasdaq``.
			
 
				 
			
 
				-(there are also other exchange types, but these are not used by celery)
			
 
				-
			
 
				-So, we create three queues, ``video``, ``image`` and ``lowpri`` that bind to
			
 
				-our ``tasks`` exchange. For the queues we use the following binding keys::
			
 
				+We create three queues, ``video``, ``image`` and ``lowpri`` that binds to
			
 
				+the ``tasks`` exchange.  For the queues we use the following binding keys::
			
 
				 
			
 
				     video: video.#
			
 
				     image: image.#
			
@@ -245,7 +301,7 @@ listen to different queues:
 
				 
			
 
				 
			
 
				 Later, if the crop task is consuming a lot of resources,
			
 
				-we can bind some new workers to handle just the ``"image.crop"`` task,
			
 
				+we can bind new workers to handle just the ``"image.crop"`` task,
			
 
				 by creating a new queue that binds to ``"image.crop``".
			
 
				 
			
 
				 .. seealso::
			
@@ -257,20 +313,20 @@ by creating a new queue that binds to ``"image.crop``".
 
				 AMQP options
			
 
				 ============
			
 
				 
			
 
				-.. warning::
			
 
				-    The ``mandatory`` and ``immediate`` flags are not supported by
			
 
				-    :mod:`amqplib` at this point.
			
 
				-
			
 
				 * mandatory
			
 
				 
			
 
				-This sets the delivery to be mandatory. An exception will be raised
			
 
				+This sets the delivery to be mandatory.  An exception will be raised
			
 
				 if there are no running workers able to take on the task.
			
 
				 
			
 
				+Not supported by :mod:`amqplib`.
			
 
				+
			
 
				 * immediate
			
 
				 
			
 
				 Request immediate delivery. Will raise an exception
			
 
				 if the task cannot be routed to a worker immediately.
			
 
				 
			
 
				+Not supported by :mod:`amqplib`.
			
 
				+
			
 
				 * priority
			
 
				 
			
 
				 A number between ``0`` and ``9``, where ``0`` is the highest priority.