| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724 | 
							- .. _next-steps:
 
- ============
 
-  Next Steps
 
- ============
 
- The :ref:`first-steps` guide is intentionally minimal.  In this guide
 
- I will demonstrate what Celery offers in more detail, including
 
- how to add Celery support for your application and library.
 
- This document does not document all of Celery's features and
 
- best practices, so it's recommended that you also read the
 
- :ref:`User Guide <guide>`
 
- .. contents::
 
-     :local:
 
-     :depth: 1
 
- Using Celery in your Application
 
- ================================
 
- .. _project-layout:
 
- Our Project
 
- -----------
 
- Project layout::
 
-     proj/__init__.py
 
-         /celery.py
 
-         /tasks.py
 
- :file:`proj/celery.py`
 
- ~~~~~~~~~~~~~~~~~~~~~~
 
- .. literalinclude:: ../../examples/next-steps/proj/celery.py
 
-     :language: python
 
- In this module you created our :class:`@Celery` instance (sometimes
 
- referred to as the *app*).  To use Celery within your project
 
- you simply import this instance.
 
- - The ``broker`` argument specifies the URL of the broker to use.
 
-     See :ref:`celerytut-broker` for more information.
 
- - The ``backend`` argument specifies the result backend to use,
 
-     It's used to keep track of task state and results.
 
-     While results are disabled by default I use the amqp result backend here
 
-     because I demonstrate how retrieving results work later, you may want to use
 
-     a different backend for your application. They all have different
 
-     strengths and weaknesses.  If you don't need results it's better
 
-     to disable them.  Results can also be disabled for individual tasks
 
-     by setting the ``@task(ignore_result=True)`` option.
 
-     See :ref:`celerytut-keeping-results` for more information.
 
- - The ``include`` argument is a list of modules to import when
 
-   the worker starts.  You need to add our tasks module here so
 
-   that the worker is able to find our tasks.
 
- :file:`proj/tasks.py`
 
- ~~~~~~~~~~~~~~~~~~~~~
 
- .. literalinclude:: ../../examples/next-steps/proj/tasks.py
 
-     :language: python
 
- Starting the worker
 
- -------------------
 
- The :program:`celery` program can be used to start the worker (you need to run the worker in the directory above proj):
 
- .. code-block:: bash
 
-     $ celery -A proj worker -l info
 
- When the worker starts you should see a banner and some messages::
 
-      -------------- celery@halcyon.local v3.1 (Cipater)
 
-      ---- **** -----
 
-      --- * ***  * -- [Configuration]
 
-      -- * - **** --- . broker:      amqp://guest@localhost:5672//
 
-      - ** ---------- . app:         __main__:0x1012d8590
 
-      - ** ---------- . concurrency: 8 (processes)
 
-      - ** ---------- . events:      OFF (enable -E to monitor this worker)
 
-      - ** ----------
 
-      - *** --- * --- [Queues]
 
-      -- ******* ---- . celery:      exchange:celery(direct) binding:celery
 
-      --- ***** -----
 
-      [2012-06-08 16:23:51,078: WARNING/MainProcess] celery@halcyon.local has started.
 
- -- The *broker* is the URL you specified in the broker argument in our ``celery``
 
- module, you can also specify a different broker on the command-line by using
 
- the :option:`-b` option.
 
- -- *Concurrency* is the number of prefork worker process used
 
- to process your tasks concurrently, when all of these are busy doing work
 
- new tasks will have to wait for one of the tasks to finish before
 
- it can be processed.
 
- The default concurrency number is the number of CPU's on that machine
 
- (including cores), you can specify a custom number using :option:`-c` option.
 
- There is no recommended value, as the optimal number depends on a number of
 
- factors, but if your tasks are mostly I/O-bound then you can try to increase
 
- it, experimentation has shown that adding more than twice the number
 
- of CPU's is rarely effective, and likely to degrade performance
 
- instead.
 
- Including the default prefork pool, Celery also supports using
 
- Eventlet, Gevent, and threads (see :ref:`concurrency`).
 
- -- *Events* is an option that when enabled causes Celery to send
 
- monitoring messages (events) for actions occurring in the worker.
 
- These can be used by monitor programs like ``celery events``,
 
- and Flower - the real-time Celery monitor, which you can read about in
 
- the :ref:`Monitoring and Management guide <guide-monitoring>`.
 
- -- *Queues* is the list of queues that the worker will consume
 
- tasks from.  The worker can be told to consume from several queues
 
- at once, and this is used to route messages to specific workers
 
- as a means for Quality of Service, separation of concerns,
 
- and emulating priorities, all described in the :ref:`Routing Guide
 
- <guide-routing>`.
 
- You can get a complete list of command-line arguments
 
- by passing in the `--help` flag:
 
- .. code-block:: bash
 
-     $ celery worker --help
 
- These options are described in more detailed in the :ref:`Workers Guide <guide-workers>`.
 
- Stopping the worker
 
- ~~~~~~~~~~~~~~~~~~~
 
- To stop the worker simply hit Ctrl+C.  A list of signals supported
 
- by the worker is detailed in the :ref:`Workers Guide <guide-workers>`.
 
- In the background
 
- ~~~~~~~~~~~~~~~~~
 
- In production you will want to run the worker in the background, this is
 
- described in detail in the :ref:`daemonization tutorial <daemonizing>`.
 
- The daemonization scripts uses the :program:`celery multi` command to
 
- start one or more workers in the background:
 
- .. code-block:: bash
 
-     $ celery multi start w1 -A proj -l info
 
-     celery multi v3.1.1 (Cipater)
 
-     > Starting nodes...
 
-         > w1.halcyon.local: OK
 
- You can restart it too:
 
- .. code-block:: bash
 
-     $ celery  multi restart w1 -A proj -l info
 
-     celery multi v3.1.1 (Cipater)
 
-     > Stopping nodes...
 
-         > w1.halcyon.local: TERM -> 64024
 
-     > Waiting for 1 node.....
 
-         > w1.halcyon.local: OK
 
-     > Restarting node w1.halcyon.local: OK
 
-     celery multi v3.1.1 (Cipater)
 
-     > Stopping nodes...
 
-         > w1.halcyon.local: TERM -> 64052
 
- or stop it:
 
- .. code-block:: bash
 
-     $ celery multi stop w1 -A proj -l info
 
- The ``stop`` command is asynchronous so it will not wait for the
 
- worker to shutdown.  You will probably want to use the ``stopwait`` command
 
- instead which will ensure all currently executing tasks is completed:
 
- .. code-block:: bash
 
-     $ celery multi stopwait w1 -A proj -l info
 
- .. note::
 
-     :program:`celery multi` doesn't store information about workers
 
-     so you need to use the same command-line arguments when
 
-     restarting.  Only the same pidfile and logfile arguments must be
 
-     used when stopping.
 
- By default it will create pid and log files in the current directory,
 
- to protect against multiple workers launching on top of each other
 
- you are encouraged to put these in a dedicated directory:
 
- .. code-block:: bash
 
-     $ mkdir -p /var/run/celery
 
-     $ mkdir -p /var/log/celery
 
-     $ celery multi start w1 -A proj -l info --pidfile=/var/run/celery/%n.pid \
 
-                                             --logfile=/var/log/celery/%n%I.log
 
- With the multi command you can start multiple workers, and there is a powerful
 
- command-line syntax to specify arguments for different workers too,
 
- e.g:
 
- .. code-block:: bash
 
-     $ celery multi start 10 -A proj -l info -Q:1-3 images,video -Q:4,5 data \
 
-         -Q default -L:4,5 debug
 
- For more examples see the :mod:`~celery.bin.multi` module in the API
 
- reference.
 
- .. _app-argument:
 
- About the :option:`--app` argument
 
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
- The :option:`--app` argument specifies the Celery app instance to use,
 
- it must be in the form of ``module.path:attribute``
 
- But it also supports a shortcut form If only a package name is specified,
 
- where it'll try to search for the app instance, in the following order:
 
- With ``--app=proj``:
 
- 1) an attribute named ``proj.app``, or
 
- 2) an attribute named ``proj.celery``, or
 
- 3) any attribute in the module ``proj`` where the value is a Celery
 
-    application, or
 
- If none of these are found it'll try a submodule named ``proj.celery``:
 
- 4) an attribute named ``proj.celery.app``, or
 
- 5) an attribute named ``proj.celery.celery``, or
 
- 6) Any atribute in the module ``proj.celery`` where the value is a Celery
 
-    application.
 
- This scheme mimics the practices used in the documentation,
 
- i.e. ``proj:app`` for a single contained module, and ``proj.celery:app``
 
- for larger projects.
 
- .. _calling-tasks:
 
- Calling Tasks
 
- =============
 
- You can call a task using the :meth:`delay` method::
 
-     >>> add.delay(2, 2)
 
- This method is actually a star-argument shortcut to another method called
 
- :meth:`apply_async`::
 
-     >>> add.apply_async((2, 2))
 
- The latter enables you to specify execution options like the time to run
 
- (countdown), the queue it should be sent to and so on::
 
-     >>> add.apply_async((2, 2), queue='lopri', countdown=10)
 
- In the above example the task will be sent to a queue named ``lopri`` and the
 
- task will execute, at the earliest, 10 seconds after the message was sent.
 
- Applying the task directly will execute the task in the current process,
 
- so that no message is sent::
 
-     >>> add(2, 2)
 
-     4
 
- These three methods - :meth:`delay`, :meth:`apply_async`, and applying
 
- (``__call__``), represents the Celery calling API, which are also used for
 
- signatures.
 
- A more detailed overview of the Calling API can be found in the
 
- :ref:`Calling User Guide <guide-calling>`.
 
- Every task invocation will be given a unique identifier (an UUID), this
 
- is the task id.
 
- The ``delay`` and ``apply_async`` methods return an :class:`~@AsyncResult`
 
- instance, which can be used to keep track of the tasks execution state.
 
- But for this you need to enable a :ref:`result backend <task-result-backends>` so that
 
- the state can be stored somewhere.
 
- Results are disabled by default because of the fact that there is no result
 
- backend that suits every application, so to choose one you need to consider
 
- the drawbacks of each individual backend.  For many tasks
 
- keeping the return value isn't even very useful, so it's a sensible default to
 
- have.  Also note that result backends are not used for monitoring tasks and workers,
 
- for that Celery uses dedicated event messages (see :ref:`guide-monitoring`).
 
- If you have a result backend configured you can retrieve the return
 
- value of a task::
 
-     >>> res = add.delay(2, 2)
 
-     >>> res.get(timeout=1)
 
-     4
 
- You can find the task's id by looking at the :attr:`id` attribute::
 
-     >>> res.id
 
-     d6b3aea2-fb9b-4ebc-8da4-848818db9114
 
- You can also inspect the exception and traceback if the task raised an
 
- exception, in fact ``result.get()`` will propagate any errors by default::
 
-     >>> res = add.delay(2)
 
-     >>> res.get(timeout=1)
 
-     Traceback (most recent call last):
 
-     File "<stdin>", line 1, in <module>
 
-     File "/opt/devel/celery/celery/result.py", line 113, in get
 
-         interval=interval)
 
-     File "/opt/devel/celery/celery/backends/amqp.py", line 138, in wait_for
 
-         raise meta['result']
 
-     TypeError: add() takes exactly 2 arguments (1 given)
 
- If you don't wish for the errors to propagate then you can disable that
 
- by passing the ``propagate`` argument::
 
-     >>> res.get(propagate=False)
 
-     TypeError('add() takes exactly 2 arguments (1 given)',)
 
- In this case it will return the exception instance raised instead,
 
- and so to check whether the task succeeded or failed you will have to
 
- use the corresponding methods on the result instance::
 
-     >>> res.failed()
 
-     True
 
-     >>> res.successful()
 
-     False
 
- So how does it know if the task has failed or not?  It can find out by looking
 
- at the tasks *state*::
 
-     >>> res.state
 
-     'FAILURE'
 
- A task can only be in a single state, but it can progress through several
 
- states. The stages of a typical task can be::
 
-     PENDING -> STARTED -> SUCCESS
 
- The started state is a special state that is only recorded if the
 
- :setting:`CELERY_TRACK_STARTED` setting is enabled, or if the
 
- ``@task(track_started=True)`` option is set for the task.
 
- The pending state is actually not a recorded state, but rather
 
- the default state for any task id that is unknown, which you can see
 
- from this example::
 
-     >>> from proj.celery import app
 
-     >>> res = app.AsyncResult('this-id-does-not-exist')
 
-     >>> res.state
 
-     'PENDING'
 
- If the task is retried the stages can become even more complex,
 
- e.g, for a task that is retried two times the stages would be::
 
-     PENDING -> STARTED -> RETRY -> STARTED -> RETRY -> STARTED -> SUCCESS
 
- To read more about task states you should see the :ref:`task-states` section
 
- in the tasks user guide.
 
- Calling tasks is described in detail in the
 
- :ref:`Calling Guide <guide-calling>`.
 
- .. _designing-workflows:
 
- *Canvas*: Designing Workflows
 
- =============================
 
- You just learned how to call a task using the tasks ``delay`` method,
 
- and this is often all you need, but sometimes you may want to pass the
 
- signature of a task invocation to another process or as an argument to another
 
- function, for this Celery uses something called *signatures*.
 
- A signature wraps the arguments and execution options of a single task
 
- invocation in a way such that it can be passed to functions or even serialized
 
- and sent across the wire.
 
- You can create a signature for the ``add`` task using the arguments ``(2, 2)``,
 
- and a countdown of 10 seconds like this::
 
-     >>> add.signature((2, 2), countdown=10)
 
-     tasks.add(2, 2)
 
- There is also a shortcut using star arguments::
 
-     >>> add.s(2, 2)
 
-     tasks.add(2, 2)
 
- And there's that calling API again…
 
- -----------------------------------
 
- Signature instances also supports the calling API, which means that they
 
- have the ``delay`` and ``apply_async`` methods.
 
- But there is a difference in that the signature may already have
 
- an argument signature specified.  The ``add`` task takes two arguments,
 
- so a signature specifying two arguments would make a complete signature::
 
-     >>> s1 = add.s(2, 2)
 
-     >>> res = s1.delay()
 
-     >>> res.get()
 
-     4
 
- But, you can also make incomplete signatures to create what we call
 
- *partials*::
 
-     # incomplete partial: add(?, 2)
 
-     >>> s2 = add.s(2)
 
- ``s2`` is now a partial signature that needs another argument to be complete,
 
- and this can be resolved when calling the signature::
 
-     # resolves the partial: add(8, 2)
 
-     >>> res = s2.delay(8)
 
-     >>> res.get()
 
-     10
 
- Here you added the argument 8, which was prepended to the existing argument 2
 
- forming a complete signature of ``add(8, 2)``.
 
- Keyword arguments can also be added later, these are then merged with any
 
- existing keyword arguments, but with new arguments taking precedence::
 
-     >>> s3 = add.s(2, 2, debug=True)
 
-     >>> s3.delay(debug=False)   # debug is now False.
 
- As stated signatures supports the calling API, which means that:
 
- - ``sig.apply_async(args=(), kwargs={}, **options)``
 
-     Calls the signature with optional partial arguments and partial
 
-     keyword arguments.  Also supports partial execution options.
 
- - ``sig.delay(*args, **kwargs)``
 
-   Star argument version of ``apply_async``.  Any arguments will be prepended
 
-   to the arguments in the signature, and keyword arguments is merged with any
 
-   existing keys.
 
- So this all seems very useful, but what can you actually do with these?
 
- To get to that I must introduce the canvas primitives…
 
- The Primitives
 
- --------------
 
- .. topic:: \ 
 
-     .. hlist::
 
-         :columns: 2
 
-         - :ref:`group <canvas-group>`
 
-         - :ref:`chain <canvas-chain>`
 
-         - :ref:`chord <canvas-chord>`
 
-         - :ref:`map <canvas-map>`
 
-         - :ref:`starmap <canvas-map>`
 
-         - :ref:`chunks <canvas-chunks>`
 
- These primitives are signature objects themselves, so they can be combined
 
- in any number of ways to compose complex workflows.
 
- .. note::
 
-     These examples retrieve results, so to try them out you need
 
-     to configure a result backend. The example project
 
-     above already does that (see the backend argument to :class:`~celery.Celery`).
 
- Let's look at some examples:
 
- Groups
 
- ~~~~~~
 
- A :class:`~celery.group` calls a list of tasks in parallel,
 
- and it returns a special result instance that lets you inspect the results
 
- as a group, and retrieve the return values in order.
 
- .. code-block:: python
 
-     >>> from celery import group
 
-     >>> from proj.tasks import add
 
-     >>> group(add.s(i, i) for i in xrange(10))().get()
 
-     [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
 
- - Partial group
 
- .. code-block:: python
 
-     >>> g = group(add.s(i) for i in xrange(10))
 
-     >>> g(10).get()
 
-     [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
 
- Chains
 
- ~~~~~~
 
- Tasks can be linked together so that after one task returns the other
 
- is called:
 
- .. code-block:: python
 
-     >>> from celery import chain
 
-     >>> from proj.tasks import add, mul
 
-     # (4 + 4) * 8
 
-     >>> chain(add.s(4, 4) | mul.s(8))().get()
 
-     64
 
- or a partial chain:
 
- .. code-block:: python
 
-     # (? + 4) * 8
 
-     >>> g = chain(add.s(4) | mul.s(8))
 
-     >>> g(4).get()
 
-     64
 
- Chains can also be written like this:
 
- .. code-block:: python
 
-     >>> (add.s(4, 4) | mul.s(8))().get()
 
-     64
 
- Chords
 
- ~~~~~~
 
- A chord is a group with a callback:
 
- .. code-block:: python
 
-     >>> from celery import chord
 
-     >>> from proj.tasks import add, xsum
 
-     >>> chord((add.s(i, i) for i in xrange(10)), xsum.s())().get()
 
-     90
 
- A group chained to another task will be automatically converted
 
- to a chord:
 
- .. code-block:: python
 
-     >>> (group(add.s(i, i) for i in xrange(10)) | xsum.s())().get()
 
-     90
 
- Since these primitives are all of the signature type they
 
- can be combined almost however you want, e.g::
 
-     >>> upload_document.s(file) | group(apply_filter.s() for filter in filters)
 
- Be sure to read more about workflows in the :ref:`Canvas <guide-canvas>` user
 
- guide.
 
- Routing
 
- =======
 
- Celery supports all of the routing facilities provided by AMQP,
 
- but it also supports simple routing where messages are sent to named queues.
 
- The :setting:`CELERY_ROUTES` setting enables you to route tasks by name
 
- and keep everything centralized in one location::
 
-     app.conf.update(
 
-         CELERY_ROUTES = {
 
-             'proj.tasks.add': {'queue': 'hipri'},
 
-         },
 
-     )
 
- You can also specify the queue at runtime
 
- with the ``queue`` argument to ``apply_async``::
 
-     >>> from proj.tasks import add
 
-     >>> add.apply_async((2, 2), queue='hipri')
 
- You can then make a worker consume from this queue by
 
- specifying the :option:`-Q` option:
 
- .. code-block:: bash
 
-     $ celery -A proj worker -Q hipri
 
- You may specify multiple queues by using a comma separated list,
 
- for example you can make the worker consume from both the default
 
- queue, and the ``hipri`` queue, where
 
- the default queue is named ``celery`` for historical reasons:
 
- .. code-block:: bash
 
-     $ celery -A proj worker -Q hipri,celery
 
- The order of the queues doesn't matter as the worker will
 
- give equal weight to the queues.
 
- To learn more about routing, including taking use of the full
 
- power of AMQP routing, see the :ref:`Routing Guide <guide-routing>`.
 
- Remote Control
 
- ==============
 
- If you're using RabbitMQ (AMQP), Redis or MongoDB as the broker then
 
- you can control and inspect the worker at runtime.
 
- For example you can see what tasks the worker is currently working on:
 
- .. code-block:: bash
 
-     $ celery -A proj inspect active
 
- This is implemented by using broadcast messaging, so all remote
 
- control commands are received by every worker in the cluster.
 
- You can also specify one or more workers to act on the request
 
- using the :option:`--destination` option, which is a comma separated
 
- list of worker host names:
 
- .. code-block:: bash
 
-     $ celery -A proj inspect active --destination=celery@example.com
 
- If a destination is not provided then every worker will act and reply
 
- to the request.
 
- The :program:`celery inspect` command contains commands that
 
- does not change anything in the worker, it only replies information
 
- and statistics about what is going on inside the worker.
 
- For a list of inspect commands you can execute:
 
- .. code-block:: bash
 
-     $ celery -A proj inspect --help
 
- Then there is the :program:`celery control` command, which contains
 
- commands that actually changes things in the worker at runtime:
 
- .. code-block:: bash
 
-     $ celery -A proj control --help
 
- For example you can force workers to enable event messages (used
 
- for monitoring tasks and workers):
 
- .. code-block:: bash
 
-     $ celery -A proj control enable_events
 
- When events are enabled you can then start the event dumper
 
- to see what the workers are doing:
 
- .. code-block:: bash
 
-     $ celery -A proj events --dump
 
- or you can start the curses interface:
 
- .. code-block:: bash
 
-     $ celery -A proj events
 
- when you're finished monitoring you can disable events again:
 
- .. code-block:: bash
 
-     $ celery -A proj control disable_events
 
- The :program:`celery status` command also uses remote control commands
 
- and shows a list of online workers in the cluster:
 
- .. code-block:: bash
 
-     $ celery -A proj status
 
- You can read more about the :program:`celery` command and monitoring
 
- in the :ref:`Monitoring Guide <guide-monitoring>`.
 
- Timezone
 
- ========
 
- All times and dates, internally and in messages uses the UTC timezone.
 
- When the worker receives a message, for example with a countdown set it
 
- converts that UTC time to local time.  If you wish to use
 
- a different timezone than the system timezone then you must
 
- configure that using the :setting:`CELERY_TIMEZONE` setting::
 
-     app.conf.CELERY_TIMEZONE = 'Europe/London'
 
- Optimization
 
- ============
 
- The default configuration is not optimized for throughput by default,
 
- it tries to walk the middle way between many short tasks and fewer long
 
- tasks, a compromise between throughput and fair scheduling.
 
- If you have strict fair scheduling requirements, or want to optimize
 
- for throughput then you should read the :ref:`Optimizing Guide
 
- <guide-optimizing>`.
 
- If you're using RabbitMQ then you should install the :mod:`librabbitmq`
 
- module, which is an AMQP client implemented in C:
 
- .. code-block:: bash
 
-     $ pip install librabbitmq
 
- What to do now?
 
- ===============
 
- Now that you have read this document you should continue
 
- to the :ref:`User Guide <guide>`.
 
- There's also an :ref:`API reference <apiref>` if you are so inclined.
 
 
  |