15 年之前 · 7bd2d1f623
--- a/docs/userguide/tasks.rst
+++ b/docs/userguide/tasks.rst
@@ -24,12 +24,11 @@ a task, ``celery.decorators.task``:
 
				     from celery.decorators import task
			
 
				     from django.contrib.auth import User
			
 
				 
			
 
				-    @task()
			
 
				+    @task
			
 
				     def create_user(username, password):
			
 
				         User.objects.create(username=username, password=password)
			
 
				 
			
 
				-Note the parens after ``@task()`` the task decorator takes any execution
			
 
				-options the ``Task`` class does:
			
 
				+The task decorator takes the same execution options the ``Task`` class does:
			
 
				 
			
 
				 .. code-block:: python
			
 
				 
			
@@ -296,5 +295,89 @@ This means that your workers must optimally be updated with the same software
 
				 as the client, this is a drawback, but the alternative is a technical
			
 
				 challenge that has yet to be solved.
			
 
				 
			
 
				+Performance and Strategies
			
 
				+==========================
			
 
				+
			
 
				+Granularity
			
 
				+-----------
			
 
				+
			
 
				+The tasks granularity is the degree of parallelization your task have.
			
 
				+It's better to have a lot of small tasks, than just a few long running
			
 
				+ones.
			
 
				+
			
 
				+With smaller tasks, you can process more tasks in parallell and the tasks
			
 
				+won't run long enough to block the worker from processing other waiting tasks.
			
 
				+
			
 
				+But there's a limit, sending messages takes processing power too. If
			
 
				+your tasks are so short the overhead of passing them around is worse than
			
 
				+just executing them inline, you should reconsider your strategy. There is no
			
 
				+universal answer here.
			
 
				+
			
 
				+Data locality
			
 
				+-------------
			
 
				+
			
 
				+The worker processing the task should optimally be as close to the data as
			
 
				+possible. The best would be to have a copy in memory, the worst being a
			
 
				+full transfer from another continent.
			
 
				+
			
 
				+If the data is far away, you could try to run another worker at location, or
			
 
				+if that's not possible, cache often used data, or preload data you know you
			
 
				+know is going to be used.
			
 
				+
			
 
				+The easiest way to share data between workers is to use a distributed caching
			
 
				+system, like `memcached`_.
			
 
				+
			
 
				+.. _`memcached`: http://memcached.org/
			
 
				+http://research.microsoft.com/pubs/70001/tr-2003-24.pdf
			
 
				+
			
 
				+State
			
 
				+-----
			
 
				+
			
 
				+Since celery is a distributed system, you can't know in which process, or even
			
 
				+on what machine the task will run, also you can't even know if the task will
			
 
				+run in a timely manner, so please be wary of the state you pass on to tasks.
			
 
				+
			
 
				+One gotcha is Django model objects, they shouldn't be passed on as arguments
			
 
				+to task classes, it's almost always better to refetch the object from the
			
 
				+database instead, as there are possible race conditions involved.
			
 
				+
			
 
				+Imagine the following scenario where you have an article, and a task
			
 
				+that automatically expands some abbreviations in it.
			
 
				+
			
 
				+.. code-block:: python
			
 
				+
			
 
				+    class Article(models.Model):
			
 
				+        title = models.CharField()
			
 
				+        body = models.TextField()
			
 
				+
			
 
				+    @task
			
 
				+    def expand_abbreviations(article):
			
 
				+        article.body.replace("MyCorp", "My Corporation")
			
 
				+        article.save()
			
 
				+
			
 
				+First, an author creates an article and saves it, then the author
			
 
				+clicks on a button that initiates the abbreviation task.
			
 
				+
			
 
				+    >>> article = Article.objects.get(id=102)
			
 
				+    >>> expand_abbreviations.delay(model_object)
			
 
				+
			
 
				+Now, the queue is very busy, so the task won't be run for another 2 minutes,
			
 
				+in the meantime another author makes some changes to the article,
			
 
				+when the task is finally run, the body of the article is reverted to the old
			
 
				+version, because the task had the old body in its argument.
			
 
				+
			
 
				+Fixing the race condition is easy, just use the article id instead, and
			
 
				+refetch the article in the task body:
			
 
				+
			
 
				+.. code-block:: python
			
 
				+
			
 
				+    @task
			
 
				+    def expand_abbreviations(article_id)
			
 
				+        article = Article.objects.get(id=article_id)
			
 
				+        article.body.replace("MyCorp", "My Corporation")
			
 
				+        article.save()
			
 
				 
			
 
				+    >>> expand_abbreviations(article_id)
			
 
				 
			
 
				+There might even be performance benefits to this approach, as sending large
			
 
				+messages may be expensive.