optimizing.rst 5.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148
  1. .. _guide-optimizing:
  2. ============
  3. Optimizing
  4. ============
  5. Introduction
  6. ============
  7. The default configuration makes a lot of compromises. It's not optimal for
  8. any single case, but works well enough for most situations.
  9. There are optimizations that can be applied based on specific use cases.
  10. Optimizations can apply to different properties of the running environment,
  11. be it the time tasks take to execute, the amount of memory used, or
  12. responsiveness at times of high load.
  13. Ensuring Operations
  14. ===================
  15. In the book `Programming Pearls`_, Jon Bentley presents the concept of
  16. back-of-the-envelope calculations by asking the question;
  17. ❝ How much water flows out of the Mississippi River in a day? ❞
  18. The point of this exercise[*] is to show that there is a limit
  19. to how much data a system can process in a timely manner.
  20. Back of the envelope calculations can be used as a means to plan for this
  21. ahead of time.
  22. In Celery; If a task takes 10 minutes to complete,
  23. and there are 10 new tasks coming in every minute, the queue will never
  24. be empty. This is why it's very important
  25. that you monitor queue lengths!
  26. A way to do this is by :ref:`using Munin <monitoring-munin>`.
  27. You should set up alerts, that will notify you as soon as any queue has
  28. reached an unacceptable size. This way you can take appropriate action
  29. like adding new worker nodes, or revoking unnecessary tasks.
  30. .. [*] The chapter is available to read for free here:
  31. `The back of the envelope`_. The book is a classic text. Highly
  32. recommended.
  33. .. _`Programming Pearls`: http://www.cs.bell-labs.com/cm/cs/pearls/
  34. .. _`The back of the envelope`:
  35. http://books.google.com/books?id=kse_7qbWbjsC&pg=PA67
  36. .. _optimizing-worker-settings:
  37. Worker Settings
  38. ===============
  39. .. _optimizing-connection-pools:
  40. Broker Connection Pools
  41. -----------------------
  42. You should enable the :setting:`BROKER_POOL_LIMIT` setting,
  43. as this will drastically improve overall performance.
  44. This setting will be enabled by default in version 3.0.
  45. .. _optimizing-prefetch-limit:
  46. Prefetch Limits
  47. ---------------
  48. *Prefetch* is a term inherited from AMQP that is often misunderstood
  49. by users.
  50. The prefetch limit is a **limit** for the number of tasks (messages) a worker
  51. can reserve for itself. If it is zero, the worker will keep
  52. consuming messages, not respecting that there may be other
  53. available worker nodes that may be able to process them sooner[#],
  54. or that the messages may not even fit in memory.
  55. The workers' default prefetch count is the
  56. :setting:`CELERYD_PREFETCH_MULTIPLIER` setting multiplied by the number
  57. of child worker processes[#].
  58. If you have many tasks with a long duration you want
  59. the multiplier value to be 1, which means it will only reserve one
  60. task per worker process at a time.
  61. However -- If you have many short-running tasks, and throughput/round trip
  62. latency[#] is important to you, this number should be large. The worker is
  63. able to process more tasks per second if the messages have already been
  64. prefetched, and is available in memory. You may have to experiment to find
  65. the best value that works for you. Values like 50 or 150 might make sense in
  66. these circumstances. Say 64, or 128.
  67. If you have a combination of long- and short-running tasks, the best option
  68. is to use two worker nodes that are configured separately, and route
  69. the tasks according to the run-time. (see :ref:`guide-routing`).
  70. .. [*] RabbitMQ and other brokers deliver messages round-robin,
  71. so this doesn't apply to an active system. If there is no prefetch
  72. limit and you restart the cluster, there will be timing delays between
  73. nodes starting. If there are 3 offline nodes and one active node,
  74. all messages will be delivered to the active node.
  75. .. [*] This is the concurrency setting; :setting:`CELERYD_CONCURRENCY` or the
  76. :option:`-c` option to :program:`celeryd`.
  77. Reserve one task at a time
  78. --------------------------
  79. When using early acknowledgement (default), a prefetch multiplier of 1
  80. means the worker will reserve at most one extra task for every active
  81. worker process.
  82. When users ask if it's possible to disable "prefetching of tasks", often
  83. what they really want is to have a worker only reserve as many tasks as there
  84. are child processes.
  85. But this is not possible without enabling late acknowledgements
  86. acknowledgements; A task that has been started, will be
  87. retried if the worker crashes mid execution so the task must be `idempotent`_
  88. (see also notes at :ref:`faq-acks_late-vs-retry`).
  89. .. _`idempotent`: http://en.wikipedia.org/wiki/Idempotent
  90. You can enable this behavior by using the following configuration options:
  91. .. code-block:: python
  92. CELERY_ACKS_LATE = True
  93. CELERYD_PREFETCH_MULTIPLIER = 1
  94. .. optimizing-rate-limits:
  95. Rate Limits
  96. -----------
  97. The system responsible for enforcing rate limits introduces some overhead,
  98. so if you're not using rate limits it may be a good idea to
  99. disable them completely. This will disable one thread, and it won't
  100. spend as many CPU cycles when the queue is inactive.
  101. Set the :setting:`CELERY_DISABLE_RATE_LIMITS` setting to disable
  102. the rate limit subsystem:
  103. .. code-block:: python
  104. CELERY_DISABLE_RATE_LIMITS = True