security.rst 8.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245
  1. .. _guide-security:
  2. ==========
  3. Security
  4. ==========
  5. .. contents::
  6. :local:
  7. Introduction
  8. ============
  9. While Celery is written with security in mind, it should be treated as an
  10. unsafe component.
  11. Depending on your `Security Policy`_, there are
  12. various steps you can take to make your Celery installation more secure.
  13. .. _`Security Policy`: http://en.wikipedia.org/wiki/Security_policy
  14. Areas of Concern
  15. ================
  16. Broker
  17. ------
  18. It is imperative that the broker is guarded from unwanted access, especially
  19. if accessible to the public.
  20. By default, workers trust that the data they get from the broker has not
  21. been tampered with. See `Message Signing`_ for information on how to make
  22. the broker connection more trustworthy.
  23. The first line of defence should be to put a firewall in front of the broker,
  24. allowing only white-listed machines to access it.
  25. Keep in mind that both firewall misconfiguration, and temporarily disabling
  26. the firewall, is common in the real world. Solid security policy includes
  27. monitoring of firewall equipment to detect if they have been disabled, be it
  28. accidentally or on purpose.
  29. In other words, one should not blindly trust the firewall either.
  30. If your broker supports fine-grained access control, like RabbitMQ,
  31. this is something you should look at enabling. See for example
  32. http://www.rabbitmq.com/access-control.html.
  33. If supported by your broker backend, you can enable end-to-end SSL encryption
  34. and authentication using :setting:`broker_use_ssl`.
  35. Client
  36. ------
  37. In Celery, "client" refers to anything that sends messages to the
  38. broker, e.g. web-servers that apply tasks.
  39. Having the broker properly secured doesn't matter if arbitrary messages
  40. can be sent through a client.
  41. *[Need more text here]*
  42. Worker
  43. ------
  44. The default permissions of tasks running inside a worker are the same ones as
  45. the privileges of the worker itself. This applies to resources such as
  46. memory, file-systems and devices.
  47. An exception to this rule is when using the multiprocessing based task pool,
  48. which is currently the default. In this case, the task will have access to
  49. any memory copied as a result of the :func:`fork` call (does not apply
  50. under MS Windows), and access to memory contents written
  51. by parent tasks in the same worker child process.
  52. Limiting access to memory contents can be done by launching every task
  53. in a subprocess (:func:`fork` + :func:`execve`).
  54. Limiting file-system and device access can be accomplished by using
  55. `chroot`_, `jail`_, `sandboxing`_, virtual machines or other
  56. mechanisms as enabled by the platform or additional software.
  57. Note also that any task executed in the worker will have the
  58. same network access as the machine on which it's running. If the worker
  59. is located on an internal network it's recommended to add firewall rules for
  60. outbound traffic.
  61. .. _`chroot`: http://en.wikipedia.org/wiki/Chroot
  62. .. _`jail`: http://en.wikipedia.org/wiki/FreeBSD_jail
  63. .. _`sandboxing`:
  64. http://en.wikipedia.org/wiki/Sandbox_(computer_security)
  65. Serializers
  66. ===========
  67. The default `pickle` serializer is convenient because it supports
  68. arbitrary Python objects, whereas other serializers only
  69. work with a restricted set of types.
  70. But for the same reasons the `pickle` serializer is inherently insecure [*]_,
  71. and should be avoided whenever clients are untrusted or
  72. unauthenticated.
  73. .. [*] http://nadiana.com/python-pickle-insecure
  74. You can disable untrusted content by specifying
  75. a white-list of accepted content-types in the :setting:`accept_content`
  76. setting:
  77. .. versionadded:: 3.0.18
  78. .. note::
  79. This setting was first supported in version 3.0.18. If you're
  80. running an earlier version it will simply be ignored, so make
  81. sure you're running a version that supports it.
  82. .. code-block:: python
  83. accept_content = ['json']
  84. This accepts a list of serializer names and content-types, so you could
  85. also specify the content type for json:
  86. .. code-block:: python
  87. accept_content = ['application/json']
  88. Celery also comes with a special `auth` serializer that validates
  89. communication between Celery clients and workers, making sure
  90. that messages originates from trusted sources.
  91. Using `Public-key cryptography` the `auth` serializer can verify the
  92. authenticity of senders, to enable this read :ref:`message-signing`
  93. for more information.
  94. .. _`pickle`: http://docs.python.org/library/pickle.html
  95. .. _`Public-key cryptography`:
  96. http://en.wikipedia.org/wiki/Public-key_cryptography
  97. .. _message-signing:
  98. Message Signing
  99. ===============
  100. Celery can use the `pyOpenSSL`_ library to sign message using
  101. `Public-key cryptography`, where
  102. messages sent by clients are signed using a private key
  103. and then later verified by the worker using a public certificate.
  104. Optimally certificates should be signed by an official
  105. `Certificate Authority`_, but they can also be self-signed.
  106. To enable this you should configure the :setting:`task_serializer`
  107. setting to use the `auth` serializer.
  108. Also required is configuring the
  109. paths used to locate private keys and certificates on the file-system:
  110. the :setting:`security_key`,
  111. :setting:`security_certificate` and :setting:`security_cert_store`
  112. settings respectively.
  113. With these configured it is also necessary to call the
  114. :func:`celery.setup_security` function. Note that this will also
  115. disable all insecure serializers so that the worker won't accept
  116. messages with untrusted content types.
  117. This is an example configuration using the `auth` serializer,
  118. with the private key and certificate files located in `/etc/ssl`.
  119. .. code-block:: python
  120. app = Celery()
  121. app.conf.update(
  122. security_key='/etc/ssl/private/worker.key'
  123. security_certificate='/etc/ssl/certs/worker.pem'
  124. security_cert_store='/etc/ssl/certs/*.pem',
  125. )
  126. app.setup_security()
  127. .. note::
  128. While relative paths are not disallowed, using absolute paths
  129. is recommended for these files.
  130. Also note that the `auth` serializer won't encrypt the contents of
  131. a message, so if needed this will have to be enabled separately.
  132. .. _`pyOpenSSL`: http://pypi.python.org/pypi/pyOpenSSL
  133. .. _`X.509`: http://en.wikipedia.org/wiki/X.509
  134. .. _`Certificate Authority`:
  135. http://en.wikipedia.org/wiki/Certificate_authority
  136. Intrusion Detection
  137. ===================
  138. The most important part when defending your systems against
  139. intruders is being able to detect if the system has been compromised.
  140. Logs
  141. ----
  142. Logs are usually the first place to look for evidence
  143. of security breaches, but they are useless if they can be tampered with.
  144. A good solution is to set up centralized logging with a dedicated logging
  145. server. Access to it should be restricted.
  146. In addition to having all of the logs in a single place, if configured
  147. correctly, it can make it harder for intruders to tamper with your logs.
  148. This should be fairly easy to setup using syslog (see also `syslog-ng`_ and
  149. `rsyslog`_.). Celery uses the :mod:`logging` library, and already has
  150. support for using syslog.
  151. A tip for the paranoid is to send logs using UDP and cut the
  152. transmit part of the logging server's network cable :-)
  153. .. _`syslog-ng`: http://en.wikipedia.org/wiki/Syslog-ng
  154. .. _`rsyslog`: http://www.rsyslog.com/
  155. Tripwire
  156. --------
  157. `Tripwire`_ is a (now commercial) data integrity tool, with several
  158. open source implementations, used to keep
  159. cryptographic hashes of files in the file-system, so that administrators
  160. can be alerted when they change. This way when the damage is done and your
  161. system has been compromised you can tell exactly what files intruders
  162. have changed (password files, logs, backdoors, rootkits and so on).
  163. Often this is the only way you will be able to detect an intrusion.
  164. Some open source implementations include:
  165. * `OSSEC`_
  166. * `Samhain`_
  167. * `Open Source Tripwire`_
  168. * `AIDE`_
  169. Also, the `ZFS`_ file-system comes with built-in integrity checks
  170. that can be used.
  171. .. _`Tripwire`: http://tripwire.com/
  172. .. _`OSSEC`: http://www.ossec.net/
  173. .. _`Samhain`: http://la-samhna.de/samhain/index.html
  174. .. _`AIDE`: http://aide.sourceforge.net/
  175. .. _`Open Source Tripwire`: http://sourceforge.net/projects/tripwire/
  176. .. _`ZFS`: http://en.wikipedia.org/wiki/ZFS