clickcounter.html 23 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311
  1. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  2. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  3. <html xmlns="http://www.w3.org/1999/xhtml">
  4. <head>
  5. <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  6. <title>Tutorial: Creating a click counter using carrot and celery &mdash; Celery v0.4.2 (stable) documentation</title>
  7. <link rel="stylesheet" href="../static/nature.css" type="text/css" />
  8. <link rel="stylesheet" href="../static/pygments.css" type="text/css" />
  9. <script type="text/javascript">
  10. var DOCUMENTATION_OPTIONS = {
  11. URL_ROOT: '../',
  12. VERSION: '0.4.2 (stable)',
  13. COLLAPSE_MODINDEX: false,
  14. FILE_SUFFIX: '.html',
  15. HAS_SOURCE: true
  16. };
  17. </script>
  18. <script type="text/javascript" src="../static/jquery.js"></script>
  19. <script type="text/javascript" src="../static/doctools.js"></script>
  20. <link rel="top" title="Celery v0.4.2 (stable) documentation" href="../index.html" />
  21. </head>
  22. <body>
  23. <div class="related">
  24. <h3>Navigation</h3>
  25. <ul>
  26. <li class="right" style="margin-right: 10px">
  27. <a href="../genindex.html" title="General Index"
  28. accesskey="I">index</a></li>
  29. <li class="right" >
  30. <a href="../modindex.html" title="Global Module Index"
  31. accesskey="M">modules</a> |</li>
  32. <li><a href="../index.html">Celery v0.4.2 (stable) documentation</a> &raquo;</li>
  33. </ul>
  34. </div>
  35. <div class="document">
  36. <div class="documentwrapper">
  37. <div class="bodywrapper">
  38. <div class="body">
  39. <div class="section" id="tutorial-creating-a-click-counter-using-carrot-and-celery">
  40. <h1>Tutorial: Creating a click counter using carrot and celery<a class="headerlink" href="#tutorial-creating-a-click-counter-using-carrot-and-celery" title="Permalink to this headline">¶</a></h1>
  41. <div class="section" id="introduction">
  42. <h2>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline">¶</a></h2>
  43. <p>A click counter should be easy, right? Just a simple view that increments
  44. a click in the DB and forwards you to the real destination.</p>
  45. <p>This would work well for most sites, but when traffic starts to increase,
  46. you are likely to bump into problems. One database write for every click is
  47. not good if you have millions of clicks a day.</p>
  48. <p>So what can you do? In this tutorial we will send the individual clicks as
  49. messages using <tt class="docutils literal"><span class="pre">carrot</span></tt>, and then process them later with a <tt class="docutils literal"><span class="pre">celery</span></tt>
  50. periodic task.</p>
  51. <p>Celery and carrot is excellent in tandem, and while this might not be
  52. the perfect example, you&#8217;ll at least see one example how of they can be used
  53. to solve a task.</p>
  54. </div>
  55. <div class="section" id="the-model">
  56. <h2>The model<a class="headerlink" href="#the-model" title="Permalink to this headline">¶</a></h2>
  57. <p>The model is simple, <tt class="docutils literal"><span class="pre">Click</span></tt> has the URL as primary key and a number of
  58. clicks for that URL. Its manager, <tt class="docutils literal"><span class="pre">ClickManager</span></tt> implements the
  59. <tt class="docutils literal"><span class="pre">increment_clicks</span></tt> method, which takes a URL and by how much to increment
  60. its count by.</p>
  61. <p><em>clickmuncher/models.py</em>:</p>
  62. <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.db</span> <span class="kn">import</span> <span class="n">models</span>
  63. <span class="kn">from</span> <span class="nn">django.utils.translation</span> <span class="kn">import</span> <span class="n">ugettext_lazy</span> <span class="k">as</span> <span class="n">_</span>
  64. <span class="k">class</span> <span class="nc">ClickManager</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Manager</span><span class="p">):</span>
  65. <span class="k">def</span> <span class="nf">increment_clicks</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">for_url</span><span class="p">,</span> <span class="n">increment_by</span><span class="o">=</span><span class="mf">1</span><span class="p">):</span>
  66. <span class="sd">&quot;&quot;&quot;Increment the click count for an URL.</span>
  67. <span class="sd"> &gt;&gt;&gt; Click.objects.increment_clicks(&quot;http://google.com&quot;, 10)</span>
  68. <span class="sd"> &quot;&quot;&quot;</span>
  69. <span class="n">click</span><span class="p">,</span> <span class="n">created</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">get_or_create</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">for_url</span><span class="p">,</span>
  70. <span class="n">defaults</span><span class="o">=</span><span class="p">{</span><span class="s">&quot;click_count&quot;</span><span class="p">:</span> <span class="n">increment_by</span><span class="p">})</span>
  71. <span class="k">if</span> <span class="ow">not</span> <span class="n">created</span><span class="p">:</span>
  72. <span class="n">click</span><span class="o">.</span><span class="n">click_count</span> <span class="o">+=</span> <span class="n">increment_by</span>
  73. <span class="n">click</span><span class="o">.</span><span class="n">save</span><span class="p">()</span>
  74. <span class="k">return</span> <span class="n">click</span><span class="o">.</span><span class="n">click_count</span>
  75. <span class="k">class</span> <span class="nc">Click</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
  76. <span class="n">url</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">URLField</span><span class="p">(</span><span class="n">_</span><span class="p">(</span><span class="s">u&quot;URL&quot;</span><span class="p">),</span> <span class="n">verify_exists</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">unique</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
  77. <span class="n">click_count</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">PositiveIntegerField</span><span class="p">(</span><span class="n">_</span><span class="p">(</span><span class="s">u&quot;click_count&quot;</span><span class="p">),</span>
  78. <span class="n">default</span><span class="o">=</span><span class="mf">0</span><span class="p">)</span>
  79. <span class="n">objects</span> <span class="o">=</span> <span class="n">ClickManager</span><span class="p">()</span>
  80. <span class="k">class</span> <span class="nc">Meta</span><span class="p">:</span>
  81. <span class="n">verbose_name</span> <span class="o">=</span> <span class="n">_</span><span class="p">(</span><span class="s">u&quot;URL clicks&quot;</span><span class="p">)</span>
  82. <span class="n">verbose_name_plural</span> <span class="o">=</span> <span class="n">_</span><span class="p">(</span><span class="s">u&quot;URL clicks&quot;</span><span class="p">)</span>
  83. </pre></div>
  84. </div>
  85. </div>
  86. <div class="section" id="using-carrot-to-send-clicks-as-messages">
  87. <h2>Using carrot to send clicks as messages<a class="headerlink" href="#using-carrot-to-send-clicks-as-messages" title="Permalink to this headline">¶</a></h2>
  88. <p>The model is normal django stuff, nothing new there. But now we get on to
  89. the messaging. It has been a tradition for me to put the projects messaging
  90. related code in its own <tt class="docutils literal"><span class="pre">messaging.py</span></tt> module, and I will continue to do so
  91. here so maybe you can adopt this practice. In this module we have two
  92. functions:</p>
  93. <ul>
  94. <li><p class="first"><tt class="docutils literal"><span class="pre">send_increment_clicks</span></tt></p>
  95. <p>This function sends a simple message to the broker. The message body only
  96. contains the URL we want to increment as plain-text, so the exchange and
  97. routing key play a role here. We use an exchange called <tt class="docutils literal"><span class="pre">clicks</span></tt>, with a
  98. routing key of <tt class="docutils literal"><span class="pre">increment_click</span></tt>, so any consumer binding a queue to
  99. this exchange using this routing key will receive these messages.</p>
  100. </li>
  101. <li><p class="first"><tt class="docutils literal"><span class="pre">process_clicks</span></tt></p>
  102. <p>This function processes all currently gathered clicks sent using
  103. <tt class="docutils literal"><span class="pre">send_increment_clicks</span></tt>. Instead of issuing one database query for every
  104. click it processes all of the messages first, calculates the new click count
  105. and issues one update per URL. A message that has been received will not be
  106. deleted from the broker until it has been acknowledged by the receiver, so
  107. if the reciever dies in the middle of processing the message, it will be
  108. re-sent at a later point in time. This guarantees delivery and we respect
  109. this feature here by not acknowledging the message until the clicks has
  110. actually been written to disk.</p>
  111. <p><strong>Note</strong>: This could probably be optimized further with
  112. some hand-written SQL, but it will do for now. Let&#8217;s say it&#8217;s an excersise
  113. left for the picky reader, albeit a discouraged one if you can survive
  114. without doing it.</p>
  115. </li>
  116. </ul>
  117. <p>On to the code...</p>
  118. <p><em>clickmuncher/messaging.py</em>:</p>
  119. <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">carrot.connection</span> <span class="kn">import</span> <span class="n">DjangoAMQPConnection</span>
  120. <span class="kn">from</span> <span class="nn">carrot.messaging</span> <span class="kn">import</span> <span class="n">Publisher</span><span class="p">,</span> <span class="n">Consumer</span>
  121. <span class="kn">from</span> <span class="nn">clickmuncher.models</span> <span class="kn">import</span> <span class="n">Click</span>
  122. <span class="k">def</span> <span class="nf">send_increment_clicks</span><span class="p">(</span><span class="n">for_url</span><span class="p">):</span>
  123. <span class="sd">&quot;&quot;&quot;Send a message for incrementing the click count for an URL.&quot;&quot;&quot;</span>
  124. <span class="n">connection</span> <span class="o">=</span> <span class="n">DjangoAMQPConnection</span><span class="p">()</span>
  125. <span class="n">publisher</span> <span class="o">=</span> <span class="n">Publisher</span><span class="p">(</span><span class="n">connection</span><span class="o">=</span><span class="n">connection</span><span class="p">,</span>
  126. <span class="n">exchange</span><span class="o">=</span><span class="s">&quot;clicks&quot;</span><span class="p">,</span>
  127. <span class="n">routing_key</span><span class="o">=</span><span class="s">&quot;increment_click&quot;</span><span class="p">,</span>
  128. <span class="n">exchange_type</span><span class="o">=</span><span class="s">&quot;direct&quot;</span><span class="p">)</span>
  129. <span class="n">publisher</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">for_url</span><span class="p">)</span>
  130. <span class="n">publisher</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  131. <span class="n">connection</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  132. <span class="k">def</span> <span class="nf">process_clicks</span><span class="p">():</span>
  133. <span class="sd">&quot;&quot;&quot;Process all currently gathered clicks by saving them to the</span>
  134. <span class="sd"> database.&quot;&quot;&quot;</span>
  135. <span class="n">connection</span> <span class="o">=</span> <span class="n">DjangoAMQPConnection</span><span class="p">()</span>
  136. <span class="n">consumer</span> <span class="o">=</span> <span class="n">Consumer</span><span class="p">(</span><span class="n">connection</span><span class="o">=</span><span class="n">connection</span><span class="p">,</span>
  137. <span class="n">queue</span><span class="o">=</span><span class="s">&quot;clicks&quot;</span><span class="p">,</span>
  138. <span class="n">exchange</span><span class="o">=</span><span class="s">&quot;clicks&quot;</span><span class="p">,</span>
  139. <span class="n">routing_key</span><span class="o">=</span><span class="s">&quot;increment_click&quot;</span><span class="p">,</span>
  140. <span class="n">exchange_type</span><span class="o">=</span><span class="s">&quot;direct&quot;</span><span class="p">)</span>
  141. <span class="c"># First process the messages: save the number of clicks</span>
  142. <span class="c"># for every URL.</span>
  143. <span class="n">clicks_for_url</span> <span class="o">=</span> <span class="p">{}</span>
  144. <span class="n">messages_for_url</span> <span class="o">=</span> <span class="p">{}</span>
  145. <span class="k">for</span> <span class="n">message</span> <span class="ow">in</span> <span class="n">consumer</span><span class="o">.</span><span class="n">iterqueue</span><span class="p">():</span>
  146. <span class="n">url</span> <span class="o">=</span> <span class="n">message</span><span class="o">.</span><span class="n">body</span>
  147. <span class="n">clicks_for_url</span><span class="p">[</span><span class="n">url</span><span class="p">]</span> <span class="o">=</span> <span class="n">clicks_for_url</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="mf">0</span><span class="p">)</span> <span class="o">+</span> <span class="mf">1</span>
  148. <span class="c"># We also need to keep the message objects so we can ack the</span>
  149. <span class="c"># messages as processed when we are finished with them.</span>
  150. <span class="k">if</span> <span class="n">url</span> <span class="ow">in</span> <span class="n">messages_for_url</span><span class="p">:</span>
  151. <span class="n">messages_for_url</span><span class="p">[</span><span class="n">url</span><span class="p">]</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
  152. <span class="k">else</span><span class="p">:</span>
  153. <span class="n">messages_for_url</span><span class="p">[</span><span class="n">url</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="n">message</span><span class="p">]</span>
  154. <span class="c"># Then increment the clicks in the database so we only need</span>
  155. <span class="c"># one UPDATE/INSERT for each URL.</span>
  156. <span class="k">for</span> <span class="n">url</span><span class="p">,</span> <span class="n">click_count</span> <span class="ow">in</span> <span class="n">clicks_for_urls</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
  157. <span class="n">Click</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">increment_clicks</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">click_count</span><span class="p">)</span>
  158. <span class="c"># Now that the clicks has been registered for this URL we can</span>
  159. <span class="c"># acknowledge the messages</span>
  160. <span class="p">[</span><span class="n">message</span><span class="o">.</span><span class="n">ack</span><span class="p">()</span> <span class="k">for</span> <span class="n">message</span> <span class="ow">in</span> <span class="n">messages_for_url</span><span class="p">[</span><span class="n">url</span><span class="p">]]</span>
  161. <span class="n">consumer</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  162. <span class="n">connection</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  163. </pre></div>
  164. </div>
  165. </div>
  166. <div class="section" id="view-and-urls">
  167. <h2>View and URLs<a class="headerlink" href="#view-and-urls" title="Permalink to this headline">¶</a></h2>
  168. <p>This is also simple stuff, don&#8217;t think I have to explain this code to you.
  169. The interface is as follows, if you have a link to <a class="reference external" href="http://google.com">http://google.com</a> you
  170. would want to count the clicks for, you replace the URL with:</p>
  171. <blockquote>
  172. <a class="reference external" href="http://mysite/clickmuncher/count/?u=http://google.com">http://mysite/clickmuncher/count/?u=http://google.com</a></blockquote>
  173. <p>and the <tt class="docutils literal"><span class="pre">count</span></tt> view will send off an increment message and forward you to
  174. that site.</p>
  175. <p><em>clickmuncher/views.py</em>:</p>
  176. <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.http</span> <span class="kn">import</span> <span class="n">HttpResponseRedirect</span>
  177. <span class="kn">from</span> <span class="nn">clickmuncher.messaging</span> <span class="kn">import</span> <span class="n">send_increment_clicks</span>
  178. <span class="k">def</span> <span class="nf">count</span><span class="p">(</span><span class="n">request</span><span class="p">):</span>
  179. <span class="n">url</span> <span class="o">=</span> <span class="n">request</span><span class="o">.</span><span class="n">GET</span><span class="p">[</span><span class="s">&quot;u&quot;</span><span class="p">]</span>
  180. <span class="n">send_increment_clicks</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
  181. <span class="k">return</span> <span class="n">HttpResponseRedirect</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
  182. </pre></div>
  183. </div>
  184. <p><em>clickmuncher/urls.py</em>:</p>
  185. <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.conf.urls.defaults</span> <span class="kn">import</span> <span class="n">patterns</span><span class="p">,</span> <span class="n">url</span>
  186. <span class="kn">from</span> <span class="nn">clickmuncher</span> <span class="kn">import</span> <span class="n">views</span>
  187. <span class="n">urlpatterns</span> <span class="o">=</span> <span class="n">patterns</span><span class="p">(</span><span class="s">&quot;&quot;</span><span class="p">,</span>
  188. <span class="n">url</span><span class="p">(</span><span class="s">r&#39;^$&#39;</span><span class="p">,</span> <span class="n">views</span><span class="o">.</span><span class="n">count</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">&quot;clickmuncher-count&quot;</span><span class="p">),</span>
  189. <span class="p">)</span>
  190. </pre></div>
  191. </div>
  192. </div>
  193. <div class="section" id="creating-the-periodic-task">
  194. <h2>Creating the periodic task<a class="headerlink" href="#creating-the-periodic-task" title="Permalink to this headline">¶</a></h2>
  195. <p>Processing the clicks every 30 minutes is easy using celery periodic tasks.</p>
  196. <p><em>clickmuncher/tasks.py</em>:</p>
  197. <div class="highlight-python"><pre>from celery.task import PeriodicTask
  198. from celery.registry import tasks
  199. from clickmuncher.messaging import process_clicks
  200. from datetime import timedelta
  201. class ProcessClicksTask(PeriodicTask):
  202. run_every = timedelta(minutes=30)
  203. def run(self, \*\*kwargs):
  204. process_clicks()
  205. tasks.register(ProcessClicksTask)</pre>
  206. </div>
  207. <p>We subclass from <a title="celery.task.base.PeriodicTask" class="reference external" href="../reference/celery.task.base.html#celery.task.base.PeriodicTask"><tt class="xref docutils literal"><span class="pre">celery.task.base.PeriodicTask</span></tt></a>, set the <tt class="docutils literal"><span class="pre">run_every</span></tt>
  208. attribute and in the body of the task just call the <tt class="docutils literal"><span class="pre">process_clicks</span></tt>
  209. function we wrote earlier. Finally, we register the task in the task registry
  210. so the celery workers is able to recognize and find it.</p>
  211. </div>
  212. <div class="section" id="finishing">
  213. <h2>Finishing<a class="headerlink" href="#finishing" title="Permalink to this headline">¶</a></h2>
  214. <p>There are still ways to improve this application. The URLs could be cleaned
  215. so the url <a class="reference external" href="http://google.com">http://google.com</a> and <a class="reference external" href="http://google.com/">http://google.com/</a> is the same. Maybe it&#8217;s
  216. even possible to update the click count using a single UPDATE query?</p>
  217. <p>If you have any questions regarding this tutorial, please send a mail to the
  218. mailing-list or come join us in the #celery IRC channel at Freenode:</p>
  219. <blockquote>
  220. <a class="reference external" href="http://celeryq.org/introduction.html#getting-help">http://celeryq.org/introduction.html#getting-help</a></blockquote>
  221. </div>
  222. </div>
  223. </div>
  224. </div>
  225. </div>
  226. <div class="sphinxsidebar">
  227. <div class="sphinxsidebarwrapper">
  228. <h3><a href="../index.html">Table Of Contents</a></h3>
  229. <ul>
  230. <li><a class="reference external" href="">Tutorial: Creating a click counter using carrot and celery</a><ul>
  231. <li><a class="reference external" href="#introduction">Introduction</a></li>
  232. <li><a class="reference external" href="#the-model">The model</a></li>
  233. <li><a class="reference external" href="#using-carrot-to-send-clicks-as-messages">Using carrot to send clicks as messages</a></li>
  234. <li><a class="reference external" href="#view-and-urls">View and URLs</a></li>
  235. <li><a class="reference external" href="#creating-the-periodic-task">Creating the periodic task</a></li>
  236. <li><a class="reference external" href="#finishing">Finishing</a></li>
  237. </ul>
  238. </li>
  239. </ul>
  240. <h3>This Page</h3>
  241. <ul class="this-page-menu">
  242. <li><a href="../sources/tutorials/clickcounter.txt"
  243. rel="nofollow">Show Source</a></li>
  244. </ul>
  245. <div id="searchbox" style="display: none">
  246. <h3>Quick search</h3>
  247. <form class="search" action="../search.html" method="get">
  248. <input type="text" name="q" size="18" />
  249. <input type="submit" value="Go" />
  250. <input type="hidden" name="check_keywords" value="yes" />
  251. <input type="hidden" name="area" value="default" />
  252. </form>
  253. <p class="searchtip" style="font-size: 90%">
  254. Enter search terms or a module, class or function name.
  255. </p>
  256. </div>
  257. <script type="text/javascript">$('#searchbox').show(0);</script>
  258. </div>
  259. </div>
  260. <div class="clearer"></div>
  261. </div>
  262. <div class="related">
  263. <h3>Navigation</h3>
  264. <ul>
  265. <li class="right" style="margin-right: 10px">
  266. <a href="../genindex.html" title="General Index"
  267. >index</a></li>
  268. <li class="right" >
  269. <a href="../modindex.html" title="Global Module Index"
  270. >modules</a> |</li>
  271. <li><a href="../index.html">Celery v0.4.2 (stable) documentation</a> &raquo;</li>
  272. </ul>
  273. </div>
  274. <div class="footer">
  275. &copy; Copyright 2009, Ask Solem.
  276. Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 0.6.1.
  277. </div>
  278. </body>
  279. </html>