clickcounter.html 24 KB


  1. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  2. "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
  3. <html xmlns="http://www.w3.org/1999/xhtml">
  4. <head>
  5. <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
  6. <title>Tutorial: Creating a click counter using carrot and celery &mdash; Celery v0.7.0 (unstable) documentation</title>
  7. <link rel="stylesheet" href="../static/nature.css" type="text/css" />
  8. <link rel="stylesheet" href="../static/pygments.css" type="text/css" />
  9. <script type="text/javascript">
  10. var DOCUMENTATION_OPTIONS = {
  11. URL_ROOT: '../',
  12. VERSION: '0.7.0 (unstable)',
  13. COLLAPSE_MODINDEX: false,
  14. FILE_SUFFIX: '.html',
  15. HAS_SOURCE: true
  16. };
  17. </script>
  18. <script type="text/javascript" src="../static/jquery.js"></script>
  19. <script type="text/javascript" src="../static/doctools.js"></script>
  20. <link rel="top" title="Celery v0.7.0 (unstable) documentation" href="../index.html" />
  21. <link rel="up" title="Tutorials" href="index.html" />
  22. <link rel="next" title="Frequently Asked Questions" href="../faq.html" />
  23. <link rel="prev" title="Tutorials" href="index.html" />
  24. </head>
  25. <body>
  26. <div class="related">
  27. <h3>Navigation</h3>
  28. <ul>
  29. <li class="right" style="margin-right: 10px">
  30. <a href="../genindex.html" title="General Index"
  31. accesskey="I">index</a></li>
  32. <li class="right" >
  33. <a href="../modindex.html" title="Global Module Index"
  34. accesskey="M">modules</a> |</li>
  35. <li class="right" >
  36. <a href="../faq.html" title="Frequently Asked Questions"
  37. accesskey="N">next</a> |</li>
  38. <li class="right" >
  39. <a href="index.html" title="Tutorials"
  40. accesskey="P">previous</a> |</li>
  41. <li><a href="../index.html">Celery v0.7.0 (unstable) documentation</a> &raquo;</li>
  42. <li><a href="index.html" accesskey="U">Tutorials</a> &raquo;</li>
  43. </ul>
  44. </div>
  45. <div class="document">
  46. <div class="documentwrapper">
  47. <div class="bodywrapper">
  48. <div class="body">
  49. <div class="section" id="tutorial-creating-a-click-counter-using-carrot-and-celery">
  50. <h1>Tutorial: Creating a click counter using carrot and celery<a class="headerlink" href="#tutorial-creating-a-click-counter-using-carrot-and-celery" title="Permalink to this headline">¶</a></h1>
  51. <div class="section" id="introduction">
  52. <h2>Introduction<a class="headerlink" href="#introduction" title="Permalink to this headline">¶</a></h2>
  53. <p>A click counter should be easy, right? Just a simple view that increments
  54. a click in the DB and forwards you to the real destination.</p>
  55. <p>This would work well for most sites, but when traffic starts to increase,
  56. you are likely to bump into problems. One database write for every click is
  57. not good if you have millions of clicks a day.</p>
  58. <p>So what can you do? In this tutorial we will send the individual clicks as
  59. messages using <tt class="docutils literal"><span class="pre">carrot</span></tt>, and then process them later with a <tt class="docutils literal"><span class="pre">celery</span></tt>
  60. periodic task.</p>
  61. <p>Celery and carrot is excellent in tandem, and while this might not be
  62. the perfect example, you&#8217;ll at least see one example how of they can be used
  63. to solve a task.</p>
  64. </div>
  65. <div class="section" id="the-model">
  66. <h2>The model<a class="headerlink" href="#the-model" title="Permalink to this headline">¶</a></h2>
  67. <p>The model is simple, <tt class="docutils literal"><span class="pre">Click</span></tt> has the URL as primary key and a number of
  68. clicks for that URL. Its manager, <tt class="docutils literal"><span class="pre">ClickManager</span></tt> implements the
  69. <tt class="docutils literal"><span class="pre">increment_clicks</span></tt> method, which takes a URL and by how much to increment
  70. its count by.</p>
  71. <p><em>clickmuncher/models.py</em>:</p>
  72. <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.db</span> <span class="kn">import</span> <span class="n">models</span>
  73. <span class="kn">from</span> <span class="nn">django.utils.translation</span> <span class="kn">import</span> <span class="n">ugettext_lazy</span> <span class="k">as</span> <span class="n">_</span>
  74. <span class="k">class</span> <span class="nc">ClickManager</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Manager</span><span class="p">):</span>
  75. <span class="k">def</span> <span class="nf">increment_clicks</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">for_url</span><span class="p">,</span> <span class="n">increment_by</span><span class="o">=</span><span class="mf">1</span><span class="p">):</span>
  76. <span class="sd">&quot;&quot;&quot;Increment the click count for an URL.</span>
  77. <span class="sd"> &gt;&gt;&gt; Click.objects.increment_clicks(&quot;http://google.com&quot;, 10)</span>
  78. <span class="sd"> &quot;&quot;&quot;</span>
  79. <span class="n">click</span><span class="p">,</span> <span class="n">created</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">get_or_create</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">for_url</span><span class="p">,</span>
  80. <span class="n">defaults</span><span class="o">=</span><span class="p">{</span><span class="s">&quot;click_count&quot;</span><span class="p">:</span> <span class="n">increment_by</span><span class="p">})</span>
  81. <span class="k">if</span> <span class="ow">not</span> <span class="n">created</span><span class="p">:</span>
  82. <span class="n">click</span><span class="o">.</span><span class="n">click_count</span> <span class="o">+=</span> <span class="n">increment_by</span>
  83. <span class="n">click</span><span class="o">.</span><span class="n">save</span><span class="p">()</span>
  84. <span class="k">return</span> <span class="n">click</span><span class="o">.</span><span class="n">click_count</span>
  85. <span class="k">class</span> <span class="nc">Click</span><span class="p">(</span><span class="n">models</span><span class="o">.</span><span class="n">Model</span><span class="p">):</span>
  86. <span class="n">url</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">URLField</span><span class="p">(</span><span class="n">_</span><span class="p">(</span><span class="s">u&quot;URL&quot;</span><span class="p">),</span> <span class="n">verify_exists</span><span class="o">=</span><span class="bp">False</span><span class="p">,</span> <span class="n">unique</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
  87. <span class="n">click_count</span> <span class="o">=</span> <span class="n">models</span><span class="o">.</span><span class="n">PositiveIntegerField</span><span class="p">(</span><span class="n">_</span><span class="p">(</span><span class="s">u&quot;click_count&quot;</span><span class="p">),</span>
  88. <span class="n">default</span><span class="o">=</span><span class="mf">0</span><span class="p">)</span>
  89. <span class="n">objects</span> <span class="o">=</span> <span class="n">ClickManager</span><span class="p">()</span>
  90. <span class="k">class</span> <span class="nc">Meta</span><span class="p">:</span>
  91. <span class="n">verbose_name</span> <span class="o">=</span> <span class="n">_</span><span class="p">(</span><span class="s">u&quot;URL clicks&quot;</span><span class="p">)</span>
  92. <span class="n">verbose_name_plural</span> <span class="o">=</span> <span class="n">_</span><span class="p">(</span><span class="s">u&quot;URL clicks&quot;</span><span class="p">)</span>
  93. </pre></div>
  94. </div>
  95. </div>
  96. <div class="section" id="using-carrot-to-send-clicks-as-messages">
  97. <h2>Using carrot to send clicks as messages<a class="headerlink" href="#using-carrot-to-send-clicks-as-messages" title="Permalink to this headline">¶</a></h2>
  98. <p>The model is normal django stuff, nothing new there. But now we get on to
  99. the messaging. It has been a tradition for me to put the projects messaging
  100. related code in its own <tt class="docutils literal"><span class="pre">messaging.py</span></tt> module, and I will continue to do so
  101. here so maybe you can adopt this practice. In this module we have two
  102. functions:</p>
  103. <ul>
  104. <li><p class="first"><tt class="docutils literal"><span class="pre">send_increment_clicks</span></tt></p>
  105. <p>This function sends a simple message to the broker. The message body only
  106. contains the URL we want to increment as plain-text, so the exchange and
  107. routing key play a role here. We use an exchange called <tt class="docutils literal"><span class="pre">clicks</span></tt>, with a
  108. routing key of <tt class="docutils literal"><span class="pre">increment_click</span></tt>, so any consumer binding a queue to
  109. this exchange using this routing key will receive these messages.</p>
  110. </li>
  111. <li><p class="first"><tt class="docutils literal"><span class="pre">process_clicks</span></tt></p>
  112. <p>This function processes all currently gathered clicks sent using
  113. <tt class="docutils literal"><span class="pre">send_increment_clicks</span></tt>. Instead of issuing one database query for every
  114. click it processes all of the messages first, calculates the new click count
  115. and issues one update per URL. A message that has been received will not be
  116. deleted from the broker until it has been acknowledged by the receiver, so
  117. if the reciever dies in the middle of processing the message, it will be
  118. re-sent at a later point in time. This guarantees delivery and we respect
  119. this feature here by not acknowledging the message until the clicks has
  120. actually been written to disk.</p>
  121. <p><strong>Note</strong>: This could probably be optimized further with
  122. some hand-written SQL, but it will do for now. Let&#8217;s say it&#8217;s an excersise
  123. left for the picky reader, albeit a discouraged one if you can survive
  124. without doing it.</p>
  125. </li>
  126. </ul>
  127. <p>On to the code...</p>
  128. <p><em>clickmuncher/messaging.py</em>:</p>
  129. <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">carrot.connection</span> <span class="kn">import</span> <span class="n">DjangoBrokerConnection</span>
  130. <span class="kn">from</span> <span class="nn">carrot.messaging</span> <span class="kn">import</span> <span class="n">Publisher</span><span class="p">,</span> <span class="n">Consumer</span>
  131. <span class="kn">from</span> <span class="nn">clickmuncher.models</span> <span class="kn">import</span> <span class="n">Click</span>
  132. <span class="k">def</span> <span class="nf">send_increment_clicks</span><span class="p">(</span><span class="n">for_url</span><span class="p">):</span>
  133. <span class="sd">&quot;&quot;&quot;Send a message for incrementing the click count for an URL.&quot;&quot;&quot;</span>
  134. <span class="n">connection</span> <span class="o">=</span> <span class="n">DjangoBrokerConnection</span><span class="p">()</span>
  135. <span class="n">publisher</span> <span class="o">=</span> <span class="n">Publisher</span><span class="p">(</span><span class="n">connection</span><span class="o">=</span><span class="n">connection</span><span class="p">,</span>
  136. <span class="n">exchange</span><span class="o">=</span><span class="s">&quot;clicks&quot;</span><span class="p">,</span>
  137. <span class="n">routing_key</span><span class="o">=</span><span class="s">&quot;increment_click&quot;</span><span class="p">,</span>
  138. <span class="n">exchange_type</span><span class="o">=</span><span class="s">&quot;direct&quot;</span><span class="p">)</span>
  139. <span class="n">publisher</span><span class="o">.</span><span class="n">send</span><span class="p">(</span><span class="n">for_url</span><span class="p">)</span>
  140. <span class="n">publisher</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  141. <span class="n">connection</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  142. <span class="k">def</span> <span class="nf">process_clicks</span><span class="p">():</span>
  143. <span class="sd">&quot;&quot;&quot;Process all currently gathered clicks by saving them to the</span>
  144. <span class="sd"> database.&quot;&quot;&quot;</span>
  145. <span class="n">connection</span> <span class="o">=</span> <span class="n">DjangoBrokerConnection</span><span class="p">()</span>
  146. <span class="n">consumer</span> <span class="o">=</span> <span class="n">Consumer</span><span class="p">(</span><span class="n">connection</span><span class="o">=</span><span class="n">connection</span><span class="p">,</span>
  147. <span class="n">queue</span><span class="o">=</span><span class="s">&quot;clicks&quot;</span><span class="p">,</span>
  148. <span class="n">exchange</span><span class="o">=</span><span class="s">&quot;clicks&quot;</span><span class="p">,</span>
  149. <span class="n">routing_key</span><span class="o">=</span><span class="s">&quot;increment_click&quot;</span><span class="p">,</span>
  150. <span class="n">exchange_type</span><span class="o">=</span><span class="s">&quot;direct&quot;</span><span class="p">)</span>
  151. <span class="c"># First process the messages: save the number of clicks</span>
  152. <span class="c"># for every URL.</span>
  153. <span class="n">clicks_for_url</span> <span class="o">=</span> <span class="p">{}</span>
  154. <span class="n">messages_for_url</span> <span class="o">=</span> <span class="p">{}</span>
  155. <span class="k">for</span> <span class="n">message</span> <span class="ow">in</span> <span class="n">consumer</span><span class="o">.</span><span class="n">iterqueue</span><span class="p">():</span>
  156. <span class="n">url</span> <span class="o">=</span> <span class="n">message</span><span class="o">.</span><span class="n">body</span>
  157. <span class="n">clicks_for_url</span><span class="p">[</span><span class="n">url</span><span class="p">]</span> <span class="o">=</span> <span class="n">clicks_for_url</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="mf">0</span><span class="p">)</span> <span class="o">+</span> <span class="mf">1</span>
  158. <span class="c"># We also need to keep the message objects so we can ack the</span>
  159. <span class="c"># messages as processed when we are finished with them.</span>
  160. <span class="k">if</span> <span class="n">url</span> <span class="ow">in</span> <span class="n">messages_for_url</span><span class="p">:</span>
  161. <span class="n">messages_for_url</span><span class="p">[</span><span class="n">url</span><span class="p">]</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">message</span><span class="p">)</span>
  162. <span class="k">else</span><span class="p">:</span>
  163. <span class="n">messages_for_url</span><span class="p">[</span><span class="n">url</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="n">message</span><span class="p">]</span>
  164. <span class="c"># Then increment the clicks in the database so we only need</span>
  165. <span class="c"># one UPDATE/INSERT for each URL.</span>
  166. <span class="k">for</span> <span class="n">url</span><span class="p">,</span> <span class="n">click_count</span> <span class="ow">in</span> <span class="n">clicks_for_urls</span><span class="o">.</span><span class="n">items</span><span class="p">():</span>
  167. <span class="n">Click</span><span class="o">.</span><span class="n">objects</span><span class="o">.</span><span class="n">increment_clicks</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">click_count</span><span class="p">)</span>
  168. <span class="c"># Now that the clicks has been registered for this URL we can</span>
  169. <span class="c"># acknowledge the messages</span>
  170. <span class="p">[</span><span class="n">message</span><span class="o">.</span><span class="n">ack</span><span class="p">()</span> <span class="k">for</span> <span class="n">message</span> <span class="ow">in</span> <span class="n">messages_for_url</span><span class="p">[</span><span class="n">url</span><span class="p">]]</span>
  171. <span class="n">consumer</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  172. <span class="n">connection</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
  173. </pre></div>
  174. </div>
  175. </div>
  176. <div class="section" id="view-and-urls">
  177. <h2>View and URLs<a class="headerlink" href="#view-and-urls" title="Permalink to this headline">¶</a></h2>
  178. <p>This is also simple stuff, don&#8217;t think I have to explain this code to you.
  179. The interface is as follows, if you have a link to <a class="reference external" href="http://google.com">http://google.com</a> you
  180. would want to count the clicks for, you replace the URL with:</p>
  181. <blockquote>
  182. <a class="reference external" href="http://mysite/clickmuncher/count/?u=http://google.com">http://mysite/clickmuncher/count/?u=http://google.com</a></blockquote>
  183. <p>and the <tt class="docutils literal"><span class="pre">count</span></tt> view will send off an increment message and forward you to
  184. that site.</p>
  185. <p><em>clickmuncher/views.py</em>:</p>
  186. <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.http</span> <span class="kn">import</span> <span class="n">HttpResponseRedirect</span>
  187. <span class="kn">from</span> <span class="nn">clickmuncher.messaging</span> <span class="kn">import</span> <span class="n">send_increment_clicks</span>
  188. <span class="k">def</span> <span class="nf">count</span><span class="p">(</span><span class="n">request</span><span class="p">):</span>
  189. <span class="n">url</span> <span class="o">=</span> <span class="n">request</span><span class="o">.</span><span class="n">GET</span><span class="p">[</span><span class="s">&quot;u&quot;</span><span class="p">]</span>
  190. <span class="n">send_increment_clicks</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
  191. <span class="k">return</span> <span class="n">HttpResponseRedirect</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
  192. </pre></div>
  193. </div>
  194. <p><em>clickmuncher/urls.py</em>:</p>
  195. <div class="highlight-python"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">django.conf.urls.defaults</span> <span class="kn">import</span> <span class="n">patterns</span><span class="p">,</span> <span class="n">url</span>
  196. <span class="kn">from</span> <span class="nn">clickmuncher</span> <span class="kn">import</span> <span class="n">views</span>
  197. <span class="n">urlpatterns</span> <span class="o">=</span> <span class="n">patterns</span><span class="p">(</span><span class="s">&quot;&quot;</span><span class="p">,</span>
  198. <span class="n">url</span><span class="p">(</span><span class="s">r&#39;^$&#39;</span><span class="p">,</span> <span class="n">views</span><span class="o">.</span><span class="n">count</span><span class="p">,</span> <span class="n">name</span><span class="o">=</span><span class="s">&quot;clickmuncher-count&quot;</span><span class="p">),</span>
  199. <span class="p">)</span>
  200. </pre></div>
  201. </div>
  202. </div>
  203. <div class="section" id="creating-the-periodic-task">
  204. <h2>Creating the periodic task<a class="headerlink" href="#creating-the-periodic-task" title="Permalink to this headline">¶</a></h2>
  205. <p>Processing the clicks every 30 minutes is easy using celery periodic tasks.</p>
  206. <p><em>clickmuncher/tasks.py</em>:</p>
  207. <div class="highlight-python"><pre>from celery.task import PeriodicTask
  208. from celery.registry import tasks
  209. from clickmuncher.messaging import process_clicks
  210. from datetime import timedelta
  211. class ProcessClicksTask(PeriodicTask):
  212. run_every = timedelta(minutes=30)
  213. def run(self, \*\*kwargs):
  214. process_clicks()
  215. tasks.register(ProcessClicksTask)</pre>
  216. </div>
  217. <p>We subclass from <a title="celery.task.base.PeriodicTask" class="reference external" href="../reference/celery.task.base.html#celery.task.base.PeriodicTask"><tt class="xref docutils literal"><span class="pre">celery.task.base.PeriodicTask</span></tt></a>, set the <tt class="docutils literal"><span class="pre">run_every</span></tt>
  218. attribute and in the body of the task just call the <tt class="docutils literal"><span class="pre">process_clicks</span></tt>
  219. function we wrote earlier. Finally, we register the task in the task registry
  220. so the celery workers is able to recognize and find it.</p>
  221. </div>
  222. <div class="section" id="finishing">
  223. <h2>Finishing<a class="headerlink" href="#finishing" title="Permalink to this headline">¶</a></h2>
  224. <p>There are still ways to improve this application. The URLs could be cleaned
  225. so the url <a class="reference external" href="http://google.com">http://google.com</a> and <a class="reference external" href="http://google.com/">http://google.com/</a> is the same. Maybe it&#8217;s
  226. even possible to update the click count using a single UPDATE query?</p>
  227. <p>If you have any questions regarding this tutorial, please send a mail to the
  228. mailing-list or come join us in the #celery IRC channel at Freenode:
  229. <a class="reference external" href="http://celeryq.org/introduction.html#getting-help">http://celeryq.org/introduction.html#getting-help</a></p>
  230. </div>
  231. </div>
  232. </div>
  233. </div>
  234. </div>
  235. <div class="sphinxsidebar">
  236. <div class="sphinxsidebarwrapper">
  237. <h3><a href="../index.html">Table Of Contents</a></h3>
  238. <ul>
  239. <li><a class="reference external" href="">Tutorial: Creating a click counter using carrot and celery</a><ul>
  240. <li><a class="reference external" href="#introduction">Introduction</a></li>
  241. <li><a class="reference external" href="#the-model">The model</a></li>
  242. <li><a class="reference external" href="#using-carrot-to-send-clicks-as-messages">Using carrot to send clicks as messages</a></li>
  243. <li><a class="reference external" href="#view-and-urls">View and URLs</a></li>
  244. <li><a class="reference external" href="#creating-the-periodic-task">Creating the periodic task</a></li>
  245. <li><a class="reference external" href="#finishing">Finishing</a></li>
  246. </ul>
  247. </li>
  248. </ul>
  249. <h4>Previous topic</h4>
  250. <p class="topless"><a href="index.html"
  251. title="previous chapter">Tutorials</a></p>
  252. <h4>Next topic</h4>
  253. <p class="topless"><a href="../faq.html"
  254. title="next chapter">Frequently Asked Questions</a></p>
  255. <h3>This Page</h3>
  256. <ul class="this-page-menu">
  257. <li><a href="../sources/tutorials/clickcounter.txt"
  258. rel="nofollow">Show Source</a></li>
  259. </ul>
  260. <div id="searchbox" style="display: none">
  261. <h3>Quick search</h3>
  262. <form class="search" action="../search.html" method="get">
  263. <input type="text" name="q" size="18" />
  264. <input type="submit" value="Go" />
  265. <input type="hidden" name="check_keywords" value="yes" />
  266. <input type="hidden" name="area" value="default" />
  267. </form>
  268. <p class="searchtip" style="font-size: 90%">
  269. Enter search terms or a module, class or function name.
  270. </p>
  271. </div>
  272. <script type="text/javascript">$('#searchbox').show(0);</script>
  273. </div>
  274. </div>
  275. <div class="clearer"></div>
  276. </div>
  277. <div class="related">
  278. <h3>Navigation</h3>
  279. <ul>
  280. <li class="right" style="margin-right: 10px">
  281. <a href="../genindex.html" title="General Index"
  282. >index</a></li>
  283. <li class="right" >
  284. <a href="../modindex.html" title="Global Module Index"
  285. >modules</a> |</li>
  286. <li class="right" >
  287. <a href="../faq.html" title="Frequently Asked Questions"
  288. >next</a> |</li>
  289. <li class="right" >
  290. <a href="index.html" title="Tutorials"
  291. >previous</a> |</li>
  292. <li><a href="../index.html">Celery v0.7.0 (unstable) documentation</a> &raquo;</li>
  293. <li><a href="index.html" >Tutorials</a> &raquo;</li>
  294. </ul>
  295. </div>
  296. <div class="footer">
  297. &copy; Copyright 2009, Ask Solem.
  298. Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 0.6.2.
  299. </div>
  300. </body>
  301. </html>