DonaldRauscher.comhttp://www.donaldrauscher.com/A Blog About D4T4 & M47HSun, 09 Sep 2018 00:00:00 -0500Using tf.Transform For Input Pipelineshttp://www.donaldrauscher.com/tf-transform.html<p>When initially building <a href="movie-reviews-tf.html">my movie classification model</a>, I used a version of the dataset that had already been preprocessed into TFRecords. Though convenient, this created a problem when deploying the model; I wasn't able to replicate the preprocessing in my serving environment leading to training-serving skew. My solution: <a href="https://github.com/tensorflow/transform">tf.Transform …</a></p>Donald RauscherSun, 09 Sep 2018 00:00:00 -0500tag:www.donaldrauscher.com,2018-09-09:/tf-transform.htmltensorflowtf-transformapache-beamcloud-mlClassifying Movie Reviews with TensorFlowhttp://www.donaldrauscher.com/movie-reviews-tf.html<p>09-09-2018 Update: My initial deployment of this model had training-serving skew since I was simply splitting words by spaces and feeding into the model. To properly serve this model, I needed to replicate the preprocessing in the <a href="https://www.tensorflow.org/guide/saved_model#prepare_serving_inputs">serving input receiver</a>. There is a nifty tool for this in the TF …</p>Donald RauscherWed, 29 Aug 2018 00:00:00 -0500tag:www.donaldrauscher.com,2018-08-29:/movie-reviews-tf.htmltensorflowcloud-mlnlpTwo Options for Hosting a Private PyPI Repositoryhttp://www.donaldrauscher.com/private-pypi.html<p>A few years back, I read <a href="https://medium.com/airbnb-engineering/using-r-packages-and-education-to-scale-data-science-at-airbnb-906faa58e12d">an interesting post</a> about how Airbnb's data science team developed their own internal R package, Rbnb, to standardize solutions to common problems and reduce redundancy across projects. I really like this idea and have implemented a similar solution for Python at places that I …</p>Donald RauscherSat, 11 Aug 2018 00:00:00 -0500tag:www.donaldrauscher.com,2018-08-11:/private-pypi.htmlpython pypi gcp cloud_build git_ops ci_cdBuilding Pipelines in K8s with Brigadehttp://www.donaldrauscher.com/brigade-crypto.html<p>Kubernetes started as a deployment option for stateless services. However, people are increasingly using Kubernetes clusters to execute complex workflows for CI/CD, ETL, machine learning, etc. And there are a number of tools/projects that have sprung up to help orchestrate these workflows. Two that I have been exploring …</p>Donald RauscherSat, 14 Jul 2018 00:00:00 -0500tag:www.donaldrauscher.com,2018-07-14:/brigade-crypto.htmlk8s gke brigade pipelines cryptocurrencyTopic Modeling Fake Newshttp://www.donaldrauscher.com/fake-news.html<p>I decided to change things up a little bit and take on an unsupervised learning task: topic modeling. For this, I explored an endlessly entertaining dataset, <a href="https://www.kaggle.com/mrisdal/fake-news/data">a database of fake news articles</a> compiled by Kaggle. It is comprised of ~13K different articles from 200 different sources circa Oct'16 - Nov'16 (a …</p>Donald RauscherTue, 26 Jun 2018 00:00:00 -0500tag:www.donaldrauscher.com,2018-06-26:/fake-news.htmlnlp topic_modeling kaggle sklearnMoving My Blog from Jekyll/GitHub to Pelican/GCShttp://www.donaldrauscher.com/pelican-blog.html<p>I recently moved this blog from Jekyll/GitHub to Pelican/GCS. Mainly, I wanted to move to a Python-based framework where I would have more flexibility to customize (e.g. add/create plugins). And cost isn't really a consideration as both options are free. <a href="https://jekyllrb.com/docs/github-pages/">GitHub pages</a> is actually powered by …</p>Donald RauscherMon, 28 May 2018 00:00:00 -0500tag:www.donaldrauscher.com,2018-05-28:/pelican-blog.htmlstatic_site blog pelican jekyll google_container_builderUsing Word2Vec for "Code Names"http://www.donaldrauscher.com/w2v-code-names.html<p><a href="https://en.wikipedia.org/wiki/Codenames_(board_game)">"Code Names"</a> Rules: People are divided into two teams. The board is comprised of 25 words divided into 4 categories: blue team, red team, neutral, and the death word. People are divided evenly into two teams (red and blue). In each round, two people from either team take turns giving …</p>Donald RauscherSat, 12 May 2018 00:00:00 -0500tag:www.donaldrauscher.com,2018-05-12:/w2v-code-names.htmlword2vecnlpdashdockerDoc2Vec + Dask + K8s for the Toxic Comment Classification Challengehttp://www.donaldrauscher.com/kaggle-jigsaw.html<p>The goal of <a href="https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge">this Kaggle challenge</a> was to build a model to flag toxic Wikipedia comments. The training dataset included 159,571 Wikipedia comments which were labeled by human raters. Each comment was evaluated on 6 dimensions: toxic, severe toxic, obscene, threat, insult, and identity hate.</p>
<h2>Model Approach</h2>
<p>This challenge …</p>Donald RauscherThu, 22 Mar 2018 00:00:00 -0500tag:www.donaldrauscher.com,2018-03-22:/kaggle-jigsaw.htmlkagglenlpdoc2vecgkedasksklearnSetting up Apache Airflow on GKEhttp://www.donaldrauscher.com/airflow-gke.html<p>Historically, I have used <a href="https://luigi.readthedocs.io/en/latest/">Luigi</a> for a lot of my data pipelining. Recently, however, I have started experimenting with <a href="https://airflow.apache.org/">Airflow</a> for <a href="https://www.quora.com/Which-is-a-better-data-pipeline-scheduling-platform-Airflow-or-Luigi">a variety of reasons</a>. Some things I really like about Airflow:</p>
<ul>
<li><strong>Easier to parallize</strong> - Luigi can only be scaled <em>locally</em>. You can create multiple worker threads by passing <code>--workers …</code></li></ul>Donald RauscherTue, 06 Feb 2018 00:00:00 -0600tag:www.donaldrauscher.com,2018-02-06:/airflow-gke.htmletlpipeliningairflowgkegcpQuick and Easy BI: Setting up Redash on GKEhttp://www.donaldrauscher.com/redash-gke.html<p>Professionally, I have worked quite a lot with BI platforms Looker and Tableau. They are great BI platforms for an organization, though probably too heavy (and too expensive) for a small project or a bootstrapping startup. Sometimes you just need something where you can write queries and dump them into …</p>Donald RauscherSun, 31 Dec 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-12-31:/redash-gke.htmlbusiness_intelligenceredashgkegcpHigh Cardinality Categoricals with Sklearnhttp://www.donaldrauscher.com/sklearn-hcc.html<p>I used <a href="/kaggle-renthop">a Bayesian approach</a> to encode high cardinality categorical variables in a Kaggle a few months back. My original implementation was in R. However, I have recently been doing most of my modeling in sklearn, so I decided to also implement this approach there as well.</p>
<p>This approach lends …</p>Donald RauscherMon, 18 Dec 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-12-18:/sklearn-hcc.htmlpythonsklearnmachine_learningfeature_engineeringModel Stacking with Sklearnhttp://www.donaldrauscher.com/sklearn-stack.html<p><a href="https://rd.springer.com/content/pdf/10.1007%2FBF00117832.pdf">Stacking</a>, also called meta ensembling, is a technique used to boost predictive accuracy by blending the predictions of multiple models. This technique is most effective when you have multiple, well-performing models which are <em>not</em> overly similar. Participants in Kaggle competitions will observe that winning solutions are often blends of multiple …</p>Donald RauscherSun, 10 Dec 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-12-10:/sklearn-stack.htmlpythonsklearnmachine_learningmodel_stackingIdentifying Frequent Item Sets using Apache Beam/Dataflowhttp://www.donaldrauscher.com/dataflow-apriori.html<p>I have used Google's <strong>serverless DW service</strong>, BigQuery, for several of my projects this past year. I recently started familiarizing myself with with Google's <strong>serverless data pipeline service</strong>, DataFlow. This post shows how to build a pipeline to identify frequently purchased item sets in <a href="https://www.kaggle.com/c/instacart-market-basket-analysis">market basket data</a> from Instacart (3 …</p>Donald RauscherFri, 24 Nov 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-11-24:/dataflow-apriori.htmlgcpdataflowserverlessapache_beamaprioriassociation_rulesHow to Deploy a Shiny App on Google Kubernetes Enginehttp://www.donaldrauscher.com/shiny-on-docker.html<p><a href="https://shiny.rstudio.com/">Shiny</a> is an awesome tool for building interactive apps powered by R. There are a couple options for <a href="https://shiny.rstudio.com/deploy/">deploying</a> Shiny apps. You can deploy to <a href="http://www.shinyapps.io/">Shinyapps.io</a>. You can also deploy on your own machine using open source Shiny Server. This tutorial shows how to setup a Docker container for …</p>Donald RauscherMon, 02 Oct 2017 00:00:00 -0500tag:www.donaldrauscher.com,2017-10-02:/shiny-on-docker.htmlgcpgkedockercontainerskubernetesshinyfarkleHow to Stream Raw Google Analytics Data into BigQueryhttp://www.donaldrauscher.com/ga-bq-stream.html<p>08-13-2018 Update: <a href="https://cloud.google.com/functions/docs/release-notes">As of 07-24-2018</a>, you can now write Google Cloud Functions in Python! I re-wrote the Cloud Function in this post in Python.</p>
<p>I have been using Google Analytics for a while for my own projects. The Google Analytics interface is great for helping you track activity on your …</p>Donald RauscherTue, 19 Sep 2017 00:00:00 -0500tag:www.donaldrauscher.com,2017-09-19:/ga-bq-stream.htmlgoogle_analyticsbigquerycloud_functionsetlAdd Some Game Theory to Your Fantasy Football Drafthttp://www.donaldrauscher.com/fantasy-football.html<p>Does your projected starting quarterback have an <a href="https://www.pro-football-reference.com/players/M/McCoJo01.htm">18-42</a> starting record? Did your team decide to complement its <a href="https://www.sbnation.com/2017/7/27/16053650/odell-beckham-jr-highest-paid-player-nfl">crazy wide receiver</a> with...<a href="http://bleacherreport.com/articles/2685133-brandon-marshall-comments-on-jets-season-and-locker-room-tension">another crazy wide receiver</a>? Did your team sign Mike Glennon, who has a <a href="https://www.pro-football-reference.com/players/G/GlenMi00.htm">5-13</a> starting record, to a $43.5M deal because he's...tall, then trade the #3 pick …</p>Donald RauscherThu, 31 Aug 2017 00:00:00 -0500tag:www.donaldrauscher.com,2017-08-31:/fantasy-football.htmlfantasy_footballgame_theoryspreadsheet_modelingUsing Google Dataproc for the Kaggle Instacart Challengehttp://www.donaldrauscher.com/kaggle-instacart.html<p>I recently competed in <a href="https://www.kaggle.com/c/instacart-market-basket-analysis">this Kaggle competition</a>. It's a challenging problem because we're not just trying to predict whether someone will buy a specific product; we're trying to predict the <em>entirety</em> of someone's next order. And there are 49,688 possible products. Furthermore, in the train orders, 60% of the …</p>Donald RauscherSun, 30 Jul 2017 00:00:00 -0500tag:www.donaldrauscher.com,2017-07-30:/kaggle-instacart.htmlsparkdataprocgcpkaggleDealing with High Cardinality Categorical Variables & Other Learnings from the Kaggle Renthop Challengehttp://www.donaldrauscher.com/kaggle-renthop.html<p>I recently completed the <a href="kaggle.com/c/two-sigma-connect-rental-listing-inquiries">Kaggle Renthop Competition</a>. I had a lot of fun with it. One of my biggest takeaways from the competition was developing a transferable approach for dealing with high cardinality categorical variables like ZIP codes, NAICS industry codes, ICD10 diagnosis codes etc. I developed a simple Bayesian …</p>Donald RauscherTue, 20 Jun 2017 00:00:00 -0500tag:www.donaldrauscher.com,2017-06-20:/kaggle-renthop.htmlkagglerpredictive_modelingA simple Salesforce.com bulk API client for Python 3http://www.donaldrauscher.com/sfdc-bulk-api.html<p>I recently put together <a href="https://github.com/donaldrauscher/sfdc-bulk">my first Python package</a>! My company (AbleTo) recently migrated our CRM to Salesforce.com. Though not the most elegant interface, many tools (e.g. Marketo) provide off-the-shelf integrations with Salesforce, which was very appealing to us. To migrate all of our customer data (several million records …</p>Donald RauscherWed, 15 Mar 2017 00:00:00 -0500tag:www.donaldrauscher.com,2017-03-15:/sfdc-bulk-api.htmlsfdcapi_clientbulk_apipython3Post-Specific Resources in Jekyllhttp://www.donaldrauscher.com/jekyll-resources.html<p>I set up this blog on Jekyll last year (largely just to have a repository for my 538 Riddler solutions haha). I really like Jekyll because it is simple, supports Markdown and Liquid templating, can be <a href="https://help.github.com/articles/using-jekyll-as-a-static-site-generator-with-github-pages/">hosted for free</a> on Github. However, I did recently notice that my site was …</p>Donald RauscherSat, 11 Mar 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-03-11:/jekyll-resources.htmljekyllblogliquid_templatingAnalyzing Citi Bike Data w/ BigQueryhttp://www.donaldrauscher.com/citi-bike.html<p>I've recently started using Google Cloud Platform for some of my big data analyses. In particular, I have been playing with BigQuery. Unlike AWS Redshift, BigQuery is a fully elastic, multi-tenant database. It is very easy to setup and gives you essentially infinite scale! I decided to take BigQuery for …</p>Donald RauscherMon, 06 Mar 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-03-06:/citi-bike.htmlcitibikegcpbigqueryBuilding an Optimal Portfolio of ETFshttp://www.donaldrauscher.com/etf-portfolio.html<p>Exchange traded funds (ETFs) have taken the market by storm. Over the last few years, we’ve seen a <a href="http://www.icifactbook.org/ch3/16_fb_ch3">huge shift</a> in assets towards passive investing, motivated by ETF’s low fee structure and the revelation that most active managers cannot beat their benchmark. This shouldn't be terribly surprising. It …</p>Donald RauscherSun, 26 Feb 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-02-26:/etf-portfolio.htmlinvestingportfolio_optimizationetfsluigi538 Riddler: 100-Sided Diehttp://www.donaldrauscher.com/big-die.html<p>This week's <a href="https://fivethirtyeight.com/features/how-long-will-it-take-to-blow-out-the-birthday-candles/">Riddler</a> involves a game played with a 100-sided die (I seriously want one). I started by thinking about the problem as an <a href="https://en.wikipedia.org/wiki/Absorbing_Markov_chain">absorbing Markov Chain</a> with 101 states, 1 state representing the end of the game and 100 states for each potential previous roll. The transition matrix is …</p>Donald RauscherSat, 14 Jan 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-01-14:/big-die.html538fivethirtyeightriddlerprobability538 Riddler: Martin Gardner's 'Hip' Gamehttp://www.donaldrauscher.com/hip-game-riddler.html<p>I began <a href="https://fivethirtyeight.com/features/dont-throw-out-that-calendar/">this week's Riddler</a> by deriving an expression for the number of squares on a n-sized board:
<img src="/images/hip-square-cnt.png" style="display:block; margin-left:auto; margin-right:auto;">
<div class="equation" data-expr="\begin{aligned} S(n) = & \sum_{i=1}^{n-1} i^2*(n-i) = n\sum_{i=1}^{n-1} i^2 - \sum_{i=1}^{n-1} i^3 \\
= & n \left( \frac{n(n-1)(2n-1)}{6}\right) - \frac{n^2(n-1)^2}{4} = \frac{n^2(n^2-1)}{12}
\end{aligned}"></div>
This expression is a polynomial with a degree of 4, which confirms that the number of squares grows more quickly than the area of the board, making it increasingly difficult to achieve …</p>Donald RauscherMon, 09 Jan 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-01-09:/hip-game-riddler.html538fivethirtyeightriddlerlinear_programming538 Riddler: Dice Poker Riddlerhttp://www.donaldrauscher.com/dice-poker-riddler.html<p>In <a href="http://fivethirtyeight.com/features/can-you-deal-with-these-card-game-puzzles/">this week's Riddler</a>, we have another game theory problem. We can describe each player's strategy with a 6 number tuple. For player A, <span class="inline-equation" data-expr="a_{i}"></span> represents the probability that player A raises given a roll of i. For player B, <span class="inline-equation" data-expr="b_{i}"></span> represents the probability that player B calls a raise from player …</p>Donald RauscherSun, 01 Jan 2017 00:00:00 -0600tag:www.donaldrauscher.com,2017-01-01:/dice-poker-riddler.html538fivethirtyeightriddlergame_theorylinear_programming538 Riddler: Rebel vs. Stormtroopershttp://www.donaldrauscher.com/stormtrooper-riddler.html<p>In <a href="http://fivethirtyeight.com/features/build-your-own-death-star-and-defeat-the-stormtroopers/">this week's Riddler</a>, we are rebels trying to defeat a group of 9 advancing stormtroopers. Fortunately for us, we are more accurate than the notoriously inaccurate stormtroopers, and the stormtroopers are clumped together, making them easy to pick off.</p>
<p>First, the hit / miss probabilities for the stormtroopers / rebel with …</p>Donald RauscherMon, 26 Dec 2016 00:00:00 -0600tag:www.donaldrauscher.com,2016-12-26:/stormtrooper-riddler.html538fivethirtyeightriddlerprobability538 Riddler: Untangling the Tangled Wireshttp://www.donaldrauscher.com/untangle-riddler.html<p>The strategy for <a href="http://fivethirtyeight.com/features/everythings-mixed-up-can-you-sort-it-all-out/">this week's Riddler</a> is to continuously split the wires into halves until only pairs remain. Then, we form circuits between the pairs to pinpoint individual wires. Using this approach, we can determine the optimal number of trips when N is a power of 2. For <span class="inline-equation" data-expr="N = 2^{2} = 4"></span>, we need …</p>Donald RauscherSat, 17 Dec 2016 00:00:00 -0600tag:www.donaldrauscher.com,2016-12-17:/untangle-riddler.html538fivethirtyeightriddlerlogic538 Riddler: The CGold Warhttp://www.donaldrauscher.com/gold-war-riddler.html<p><a href="http://fivethirtyeight.com/features/how-much-gold-would-push-you-into-a-war/">This week's Riddler</a> challenges us with some game theory. Each player has a hefty $1 trillion in gold and an army whose strength is uniformly distributed between 0 and 1. Each player knows their own army's strength but not their opponent's army's strength (obviously). Each player then simultaneously declares "Peace …</p>Donald RauscherThu, 15 Dec 2016 00:00:00 -0600tag:www.donaldrauscher.com,2016-12-15:/gold-war-riddler.html538fivethirtyeightriddlergame_theory538 Riddler: Allison, Bob, and the Technicolor Dream Maphttp://www.donaldrauscher.com/map-game-riddler.html<p><svg id="map-game-riddler" style="display:block; margin-left:auto; margin-right:auto;" width="580" height="400" xmlns="http://www.w3.org/2000/svg">
<g>
<ellipse stroke="#000000" ry="172" rx="239" cy="202" cx="285" fill="#ffffaa"/>
<text x="285" y="65" dy="0.3em" text-anchor="middle" font-size="24">9</text>
</g>
<g>
<ellipse stroke="#000000" ry="93" rx="121" cy="171" cx="354" fill="#ffaaff"/>
<text x="354" y="130" dy="0.3em" text-anchor="middle" font-size="24">8</text>
</g>
<g>
<ellipse stroke="#000000" ry="73" rx="86" cy="181" cx="187" fill="#ffd4aa"/>
<text x="165" y="145" dy="0.3em" text-anchor="middle" font-size="24">7</text>
</g>
<g>
<ellipse stroke="#000000" ry="39" rx="51" cy="191" cx="366" fill="#56aaff"/>
<text x="366" y="191" dy="0.3em" text-anchor="middle" font-size="24">6</text>
</g>
<g>
<ellipse stroke="#000000" ry="39" rx="51" cy="191" cx="210" fill="#ffaaaa"/>
<text x="210" y="191" dy="0.3em" text-anchor="middle" font-size="24">5</text>
</g>
<g>
<ellipse stroke="#000000" ry="39" rx="51" cy="290" cx="285" fill="#56ffaa"/>
<text x="285" y="290" dy="0.3em" text-anchor="middle" font-size="24">4</text>
</g>
<g>
<ellipse stroke="#000000" ry="39" rx="51" cy="191" cx="285" fill="#56ffaa"/>
<text x="285" y="191" dy="0.3em" text-anchor="middle" font-size="24">3</text>
</g>
<g>
<ellipse stroke="#000000" ry="39" rx="124" cy="240" cx="374" fill="#ffaaaa"/>
<text x="374" y="240" dy="0.3em" text-anchor="middle" font-size="24">2</text>
</g>
<g>
<ellipse stroke="#000000" ry="39" rx="124" cy="240" cx="197" fill="#56aaff"/>
<text x="197" y="240" dy="0.3em" text-anchor="middle" font-size="24">1</text>
</g>
</svg></p>
<p><a class="animate">Animate</a></p>
<p>This week's Riddler was very interesting! I'll start with my big ah-ha: any shape touching N shapes will "bury" at least N-2 shape(s). Take a simple example where we have 3 touching circles, 3 different colors. We can't draw …</p>Donald RauscherSun, 13 Nov 2016 00:00:00 -0600tag:www.donaldrauscher.com,2016-11-13:/map-game-riddler.html538fivethirtyeightriddlerUsing the US Census API(s)http://www.donaldrauscher.com/census-api.html<p>The other day I was building a model and wanted to layer in some ZIP-level census data. Some quick Googling led me to <a href="http://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml">this .gov site</a> where you can search a database of stock reports and data cuts. However, I wasn't able to find what I was looking for. Not …</p>Donald RauscherThu, 10 Nov 2016 00:00:00 -0600tag:www.donaldrauscher.com,2016-11-10:/census-api.htmlapicensusdata-scraping538 Riddler: Chance of Being THE Deciding Votehttp://www.donaldrauscher.com/election-riddler.html<p><a href="http://fivethirtyeight.com/features/a-puzzle-will-you-yes-you-decide-the-election/">This week's Riddler</a> tasked us with calculating the probability of being the deciding vote in a toss-up election. For simplicity, I'm going to assume that there are an even number of other voters (an odd number of total voters). We can model the number of votes for "our" candidate as …</p>Donald RauscherSun, 06 Nov 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-11-06:/election-riddler.html538fivethirtyeightriddlerprobability538 Riddler: Betting on Cubs to Win!http://www.donaldrauscher.com/cubs-riddler.html<p>I approached <a href="http://fivethirtyeight.com/features/cubs-world-series-puzzles-for-fun-and-profit/">this week's Riddler</a> by building a decision tree from the bottom up. Let's start with the easiest case: betting on game 7. At this point, we must be net-even in betting, and we must wager $100 on the Cubs to win. Knowing this, we can extrapolate how we …</p>Donald RauscherSun, 02 Oct 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-10-02:/cubs-riddler.html538fivethirtyeightriddlerlogicgambling538 Riddler: How Big A Table Can The Carpenter Build?http://www.donaldrauscher.com/table-riddler.html<p>In <a href="http://fivethirtyeight.com/features/how-big-a-table-can-the-carpenter-build/">this week's Riddler</a>, the largest circular table that we can carve out of our 4x8 piece of wood with two congruent semi-circles has a radius of ~2.70 feet. We can fit the largest semi-circles in the wood by orienting them diagonally:</p>
<p><img src='/images/table-riddler.jpg' style="display:block; margin-left:auto; margin-right:auto;"></p>
<p>From the above graph, we can use …</p>Donald RauscherSun, 25 Sep 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-09-25:/table-riddler.html538fivethirtyeightriddlergeometry538 Riddler: Who Gets The $100 Bill?http://www.donaldrauscher.com/dollar-riddler.html<p>I modelled this week's <a href="http://fivethirtyeight.com/features/who-keeps-the-money-you-found-on-the-floor/">Riddler</a> as an <a href="https://en.wikipedia.org/wiki/Absorbing_Markov_chain">absorbing Markov chain</a>. This MC has 5 transient states (representing the dollar bill sitting in front of someone) and 5 absorbing states (which represent someone winning). The transition probability matrix is the following:
<div class="equation" data-expr="
\begin{matrix}
& 0 & \frac{1}{3} & 0 & 0 & \frac{1}{3} & \frac{1}{3} & 0 & 0 & 0 & 0 \\
& \frac{1}{3} & 0 & \frac{1}{3} & 0 & 0 & 0 & \frac{1}{3} & 0 & 0 & 0 \\
& 0 & \frac{1}{3} & 0 & \frac{1}{3} & 0 & 0 & 0 & \frac{1}{3} & 0 & 0 \\
& 0 & 0 & \frac{1}{3} & 0 & \frac{1}{3} & 0 & 0 & 0 & \frac{1}{3} & 0 \\
& \frac{1}{3} & 0 & 0 & \frac{1}{3} & 0 & 0 & 0 & 0 & 0 & \frac{1}{3} \\
& 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \\
& 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\
& 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\
& 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \\
& 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1
\end{matrix}
"></div></p>
<p>From this transition matrix we can calculate the absorbing probabilities …</p>Donald RauscherSun, 11 Sep 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-09-11:/dollar-riddler.html538fivethirtyeightriddlerprobability538 Riddler: Escaping the Angry Ramhttp://www.donaldrauscher.com/angry-ram-riddler.html<p>Link to <a href="http://fivethirtyeight.com/features/can-you-outrun-the-angry-ram-coming-right-for-oh-god/">this week's Riddler</a>. I began by forming expressions for two known facts of the problem. equalities. Firstly, we know that the derivative of the ram's path will be the slope of the line passing through the ram's current position (x,y) and the location of the fleeing person …</p>Donald RauscherWed, 24 Aug 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-08-24:/angry-ram-riddler.html538fivethirtyeightriddlerdiffeq538 Riddler: Hungry Bearshttp://www.donaldrauscher.com/hungry-bears.html<p>My first intuition after reading <a href="http://fivethirtyeight.com/features/should-the-grizzly-bear-eat-the-salmon/">this problem</a> was "why would the bear ever reject the first fish?" If the first fish is big, then it makes sense to eat it; there are no guarantees the next fish will be as big. If the first fish is small, then eat it …</p>Donald RauscherFri, 05 Aug 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-08-05:/hungry-bears.html538fivethirtyeightriddlerprobability538 Riddler: Traitorous Generalshttp://www.donaldrauscher.com/traitorous-generals.html<p>We begin by selecting one general at random. Our goal will be to determine if this specific general is loyal or traitorous, which we can figure out by polling the other generals. As we go around the circle, there are two stop conditions:</p>
<ol>
<li>The selected general receives <span class="inline-equation" data-expr="\left( \lceil N/2 \rceil \right)-1"></span> loyal votes</li>
<li>The …</li></ol>Donald RauscherSat, 30 Jul 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-07-30:/traitorous-generals.html538fivethirtyeightriddlerlogic538 Riddler: Defending Against an Alien Invasionhttp://www.donaldrauscher.com/alien-invasion.html<p>This <a href="http://fivethirtyeight.com/features/solve-the-puzzle-stop-the-alien-invasion/">week's Ridder</a> was the first one that I got wrong! I never really felt confident in my answer, thus no post. In the end, I did not model random points on the surface of the sphere correctly. A <a href="http://mathworld.wolfram.com/SpherePointPicking.html">good link</a> supplied by the 538 folks demonstrates how to do …</p>Donald RauscherFri, 29 Jul 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-07-29:/alien-invasion.html538fivethirtyeightriddlergeometry538 Riddler: A Variation of the Drunkard's Walk ... In a Barhttp://www.donaldrauscher.com/bar-riddler.html<p>This week’s <a href="http://fivethirtyeight.com/features/how-long-will-you-be-stuck-playing-this-bar-game/">Riddler</a> is a variation on the well-known OR problem, <a href="https://en.wikipedia.org/wiki/Random_walk">the drunkard’s walk</a>. We can model this problem as an absorbing Markov Chain with X+Y+1 states. The transition probability matrix is the following:
<div class="equation" data-expr="
\begin{matrix}
& 1 & 0 & 0 & 0 & 0 & \cdots & 0 & \\
& 0.5 & 0 & 0.5 & 0 & 0 & \cdots & 0 & \\
& 0 & 0.5 & 0 & 0.5 & 0 & \cdots & 0 & \\
& 0 & 0 & 0.5 & 0 & 0.5 & \cdots & 0 & \\
& \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots \\
& 0 & 0 & 0 & 0 & 0 & \cdots & 1
\end{matrix}
"></div></p>
<p>Once we compute the fundamental matrix (N), calculating the expected number of …</p>Donald RauscherThu, 14 Jul 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-07-14:/bar-riddler.html538fivethirtyeightriddlerprobability538 Riddler: Defending Riddler Headquartershttp://www.donaldrauscher.com/laser-riddler.html<p>The challenging part of this problem was creating an exhaustive state-space of bisectors. Odd-numbered polygons are a real pain. I started with a known bisector: a line that goes through one point and intersects the mid-point of the opposite side. If we shift the line slightly on one side, how …</p>Donald RauscherThu, 14 Jul 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-07-14:/laser-riddler.html538fivethirtyeightriddler538 Riddler: Defeating Roger Federerhttp://www.donaldrauscher.com/tennis-riddler.html<p>I began <a href="http://fivethirtyeight.com/features/can-you-figure-out-how-to-beat-roger-federer-at-wimbledon/">this challenge</a> at the obvious starting point: I win my first 71 points against Roger. This puts me up 2 sets to 0, up 5-Nill in the 3rd, and winning 40-Love in what is possibly the final game of the match. Things look pretty good; I've got a …</p>Donald RauscherThu, 07 Jul 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-07-07:/tennis-riddler.html538fivethirtyeightriddlerprobability538 Riddler: Puzzle of the Robot Pizza Cutterhttp://www.donaldrauscher.com/pizza-riddler.html<p>The robot's first cut will create 2 pizza slices. The robot's second cut will create either 3 or 4 pizza slices. The robot's third cut will create 4, 5, or 6 slices if starting with 3 slices or 5, 6, or 7 slices if starting with 4 slices. I began …</p>Donald RauscherSun, 26 Jun 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-06-26:/pizza-riddler.html538fivethirtyeightriddlergeometry538 Riddler: Puzzle of the Picky Eaterhttp://www.donaldrauscher.com/sandwich-riddler.html<p>I found it easiest to think about this problem in terms of polar coordinates. The furthest point that we can eat along trajectory <span class="inline-equation" data-expr="\theta"></span> is the following:
<div class="equation" data-expr="r \left( \theta \right) = \frac{1}{2 \left( 1 + cos( \theta ) \right)} \quad \forall \theta \in \left[ 0, \frac{\pi}{4} \right]"></div></p>
<p>Plotting this, it forms this weird, rounded rectangle shape:
<br>
<img src="/images/sandwich-riddler.jpg" width="400px" style = "display: block; margin-left: auto; margin-right: auto;"></p>
<p>I integrated the above equation to get the area. Unlike a regular integral where each …</p>Donald RauscherWed, 22 Jun 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-06-22:/sandwich-riddler.html538fivethirtyeightriddlergeometry538 Riddler: Puzzle of Baseball Divisional Champshttp://www.donaldrauscher.com/baseball-riddler.html<p>For <a href="http://fivethirtyeight.com/features/can-you-solve-the-puzzle-of-the-baseball-division-champs/">this week's Riddler</a>, I estimated that the division leader would have 88.8 wins after 162 games. I assumed that each team plays the other teams in it's division 19 times for a total of 76 intradivision games and 86 interdivision games, consistent with the <a href="https://en.wikipedia.org/wiki/Major_League_Baseball_schedule">actual scheduling rules</a>.</p>
<p>Interdivision …</p>Donald RauscherThu, 26 May 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-05-26:/baseball-riddler.html538fivethirtyeightriddlerbaseball538 Riddler: Puzzle of the Monsters' Gemshttp://www.donaldrauscher.com/monster-riddler.html<p>There are three ways that this game can end: slaying a rare monster (the most likely), slaying an uncommon monster, or slaying a common monster (the least likely). I began by thinking about the probability of each of these events happening. The probability of the game ending by slaying a …</p>Donald RauscherThu, 26 May 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-05-26:/monster-riddler.html538fivethirtyeightriddlerprobability538 Riddler: Puzzle of the Overflowing Martinihttp://www.donaldrauscher.com/martini-riddler.html<p>Here's my solution to <a href="http://fivethirtyeight.com/features/can-you-solve-the-puzzle-of-the-overflowing-martini-glass/">this week's 538 Riddler</a>:
<a href="/images/martini-riddler.jpg"><img src="/images/martini-riddler.jpg" width="885x"></a></p>
<p>If the liquid reaches <span class="inline-equation" data-expr="p"></span> fraction of the way up the glass when upright, then the liquid goes <span class="inline-equation" data-expr="p^2"></span> fraction of the way up the glass on the opposite side just before it begins to pour. </p>
<p><a href="https://en.wikipedia.org/wiki/Dandelin_spheres">Dandelin spheres</a> were key to my proof. They …</p>Donald RauscherThu, 19 May 2016 00:00:00 -0500tag:www.donaldrauscher.com,2016-05-19:/martini-riddler.html538fivethirtyeightriddlergeometry