Math

Engineering

$$\mathrm{Re} = \frac{ \mbox{inertial} }{ \mbox{viscous} } = \frac{\rho U L}{\mu} = \frac{ U L}{\nu}$$

Coming out of our MIT Chem Eng PhD program, I always thought the curriculum could be made so much simpler by having a unifying glue of the mathematical derivation tree from first principles of every possible problem that could be asked in a PhD qualifying exam.  The key would be to define a consistent set of quantities (whose variable names and representations never change) that can be cross-indexed locally within a section and globally for the entire book, all labeled by their SI units.  To this end, Hal Alper and I wrote an outline.

Since then, we’ve added Erik Allen to the team and partnered with a LaTeX-in-the-cloud company to develop the sub-indexable quantity technology and translated the early work into a manuscript outline.

It’s not meant to replace the traditional chemical engineering textbook pillars (physics / stat mech, thermo, transport, and mathematical methods) but instead be the glue between them all.  For every undergrad and grad textbook, we can cross-reference section-by-section where we cover the material.

Please click the accompanying images for more information.

 

Baseball

$$P(X=k \ | \ p, n)  = {n\choose k}p^k(1-p)^{n-k}$$

During the 2005 and 2006 Major League Baseball seasons, a group of primarily MIT Chemical Engineering PhD students including Erik Allen, Ben Wang, Ryan Bennett, Tyler Martin, Chris Peiffer, and Joel Moxley joined forces to turn $500 to >$200k in the first 4 months and complete an overall ~1000X return.  This BBHedge (MIT Baseball Hedge Fund) operations were shut down following the U.S. Congress’ Unlawful Internet Gambling Enforcement Act (UIGEA) that went effective October 13, 2006.  This same core methodology was later featured in Academy Award winning film “Moneyball”.

More recently in late 2014 but including original founder Erik Allen, Rho AI began building a comprehensive data set for Major League Baseball to enable the application of deep learning techniques. Specifically, the data set is structured for straightforward access to the full data state at any given point of time in the past. At present, the data set contains 161,501,228 rows encompassing 1,478,170,204 data points. Pit Rho has considered several uses of this data set including i) optimum pitching staff usage, ii) predicting the type & location of the next pitch thrown, iii) fielder positioning for any batter / pitcher combination, iv) optimum location of the first baseman for holding runners, and v) optimum leadoff distance.

Please click the accompanying images for more information.

racing

Strategies employed during a race are based on a complex set of interacting factors, requiring the crew chief and race engineers to account for hundreds of potential data points at all times. One of the complexities in predicting the behavior of competitors, and which strategies they will employ, is assimilating all of the available data and creating a model that accurately reflects how a human being will interpret the information. This also requires that the unique tendencies of particular crew chiefs, drivers, and teams are taken into account; the same data presented to different people often results in different outcomes.

Rho AI approached this challenge by developing algorithms that can account for not only directly measurable factors, but also those aspects of decision making that are indirectly quantifiable. The current model is able to run in under 1 second, process nearly 1,000 data points, learn competitor and track tendencies automatically, and predict the strategy of every car in a field with over 85% accuracy.

Please click the accompanying images for more information.

$100B markets & data-enabled business models