Home Data Artificial Intelligence Getting to know PyMC3, a probabilistic programming framework for Bayesian Analysis in...

Getting to know PyMC3, a probabilistic programming framework for Bayesian Analysis in Python

Vincy Davis

December 11, 2019 - 4:00 pm

4151

4 min read

Bayes’ theorem, named after 18th-century British mathematician Thomas Bayes, is a mathematical formula for determining conditional probability. This theorem is used to revise or update existing predictions or theories using new or additional evidence. Bayes theorem is also used in the field of data science as it provides a rule for moving from a prior probability to a posterior probability.

In Bayesian statistics, a prior probability is the probability of an event before a new data is collected and a posterior probability is a conditional probability that is allotted after the relevant evidence is acquired. Hence, the Bayes algorithm is one of the most popular machine learning techniques in the field of data science.

In this post, we are going to discuss a specific Bayesian implementation called probabilistic programming (PP) in Python, considering that modern Bayesian statistics is mainly done by writing code. The probabilistic programming enables flexible specification of complex Bayesian statistical models, thus giving users the ability to focus more on model design, evaluation, and interpretation, and less on mathematical or computational details.

How to use PyMC3 in probabilistic programming?

In the paper, the researchers have utilized a simple Bayesian linear regression model with normal priors for the parameters. The unknown variables in the model are also assigned a prior distribution. The artificial data in the model are then simulated using NumPy’s random module, followed by the PyMC3 model to retrieve the corresponding parameters. The straightforward PyMC3 model structure is used to generate the unknown data as it is close to the statistical notation.

Firstly, the necessary components are imported from PyMC to build the required model. It is represented in the full format initially and then explained partly. The paper states, “Following instantiation of the model, the subsequent specification of the model components is performed inside a with statement:

with basic_model:

This creates a context manager, with our basic model as the context, that includes all statements until the indented block ends.”

This means that all the PyMC3 objects introduced in the indented code block below the with statements are added to the model behind the scenes. In the absence of this context manager idiom, users would be forced to manually associate each of the variables with the basic model immediately after we create them. Also, if a user tries to create a new random variable without a with model: statement, it will cause an error due to the absence of an obvious model for the variable to be added to.

Next, to obtain posterior estimates for the unknown variables in the model, the posterior estimates are calculated analytically. The researchers have explained two approaches to obtain posterior estimates, users can choose either of them depending on the structure of the model and the goals of the analysis. The first approach is called finding the maximum a posteriori (MAP) point using optimization methods and the second approach is computing summaries based on samples drawn from the posterior distribution using Markov Chain Monte Carlo (MCMC) sampling methods.

For producing a posterior analysis of the required model, PyMC3 provides plotting and summarization functions for inspecting the sampling output. A simple posterior plot can be created using traceplot. In the traceplot, the left column consists of the smoothed histogram while the right column contains the samples of the Markov chain plotted in sequential order. In addition, the summary function of PyMC3 also provides a text-based output of common posterior statistics.

You can also learn more about the practical implementation of PyMC3 and its loss functions in the book ‘Bayesian Analysis with Python’ by Packt Publishing.

Top 6 Cybersecurity Books from Packt to Accelerate Your Career

Your Quick Introduction to Extended Events in Analysis Services from Blog…

Logging the history of my past SQL Saturday presentations from Blog…

Storage savings with Table Compression from Blog Posts – SQLServerCentral

Daily Coping 31 Dec 2020 from Blog Posts – SQLServerCentral

Learning Essential Linux Commands for Navigating the Shell Effectively

Exploring the Strategy Behavioral Design Pattern in Node.js

How to integrate a Medium editor in Angular 8

Implementing memory management with Golang’s garbage collector

How to create sales analysis app in Qlik Sense using DAR…

Getting to know PyMC3, a probabilistic programming framework for Bayesian Analysis in Python

Further Reading

How to use PyMC3 in probabilistic programming?

Read Next

Must Read in Cloud & Networking

Top life hacks for prepping for your IT certification exam

Learning Essential Linux Commands for Navigating the Shell Effectively

ServiceNow Partners with IBM on AIOps from DevOps.com

Must Read in Data

Learn Transformers for Natural Language Processing with Denis Rothman

Scientific Analysis of Donald Trump’s Tweets on COVID-19 with Transformers

Distributed training in TensorFlow 2.x

Interviews

Learn Transformers for Natural Language Processing with Denis Rothman

Clean Coding in Python with Mariano Anaya

Bringing AI to the B2B world: Catching up with Sidetrade CTO Mark Sheldon [Interview]

On Adobe InDesign 2020, graphic designing industry direction and more: Iman Ahmed, an Adobe Certified Partner and Instructor [Interview]

Is DevOps experiencing an identity crisis? [Interview]

MobilePro

datapro

Programming

Subscribe to our newsletter