Matheus Rabetti bio photo

Matheus Rabetti

Experimentation & Data Science @ Uber

Email LinkedIn Github

Recommended Blogs

About Me

“Generating numbers is easy, generating numbers you should trust is hard!”

A highly skilled Data Scientist with 10 years of experience, specializing in complex experimentation, causal inference and probability time series forecasting. My objective is to conduct analyses and build models that prioritize interpretability and clarity over pure performance. By focusing on transparency, I ensure that the insights derived are easily understood, allowing stakeholders to grasp how different actions impact outcomes. This approach enhances decision-making by providing actionable insights, fostering trust in the results, and driving more strategic and informed business decisions.

I have excelled in a variety of impactful roles across sectors, including serving as a Research Assistant at a Brazilian National Research Institute, Statistician at the Ministry of Labor, Product Analyst at Globo (the largest media company in Latin America), Marketing Scientist at Uber, and Marketing Data Scientist at Glovo, Consultant at Toptal, Lead Product Analyst & Experimentation Ambassador at Vista and Senior Data Scientist for different areas of my current company, Preply.

One of my key strengths is my versatility. I have proven myself in multiple areas of business, from marketing to supply, product, pricing and sales allowing me to develop a broad understanding of diverse business functions. This cross-functional expertise enables me to see the bigger picture and drive results across various domains.

Throughout my career, I have delivered measurable business value through optimal decision-making and effective stakeholder management. My approach balances quick, heuristic insights with rigorous analysis to drive MVP experimentation and informed decision-making. I am driven by intellectual curiosity, a passion for solving complex problems, and a deep commitment to continuous learning and growth.

Experience

@Preply

  • App Core Product Team
    • Implement and QA event taxonomies for self-service analysis on customer and user behaviour.
    • Develop and maintain KPIs and dashboards to monitor product metrics and identify insights.
  • Ranking & Pricing
    • Creating price elasticity curves for different marketplace segmentations using Conditional Average Treatment Effect modeling which found a group of segments summing up to 28% of the user base that would generate incremental revenue if the price of their tutors increased by a certain amount.
  • Supply & Demand
    • Tutor demand forecast
    • Causal Inference Model - Training impact on Tutor Performance which lead us to shift the strategy from proposing the tutors to take a certain course right after the profile creation approval to 14 days later resulting in 3% incremental revenue.
    • Tutor Community Message Classification (NLP) to track the sentiment of the messages and flagging the critical ones to be addressed by the internal team.
    • Retention Experimentation Framework using Survival Analysis
  • B2B Sales & Marketing
    • Lead Scoring Model
    • Hours Utilization Impact on Client Retention. Important for defining the new OKRs for sales agents.
    • Improved B2B users experience by tutor selection. What makes a tutor a good tutor for the specific B2B users needs. Collaboration with user research team.

@Vista

  • Technical leader of the analytics team for Next Best Action - working on personalised email experience
  • Building the experimentation culture, create standardisation and supporting experiment creation and analysis.
  • Leading a Data Product team of 3 analysts. Coaching and managing their carreer.
  • Implemented Bayesian Experimentation Framework

Experimentation Culture Awards 2022 - Building a culture of experimentation in public

@Toptal

  • Strategic advisor for Toptal business functions and leaders
  • Leading Talent Operations analytical function.
  • Holding weekly analysis prioritization meetings
  • Holding monthly business performance review meetings
  • OKRs definition
  • Improved newcomers talents retention by 5% in the network and by 2% the talents allocated to a job within 10 days. Discovery
    • Discovery - discovered talent retention issue. The metric wasn’t being tracked.
    • Problem Statement - explored the data and found that many talents leave the network after finishing a job.
    • Action Item - Implement a new job matching process to tackle jobs with end scheduled.
  • Probability to convert based on Technical Steps Grade - Regression Model to predict when a talent score high in the first steps of the funnel if we can skip further steps in the screening process.

@Glovo

  • Experimentation/Causality POC
  • Identifying heterogeneous treatment effects of marketing interventions (Retargeting). Segment Discovery leads to more efficient retargeting audiences.
  • Multi-Touch Attribution (building the role pipeline for u-shaped and Markov chains)
  • UMAP embedding clustering to identify users that churned due to bad user experience.
  • Marketing First Order Prediction - predicting total conversions for a specific time window.
  • Incrementality Tests for Retargeting and Acquisition
  • Build reliable, precise and faster A/B tests with variance reduction (CUPED), bias correction (CUPED / Diff and Diff), identifying novelty or primacy effects, randomization unit correction (Linear Mixed Model), and Bayesian Inference.
  • Moving the company culture of decision making based on correlation studies to causality analysis (instrumental variables, propensity score matching, mediation modeling, regression discontinuity, causal diagrams, and others causal inference methods)

@Uber

  • Methods for geo experimentation: Bayesian Hierarchical Time Series (Causal Impact), Synthetic Control, Matching Similar Cities and Regions for geo experimentation
  • Market Saturation, Total Addressable Market and Opportunity Size for Engagement, Churn and Acquisition segments
  • Marketing and Incentives Impact in Acquisition Metrics - short and long term impact inference.
  • LAtam POC for Marketing Incrementality Tests & Measurement Methodology
  • Measuring Offline and Brand Campaigns Business Impact
  • Marketing Business Results Report sent monthly for key stakeholders
  • Interviewing marketing data analysts

@Globo.com - largest media company in Latin America

  • Daily business insights giving guidance for product changes (working along a software engineer and a designer)
  • Implemented a churn prediction model to anticipate this decision and make actions to retain the customer.
  • User Path Navigation with Markov Chain
  • Building the metrics and the statistical environment on A/B platform.
  • Formulating success metrics for quality of experience on player, estimating marginal effects of these metrics on engagement and using the estimates on how to prioritize what metric to act on.
  • Computer Vision for predicting when the video credits start

@Ministry of Labor

  • Time series prediction of admissions and dismissals for the main Brazilian cities
  • Monthly report of labor market monitoring metrics of affiliated municipalities
  • Public Dashboard informing policy makers and the overall population in the current labor market situation

@IPEA - Institute of Economic and Applied Research

  • An econometric analysis of the diversity on agriculture familiar production.
  • Panel of social vulnerability in partnership with the UN.
  • Investigates the population and employment dynamics in central urban areas of twelve selected capitals using Spatial analysis methods. Kernel heat maps were used to conceptualize and delimite central areas for each city.

Skills

  • Modeling Frameworks: econml, pymc, bsts, Scikit-learn, Caret, XGBoost, LightGBM
  • Programming Languages: R, Python, SQL
  • Compute Instance: Google Compute Engine, Databricks
  • Data Warehouse (OLAP): Hive, Vertica, Presto, AWS Redshift, Google BigQuery, Snowflake, dbt
  • Product Event Tracking: Google Analytics, Amplitude
  • Cloud Storage: AWS S3, Google Cloud Storage
  • Distributed Machine Learning: Spark
  • Job Scheduler: Luigi
  • Visualisation: Tableau, Matplotlib, ggplot, Looker, Google Spreadsheet
  • Version Control: Git

Courses

  • MITx: 15.071x The Analytics Edge: MIT’s The Analytics Edge is an edX course focused on using statistical tools to gain insight about data and make predictions. It has around 75 datasets and starts from linear regression upto clustering and some classification techniques like Random Forest and CART models in between.

  • Data Science Specialization: This Specialization covers the concepts and tools you’ll need throughout the entire data science pipeline, from asking the right kinds of questions to making inferences and publishing results.

  • Master Statistics with R: This Specialization showed how to analyze and visualize data in R and created reproducible data analysis reports, demonstrate a conceptual understanding of the unified nature of statistical inference, perform frequentist and Bayesian statistical inference and modeling to understand natural phenomena and make data-based decisions.

  • Digital Marketing: This program offers you the opportunity to master platform-specific skills valued by top employers, while at the same time establishing a broad-based understanding of the whole digital marketing ecosystem. Run live campaigns on major marketing platforms, learn and apply new marketing techniques, analyze results, and produce actionable insights.

  • Causal Inference: In this series of 7 courses created by Duke University with support from eBay focuses on how to use the advanced methods of instrumental variables and regression discontinuity to find causal effects. Guides you through the concepts that data scientists need to always consider when examining and making inferences about data.

Publications

  1. Rabetti, M.S.; Nadalin, V.G.; Oliveira, C.A.P.; Furtado, B.A.; Cavalcanti, C.B. (2016) Population and Employment Dynamics in the Urban Centers of the Brazilian Metropolis .

  2. Rabetti, M.S.; Sambuichi, R.H.R.; Galindo, E.P.; Pereira, R.M.; Cconstantino, M. (2016) Production Diversity in Family Agriculture Establishments in Brazil: an econometric analysis based on the registration of the Declaration of Aptitude to Pronaf (DAP) .

  3. Rabetti, M.S. and Carvalho, C.H.R. (2015) Traffic accidents on Brazilian federal highways .

Links

  • I worked on the development, calculation and construction along a BI team of a interactive plataform to analyse the brazilian labour market, Painel de Monitoramento do Mercado de Trabalho.

  • I’ve done all analysis procedures on the IVS project, a plataform developed in partnership betweend United Nations Development Programme and Institute for Applied Economic Research, IVS.

  • Some results of the third publication listed above - Mapping the Economic Centers of Brazil, RPubs Portfolio

About This Site

This site is powered by Jekyll using the Minimal Mistakes theme. All blog posts are released under a Creative Commons Attribution-ShareAlike 4.0 International License.

All R blog posts are compiled with knitr R markdown. You can find the reproducible sources of each blog post here.