A simple, minimal website for RobustiPy!
View the Project on GitHub here!
View the Python Documentation here!
View the RobustiPy working paper here!
A practical, end‑to‑end guide for fitting robust specifications, managing compute, and interpreting outputs.
pandas.DataFrame with your outcome(s), main predictor(s), and candidate controls.draws and/or kfold.results.summary() for key robustness statistics.results.plot().pip install robustipy
For the latest development version:
git clone https://github.com/RobustiPy/robustipy.git
cd robustipy
pip install .
RobustiPy expects a tidy pandas.DataFrame where each column corresponds to a variable.
Recommendations:
y, x, and controls columns are numeric or properly encoded.Both accept single outcomes or lists of outcomes (run separately) and produce a unified results object.
import pandas as pd
from robustipy.models import OLSRobust
# data = pd.read_csv("your_data.csv")
y = "your_outcome"
x = "your_key_predictor"
controls = ["control_1", "control_2", "control_3", "control_4"]
model = OLSRobust(y=y, x=x, data=data)
model.fit(
controls=controls,
draws=1000,
kfold=10,
oos_metric="pseudo-r2",
seed=192735,
n_cpu=4,
)
results = model.get_results()
results.summary(digits=3)
results.plot(ic="aic", project_name="union_example", figpath="./figures")
from robustipy.models import LRobust
model = LRobust(y="outcome_binary", x="treatment", data=data)
model.fit(
controls=controls,
draws=500,
kfold=5,
oos_metric="pseudo-r2",
seed=123,
)
results = model.get_results()
results.summary()
results.plot(ic="bic", oddsratio=True)
RobustiPy builds many “reasonable” specifications by varying which controls are included. The control list you pass to fit() defines the candidate set. The full specification space can be large, so RobustiPy lets you sample or downscale.
| Parameter | Purpose | Notes |
|---|---|---|
controls |
Candidate control variables | Required list of column names |
draws |
Bootstrap resamples per spec | If None, bootstrapping is skipped |
kfold |
Folds for cross‑validation | Requires oos_metric |
oos_metric |
OOS metric | 'pseudo-r2' or 'rmse' |
n_cpu |
Parallel processes | Defaults to all available CPUs minus one if not provided |
seed |
Reproducibility | Propagated to all random ops |
group |
Fixed effects grouping | De‑means outcomes by group |
z_specs_sample_size |
Sample specs | Randomly samples control‑set combinations |
composite_sample |
Composite bootstrap | Reduces compute by sampling before specs |
rescale_y/x/z |
Standardization | Rescales variables to mean 0, sd 1 |
threshold |
Workload warning | Warns if specs × draws × folds is too large |
model.fit(
controls=controls,
draws=200,
kfold=5,
z_specs_sample_size=300,
threshold=200000,
n_cpu=4,
)
After get_results(), the results object includes:
results.summary() for a compact robustness overview.results.plot() for the multi‑panel figure.results.summary_df with per‑spec metrics (ICs, OOS performance, etc.).results.estimates, results.p_values, results.r2_values for full matrices of outcomes.ic: 'aic', 'bic', or 'hqic'.specs: list of specific control sets to highlight (max 3).ci: confidence interval width.loess: smooth CI bounds.colormap: colormap for highlights and colorbars.highlights: toggle full‑model and null‑model highlights.oddsratio: show exponentiated coefficients for logistic models.figpath, project_name, ext: output location and file format.specs = [
["age", "tenure", "hours"],
["age", "tenure", "hours", "collgrad"],
]
results.plot(specs=specs, ic="hqic")
results.plot(...) saves a combined multi‑panel figure and a set of individual panels. Use figpath and project_name to control the output directory and filename prefix.
oos_metric.INTERPRETATION_GUIDE.MD.