Sub-Matrix Factorization for Real-Time Vote Prediction
The predikon
library is the Python library for the algorithm proposed in
Alexander Immer*, Victor Kristof*, Matthias Grossglauser, Patrick Thiran, Sub-Matrix Factorization for Real-Time Vote Prediction, KDD 2020
The predikon
algorithm enables you to predict aggregate vote outcomes (e.g., national) from partial outcomes (e.g., regional) that are revealed sequentially.
See the usage documentation more details on how to use this library or read the paper linked above for more details on how the algorithm works.
It is the algorithm powering predikon.ch, a platform for real-time vote prediction in Switzerland.
To install the Predikon library from PyPI, run
pip install predikon
Given a dataset Y
of historical vote results collected in an array of R
regions and V
votes, given a vector y
of partial results, and given an optional weighting w
per region (e.g., the number of valid votes in each region), it is easy to predict the unobserved entries of y
after observing at least one regional result (one entry of y
) of an ongoing referendum or election:
from predikon import LogisticSubSVD
model = LogisticSubSVD(Y, w)
pred = model.fit_predict(y)
# All unobserved entries in `y` are now filled in.
You can then obtain a prediction for the aggregate outcome (assuming the weights are the number of valid votes in this example) as:
N = w.sum() # Total number of votes.
ypred = pred.dot(w) / N
ytrue = y.dot(w) / N
print(abs(ypred - ytrue))
Have a look at the example notebook for a complete example of how to use the predikon
library (with Swiss referenda).
You can find further information in:
- The example notebook using Swiss referenda
- The usage documentation describing the set up in more details
- The scientific paper introducing the algorithm
And don't hesitate to reach out us if you have questions or to open issues!
- Python 3.5 and above
- NumPy 1.0.0 and above
- scikit-learn 0.16.1 and above