Modell für Wahlen in den USA (DNN)
Storyboard
Book:
Allan J. Lichtman - Predicting the Next President-Rowman & Littlefield Publishers (2020)
Program
ID:(1789, 0)
Lichtman-Modelldaten laden
Beschreibung
Laden Sie die Daten in die Datei election.csv:
# import panda and numpy import pandas as pd import numpy as np # load data election = pd.read_csv('election.csv')
ID:(13856, 0)
Studiendatenstruktur
Beschreibung
Alle Spalten mit head der in election geladenen Daten anzeigen:
# show 5 records election.head(5)
ID:(13857, 0)
Formulardaten zum Trainieren und Auswerten
Beschreibung
Bilden der Daten, um X_train, y_train zu trainieren und X_eval, y_eval auszuwerten:
# import train_test_split from sklearn.model_selection import train_test_split # build train and evaluation data for the vote and electorate results X = election y_v = election.pop('vote-victory') y_e = election.pop('electoral-victory') X_train_v, X_eval_v, y_train_v, y_eval_v = train_test_split(X, y_v, test_size=0.33, random_state=42) X_train_e, X_eval_e, y_train_e, y_eval_e = train_test_split(X, y_e, test_size=0.33, random_state=42)
ID:(13858, 0)
Erstellen eines Spaltenarray, um das Modell zu definieren
Beschreibung
Um das Modell zu definieren, wird die Anordnung der Spalten feature_columns erstellt, die verwendet werden von:
import tensorflow as tf NUMERIC_COLUMNS = ['party-mandate', 'nomination-contest', 'incumbency', 'third-party', 'short-term-economy', 'long-term-economy', 'policy-change', 'social-unrest', 'scandal', 'foreign-military-failure', 'foreign-military-success', 'incumbent-charisma', 'challenger-charisma'] feature_columns = [] for feature_name in NUMERIC_COLUMNS: feature_columns.append(tf.feature_column.numeric_column(feature_name, dtype=tf.int32))
ID:(13859, 0)
Definieren der DNN-Modelle
Beschreibung
Mit den Spalten feature_columns können Sie mit estimator.DNNClassifier die Modelle DNN_model_v für die Abstimmung und DNN_model_e für die Wähler definieren:
# define the DNN model for voting DNN_model_v = tf.estimator.DNNClassifier( feature_columns=feature_columns, hidden_units=[64,32]) # define the DNN model for electorate DNN_model_e = tf.estimator.DNNClassifier( feature_columns=feature_columns, hidden_units=[64,32])
ID:(13860, 0)
Daten zum Trainieren und Auswerten hochladen
Beschreibung
Um das Training auszuführen, müssen Sie die Daten sowohl aus dem Training train_input_fn als auch aus der Auswertung eval_input_fn in Tensoren laden und mischen:
# input function for training def train_input_fn(features, labels, batch_size): dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) dataset = dataset.shuffle(10).repeat().batch(batch_size) return dataset # input function for evaluation or prediction def eval_input_fn(features, labels, batch_size): features=dict(features) if labels is None: inputs = features else: inputs = (features, labels) dataset = tf.data.Dataset.from_tensor_slices(inputs) assert batch_size is not None, 'batch_size must not be None' dataset = dataset.batch(batch_size) return dataset
ID:(13861, 0)
Definieren des DNN-Modells für Wahl bei Stimme
Beschreibung
Mit den Spalten feature_columns kann man das Modell DNN_model_v mit estimator.DNNClassifier definieren:
# define the DNN model v batch_size = 10 train_steps = 40 for i in range(0,100): DNN_model_v.train(input_fn=lambda:train_input_fn(X_train_v, y_train_v,batch_size),steps=train_steps)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.7023675, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 40...
INFO:tensorflow:Saving checkpoints for 40 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 40...
INFO:tensorflow:Loss for final step: 0.5706555.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
...
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-3960
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 3960...
INFO:tensorflow:Saving checkpoints for 3960 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 3960...
INFO:tensorflow:loss = 0.19702096, step = 3960
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 4000...
INFO:tensorflow:Saving checkpoints for 4000 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 4000...
INFO:tensorflow:Loss for final step: 0.25778225.
ID:(13862, 0)
Definieren des DNN-Modells für Wahl bei Elektorat
Beschreibung
Mit den Spalten feature_columns kann man das Modell DNN_model_e mit estimator.DNNClassifier definieren:
# define the DNN model e batch_size = 10 train_steps = 40 for i in range(0,100): DNN_model_e.train(input_fn=lambda:train_input_fn(X_train_e, y_train_e,batch_size),steps=train_steps)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.66710687, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 40...
INFO:tensorflow:Saving checkpoints for 40 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 40...
INFO:tensorflow:Loss for final step: 0.74084073.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
...
INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-3960
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 3960...
INFO:tensorflow:Saving checkpoints for 3960 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 3960...
INFO:tensorflow:loss = 0.32061976, step = 3960
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 4000...
INFO:tensorflow:Saving checkpoints for 4000 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 4000...
INFO:tensorflow:Loss for final step: 0.2767122.
ID:(13863, 0)
Bewerten Sie das DNN-Modell des Wahlsiegs
Beschreibung
Bewerten des Abstimmungsgewinnmodell mit den von train_input_fn erstellten Daten:
# evaluate the DNN model of victory by votes eval_result_v = DNN_model_v.evaluate(input_fn=lambda:eval_input_fn(X_eval_v, y_eval_v,batch_size))
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-07-27T13:08:31
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-4000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 0.37481s
INFO:tensorflow:Finished evaluation at 2021-07-27-13:08:32
INFO:tensorflow:Saving dict for global step 4000: accuracy = 0.46153846, accuracy_baseline = 0.61538464, auc = 0.9, auc_precision_recall = 0.9057735, average_loss = 0.81627464, global_step = 4000, label/mean = 0.3846154, loss = 0.74616987, precision = 0.41666666, prediction/mean = 0.7725815, recall = 1.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 4000: C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-4000
ID:(13864, 0)
Bewerten des DNN-Siegmodells nach College
Beschreibung
Bewerten Sie das Komiteesiegmodell mit den von train_input_fn erstellten Daten:
# evaluate the DNN model of victory by college eval_result_e = DNN_model_v.evaluate(input_fn=lambda:eval_input_fn(X_eval_e, y_eval_e,batch_size))
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2021-07-27T13:08:53
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-4000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Inference Time : 0.37231s
INFO:tensorflow:Finished evaluation at 2021-07-27-13:08:54
INFO:tensorflow:Saving dict for global step 4000: accuracy = 0.61538464, accuracy_baseline = 0.53846157, auc = 0.78571427, auc_precision_recall = 0.77785635, average_loss = 0.7589465, global_step = 4000, label/mean = 0.46153846, loss = 0.80362004, precision = 0.54545456, prediction/mean = 0.7483164, recall = 1.0
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 4000: C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-4000
ID:(13865, 0)
Vorhersage mit dem DNN-Modell von Sieg durch Wahl durchführen
Beschreibung
Ergebnis des Wahlsiegmodells mit den von eval_input_fn erstellten Bewertungsdaten voraus:
# forcast with the DNN model of victory by voting predictions_v = DNN_model_v.predict( input_fn=lambda:eval_input_fn(X_eval_v,labels=None, batch_size=batch_size)) results_v = list(predictions_v)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-4000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
ID:(13866, 0)
Vorhersage mit dem DNN-Siegmodell durch das College
Beschreibung
Sagen Sie das Ergebnis des Komiteesiegmodells mit den von eval_input_fn erstellten Bewertungsdaten voraus:
# forcast with the DNN model of victory by college predictions_e = DNN_model_e.predict( input_fn=lambda:eval_input_fn(X_eval_e,labels=None, batch_size=batch_size)) results_e = list(predictions_e)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-4000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
ID:(13868, 0)
Vorhersage mit dem DNN-Modell von Sieg durch Wahl durchführen
Beschreibung
Sagen Sie das Ergebnis des Komiteesiegmodells mit den von eval_input_fn erstellten Bewertungsdaten voraus:
# show histogram of probabilies import numpy from matplotlib import pyplot bins = numpy.linspace(0, 1, 10) prob_v = [pred['probabilities'][1] for pred in results_v] prob_e = [pred['probabilities'][1] for pred in results_e] pyplot.hist([prob_v,prob_e], bins, label=['vote','college']) pyplot.title('predicted probabilities') pyplot.xlabel('probability') pyplot.ylabel('frequency') pyplot.legend(loc='upper right') pyplot.show()
ID:(13867, 0)
ROC Kurven
Beschreibung
Um diese Prognose zu verwenden, muss ein Grenzwert der Immobilie definiert werden, um zu definieren, bei welchem ??Wert das Überleben vorhergesagt wird und unter welchem ??das Nicht-Überleben vorhergesagt wird. Dazu muss die Wahrscheinlichkeit, dass das Überleben vorhergesagt und beobachtet wird (richtig positiv) bewertet und mit der Wahrscheinlichkeit verglichen wird, dass das Überleben vorhergesagt wird, wenn es nicht überlebt wird (falsch positiv).\\nDer wahr-positive
$TPR=\displaystyle\frac{TP}{TP+FN}$
\\n\\nwobei wahr-positiv
$FPR=\displaystyle\frac{FP}{FP+TN}$
wobei falsch-positiv
from sklearn.metrics import roc_curve from matplotlib import pyplot as plt fpr, tpr_v, _ = roc_curve(y_eval_v, probs) fpr, tpr_e, _ = roc_curve(y_eval_e, probs) plt.plot(fpr,tpr_v,label='vote') plt.plot(fpr,tpr_e,label='college') plt.title('ROC curve') plt.xlabel('false positive rate') plt.ylabel('true positive rate') plt.legend(loc='lower right') plt.xlim(0,) plt.ylim(0,)
Die Darstellung beider Wahrscheinlichkeiten wird als ROC-Diagramm (Receiver Operating Characteristic) bezeichnet:
ID:(13869, 0)
Daten zur Vorhersage laden und untersuchen
Beschreibung
Laden und anzeigen von 6 Spalten mit head der in election_scenarios geladenen Daten anzeigen:
# load and show 6 scenarios election_secnarios = pd.read_csv('election-secnarios.csv') election_secnarios.head(6)
ID:(13870, 0)
Vorhersage
Beschreibung
Prognostizierte Ergebnisse der in election_scenarios enthaltenen Szenarien:
predictions_v = DNN_model_v.predict( input_fn=lambda:eval_input_fn(election_secnarios,labels=None, batch_size=batch_size)) predictions_e = DNN_model_e.predict( input_fn=lambda:eval_input_fn(election_secnarios,labels=None, batch_size=batch_size)) results_v = list(predictions_v) results_e = list(predictions_e) def x(res,j): class_id = res[j]['class_ids'] probability = int(res[j]['probabilities'][class_id] *100) if int(class_id) == 0: return ('%s%% probalitity to %s' % (probability,'Challenger')) else: return ('%s%% probalitity to %s' % (probability,'Incumbent')) print ('Predictions for the scenarios:') for i in range(0,6): print (x(results_v,i)+','+x(results_e,i))
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-4000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-4000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
Predictions for the scenarios:
68% probalitity to Incumbent,60% probalitity to Incumbent
80% probalitity to Incumbent,72% probalitity to Incumbent
86% probalitity to Incumbent,76% probalitity to Incumbent
90% probalitity to Incumbent,76% probalitity to Incumbent
89% probalitity to Incumbent,76% probalitity to Incumbent
93% probalitity to Incumbent,75% probalitity to Incumbent
ID:(13871, 0)