Benützer:


Modell für Wahlen in den USA (DNN)

Storyboard

Book:

Allan J. Lichtman - Predicting the Next President-Rowman & Littlefield Publishers (2020)

Program

USElection.ipynb

election.csv

election-secnarios.csv

>Modell

ID:(1789, 0)



Lichtman-Modelldaten laden

Beschreibung

>Top


Laden Sie die Daten in die Datei election.csv:

# import panda and numpy
import pandas as pd
import numpy as np

# load data
election = pd.read_csv('election.csv')

ID:(13856, 0)



Studiendatenstruktur

Beschreibung

>Top


Alle Spalten mit head der in election geladenen Daten anzeigen:

# show 5 records
election.head(5)


ID:(13857, 0)



Formulardaten zum Trainieren und Auswerten

Beschreibung

>Top


Bilden der Daten, um X_train, y_train zu trainieren und X_eval, y_eval auszuwerten:

# import train_test_split
from sklearn.model_selection import train_test_split

# build train and evaluation data for the vote and electorate results
X = election
y_v = election.pop('vote-victory')
y_e = election.pop('electoral-victory')
X_train_v, X_eval_v, y_train_v, y_eval_v = train_test_split(X, y_v, test_size=0.33, random_state=42)
X_train_e, X_eval_e, y_train_e, y_eval_e = train_test_split(X, y_e, test_size=0.33, random_state=42)

ID:(13858, 0)



Erstellen eines Spaltenarray, um das Modell zu definieren

Beschreibung

>Top


Um das Modell zu definieren, wird die Anordnung der Spalten feature_columns erstellt, die verwendet werden von:

import tensorflow as tf

NUMERIC_COLUMNS = ['party-mandate', 'nomination-contest', 'incumbency', 'third-party', 'short-term-economy', 'long-term-economy', 'policy-change', 'social-unrest', 'scandal', 'foreign-military-failure', 'foreign-military-success', 'incumbent-charisma', 'challenger-charisma']

feature_columns = []
for feature_name in NUMERIC_COLUMNS:
  feature_columns.append(tf.feature_column.numeric_column(feature_name, dtype=tf.int32))

ID:(13859, 0)



Definieren der DNN-Modelle

Beschreibung

>Top


Mit den Spalten feature_columns können Sie mit estimator.DNNClassifier die Modelle DNN_model_v für die Abstimmung und DNN_model_e für die Wähler definieren:

# define the DNN model for voting
DNN_model_v = tf.estimator.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[64,32])
# define the DNN model for electorate
DNN_model_e = tf.estimator.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[64,32])

ID:(13860, 0)



Daten zum Trainieren und Auswerten hochladen

Beschreibung

>Top


Um das Training auszuführen, müssen Sie die Daten sowohl aus dem Training train_input_fn als auch aus der Auswertung eval_input_fn in Tensoren laden und mischen:

# input function for training
def train_input_fn(features, labels, batch_size):
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))
    dataset = dataset.shuffle(10).repeat().batch(batch_size)
    return dataset

# input function for evaluation or prediction
def eval_input_fn(features, labels, batch_size):
    features=dict(features)
    if labels is None:
        inputs = features
    else:
        inputs = (features, labels)

    dataset = tf.data.Dataset.from_tensor_slices(inputs)
    
    assert batch_size is not None, 'batch_size must not be None'
    dataset = dataset.batch(batch_size)

    return dataset

ID:(13861, 0)



Definieren des DNN-Modells für Wahl bei Stimme

Beschreibung

>Top


Mit den Spalten feature_columns kann man das Modell DNN_model_v mit estimator.DNNClassifier definieren:

# define the DNN model v
batch_size = 10
train_steps = 40

for i in range(0,100):    
    DNN_model_v.train(input_fn=lambda:train_input_fn(X_train_v, y_train_v,batch_size),steps=train_steps)

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Saving checkpoints for 0 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:loss = 0.7023675, step = 0

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 40...

INFO:tensorflow:Saving checkpoints for 40 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 40...

INFO:tensorflow:Loss for final step: 0.5706555.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

...

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-3960

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 3960...

INFO:tensorflow:Saving checkpoints for 3960 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 3960...

INFO:tensorflow:loss = 0.19702096, step = 3960

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 4000...

INFO:tensorflow:Saving checkpoints for 4000 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 4000...

INFO:tensorflow:Loss for final step: 0.25778225.

ID:(13862, 0)



Definieren des DNN-Modells für Wahl bei Elektorat

Beschreibung

>Top


Mit den Spalten feature_columns kann man das Modell DNN_model_e mit estimator.DNNClassifier definieren:

# define the DNN model e
batch_size = 10
train_steps = 40

for i in range(0,100):    
    DNN_model_e.train(input_fn=lambda:train_input_fn(X_train_e, y_train_e,batch_size),steps=train_steps)

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...

INFO:tensorflow:Saving checkpoints for 0 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...

INFO:tensorflow:loss = 0.66710687, step = 0

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 40...

INFO:tensorflow:Saving checkpoints for 40 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 40...

INFO:tensorflow:Loss for final step: 0.74084073.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Create CheckpointSaverHook.

INFO:tensorflow:Graph was finalized.

...

INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-3960

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 3960...

INFO:tensorflow:Saving checkpoints for 3960 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 3960...

INFO:tensorflow:loss = 0.32061976, step = 3960

INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 4000...

INFO:tensorflow:Saving checkpoints for 4000 into C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt.

INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 4000...

INFO:tensorflow:Loss for final step: 0.2767122.

ID:(13863, 0)



Bewerten Sie das DNN-Modell des Wahlsiegs

Beschreibung

>Top


Bewerten des Abstimmungsgewinnmodell mit den von train_input_fn erstellten Daten:

# evaluate the DNN model of victory by votes
eval_result_v = DNN_model_v.evaluate(input_fn=lambda:eval_input_fn(X_eval_v, y_eval_v,batch_size))

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Starting evaluation at 2021-07-27T13:08:31

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-4000

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Inference Time : 0.37481s

INFO:tensorflow:Finished evaluation at 2021-07-27-13:08:32

INFO:tensorflow:Saving dict for global step 4000: accuracy = 0.46153846, accuracy_baseline = 0.61538464, auc = 0.9, auc_precision_recall = 0.9057735, average_loss = 0.81627464, global_step = 4000, label/mean = 0.3846154, loss = 0.74616987, precision = 0.41666666, prediction/mean = 0.7725815, recall = 1.0

INFO:tensorflow:Saving 'checkpoint_path' summary for global step 4000: C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-4000

ID:(13864, 0)



Bewerten des DNN-Siegmodells nach College

Beschreibung

>Top


Bewerten Sie das Komiteesiegmodell mit den von train_input_fn erstellten Daten:

# evaluate the DNN model of victory by college
eval_result_e = DNN_model_v.evaluate(input_fn=lambda:eval_input_fn(X_eval_e, y_eval_e,batch_size))

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Starting evaluation at 2021-07-27T13:08:53

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-4000

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Inference Time : 0.37231s

INFO:tensorflow:Finished evaluation at 2021-07-27-13:08:54

INFO:tensorflow:Saving dict for global step 4000: accuracy = 0.61538464, accuracy_baseline = 0.53846157, auc = 0.78571427, auc_precision_recall = 0.77785635, average_loss = 0.7589465, global_step = 4000, label/mean = 0.46153846, loss = 0.80362004, precision = 0.54545456, prediction/mean = 0.7483164, recall = 1.0

INFO:tensorflow:Saving 'checkpoint_path' summary for global step 4000: C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-4000

ID:(13865, 0)



Vorhersage mit dem DNN-Modell von Sieg durch Wahl durchführen

Beschreibung

>Top


Ergebnis des Wahlsiegmodells mit den von eval_input_fn erstellten Bewertungsdaten voraus:

# forcast with the DNN model of victory by voting
predictions_v = DNN_model_v.predict(
    input_fn=lambda:eval_input_fn(X_eval_v,labels=None,
    batch_size=batch_size))

results_v = list(predictions_v)

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-4000

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

ID:(13866, 0)



Vorhersage mit dem DNN-Siegmodell durch das College

Beschreibung

>Top


Sagen Sie das Ergebnis des Komiteesiegmodells mit den von eval_input_fn erstellten Bewertungsdaten voraus:

# forcast with the DNN model of victory by college
predictions_e = DNN_model_e.predict(
    input_fn=lambda:eval_input_fn(X_eval_e,labels=None,
    batch_size=batch_size))

results_e = list(predictions_e)

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-4000

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

ID:(13868, 0)



Vorhersage mit dem DNN-Modell von Sieg durch Wahl durchführen

Beschreibung

>Top


Sagen Sie das Ergebnis des Komiteesiegmodells mit den von eval_input_fn erstellten Bewertungsdaten voraus:

# show histogram of probabilies
import numpy
from matplotlib import pyplot

bins = numpy.linspace(0, 1, 10)
prob_v = [pred['probabilities'][1] for pred in results_v]
prob_e = [pred['probabilities'][1] for pred in results_e]

pyplot.hist([prob_v,prob_e], bins, label=['vote','college'])
pyplot.title('predicted probabilities')
pyplot.xlabel('probability')
pyplot.ylabel('frequency')
pyplot.legend(loc='upper right')
pyplot.show()


ID:(13867, 0)



ROC Kurven

Beschreibung

>Top


Um diese Prognose zu verwenden, muss ein Grenzwert der Immobilie definiert werden, um zu definieren, bei welchem ??Wert das Überleben vorhergesagt wird und unter welchem ??das Nicht-Überleben vorhergesagt wird. Dazu muss die Wahrscheinlichkeit, dass das Überleben vorhergesagt und beobachtet wird (richtig positiv) bewertet und mit der Wahrscheinlichkeit verglichen wird, dass das Überleben vorhergesagt wird, wenn es nicht überlebt wird (falsch positiv).\\nDer wahr-positive TPR oder Sensitivitätsfaktor ist definiert als\\n\\n

$TPR=\displaystyle\frac{TP}{TP+FN}$

\\n\\nwobei wahr-positiv TP die korrekt als positiv vorhergesagten Fälle und die falsch-negative FN den als negativ vorhergesagten Fällen entsprechen, die negativ sind.\\nDer falsch-positive FPR oder Sensitivitätsfaktor ist definiert als\\n\\n

$FPR=\displaystyle\frac{FP}{FP+TN}$



wobei falsch-positiv FP die als positiv vorhergesagten Fälle sind, obwohl es tatsächlich falsch ist, und das falsch-negative FN entspricht den als negativ vorhergesagten Fällen, die negativ sind.

from sklearn.metrics import roc_curve
from matplotlib import pyplot as plt

fpr, tpr_v, _ = roc_curve(y_eval_v, probs)
fpr, tpr_e, _ = roc_curve(y_eval_e, probs)
plt.plot(fpr,tpr_v,label='vote')
plt.plot(fpr,tpr_e,label='college')
plt.title('ROC curve')
plt.xlabel('false positive rate')
plt.ylabel('true positive rate')
plt.legend(loc='lower right')
plt.xlim(0,)
plt.ylim(0,)


Die Darstellung beider Wahrscheinlichkeiten wird als ROC-Diagramm (Receiver Operating Characteristic) bezeichnet:

ID:(13869, 0)



Daten zur Vorhersage laden und untersuchen

Beschreibung

>Top


Laden und anzeigen von 6 Spalten mit head der in election_scenarios geladenen Daten anzeigen:

# load and show 6 scenarios
election_secnarios = pd.read_csv('election-secnarios.csv')
election_secnarios.head(6)


ID:(13870, 0)



Vorhersage

Beschreibung

>Top


Prognostizierte Ergebnisse der in election_scenarios enthaltenen Szenarien:

predictions_v = DNN_model_v.predict(
    input_fn=lambda:eval_input_fn(election_secnarios,labels=None,
    batch_size=batch_size))

predictions_e = DNN_model_e.predict(
    input_fn=lambda:eval_input_fn(election_secnarios,labels=None,
    batch_size=batch_size))

results_v = list(predictions_v)
results_e = list(predictions_e)

def x(res,j):
    class_id = res[j]['class_ids']
    probability = int(res[j]['probabilities'][class_id] *100)

    if int(class_id) == 0:
        return ('%s%% probalitity to %s' % (probability,'Challenger'))
    else:
        return ('%s%% probalitity to %s' % (probability,'Incumbent'))

print ('Predictions for the scenarios:')

for i in range(0,6):    
    print (x(results_v,i)+','+x(results_e,i))

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmpuhex661k\model.ckpt-4000

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

INFO:tensorflow:Calling model_fn.

INFO:tensorflow:Done calling model_fn.

INFO:tensorflow:Graph was finalized.

INFO:tensorflow:Restoring parameters from C:\Users\KLAUSS~1\AppData\Local\Temp\tmp05qx1i4u\model.ckpt-4000

INFO:tensorflow:Running local_init_op.

INFO:tensorflow:Done running local_init_op.

Predictions for the scenarios:

68% probalitity to Incumbent,60% probalitity to Incumbent

80% probalitity to Incumbent,72% probalitity to Incumbent

86% probalitity to Incumbent,76% probalitity to Incumbent

90% probalitity to Incumbent,76% probalitity to Incumbent

89% probalitity to Incumbent,76% probalitity to Incumbent

93% probalitity to Incumbent,75% probalitity to Incumbent

ID:(13871, 0)