To catch a protagonist in DraCor#
by Ingo Börner
In the paper To Catch a Protagonist: Quantitative Dominance Relations in German-Language Drama (1730-1930) (Fischer et al. 2018) an algorithm is described, that allows to identify characters that are the quantitatively dominant characters of a play based on a set of network-based and count based measures:
In order to systematically describe the extent of this deviation, we calculate eight values for each character of the 465 dramas of our corpus, three count-based measures (number of scenes a character appears in, number of speech acts, number of spoken words) and five network-related measures (degree, weighted degree, betweenness centrality, closeness centrality, eigenvector centrality). For each measurement a ranking is created. The rankings are then merged into two meta-rankings: one count-based and one network-based. The two meta-rankings are then combined into an overall ranking.
The original algorithm was implemented in the tool Dramavis by Christopher Kittel. Dramavis operates on XML “zwischenformat” files created in the DLINA project.
The following notebook adapts the code of the respective modules to work with data returned by the DraCor API. The aim is to be able to recreate the *_chars.csv-files that were used in the study. The data can be found in the repository on github in the folder allmetrics.
The implementation will be tested with the play Emilia Galotti. The original algorithm operated on the corresponding LINA and produced the file 88_Emilia Galotti_chars.csv as output In DraCor the play can be accessed here.
Step 1. Get the basic measures#
We need to get the following basic measures on characters:
Network measures
betweenness
degree
closeness
~~closeness corrected~~
weighted degree
eigenvector centrality
count-based measures
frequency/appearances
number of speech acts
number of words
Network and count-based metrics via Dracor API#
The Python-Packages requests and the library json will be used to query the API and parse the response:
# if not installed, uncomment the following line and run the cell:
# !pip install requests
import requests
import json
# set corpus and playname
corpusname = "ger"
playname = "lessing-emilia-galotti"
# base url of the DraCor-API
api_base = "https://dracor.org/api/"
To retrieve the network-data and speech-amounts data on single characters the function /corpora/{corpusname}/play/{playname}/cast is used as follows:
# send a request to the endpoint and parse results
request_url = api_base + "corpora/" + corpusname + "/play/" + playname + "/cast"
r = requests.get(request_url)
character_data = json.loads(r.text)
The API function returns data on the characters, including the network and count-based metrics:
character_data
[{'id': 'der_prinz',
'name': 'Der Prinz',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 17,
'numOfSpeechActs': 157,
'numOfWords': 4002,
'degree': 8,
'weightedDegree': 20,
'closeness': 0.75,
'betweenness': 0.46717171717171724,
'eigenvector': 0.32076106311648156},
{'id': 'der_kammerdiener',
'name': 'Der Kammerdiener',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 2,
'numOfSpeechActs': 6,
'numOfWords': 33,
'degree': 1,
'weightedDegree': 2,
'closeness': 0.4444444444444444,
'betweenness': 0,
'eigenvector': 0.05575792046031641},
{'id': 'conti',
'name': 'Conti',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 2,
'numOfSpeechActs': 24,
'numOfWords': 604,
'degree': 1,
'weightedDegree': 2,
'closeness': 0.4444444444444444,
'betweenness': 0,
'eigenvector': 0.05575792046031641},
{'id': 'marinelli',
'name': 'Marinelli',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 19,
'numOfSpeechActs': 221,
'numOfWords': 4343,
'degree': 9,
'weightedDegree': 30,
'closeness': 0.8,
'betweenness': 0.24696969696969698,
'eigenvector': 0.4489846359321899},
{'id': 'camillo_rota',
'name': 'Camillo Rota',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 1,
'numOfSpeechActs': 6,
'numOfWords': 78,
'degree': 1,
'weightedDegree': 1,
'closeness': 0.4444444444444444,
'betweenness': 0,
'eigenvector': 0.05575792046031641},
{'id': 'claudia',
'name': 'Claudia',
'isGroup': False,
'gender': 'FEMALE',
'numOfScenes': 13,
'numOfSpeechActs': 73,
'numOfWords': 1581,
'degree': 7,
'weightedDegree': 19,
'closeness': 0.6,
'betweenness': 0.04545454545454544,
'eigenvector': 0.38292603187412266},
{'id': 'pirro',
'name': 'Pirro',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 4,
'numOfSpeechActs': 25,
'numOfWords': 263,
'degree': 5,
'weightedDegree': 7,
'closeness': 0.5454545454545454,
'betweenness': 0.026515151515151516,
'eigenvector': 0.2719436343371554},
{'id': 'odoardo',
'name': 'Odoardo',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 12,
'numOfSpeechActs': 108,
'numOfWords': 2441,
'degree': 6,
'weightedDegree': 15,
'closeness': 0.6666666666666666,
'betweenness': 0.05505050505050505,
'eigenvector': 0.3542503929627511},
{'id': 'angelo',
'name': 'Angelo',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 2,
'numOfSpeechActs': 28,
'numOfWords': 487,
'degree': 2,
'weightedDegree': 2,
'closeness': 0.48,
'betweenness': 0,
'eigenvector': 0.1253177208861109},
{'id': 'emilia',
'name': 'Emilia',
'isGroup': False,
'gender': 'FEMALE',
'numOfScenes': 7,
'numOfSpeechActs': 64,
'numOfWords': 1702,
'degree': 6,
'weightedDegree': 13,
'closeness': 0.6666666666666666,
'betweenness': 0.05505050505050505,
'eigenvector': 0.3513647060457318},
{'id': 'appiani',
'name': 'Appiani',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 5,
'numOfSpeechActs': 48,
'numOfWords': 852,
'degree': 4,
'weightedDegree': 8,
'closeness': 0.5217391304347826,
'betweenness': 0.003787878787878788,
'eigenvector': 0.2529584931569895},
{'id': 'battista',
'name': 'Battista',
'isGroup': False,
'gender': 'MALE',
'numOfScenes': 4,
'numOfSpeechActs': 11,
'numOfWords': 152,
'degree': 4,
'weightedDegree': 7,
'closeness': 0.6,
'betweenness': 0.012121212121212121,
'eigenvector': 0.26144507771860326},
{'id': 'orsina',
'name': 'Orsina',
'isGroup': False,
'gender': 'FEMALE',
'numOfScenes': 6,
'numOfSpeechActs': 64,
'numOfWords': 2111,
'degree': 4,
'weightedDegree': 8,
'closeness': 0.6,
'betweenness': 0.012121212121212121,
'eigenvector': 0.2619466686163178}]
The data on the characters are in a dictionary:
{'betweenness': 0.24696969696969698,
'closeness': 0.8,
'degree': 9,
'eigenvector': 0.44898463593218985,
'gender': 'MALE',
'id': 'marinelli',
'isGroup': False,
'name': 'Marinelli',
'numOfScenes': 19,
'numOfSpeechActs': 221,
'numOfWords': 4343,
'weightedDegree': 30}
We don’t know anything about the network metrics of the whole play, though. If we want to retrieve this information, we would have to use the API function /corpora/{corpusname}/play/{playname}/metrics, which would also tell us, if there are several sub-networks in a dictionary-field with the key numConnectedComponents. This could be relevant, because we can also calculate some network-metrics differently, e.g. the closeness.
Preparation: Get the metrics and construct a pandas data frame#
In the Dramavis implementation an object of the class DramaAnalyzer is created, which contains the information on characters in a pandas data frame. We will create the same data structure to be able to use the same methods for calculating means and ranking.
The rows in the table are:
name,betweenness,degree,closeness,closeness_corrected,strength,eigenvector_centrality,avg_distance,avg_distance_corrected,frequency,speech_acts,words,lines,chars ...
We will not include all rows, but only the ones, that are relevant for the rankings:
name,betweenness,degree,closeness,~~closeness_corrected~~,strength,eigenvector_centrality,~~avg_distance,avg_distance_corrected~~,frequency,speech_acts,words,~~lines,chars~~ …
following rows will be called differently to follow DraCor conventions of the API output:
name→id; later this will be used to construct URIsstrength→weightedDegreeeigenvector_centrality→eigenvectorfrequency→numOfScenesspeech_acts→numOfSpeechActswords→numOfWords
The package pandas is used to handle the data as a dataframe. Therefore we need to import the package.
# if not installed, uncomment the following line and run the cell:
# !pip install pandas
import pandas as pd
First, we need to transform the parsed JSON API response to a list of lists, that is then turned into the data frame df.
# columns
cols = ["id","betweenness","degree","closeness","weightedDegree","eigenvector","numOfScenes","numOfSpeechActs","numOfWords"]
# prepare the data for the data frame
df_data = []
for character in character_data:
row = []
for key in cols:
row.append(character[key])
df_data.append(row)
# construct the data frame
df = pd.DataFrame(df_data, columns = cols)
#turn the column "id" to the index
df = df.set_index('id')
#output
df
| betweenness | degree | closeness | weightedDegree | eigenvector | numOfScenes | numOfSpeechActs | numOfWords | |
|---|---|---|---|---|---|---|---|---|
| id | ||||||||
| der_prinz | 0.467172 | 8 | 0.750000 | 20 | 0.320761 | 17 | 157 | 4002 |
| der_kammerdiener | 0.000000 | 1 | 0.444444 | 2 | 0.055758 | 2 | 6 | 33 |
| conti | 0.000000 | 1 | 0.444444 | 2 | 0.055758 | 2 | 24 | 604 |
| marinelli | 0.246970 | 9 | 0.800000 | 30 | 0.448985 | 19 | 221 | 4343 |
| camillo_rota | 0.000000 | 1 | 0.444444 | 1 | 0.055758 | 1 | 6 | 78 |
| claudia | 0.045455 | 7 | 0.600000 | 19 | 0.382926 | 13 | 73 | 1581 |
| pirro | 0.026515 | 5 | 0.545455 | 7 | 0.271944 | 4 | 25 | 263 |
| odoardo | 0.055051 | 6 | 0.666667 | 15 | 0.354250 | 12 | 108 | 2441 |
| angelo | 0.000000 | 2 | 0.480000 | 2 | 0.125318 | 2 | 28 | 487 |
| emilia | 0.055051 | 6 | 0.666667 | 13 | 0.351365 | 7 | 64 | 1702 |
| appiani | 0.003788 | 4 | 0.521739 | 8 | 0.252958 | 5 | 48 | 852 |
| battista | 0.012121 | 4 | 0.600000 | 7 | 0.261445 | 4 | 11 | 152 |
| orsina | 0.012121 | 4 | 0.600000 | 8 | 0.261947 | 6 | 64 | 2111 |
We can now query the data, e.g. output the values of a single character by requesting a row by its index value, which is the id of the character.
# get the values of a single character
df.loc["marinelli"]
betweenness 0.246970
degree 9.000000
closeness 0.800000
weightedDegree 30.000000
eigenvector 0.448985
numOfScenes 19.000000
numOfSpeechActs 221.000000
numOfWords 4343.000000
Name: marinelli, dtype: float64
Step 2. Calculate the ranks#
In Dramavis the function get_character_ranks creates the rankings of the count-based and network-based measures. We will adapt this function to operate on the created data frame and rename the columns:
metrics_to_rank = ['degree', 'closeness', 'betweenness', 'weightedDegree', 'eigenvector', 'numOfScenes', 'numOfSpeechActs', 'numOfWords']
for metric in metrics_to_rank:
df[metric + "_rank"] = df[metric].rank(method='min', ascending=False)
df
| betweenness | degree | closeness | weightedDegree | eigenvector | numOfScenes | numOfSpeechActs | numOfWords | degree_rank | closeness_rank | betweenness_rank | weightedDegree_rank | eigenvector_rank | numOfScenes_rank | numOfSpeechActs_rank | numOfWords_rank | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||
| der_prinz | 0.467172 | 8 | 0.750000 | 20 | 0.320761 | 17 | 157 | 4002 | 2.0 | 2.0 | 1.0 | 2.0 | 5.0 | 2.0 | 2.0 | 2.0 |
| der_kammerdiener | 0.000000 | 1 | 0.444444 | 2 | 0.055758 | 2 | 6 | 33 | 11.0 | 11.0 | 10.0 | 10.0 | 11.0 | 10.0 | 12.0 | 13.0 |
| conti | 0.000000 | 1 | 0.444444 | 2 | 0.055758 | 2 | 24 | 604 | 11.0 | 11.0 | 10.0 | 10.0 | 11.0 | 10.0 | 10.0 | 8.0 |
| marinelli | 0.246970 | 9 | 0.800000 | 30 | 0.448985 | 19 | 221 | 4343 | 1.0 | 1.0 | 2.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
| camillo_rota | 0.000000 | 1 | 0.444444 | 1 | 0.055758 | 1 | 6 | 78 | 11.0 | 11.0 | 10.0 | 13.0 | 11.0 | 13.0 | 12.0 | 12.0 |
| claudia | 0.045455 | 7 | 0.600000 | 19 | 0.382926 | 13 | 73 | 1581 | 3.0 | 5.0 | 5.0 | 3.0 | 2.0 | 3.0 | 4.0 | 6.0 |
| pirro | 0.026515 | 5 | 0.545455 | 7 | 0.271944 | 4 | 25 | 263 | 6.0 | 8.0 | 6.0 | 8.0 | 6.0 | 8.0 | 9.0 | 10.0 |
| odoardo | 0.055051 | 6 | 0.666667 | 15 | 0.354250 | 12 | 108 | 2441 | 4.0 | 3.0 | 3.0 | 4.0 | 3.0 | 4.0 | 3.0 | 3.0 |
| angelo | 0.000000 | 2 | 0.480000 | 2 | 0.125318 | 2 | 28 | 487 | 10.0 | 10.0 | 10.0 | 10.0 | 10.0 | 10.0 | 8.0 | 9.0 |
| emilia | 0.055051 | 6 | 0.666667 | 13 | 0.351365 | 7 | 64 | 1702 | 4.0 | 3.0 | 3.0 | 5.0 | 4.0 | 5.0 | 5.0 | 5.0 |
| appiani | 0.003788 | 4 | 0.521739 | 8 | 0.252958 | 5 | 48 | 852 | 7.0 | 9.0 | 9.0 | 6.0 | 9.0 | 7.0 | 7.0 | 7.0 |
| battista | 0.012121 | 4 | 0.600000 | 7 | 0.261445 | 4 | 11 | 152 | 7.0 | 5.0 | 7.0 | 8.0 | 8.0 | 8.0 | 11.0 | 11.0 |
| orsina | 0.012121 | 4 | 0.600000 | 8 | 0.261947 | 6 | 64 | 2111 | 7.0 | 5.0 | 7.0 | 6.0 | 7.0 | 6.0 | 5.0 | 4.0 |
Step 3. Rank on average and standard deviation of the individual rankings#
In Dramavis the individual rankings are then used for the calculation of an average ranking and the standard deviation, which are then also ranked. This is done by the function get_centrality_ranks.
The following columns will be added to the data frame:
(1)
centrality_rank_avg: The average of all rankings(2)
centrality_rank_std: Standard deviation of the rankings(3)
centrality_rank_avg_rank: A ranking is created from the average of all rankings (1)(4)
centrality_rank_std_rank: A ranking is created from the standard deviation of all rankings (2)
The following dramavis code is adapted accordingly to operate on the dataframe:
ranks = [c for c in df.columns if c.endswith("rank")]
df['centrality_rank_avg'] = df[ranks].sum(axis=1)/len(ranks)
df['centrality_rank_std'] = df[ranks].std(axis=1)/len(ranks)
for metric in ['centrality_rank_avg', 'centrality_rank_std']:
df[metric + "_rank"] = df[metric].rank(method='min', ascending=True)
df
| betweenness | degree | closeness | weightedDegree | eigenvector | numOfScenes | numOfSpeechActs | numOfWords | degree_rank | closeness_rank | betweenness_rank | weightedDegree_rank | eigenvector_rank | numOfScenes_rank | numOfSpeechActs_rank | numOfWords_rank | centrality_rank_avg | centrality_rank_std | centrality_rank_avg_rank | centrality_rank_std_rank | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | ||||||||||||||||||||
| der_prinz | 0.467172 | 8 | 0.750000 | 20 | 0.320761 | 17 | 157 | 4002 | 2.0 | 2.0 | 1.0 | 2.0 | 5.0 | 2.0 | 2.0 | 2.0 | 2.250 | 0.145621 | 2.0 | 9.0 |
| der_kammerdiener | 0.000000 | 1 | 0.444444 | 2 | 0.055758 | 2 | 6 | 33 | 11.0 | 11.0 | 10.0 | 10.0 | 11.0 | 10.0 | 12.0 | 13.0 | 11.000 | 0.133631 | 12.0 | 7.0 |
| conti | 0.000000 | 1 | 0.444444 | 2 | 0.055758 | 2 | 24 | 604 | 11.0 | 11.0 | 10.0 | 10.0 | 11.0 | 10.0 | 10.0 | 8.0 | 10.125 | 0.123879 | 11.0 | 5.0 |
| marinelli | 0.246970 | 9 | 0.800000 | 30 | 0.448985 | 19 | 221 | 4343 | 1.0 | 1.0 | 2.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.125 | 0.044194 | 1.0 | 1.0 |
| camillo_rota | 0.000000 | 1 | 0.444444 | 1 | 0.055758 | 1 | 6 | 78 | 11.0 | 11.0 | 10.0 | 13.0 | 11.0 | 13.0 | 12.0 | 12.0 | 11.625 | 0.132583 | 13.0 | 6.0 |
| claudia | 0.045455 | 7 | 0.600000 | 19 | 0.382926 | 13 | 73 | 1581 | 3.0 | 5.0 | 5.0 | 3.0 | 2.0 | 3.0 | 4.0 | 6.0 | 3.875 | 0.169525 | 4.0 | 11.0 |
| pirro | 0.026515 | 5 | 0.545455 | 7 | 0.271944 | 4 | 25 | 263 | 6.0 | 8.0 | 6.0 | 8.0 | 6.0 | 8.0 | 9.0 | 10.0 | 7.625 | 0.188243 | 7.0 | 12.0 |
| odoardo | 0.055051 | 6 | 0.666667 | 15 | 0.354250 | 12 | 108 | 2441 | 4.0 | 3.0 | 3.0 | 4.0 | 3.0 | 4.0 | 3.0 | 3.0 | 3.375 | 0.064694 | 3.0 | 2.0 |
| angelo | 0.000000 | 2 | 0.480000 | 2 | 0.125318 | 2 | 28 | 487 | 10.0 | 10.0 | 10.0 | 10.0 | 10.0 | 10.0 | 8.0 | 9.0 | 9.625 | 0.093003 | 10.0 | 3.0 |
| emilia | 0.055051 | 6 | 0.666667 | 13 | 0.351365 | 7 | 64 | 1702 | 4.0 | 3.0 | 3.0 | 5.0 | 4.0 | 5.0 | 5.0 | 5.0 | 4.250 | 0.110801 | 5.0 | 4.0 |
| appiani | 0.003788 | 4 | 0.521739 | 8 | 0.252958 | 5 | 48 | 852 | 7.0 | 9.0 | 9.0 | 6.0 | 9.0 | 7.0 | 7.0 | 7.0 | 7.625 | 0.148467 | 7.0 | 10.0 |
| battista | 0.012121 | 4 | 0.600000 | 7 | 0.261445 | 4 | 11 | 152 | 7.0 | 5.0 | 7.0 | 8.0 | 8.0 | 8.0 | 11.0 | 11.0 | 8.125 | 0.253876 | 9.0 | 13.0 |
| orsina | 0.012121 | 4 | 0.600000 | 8 | 0.261947 | 6 | 64 | 2111 | 7.0 | 5.0 | 7.0 | 6.0 | 7.0 | 6.0 | 5.0 | 4.0 | 5.875 | 0.140749 | 6.0 | 8.0 |
Based on the calculation of centrality_rank_avg_rank, the “central” characters can be already queried as follows:
df[df["centrality_rank_avg_rank"] == 1].index.tolist()
['marinelli']
Additional Step: Create Rankings and combined rankings of network-based and count-based metrics separately#
In addition to a ranking that combines all metrics and rankings derived thereof, the function get_structural_ranking_measures treats network-based and count-based values separately and only then aggregates them to a combined overall ranking.
The function adds the following rows to the data frame:
(1)
avg_graph_rank: a ranking based on the rankings of the network-values (degree, closeness, betweenness, strength or weightedDegree and eigenvector centrality or eigenvector)(2)
avg_content_rank: a ranking based on the rankings of the count-based values (frequency or numOfScenes, speech acts and words)(3)
overall_avg: the two rankings (1+2) are combined by calculating the mean(4)
overall_avg_rank: based on the overall average (3) a ranking is created
The following code is adapted accordingly to operate on the dataframe. The ranking stability measures are not implemented here.
#renamed the columns to match the DraCor values here:
graph_ranks = ['degree_rank', 'closeness_rank', 'betweenness_rank', 'weightedDegree_rank', 'eigenvector_rank']
content_ranks = ['numOfScenes_rank', 'numOfSpeechActs_rank', 'numOfWords_rank']
avg_graph_rank = df[graph_ranks].mean(axis=1).rank(method='min')
avg_content_rank = df[content_ranks].mean(axis=1).rank(method='min')
df["avg_graph_rank"] = avg_graph_rank
df["avg_content_rank"] = avg_content_rank
df["overall_avg"] = df[["avg_graph_rank", "avg_content_rank"]].mean(axis=1)
df["overall_avg_rank"] = df["overall_avg"].rank(method='min')
df
| betweenness | degree | closeness | weightedDegree | eigenvector | numOfScenes | numOfSpeechActs | numOfWords | degree_rank | closeness_rank | ... | numOfSpeechActs_rank | numOfWords_rank | centrality_rank_avg | centrality_rank_std | centrality_rank_avg_rank | centrality_rank_std_rank | avg_graph_rank | avg_content_rank | overall_avg | overall_avg_rank | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| id | |||||||||||||||||||||
| der_prinz | 0.467172 | 8 | 0.750000 | 20 | 0.320761 | 17 | 157 | 4002 | 2.0 | 2.0 | ... | 2.0 | 2.0 | 2.250 | 0.145621 | 2.0 | 9.0 | 2.0 | 2.0 | 2.0 | 2.0 |
| der_kammerdiener | 0.000000 | 1 | 0.444444 | 2 | 0.055758 | 2 | 6 | 33 | 11.0 | 11.0 | ... | 12.0 | 13.0 | 11.000 | 0.133631 | 12.0 | 7.0 | 11.0 | 12.0 | 11.5 | 12.0 |
| conti | 0.000000 | 1 | 0.444444 | 2 | 0.055758 | 2 | 24 | 604 | 11.0 | 11.0 | ... | 10.0 | 8.0 | 10.125 | 0.123879 | 11.0 | 5.0 | 11.0 | 10.0 | 10.5 | 11.0 |
| marinelli | 0.246970 | 9 | 0.800000 | 30 | 0.448985 | 19 | 221 | 4343 | 1.0 | 1.0 | ... | 1.0 | 1.0 | 1.125 | 0.044194 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
| camillo_rota | 0.000000 | 1 | 0.444444 | 1 | 0.055758 | 1 | 6 | 78 | 11.0 | 11.0 | ... | 12.0 | 12.0 | 11.625 | 0.132583 | 13.0 | 6.0 | 13.0 | 13.0 | 13.0 | 13.0 |
| claudia | 0.045455 | 7 | 0.600000 | 19 | 0.382926 | 13 | 73 | 1581 | 3.0 | 5.0 | ... | 4.0 | 6.0 | 3.875 | 0.169525 | 4.0 | 11.0 | 4.0 | 4.0 | 4.0 | 4.0 |
| pirro | 0.026515 | 5 | 0.545455 | 7 | 0.271944 | 4 | 25 | 263 | 6.0 | 8.0 | ... | 9.0 | 10.0 | 7.625 | 0.188243 | 7.0 | 12.0 | 7.0 | 8.0 | 7.5 | 7.0 |
| odoardo | 0.055051 | 6 | 0.666667 | 15 | 0.354250 | 12 | 108 | 2441 | 4.0 | 3.0 | ... | 3.0 | 3.0 | 3.375 | 0.064694 | 3.0 | 2.0 | 3.0 | 3.0 | 3.0 | 3.0 |
| angelo | 0.000000 | 2 | 0.480000 | 2 | 0.125318 | 2 | 28 | 487 | 10.0 | 10.0 | ... | 8.0 | 9.0 | 9.625 | 0.093003 | 10.0 | 3.0 | 10.0 | 8.0 | 9.0 | 9.0 |
| emilia | 0.055051 | 6 | 0.666667 | 13 | 0.351365 | 7 | 64 | 1702 | 4.0 | 3.0 | ... | 5.0 | 5.0 | 4.250 | 0.110801 | 5.0 | 4.0 | 5.0 | 5.0 | 5.0 | 5.0 |
| appiani | 0.003788 | 4 | 0.521739 | 8 | 0.252958 | 5 | 48 | 852 | 7.0 | 9.0 | ... | 7.0 | 7.0 | 7.625 | 0.148467 | 7.0 | 10.0 | 9.0 | 7.0 | 8.0 | 8.0 |
| battista | 0.012121 | 4 | 0.600000 | 7 | 0.261445 | 4 | 11 | 152 | 7.0 | 5.0 | ... | 11.0 | 11.0 | 8.125 | 0.253876 | 9.0 | 13.0 | 8.0 | 11.0 | 9.5 | 10.0 |
| orsina | 0.012121 | 4 | 0.600000 | 8 | 0.261947 | 6 | 64 | 2111 | 7.0 | 5.0 | ... | 5.0 | 4.0 | 5.875 | 0.140749 | 6.0 | 8.0 | 6.0 | 5.0 | 5.5 | 6.0 |
13 rows × 24 columns
Based on the calculation of overall_avg_rank, the “central” characters can be queried as follows:
df[df["overall_avg_rank"] == 1].index.tolist()
['marinelli']