Words in a Play#
installing dependencies#
Installing Spacy (a Python module for natural language processing) and its dependencies is a bit arduous but should work as follows:
!pip install spacy pydracor
# !pip install spacy-transformers # first check if really required → has many dependencies
Requirement already satisfied: spacy in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (3.7.4)
Collecting pydracor
Using cached pydracor-2.0.0-py3-none-any.whl.metadata (8.0 kB)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (8.2.3)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (1.1.2)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (2.0.10)
Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (0.3.4)
Requirement already satisfied: typer<0.10.0,>=0.3.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (0.9.0)
Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (6.4.0)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (4.66.2)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (2.31.0)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (2.6.3)
Requirement already satisfied: jinja2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (3.1.3)
Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (65.5.0)
Requirement already satisfied: packaging>=20.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (24.0)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (3.3.0)
Requirement already satisfied: numpy>=1.19.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy) (1.26.4)
Requirement already satisfied: annotated-types>=0.4.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (0.6.0)
Requirement already satisfied: pydantic-core==2.16.3 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (2.16.3)
Requirement already satisfied: typing-extensions>=4.6.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy) (4.10.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from requests<3.0.0,>=2.13.0->spacy) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from requests<3.0.0,>=2.13.0->spacy) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from requests<3.0.0,>=2.13.0->spacy) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from requests<3.0.0,>=2.13.0->spacy) (2024.2.2)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from thinc<8.3.0,>=8.2.2->spacy) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from thinc<8.3.0,>=8.2.2->spacy) (0.1.4)
Requirement already satisfied: click<9.0.0,>=7.1.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from typer<0.10.0,>=0.3.0->spacy) (8.1.7)
Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from weasel<0.4.0,>=0.1.0->spacy) (0.16.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from jinja2->spacy) (2.1.5)
Using cached pydracor-2.0.0-py3-none-any.whl (19 kB)
Installing collected packages: pydracor
Successfully installed pydracor-2.0.0
afterwards (see hints on selection of models):
!python -m spacy download en_core_web_sm
Collecting en-core-web-sm==3.7.1
Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
?25l ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/12.8 MB ? eta -:--:--
━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/12.8 MB 36.8 MB/s eta 0:00:01
━━━━━━━━━━━━━━━╺━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/12.8 MB 72.9 MB/s eta 0:00:01
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╺━━━━━ 10.9/12.8 MB 132.9 MB/s eta 0:00:01
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╸ 12.8/12.8 MB 156.7 MB/s eta 0:00:01
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.8/12.8 MB 105.5 MB/s eta 0:00:00
?25h
Requirement already satisfied: spacy<3.8.0,>=3.7.2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from en-core-web-sm==3.7.1) (3.7.4)
Requirement already satisfied: spacy-legacy<3.1.0,>=3.0.11 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.0.12)
Requirement already satisfied: spacy-loggers<2.0.0,>=1.0.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.0.5)
Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.0.10)
Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.0.8)
Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.0.9)
Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (8.2.3)
Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.1.2)
Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.4.8)
Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.0.10)
Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.3.4)
Requirement already satisfied: typer<0.10.0,>=0.3.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.9.0)
Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (6.4.0)
Requirement already satisfied: tqdm<5.0.0,>=4.38.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (4.66.2)
Requirement already satisfied: requests<3.0.0,>=2.13.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.31.0)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.6.3)
Requirement already satisfied: jinja2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.1.3)
Requirement already satisfied: setuptools in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (65.5.0)
Requirement already satisfied: packaging>=20.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (24.0)
Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.3.0)
Requirement already satisfied: numpy>=1.19.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (1.26.4)
Requirement already satisfied: annotated-types>=0.4.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.6.0)
Requirement already satisfied: pydantic-core==2.16.3 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.16.3)
Requirement already satisfied: typing-extensions>=4.6.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (4.10.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (3.6)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.2.1)
Requirement already satisfied: certifi>=2017.4.17 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from requests<3.0.0,>=2.13.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2024.2.2)
Requirement already satisfied: blis<0.8.0,>=0.7.8 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.7.11)
Requirement already satisfied: confection<1.0.0,>=0.0.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from thinc<8.3.0,>=8.2.2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.1.4)
Requirement already satisfied: click<9.0.0,>=7.1.1 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from typer<0.10.0,>=0.3.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (8.1.7)
Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from weasel<0.4.0,>=0.1.0->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (0.16.0)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages (from jinja2->spacy<3.8.0,>=3.7.2->en-core-web-sm==3.7.1) (2.1.5)
Installing collected packages: en-core-web-sm
Successfully installed en-core-web-sm-3.7.1
✔ Download and installation successful
You can now load the package via spacy.load('en_core_web_sm')
load play from DraCor#
import pydracor
play = pydracor.Play(play_name = "a-midsummer-night-s-dream")
play.spoken_text()
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
Cell In[3], line 3
1 import pydracor
----> 3 play = pydracor.Play(play_name = "a-midsummer-night-s-dream")
4 play.spoken_text()
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/pydracor/dracor.py:1022, in Play.__init__(self, play_id, play_name, play_title)
1020 self.title = self.play_id_to_play_title()[self.id]
1021 elif play_name is not None:
-> 1022 play_names = list(self.play_id_to_play_name().values())
1023 assert play_name in play_names, f"No such play_name {play_name} in the corpora"
1024 if play_names.count(play_name) > 1:
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/pydracor/dracor.py:390, in DraCor.play_id_to_play_name(self)
377 @lru_cache()
378 def play_id_to_play_name(self):
379 """Map play id to the play name.
380
381 Returns
(...)
387 }
388 """
--> 390 return self.play_id_to_field('name')
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/pydracor/dracor.py:322, in DraCor.play_id_to_field(self, field)
309 @lru_cache()
310 def play_id_to_field(self, field):
311 """Map play id to the field value.
312
313 Returns
(...)
319 }
320 """
--> 322 return {
323 play['id']: play[field]
324 for corpus_name in self.corpora_names()
325 for play in Corpus(corpus_name).corpus_info()['plays']
326 }
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/pydracor/dracor.py:325, in <dictcomp>(.0)
309 @lru_cache()
310 def play_id_to_field(self, field):
311 """Map play id to the field value.
312
313 Returns
(...)
319 }
320 """
322 return {
323 play['id']: play[field]
324 for corpus_name in self.corpora_names()
--> 325 for play in Corpus(corpus_name).corpus_info()['plays']
326 }
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/pydracor/dracor.py:538, in Corpus.__init__(self, corpus_name)
524 def __init__(self, corpus_name):
525 """Set corpusname, title, repository url and number of plays attributes from corpus_info method.
526
527 Parameters
(...)
535 If there is no such corpus_name
536 """
--> 538 assert corpus_name in self.corpora_names(), f'No such corpusname "{corpus_name}"'
539 super().__init__()
540 self.corpus_name = corpus_name
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/pydracor/dracor.py:186, in DraCor.corpora_names(self)
176 @lru_cache()
177 def corpora_names(self):
178 """Get all available corpora names.
179
180 Returns
(...)
183 ['cal', ...]
184 """
--> 186 return [corpus['name'] for corpus in self.corpora()]
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/pydracor/dracor.py:174, in DraCor.corpora(self, include)
140 """List available corpora.
141
142 Get info about the corpora of Drama Corpus.
(...)
170 If include parameter is not equal either 'metrics' or ''
171 """
173 assert include in ['', 'metrics'], "Include parameter should be either 'metrics' or ''"
--> 174 return self.make_get_json_request(f"{self._base_url}/corpora/?include={include}")
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/pydracor/dracor.py:53, in DraCor.make_get_json_request(self, url)
35 def make_get_json_request(self, url):
36 """Base method to send GET request and retrieve json from response.
37
38 Parameters
(...)
50 Client or Server Error can be raised.
51 """
---> 53 response = requests.get(url)
54 response.raise_for_status()
55 result = self.transform_dict(response.json())
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/requests/api.py:73, in get(url, params, **kwargs)
62 def get(url, params=None, **kwargs):
63 r"""Sends a GET request.
64
65 :param url: URL for the new :class:`Request` object.
(...)
70 :rtype: requests.Response
71 """
---> 73 return request("get", url, params=params, **kwargs)
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/requests/api.py:59, in request(method, url, **kwargs)
55 # By using the 'with' statement we are sure the session is closed, thus we
56 # avoid leaving sockets open which can trigger a ResourceWarning in some
57 # cases, and look like a memory leak in others.
58 with sessions.Session() as session:
---> 59 return session.request(method=method, url=url, **kwargs)
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
584 send_kwargs = {
585 "timeout": timeout,
586 "allow_redirects": allow_redirects,
587 }
588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
591 return resp
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
700 start = preferred_clock()
702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
705 # Total elapsed time of the request (approximately)
706 elapsed = preferred_clock() - start
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/requests/adapters.py:486, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
483 timeout = TimeoutSauce(connect=timeout, read=timeout)
485 try:
--> 486 resp = conn.urlopen(
487 method=request.method,
488 url=url,
489 body=request.body,
490 headers=request.headers,
491 redirect=False,
492 assert_same_host=False,
493 preload_content=False,
494 decode_content=False,
495 retries=self.max_retries,
496 timeout=timeout,
497 chunked=chunked,
498 )
500 except (ProtocolError, OSError) as err:
501 raise ConnectionError(err, request=request)
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:793, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
790 response_conn = conn if not release_conn else None
792 # Make the request on the HTTPConnection object
--> 793 response = self._make_request(
794 conn,
795 method,
796 url,
797 timeout=timeout_obj,
798 body=body,
799 headers=headers,
800 chunked=chunked,
801 retries=retries,
802 response_conn=response_conn,
803 preload_content=preload_content,
804 decode_content=decode_content,
805 **response_kw,
806 )
808 # Everything went great!
809 clean_exit = True
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:467, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
464 try:
465 # Trigger any extra validation we need to do.
466 try:
--> 467 self._validate_conn(conn)
468 except (SocketTimeout, BaseSSLError) as e:
469 self._raise_timeout(err=e, url=url, timeout_value=conn.timeout)
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/urllib3/connectionpool.py:1099, in HTTPSConnectionPool._validate_conn(self, conn)
1097 # Force connect early to allow us to validate the connection.
1098 if conn.is_closed:
-> 1099 conn.connect()
1101 # TODO revise this, see https://github.com/urllib3/urllib3/issues/2791
1102 if not conn.is_verified and not conn.proxy_is_verified:
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/urllib3/connection.py:653, in HTTPSConnection.connect(self)
650 # Remove trailing '.' from fqdn hostnames to allow certificate validation
651 server_hostname_rm_dot = server_hostname.rstrip(".")
--> 653 sock_and_verified = _ssl_wrap_socket_and_match_hostname(
654 sock=sock,
655 cert_reqs=self.cert_reqs,
656 ssl_version=self.ssl_version,
657 ssl_minimum_version=self.ssl_minimum_version,
658 ssl_maximum_version=self.ssl_maximum_version,
659 ca_certs=self.ca_certs,
660 ca_cert_dir=self.ca_cert_dir,
661 ca_cert_data=self.ca_cert_data,
662 cert_file=self.cert_file,
663 key_file=self.key_file,
664 key_password=self.key_password,
665 server_hostname=server_hostname_rm_dot,
666 ssl_context=self.ssl_context,
667 tls_in_tls=tls_in_tls,
668 assert_hostname=self.assert_hostname,
669 assert_fingerprint=self.assert_fingerprint,
670 )
671 self.sock = sock_and_verified.socket
673 # Forwarding proxies can never have a verified target since
674 # the proxy is the one doing the verification. Should instead
675 # use a CONNECT tunnel in order to verify the target.
676 # See: https://github.com/urllib3/urllib3/issues/3267.
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/urllib3/connection.py:806, in _ssl_wrap_socket_and_match_hostname(sock, cert_reqs, ssl_version, ssl_minimum_version, ssl_maximum_version, cert_file, key_file, key_password, ca_certs, ca_cert_dir, ca_cert_data, assert_hostname, assert_fingerprint, server_hostname, ssl_context, tls_in_tls)
803 if is_ipaddress(normalized):
804 server_hostname = normalized
--> 806 ssl_sock = ssl_wrap_socket(
807 sock=sock,
808 keyfile=key_file,
809 certfile=cert_file,
810 key_password=key_password,
811 ca_certs=ca_certs,
812 ca_cert_dir=ca_cert_dir,
813 ca_cert_data=ca_cert_data,
814 server_hostname=server_hostname,
815 ssl_context=context,
816 tls_in_tls=tls_in_tls,
817 )
819 try:
820 if assert_fingerprint:
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/urllib3/util/ssl_.py:465, in ssl_wrap_socket(sock, keyfile, certfile, cert_reqs, ca_certs, server_hostname, ssl_version, ciphers, ssl_context, ca_cert_dir, key_password, ca_cert_data, tls_in_tls)
462 except NotImplementedError: # Defensive: in CI, we always have set_alpn_protocols
463 pass
--> 465 ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls, server_hostname)
466 return ssl_sock
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/urllib3/util/ssl_.py:509, in _ssl_wrap_socket_impl(sock, ssl_context, tls_in_tls, server_hostname)
506 SSLTransport._validate_ssl_context_for_tls_in_tls(ssl_context)
507 return SSLTransport(sock, ssl_context, server_hostname)
--> 509 return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/ssl.py:517, in SSLContext.wrap_socket(self, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, session)
511 def wrap_socket(self, sock, server_side=False,
512 do_handshake_on_connect=True,
513 suppress_ragged_eofs=True,
514 server_hostname=None, session=None):
515 # SSLSocket class handles server_hostname encoding before it calls
516 # ctx._wrap_socket()
--> 517 return self.sslsocket_class._create(
518 sock=sock,
519 server_side=server_side,
520 do_handshake_on_connect=do_handshake_on_connect,
521 suppress_ragged_eofs=suppress_ragged_eofs,
522 server_hostname=server_hostname,
523 context=self,
524 session=session
525 )
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/ssl.py:1104, in SSLSocket._create(cls, sock, server_side, do_handshake_on_connect, suppress_ragged_eofs, server_hostname, context, session)
1101 if timeout == 0.0:
1102 # non-blocking
1103 raise ValueError("do_handshake_on_connect should not be specified for non-blocking sockets")
-> 1104 self.do_handshake()
1105 except:
1106 try:
File /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/ssl.py:1382, in SSLSocket.do_handshake(self, block)
1380 if timeout == 0.0 and block:
1381 self.settimeout(None)
-> 1382 self._sslobj.do_handshake()
1383 finally:
1384 self.settimeout(timeout)
KeyboardInterrupt:
tokenise text and detect parts of speech#
import spacy
nlp = spacy.load("en_core_web_sm")
import en_core_web_sm
nlp = en_core_web_sm.load()
doc = nlp(play.spoken_text())
text and parts of speech#
print([(w.text, w.pos_, w.lemma_) for w in doc])
most frequent parts of speech#
from collections import Counter
def count_words(doc, word_type):
cnt = Counter()
for w in doc:
if w.pos_ == word_type:
cnt[w.lemma_] += 1 # better than w.text
return cnt
def print_top(words, n):
for w, cnt in words.most_common(n):
print(cnt, w, sep='\t')
word_types = Counter()
for w in doc:
word_types[w.pos_] += 1
print_top(word_types, 10)
most frequent words per part of speech#
for word_type in ["NOUN", "VERB", "ADJ"]:
print("---", word_type, "---")
print_top(count_words(doc, word_type), 10)