やってみる

アウトプットすべく己を導くためのブログ。その試行錯誤すらたれ流す。

ニュース用Webクローラ&抽出したかったがインストールできずエラー(news-please )

 これだからPythonは嫌い。

情報源

news-please

 インストール。

pip3 install news-please 

とても簡単です、約束します!

 とか書いてあったが、騙された。エラー。Pythonでは嘘とエラーが日常茶飯事な印象。

エラー pip3 install news-please TypeError: unsupported operand type(s) for -=: 'Retry' and 'int'

Collecting news-please
  Downloading https://www.piwheels.org/simple/news-please/news_please-1.4.23-py3-none-any.whl (84kB)
    100% |████████████████████████████████| 92kB 177kB/s 
Collecting Scrapy>=1.1.0 (from news-please)
  Downloading https://files.pythonhosted.org/packages/29/4b/585e8e111ffb01466c59281f34febb13ad1a95d7fb3919fd57c33fc732a5/Scrapy-1.7.3-py2.py3-none-any.whl (234kB)
    100% |████████████████████████████████| 235kB 713kB/s 
Collecting dotmap>=1.2.17 (from news-please)
  Downloading https://files.pythonhosted.org/packages/35/a6/0f88e89673285daf190891985ad8b57d1d70ec4ccaf2d53b692e25f52ad4/dotmap-1.3.8-py3-none-any.whl
Collecting six>=1.10.0 (from news-please)
  Using cached https://files.pythonhosted.org/packages/73/fb/00a976f728d0d1fecfe898238ce23f502a721c0ac0ecfedb80e0d88c64e9/six-1.12.0-py2.py3-none-any.whl
Collecting langdetect>=1.0.7 (from news-please)
  Downloading https://www.piwheels.org/simple/langdetect/langdetect-1.0.7-py3-none-any.whl (994kB)
    100% |████████████████████████████████| 1.0MB 294kB/s 
Collecting PyDispatcher>=2.0.5 (from news-please)
  Downloading https://www.piwheels.org/simple/pydispatcher/PyDispatcher-2.0.5-py3-none-any.whl
Collecting hjson>=1.5.8 (from news-please)
  Downloading https://www.piwheels.org/simple/hjson/hjson-3.0.1-py3-none-any.whl (51kB)
    100% |████████████████████████████████| 61kB 118kB/s 
Collecting bs4 (from news-please)
  Downloading https://www.piwheels.org/simple/bs4/bs4-0.0.1-py3-none-any.whl
Collecting ago>=0.0.9 (from news-please)
  Downloading https://www.piwheels.org/simple/ago/ago-0.0.93-py3-none-any.whl
Collecting warcio>=1.3.3 (from news-please)
  Downloading https://files.pythonhosted.org/packages/90/c4/86bc02bc3bc33c34ab24e52af8a1c34eb6e03e7cd5b3904057ebcea311da/warcio-1.7.1-py2.py3-none-any.whl (41kB)
    100% |████████████████████████████████| 51kB 1.6MB/s 
Collecting plac>=0.9.6 (from news-please)
  Downloading https://files.pythonhosted.org/packages/d6/2d/06ad623272c7b38324e362d606a685ca051be28b5eaac3897eb120bfe70c/plac-1.1.0-py2.py3-none-any.whl
Collecting elasticsearch>=2.4 (from news-please)
  Downloading https://files.pythonhosted.org/packages/df/63/86cfe50a11621240450322509f8eacac14f3381cbf43aee2fe56db8c7c8a/elasticsearch-7.0.4-py2.py3-none-any.whl (83kB)
    100% |████████████████████████████████| 92kB 1.0MB/s 
Collecting readability-lxml>=0.6.2 (from news-please)
  Downloading https://www.piwheels.org/simple/readability-lxml/readability_lxml-0.7.1-py3-none-any.whl
Collecting lxml>=3.3.5 (from news-please)
  Downloading https://www.piwheels.org/simple/lxml/lxml-4.4.1-cp35-cp35m-linux_armv7l.whl (5.3MB)
    100% |████████████████████████████████| 5.3MB 56kB/s 
Collecting awscli>=1.11.117 (from news-please)
  Downloading https://files.pythonhosted.org/packages/77/af/53874472e07a236608dcc9d48e23bae6a0adbff74c35a268d3a89edc8ae7/awscli-1.16.240-py2.py3-none-any.whl (2.1MB)
    100% |████████████████████████████████| 2.1MB 164kB/s 
Collecting hurry.filesize>=0.9 (from news-please)
  Downloading https://www.piwheels.org/simple/hurry-filesize/hurry.filesize-0.9-py3-none-any.whl
Collecting newspaper3k; python_version >= "3.0" (from news-please)
  Downloading https://files.pythonhosted.org/packages/d7/b9/51afecb35bb61b188a4b44868001de348a0e8134b4dfa00ffc191567c4b9/newspaper3k-0.2.8-py3-none-any.whl (211kB)
    100% |████████████████████████████████| 215kB 720kB/s 
Collecting PyMySQL>=0.7.9 (from news-please)
  Downloading https://files.pythonhosted.org/packages/ed/39/15045ae46f2a123019aa968dfcba0396c161c20f855f11dea6796bcaae95/PyMySQL-0.9.3-py2.py3-none-any.whl (47kB)
    100% |████████████████████████████████| 51kB 1.6MB/s 
Collecting python-dateutil>=2.4.0 (from news-please)
  Downloading https://files.pythonhosted.org/packages/41/17/c62faccbfbd163c7f57f3844689e3a78bae1f403648a6afb1d0866d87fbb/python_dateutil-2.8.0-py2.py3-none-any.whl (226kB)
    100% |████████████████████████████████| 235kB 857kB/s 
Collecting beautifulsoup4>=4.3.2 (from news-please)
  Downloading https://files.pythonhosted.org/packages/1a/b7/34eec2fe5a49718944e215fde81288eec1fa04638aa3fb57c1c6cd0f98c3/beautifulsoup4-4.8.0-py3-none-any.whl (97kB)
    100% |████████████████████████████████| 102kB 1.4MB/s 
Collecting Twisted>=13.1.0; python_version != "3.4" (from Scrapy>=1.1.0->news-please)
  Downloading https://www.piwheels.org/simple/twisted/Twisted-19.7.0-cp35-cp35m-linux_armv7l.whl (3.1MB)
    100% |████████████████████████████████| 3.1MB 99kB/s 
Collecting pyOpenSSL (from Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/01/c8/ceb170d81bd3941cbeb9940fc6cc2ef2ca4288d0ca8929ea4db5905d904d/pyOpenSSL-19.0.0-py2.py3-none-any.whl (53kB)
    100% |████████████████████████████████| 61kB 527kB/s 
Collecting service-identity (from Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/e9/7c/2195b890023e098f9618d43ebc337d83c8b38d414326685339eb024db2f6/service_identity-18.1.0-py2.py3-none-any.whl
Collecting w3lib>=1.17.0 (from Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/6a/45/1ba17c50a0bb16bd950c9c2b92ec60d40c8ebda9f3371ae4230c437120b6/w3lib-1.21.0-py2.py3-none-any.whl
Collecting parsel>=1.5 (from Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/86/c8/fc5a2f9376066905dfcca334da2a25842aedfda142c0424722e7c497798b/parsel-1.5.2-py2.py3-none-any.whl
Collecting queuelib (from Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/4c/85/ae64e9145f39dd6d14f8af3fa809a270ef3729f3b90b3c0cf5aa242ab0d4/queuelib-1.5.0-py2.py3-none-any.whl
Collecting cssselect>=0.9 (from Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/3b/d4/3b5c17f00cce85b9a1e6f91096e1cc8e8ede2e1be8e96b87ce1ed09e92c5/cssselect-1.1.0-py2.py3-none-any.whl
Collecting urllib3>=1.21.1 (from elasticsearch>=2.4->news-please)
  Downloading https://files.pythonhosted.org/packages/e6/60/247f23a7121ae632d62811ba7f273d0e58972d75e58a94d329d51550a47d/urllib3-1.25.3-py2.py3-none-any.whl (150kB)
    100% |████████████████████████████████| 153kB 558kB/s 
Collecting chardet (from readability-lxml>=0.6.2->news-please)
  Using cached https://files.pythonhosted.org/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl
Collecting botocore==1.12.230 (from awscli>=1.11.117->news-please)
  Downloading https://files.pythonhosted.org/packages/13/73/8cb67ed4e5526f070a9fc79cc6b224d8fcc6cf3be8363ce50b4b951c63df/botocore-1.12.230-py2.py3-none-any.whl (5.7MB)
    100% |████████████████████████████████| 5.7MB 57kB/s 
Collecting s3transfer<0.3.0,>=0.2.0 (from awscli>=1.11.117->news-please)
  Downloading https://files.pythonhosted.org/packages/16/8a/1fc3dba0c4923c2a76e1ff0d52b305c44606da63f718d14d3231e21c51b0/s3transfer-0.2.1-py2.py3-none-any.whl (70kB)
    100% |████████████████████████████████| 71kB 1.4MB/s 
Collecting colorama<=0.3.9,>=0.2.5 (from awscli>=1.11.117->news-please)
  Downloading https://files.pythonhosted.org/packages/db/c8/7dcf9dbcb22429512708fe3a547f8b6101c0d02137acbd892505aee57adf/colorama-0.3.9-py2.py3-none-any.whl
Collecting rsa<=3.5.0,>=3.1.2 (from awscli>=1.11.117->news-please)
  Downloading https://files.pythonhosted.org/packages/e1/ae/baedc9cb175552e95f3395c43055a6a5e125ae4d48a1d7a924baca83e92e/rsa-3.4.2-py2.py3-none-any.whl (46kB)
    100% |████████████████████████████████| 51kB 1.8MB/s 
Collecting docutils<0.16,>=0.10 (from awscli>=1.11.117->news-please)
  Downloading https://files.pythonhosted.org/packages/22/cd/a6aa959dca619918ccb55023b4cb151949c64d4d5d55b3f4ffd7eee0c6e8/docutils-0.15.2-py3-none-any.whl (547kB)
    100% |████████████████████████████████| 552kB 514kB/s 
Collecting PyYAML<=5.2,>=3.10; python_version != "2.6" (from awscli>=1.11.117->news-please)
  Downloading https://www.piwheels.org/simple/pyyaml/PyYAML-5.1.2-cp35-cp35m-linux_armv7l.whl (45kB)
    100% |████████████████████████████████| 51kB 106kB/s 
Collecting setuptools (from hurry.filesize>=0.9->news-please)
  Downloading https://files.pythonhosted.org/packages/b2/86/095d2f7829badc207c893dd4ac767e871f6cd547145df797ea26baea4e2e/setuptools-41.2.0-py2.py3-none-any.whl (576kB)
    100% |████████████████████████████████| 583kB 450kB/s 
Collecting Pillow>=3.3.0 (from newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://www.piwheels.org/simple/pillow/Pillow-6.1.0-cp35-cp35m-linux_armv7l.whl (1.1MB)
    100% |████████████████████████████████| 1.1MB 236kB/s 
Collecting tldextract>=2.0.1 (from newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://files.pythonhosted.org/packages/1e/90/18ac0e5340b6228c25cc8e79835c3811e7553b2b9ae87296dfeb62b7866d/tldextract-2.2.1-py2.py3-none-any.whl (48kB)
    100% |████████████████████████████████| 51kB 1.4MB/s 
Collecting feedparser>=5.2.1 (from newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://www.piwheels.org/simple/feedparser/feedparser-5.2.1-py3-none-any.whl (44kB)
    100% |████████████████████████████████| 51kB 103kB/s 
Collecting tinysegmenter==0.3 (from newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://www.piwheels.org/simple/tinysegmenter/tinysegmenter-0.3-py3-none-any.whl
Collecting requests>=2.10.0 (from newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://files.pythonhosted.org/packages/51/bd/23c926cd341ea6b7dd0b2a00aba99ae0f828be89d72b2190f27c11d4b7fb/requests-2.22.0-py2.py3-none-any.whl (57kB)
    100% |████████████████████████████████| 61kB 1.4MB/s 
Collecting feedfinder2>=0.0.4 (from newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://www.piwheels.org/simple/feedfinder2/feedfinder2-0.0.4-py3-none-any.whl
Collecting nltk>=3.2.1 (from newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://www.piwheels.org/simple/nltk/nltk-3.4.5-py3-none-any.whl (1.4MB)
    100% |████████████████████████████████| 1.5MB 135kB/s 
Collecting jieba3k>=0.35.1 (from newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://www.piwheels.org/simple/jieba3k/jieba3k-0.35.1-py3-none-any.whl (7.4MB)
    100% |████████████████████████████████| 7.4MB 42kB/s 
Collecting soupsieve>=1.2 (from beautifulsoup4>=4.3.2->news-please)
  Downloading https://files.pythonhosted.org/packages/0b/44/0474f2207fdd601bb25787671c81076333d2c80e6f97e92790f8887cf682/soupsieve-1.9.3-py2.py3-none-any.whl
Collecting incremental>=16.10.1 (from Twisted>=13.1.0; python_version != "3.4"->Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/f5/1d/c98a587dc06e107115cf4a58b49de20b19222c83d75335a192052af4c4b7/incremental-17.5.0-py2.py3-none-any.whl
Collecting constantly>=15.1 (from Twisted>=13.1.0; python_version != "3.4"->Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/b9/65/48c1909d0c0aeae6c10213340ce682db01b48ea900a7d9fce7a7910ff318/constantly-15.1.0-py2.py3-none-any.whl
Collecting PyHamcrest>=1.9.0 (from Twisted>=13.1.0; python_version != "3.4"->Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/9a/d5/d37fd731b7d0e91afcc84577edeccf4638b4f9b82f5ffe2f8b62e2ddc609/PyHamcrest-1.9.0-py2.py3-none-any.whl (52kB)
    100% |████████████████████████████████| 61kB 1.7MB/s 
Collecting attrs>=17.4.0 (from Twisted>=13.1.0; python_version != "3.4"->Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/23/96/d828354fa2dbdf216eaa7b7de0db692f12c234f7ef888cc14980ef40d1d2/attrs-19.1.0-py2.py3-none-any.whl
Collecting zope.interface>=4.4.2 (from Twisted>=13.1.0; python_version != "3.4"->Scrapy>=1.1.0->news-please)
  Downloading https://www.piwheels.org/simple/zope-interface/zope.interface-4.6.0-cp35-cp35m-linux_armv7l.whl (163kB)
    100% |████████████████████████████████| 163kB 327kB/s 
Collecting Automat>=0.3.0 (from Twisted>=13.1.0; python_version != "3.4"->Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/a3/86/14c16bb98a5a3542ed8fed5d74fb064a902de3bdd98d6584b34553353c45/Automat-0.7.0-py2.py3-none-any.whl
Collecting hyperlink>=17.1.1 (from Twisted>=13.1.0; python_version != "3.4"->Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/7f/91/e916ca10a2de1cb7101a9b24da546fb90ee14629e23160086cf3361c4fb8/hyperlink-19.0.0-py2.py3-none-any.whl
Collecting cryptography>=2.3 (from pyOpenSSL->Scrapy>=1.1.0->news-please)
  Downloading https://www.piwheels.org/simple/cryptography/cryptography-2.7-cp35-cp35m-linux_armv7l.whl (782kB)
    100% |████████████████████████████████| 788kB 190kB/s 
Collecting pyasn1 (from service-identity->Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/a1/71/8f0d444e3a74e5640a3d5d967c1c6b015da9c655f35b2d308a55d907a517/pyasn1-0.4.7-py2.py3-none-any.whl (76kB)
    100% |████████████████████████████████| 81kB 1.1MB/s 
Collecting pyasn1-modules (from service-identity->Scrapy>=1.1.0->news-please)
  Downloading https://files.pythonhosted.org/packages/be/70/e5ea8afd6d08a4b99ebfc77bd1845248d56cfcf43d11f9dc324b9580a35c/pyasn1_modules-0.2.6-py2.py3-none-any.whl (95kB)
    100% |████████████████████████████████| 102kB 1.1MB/s 
Collecting jmespath<1.0.0,>=0.7.1 (from botocore==1.12.230->awscli>=1.11.117->news-please)
  Downloading https://files.pythonhosted.org/packages/83/94/7179c3832a6d45b266ddb2aac329e101367fbdb11f425f13771d27f225bb/jmespath-0.9.4-py2.py3-none-any.whl
Collecting requests-file>=1.4 (from tldextract>=2.0.1->newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://files.pythonhosted.org/packages/23/9c/6e63c23c39e53d3df41c77a3d05a49a42c4e1383a6d2a5e3233161b89dbf/requests_file-1.4.3-py2.py3-none-any.whl
Collecting idna (from tldextract>=2.0.1->newspaper3k; python_version >= "3.0"->news-please)
  Using cached https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests>=2.10.0->newspaper3k; python_version >= "3.0"->news-please)
  Downloading https://files.pythonhosted.org/packages/18/b0/8146a4f8dd402f60744fa380bc73ca47303cccf8b9190fd16a827281eac2/certifi-2019.9.11-py2.py3-none-any.whl (154kB)
    100% |████████████████████████████████| 163kB 698kB/s 
Collecting cffi!=1.11.3,>=1.8 (from cryptography>=2.3->pyOpenSSL->Scrapy>=1.1.0->news-please)
Exception:
Traceback (most recent call last):
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 594, in urlopen
    chunked=chunked)
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 391, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 387, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.5/http/client.py", line 1198, in getresponse
    response.begin()
  File "/usr/lib/python3.5/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
    raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 215, in main
    status = self.run(options, args)
  File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 353, in run
    wb.build(autobuilding=True)
  File "/usr/lib/python3/dist-packages/pip/wheel.py", line 749, in build
    self.requirement_set.prepare_files(self.finder)
  File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 380, in prepare_files
    ignore_dependencies=self.ignore_dependencies))
  File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 554, in _prepare_file
    require_hashes
  File "/usr/lib/python3/dist-packages/pip/req/req_install.py", line 278, in populate_link
    self.link = finder.find_requirement(self, upgrade)
  File "/usr/lib/python3/dist-packages/pip/index.py", line 465, in find_requirement
    all_candidates = self.find_all_candidates(req.name)
  File "/usr/lib/python3/dist-packages/pip/index.py", line 423, in find_all_candidates
    for page in self._get_pages(url_locations, project_name):
  File "/usr/lib/python3/dist-packages/pip/index.py", line 568, in _get_pages
    page = self._get_page(location)
  File "/usr/lib/python3/dist-packages/pip/index.py", line 683, in _get_page
    return HTMLPage.get_page(link, session=self.session)
  File "/usr/lib/python3/dist-packages/pip/index.py", line 792, in get_page
    "Cache-Control": "max-age=600",
  File "/usr/share/python-wheels/requests-2.12.4-py2.py3-none-any.whl/requests/sessions.py", line 501, in get
    return self.request('GET', url, **kwargs)
  File "/usr/lib/python3/dist-packages/pip/download.py", line 386, in request
    return super(PipSession, self).request(method, url, *args, **kwargs)
  File "/usr/share/python-wheels/requests-2.12.4-py2.py3-none-any.whl/requests/sessions.py", line 488, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/share/python-wheels/requests-2.12.4-py2.py3-none-any.whl/requests/sessions.py", line 609, in send
    r = adapter.send(request, **kwargs)
  File "/usr/share/python-wheels/CacheControl-0.11.7-py2.py3-none-any.whl/cachecontrol/adapter.py", line 47, in send
    resp = super(CacheControlAdapter, self).send(request, **kw)
  File "/usr/share/python-wheels/requests-2.12.4-py2.py3-none-any.whl/requests/adapters.py", line 423, in send
    timeout=timeout
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/connectionpool.py", line 643, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/share/python-wheels/urllib3-1.19.1-py2.py3-none-any.whl/urllib3/util/retry.py", line 315, in increment
    total -= 1
TypeError: unsupported operand type(s) for -=: 'Retry' and 'int'

 うわぁ……。これまた環境構築からやり直さないとダメっぽい。辛い思い出しかない。絶対ヤダ。Python本体やパッケージ管理のためにpyenvとかanyenvとかdockerとか、管理ツールから構築しないとならない。そして、それらの使い方も学ばねば……。設定にはBashでシェル書かねば、とか広がって……。いつになったら初められるの;;ってなって……。うっ、頭が……。

所感

 この世にPythonがはびこるとき、必ずや挫けるという絶望の環境構築。

 苦労してPython環境をつくっても、OSアップデートでOSごとぶっ壊れる。そういう経験をしてきた。もう嫌だ。Pythonで苦労したくない。楽にコード書きたいからPython使っているのに。公式ドキュメントといい、なぜかPythonまわりの文書には欺瞞が満ちている印象がある。簡単とか嘘つくのはやめてくれ。少なくとも私は超絶に苦労した。もう騙されないからな。

 こうして心の壁をつくっておかないとトラウマスイッチが入るレベル。ヘビ恐い。

対象環境

$ uname -a
Linux raspberrypi 4.19.42-v7+ #1218 SMP Tue May 14 00:48:17 BST 2019 armv7l GNU/Linux