いよいよDBへ保存してみる。
開発環境
- Linux Mint 17.3 MATE 32bit
- SQLite 3.8.2
はてなフォトライフAtomAPI - Hatena Developer Center
成果物
準備
1. DBを作成しておく
http://ytyaru.hatenablog.com/entry/2017/06/30 http://ytyaru.hatenablog.com/entry/2017/07/01 http://ytyaru.hatenablog.com/entry/2017/07/02 Server Error
2. XMLを取得しファイル保存しておく
http://ytyaru.hatenablog.com/entry/2017/06/23
3. datasetをインストールする
概要
- BeautifulSoupでXML文書からデータを抽出する
- datasetでSQLite3ファイルにデータを挿入する
本質的には、単にXMLファイルからsqlite3ファイルに変換するだけ。
SQLでアクセスできるようになるし、タグなど無駄な情報を削れる。単一ファイルなので保存もしやすい。
datasetをインストールする
$ sudo pip3 install dataset [sudo] password for mint: Downloading/unpacking dataset Downloading dataset-0.8.0-py2.py3-none-any.whl Downloading/unpacking normality>=0.3.9 (from dataset) Downloading normality-0.4.0.tar.gz Running setup.py (path:/tmp/pip_build_root/normality/setup.py) egg_info for package normality Downloading/unpacking PyYAML>=3.10 (from dataset) Downloading PyYAML-3.12.tar.gz (253kB): 253kB downloaded Running setup.py (path:/tmp/pip_build_root/PyYAML/setup.py) egg_info for package PyYAML Downloading/unpacking six>=1.7.3 (from dataset) Downloading six-1.10.0-py2.py3-none-any.whl Downloading/unpacking sqlalchemy>=0.9.1 (from dataset) Downloading SQLAlchemy-1.1.6.tar.gz (5.2MB): 5.2MB downloaded Running setup.py (path:/tmp/pip_build_root/sqlalchemy/setup.py) egg_info for package sqlalchemy warning: no files found matching '*.jpg' under directory 'doc' warning: no files found matching '*.mako' under directory 'doc' warning: no files found matching 'distribute_setup.py' warning: no files found matching 'sa2to3.py' warning: no files found matching 'ez_setup.py' no previously-included directories found matching 'doc/build/output' Downloading/unpacking alembic>=0.6.2 (from dataset) Downloading alembic-0.9.1.tar.gz (999kB): 999kB downloaded Running setup.py (path:/tmp/pip_build_root/alembic/setup.py) egg_info for package alembic warning: no files found matching '*.jpg' under directory 'docs' warning: no files found matching '*.sty' under directory 'docs' warning: no files found matching '*.dat' under directory 'tests' no previously-included directories found matching 'docs/build/output' Requirement already satisfied (use --upgrade to upgrade): chardet in /usr/lib/python3/dist-packages (from normality>=0.3.9->dataset) Downloading/unpacking Mako (from alembic>=0.6.2->dataset) Downloading Mako-1.0.6.tar.gz (575kB): 575kB downloaded Running setup.py (path:/tmp/pip_build_root/Mako/setup.py) egg_info for package Mako warning: no files found matching '*.xml' under directory 'examples' warning: no files found matching '*.mako' under directory 'examples' warning: no files found matching 'distribute_setup.py' warning: no files found matching 'ez_setup.py' no previously-included directories found matching 'doc/build/output' Downloading/unpacking python-editor>=0.3 (from alembic>=0.6.2->dataset) Downloading python-editor-1.0.3.tar.gz Running setup.py (path:/tmp/pip_build_root/python-editor/setup.py) egg_info for package python-editor Downloading/unpacking MarkupSafe>=0.9.2 (from Mako->alembic>=0.6.2->dataset) Downloading MarkupSafe-0.23.tar.gz Running setup.py (path:/tmp/pip_build_root/MarkupSafe/setup.py) egg_info for package MarkupSafe Installing collected packages: dataset, normality, PyYAML, six, sqlalchemy, alembic, Mako, python-editor, MarkupSafe Running setup.py install for normality Running setup.py install for PyYAML checking if libyaml is compilable i686-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.4m -c build/temp.linux-i686-3.4/check_libyaml.c -o build/temp.linux-i686-3.4/check_libyaml.o build/temp.linux-i686-3.4/check_libyaml.c:2:18: fatal error: yaml.h: そのようなファイルやディレクトリはありません #include <yaml.h> ^ compilation terminated. libyaml is not found or a compiler error: forcing --without-libyaml (if libyaml is installed correctly, you may need to specify the option --include-dirs or uncomment and modify the parameter include_dirs in setup.cfg) Found existing installation: six 1.5.2 Not uninstalling six at /usr/lib/python3/dist-packages, owned by OS Running setup.py install for sqlalchemy building 'sqlalchemy.cprocessors' extension i686-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.4m -c lib/sqlalchemy/cextension/processors.c -o build/temp.linux-i686-3.4/lib/sqlalchemy/cextension/processors.o i686-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 build/temp.linux-i686-3.4/lib/sqlalchemy/cextension/processors.o -o build/lib.linux-i686-3.4/sqlalchemy/cprocessors.cpython-34m.so building 'sqlalchemy.cresultproxy' extension i686-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.4m -c lib/sqlalchemy/cextension/resultproxy.c -o build/temp.linux-i686-3.4/lib/sqlalchemy/cextension/resultproxy.o i686-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 build/temp.linux-i686-3.4/lib/sqlalchemy/cextension/resultproxy.o -o build/lib.linux-i686-3.4/sqlalchemy/cresultproxy.cpython-34m.so building 'sqlalchemy.cutils' extension i686-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.4m -c lib/sqlalchemy/cextension/utils.c -o build/temp.linux-i686-3.4/lib/sqlalchemy/cextension/utils.o i686-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 build/temp.linux-i686-3.4/lib/sqlalchemy/cextension/utils.o -o build/lib.linux-i686-3.4/sqlalchemy/cutils.cpython-34m.so warning: no files found matching '*.jpg' under directory 'doc' warning: no files found matching '*.mako' under directory 'doc' warning: no files found matching 'distribute_setup.py' warning: no files found matching 'sa2to3.py' warning: no files found matching 'ez_setup.py' no previously-included directories found matching 'doc/build/output' Running setup.py install for alembic warning: no files found matching '*.jpg' under directory 'docs' warning: no files found matching '*.sty' under directory 'docs' warning: no files found matching '*.dat' under directory 'tests' no previously-included directories found matching 'docs/build/output' Installing alembic script to /usr/local/bin Running setup.py install for Mako warning: no files found matching '*.xml' under directory 'examples' warning: no files found matching '*.mako' under directory 'examples' warning: no files found matching 'distribute_setup.py' warning: no files found matching 'ez_setup.py' no previously-included directories found matching 'doc/build/output' Installing mako-render script to /usr/local/bin Running setup.py install for python-editor Running setup.py install for MarkupSafe building 'markupsafe._speedups' extension i686-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.4m -c markupsafe/_speedups.c -o build/temp.linux-i686-3.4/markupsafe/_speedups.o i686-linux-gnu-gcc -pthread -shared -Wl,-O1 -Wl,-Bsymbolic-functions -Wl,-Bsymbolic-functions -Wl,-z,relro -Wl,-Bsymbolic-functions -Wl,-z,relro -g -fstack-protector --param=ssp-buffer-size=4 -Wformat -Werror=format-security -D_FORTIFY_SOURCE=2 build/temp.linux-i686-3.4/markupsafe/_speedups.o -o build/lib.linux-i686-3.4/markupsafe/_speedups.cpython-34m.so Successfully installed dataset normality PyYAML six sqlalchemy alembic Mako python-editor MarkupSafe Cleaning up...
課題
- ページネーションして全件取得したい
所感
課題の件も大事だが、ほかのDBへの保存も大事。とくに本題であるブログ本文の保存をやりたい。