You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While extracting citations from the hewiki dumps of 2019/05/01, the following error occurs:
$ mwcites extract /mnt/data/xmldatadumps/public/hewiki/20190501/hewiki-20190501-pages-meta-history*.xml*.bz2 > hewiki-20190501-citations.tsv
Traceback (most recent call last):
File "/srv/home/bmansurov/venv/mwcites/bin/mwcites", line 11, in <module>
sys.exit(main())
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mwcites/mwcites.py", line 49, in main
module.main(sys.argv[2:])
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mwcites/utilities/extract.py", line 58, in main
run(dump_files, extractors)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mwcites/utilities/extract.py", line 65, in run
for page_id, title, rev_id, timestamp, type, id in cites:
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/map.py", line 87, in map
Failed while processing dump '/mnt/data/xmldatadumps/public/hewiki/20190501/hewiki-20190501-pages-meta-history1.xml-p13702p18009.bz2':
Traceback (most recent call last):
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/processor.py", line 35, in run
for out in self.process_dump(dump, path):
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mwcites/utilities/extract.py", line 94, in process_dump
for cite in extract_cite_history(page, extractors):
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mwcites/utilities/extract.py", line 116, in extract_cite_history
for revision in page:
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/page.py", line 72, in load_revisions
yield Revision.from_element(sub_element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/revision.py", line 99, in from_element
values = consume_tags(cls.TAG_MAP, element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/util.py", line 7, in consume_tags
value_map[tag_name] = tag_map[tag_name](sub_element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/revision.py", line 20, in <lambda>
'contributor': lambda e: Contributor.from_element(e),
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/contributor.py", line 40, in from_element
values = consume_tags(cls.TAG_MAP, element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/util.py", line 7, in consume_tags
value_map[tag_name] = tag_map[tag_name](sub_element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/contributor.py", line 14, in <lambda>
'id': lambda e: int(e.text),
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
re_raise(error, path)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/map.py", line 12, in re_raise
raise error
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
Failed while processing dump '/mnt/data/xmldatadumps/public/hewiki/20190501/hewiki-20190501-pages-meta-history1.xml-p6536p13701.bz2':
Traceback (most recent call last):
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/processor.py", line 35, in run
for out in self.process_dump(dump, path):
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mwcites/utilities/extract.py", line 94, in process_dump
for cite in extract_cite_history(page, extractors):
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mwcites/utilities/extract.py", line 116, in extract_cite_history
for revision in page:
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/page.py", line 72, in load_revisions
yield Revision.from_element(sub_element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/revision.py", line 99, in from_element
values = consume_tags(cls.TAG_MAP, element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/util.py", line 7, in consume_tags
value_map[tag_name] = tag_map[tag_name](sub_element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/revision.py", line 20, in <lambda>
'contributor': lambda e: Contributor.from_element(e),
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/contributor.py", line 40, in from_element
values = consume_tags(cls.TAG_MAP, element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/util.py", line 7, in consume_tags
value_map[tag_name] = tag_map[tag_name](sub_element)
File "/srv/home/bmansurov/venv/mwcites/lib/python3.5/site-packages/mw/xml_dump/iteration/contributor.py", line 14, in <lambda>
'id': lambda e: int(e.text),
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
The text was updated successfully, but these errors were encountered:
While extracting citations from the hewiki dumps of 2019/05/01, the following error occurs:
The text was updated successfully, but these errors were encountered: