Word pages look unformatted #129

dbogdanov · 2020-07-15T20:08:47Z

The pages for each particular work look unformatted with lots of metadata tags output as raw text.
Is this intentional? I've just installed the app and testing.

An example (EN.quickdic) rendered in QuickDic compared to the same Wiktionary page in Firefox Android:

App version: 5.5.6

rdoeffinger · 2020-07-15T20:49:32Z

"Intentional" is the wrong word.
The wiktionary data looks like on the left side, and there is no easy to use/integrate code to convert it to the right side.
Support has been added for some specific, common ones. It would be possible to add support for some more, and for some others maybe just remove them (as they increase dictionary size without much benefit, for example online links are of somewhat questionable use in an offline dictionary).
It would be some work though, and only improve things, not completely fix it.

Huy-Ngo · 2021-01-01T01:41:03Z

I suppose Wikimedia should have the parser for this markup. Maybe you can import them?

ilius · 2021-01-01T04:27:10Z

I have this problem as well in my Python tool: ilius/pyglossary#48

I think using .zim files (from Kiwix project) is the easiest way to use Wiktionary or Wikipedia offline.
There is libzim

shaked6540 · 2021-01-28T23:26:15Z

There actually is an easy way to extract the formatted data using https://github.com/tatuylonen/wiktextract

ilius · 2021-01-29T01:45:25Z

That tool simply downloads the rendered HTML from Wiktionary website one entry at a time.
It does not render it.
It's also in Python. This is a Java project.

shaked6540 · 2021-01-29T11:52:19Z

You use it to extract the information which you can then convert to the same format this dictionary is using, making it human readable. I'm using it in my app, there's no readme yet but you can compile and see for yourself how its much cleaner and readerable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word pages look unformatted #129

Word pages look unformatted #129

dbogdanov commented Jul 15, 2020

rdoeffinger commented Jul 15, 2020

Huy-Ngo commented Jan 1, 2021

ilius commented Jan 1, 2021

shaked6540 commented Jan 28, 2021

ilius commented Jan 29, 2021

shaked6540 commented Jan 29, 2021

Word pages look unformatted #129

Word pages look unformatted #129

Comments

dbogdanov commented Jul 15, 2020

rdoeffinger commented Jul 15, 2020

Huy-Ngo commented Jan 1, 2021

ilius commented Jan 1, 2021

shaked6540 commented Jan 28, 2021

ilius commented Jan 29, 2021

shaked6540 commented Jan 29, 2021