jesus_zen_drod

85

Projects

Personal log

  • consider implementing a consistent syntax to be used for all future BGEs (structured, ie. XML- or JSON-based)
8 months ago

Conclusions relative to HTML parser: - works reasonably well so far, there may be some occasional data loss - too many inconsistencies in the text data to reliably re-structure the data ; human post-processing of the script output is a must.

8 months ago

splitting references to extract their Art./Abs./lit. components.

8 months ago

data pulled from bger.li is unstructured ; parser tries to address this issue

8 months ago

dedicated HTML parser for bger.li

8 months ago

dedicated HTML parser for bger.li

8 months ago