- consider implementing a consistent syntax to be used for all future BGEs (structured, ie. XML- or JSON-based)
jesus_zen_drod
85
Projects
Personal log
8 months ago
Conclusions relative to HTML parser: - works reasonably well so far, there may be some occasional data loss - too many inconsistencies in the text data to reliably re-structure the data ; human post-processing of the script output is a must.
8 months ago
data pulled from bger.li is unstructured ; parser tries to address this issue
8 months ago