Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mistakes and problems in database #2

Open
BugFix opened this issue Nov 17, 2023 · 8 comments
Open

mistakes and problems in database #2

BugFix opened this issue Nov 17, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@BugFix
Copy link

BugFix commented Nov 17, 2023

Hello,
I've tried to parse this csv to insert in an sqlite database. This fails anytime. Some reasons was wrong data (your source parsing got not the full data):
#208 's Obere Wegle;72820;Sonnenbühl;08415091 --> Fußweg 's Obere Wegle;72820;Sonnenbühl;08415091
#209 's Untere Wegle;72820;Sonnenbühl;08415091 --> Fußweg 's Untere Wegle;72820;Sonnenbühl;08415091
#210 's-Heerenberger Str.;46446;Emmerich am Rhein;05154008 --> Alte 's-Heerenberger Str.;46446;Emmerich am Rhein;05154008
This seems also to be wrong:
#214 -4;76187;Karlsruhe;08212000
Another problem are locations like this:
#364 24 Stunden 2009;78183;Hüfingen;08326027
This is not the name of a street. Its tagged as footway, name=24 Stunden 2009. And as i can see on OSM, it's part of a way (was walked in 24 hours?)
Some other examples:
#370 29 - 38;06184;Kabelsketal;15088150
#371 2;04758;Cavertitz;14730050
#372 2;92536;Pfreimd;09376153
There is no relationship to any street.

I'm trying since some years to get good free sources for postalcodes and regional keys. I know, its heavy to parse the sources, especially with the amount auf nearly 2 millions of street names.
And so: Thumbs up for your work.

Some hints. I think it give more problems as responsed, while using ';' inside values and also as delimiter. You've encapsulated therefore the values with '"' to avoid mixing. Thats OK. But while processing the data i must search for the last '"', cut until this as 1st field and can than split the rest by ';' to get the other 3 fields. Replacing the ';' in the name-value with ',' while parsing from your sources would avoid this overhead of work.

Best Regards
BugFix ([email protected])

@fstueber
Copy link
Contributor

fstueber commented Nov 17, 2023

Hi, thanks a lot for the feedback. As for the first 3 examples: That's the name how it is stored in OSM:

To solve this OSM should be updated.

@fstueber
Copy link
Contributor

About #214. This is the entry in OSM:

<way id="428097109">
  <tag k="access" v="private"/>
  <tag k="highway" v="unclassified"/>
  <tag k="name" v="-4"/>
</way>

This clearly should not be in the result set. I will filter out access=private for the next extraction.

@fstueber
Copy link
Contributor

fstueber commented Nov 17, 2023

About #364.

<way id="1053159284">
  <tag k="highway" v="footway"/>
  <tag k="name" v="24 Stunden 2009"/>
  <tag k="surface" v="paved"/>
</way>

This should not be in the result set, yes. I will have to look how to deal with this.

@fstueber
Copy link
Contributor

The last 3:

<way id="398541742">
  <tag k="foot" v="yes"/>
  <tag k="highway" v="footway"/>
  <tag k="horse" v="no"/>
  <tag k="name" v="29 - 38"/>
</way>
<way id="34520686">
  <tag k="highway" v="secondary"/>
  <tag k="maxspeed" v="50"/>
  <tag k="name" v="2"/>
  <tag k="ref" v="S 27"/>
</way>
<way id="303202319">
  <tag k="highway" v="footway"/>
  <tag k="name" v="2"/>
  <tag k="surface" v="grass"/>
</way>

Some better filtering is needed.

@fstueber
Copy link
Contributor

About your hint: That's how CSV (in this case with ; as delimiter) is formated. Or did I miss something?

@fstueber fstueber added the bug Something isn't working label Nov 17, 2023
@BugFix
Copy link
Author

BugFix commented Nov 17, 2023

That's how CSV (in this case with ; as delimiter) is formated.

You may be right. I think it's language related, how csv libraries will work. I'm using often AutoIt to write programs and have here my own created functions.
But i can live with my solution too.
By the other way, I'll try how the results will be with Lua and Nim.

Thanks for your reply.

@fstueber
Copy link
Contributor

I don't know AutoIt, but it seems it supports regular expressions which is an alternative way to deal with CSV files.

@fstueber
Copy link
Contributor

Please check new version of OpenPLZ API: https://www.openplzapi.org/en/change-log/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants