Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OutOfMemoryError with very large (>200MB) MDB files (was: build instructions) #5

Open
hay opened this issue Nov 7, 2018 · 11 comments

Comments

@hay
Copy link

hay commented Nov 7, 2018

Hey,
i've used your online version of this tool before, thanks so much for providing that. However, i now have a 350+ MB Access File, so i want to use this tool directly from the command line. Unfortunately, i have some trouble building a .jar file. I have very little experience with building Java applications, so maybe a build instruction in the README would be handy.

Where i'm stuck: i've cloned the repo and used mvn clean install in the root of the directory. This works, and i get a accessconverter-1.1.1.jar file in the target directory. However, when i try running that using java -jar accessconverter-1.1.1.jar i get an error: no main manifest attribute, in accessconverter-1.1.1.jar.

Is there any way i could solve this error and build a working .jar file?

@clytras
Copy link
Owner

clytras commented Nov 7, 2018

Hello. Yes it would be nice to include compile instructions and I will do it at some point, but really you don't have to compile your own version to use the current code, I have it already compiled and you can download it at releases here on github repo. Just visit this link https://github.com/clytras/AccessConverter/releases/download/v1.1/AccessConverter_v1.1.zip and you'll download the latest jar compiled file along with the dependencies inside a zip file.

@hay
Copy link
Author

hay commented Nov 7, 2018

Ah, awesome, thank you so much! Funny, i tried looking for pre-built releases, but couldn't find them, even though they were available.

@hay hay closed this as completed Nov 7, 2018
@clytras
Copy link
Owner

clytras commented Nov 7, 2018

Well, I don't have compiled/uploaded the v1.1.1 release which fixes some bugs when importing huge amount of data which is also at the online tool, so, if you face any problems with the compiled v1.1 jar, let me know and I'll try to find some time and upload a release for v1.1.1.

EDIT
Here you can find all the releases:
image

@hay
Copy link
Author

hay commented Nov 7, 2018

That would be excellent. I've downloaded the v1.1 release to import the Access file, but it's giving out of memory errors, even if i increase the heap size to 16GB.

@clytras
Copy link
Owner

clytras commented Nov 7, 2018

I have uploaded the v1.1.1 release, you can get it here https://github.com/clytras/AccessConverter/releases/download/v1.1.1/AccessConverter_v1.1.1.zip, but I don't think it will fix your issue, because for JSON or MySQL conversion it uses a org.apache.commons.text.TextStringBuilder to store the whole data/(creation query) and thus I assume it reaches some limits. I now think the text outputs shoud be implemented by writing directly to the disk text file and not store everything to a string buffer and then just save that buffer to the output file. Try v1.1.1 and let me know how it behavies. Are you using the -show-progress flag to see how long it goes before it stops/crashes? What type are you trying to convert to? Can you try SQLite also? SQLite does not write anything to a text file, it creates a SQLite db and writes directly to it.

@hay
Copy link
Author

hay commented Nov 7, 2018

Awesome, thanks for making that release so quickly.

Unfortunately, i still get out of heap errors. However, this new version does actually show progress when using the -show-progress flag. It quickly goes out of memory when trying to convert to JSON.

I've also tried the convert-sqlite option, and that seems to work better. However, when it gets to 90% or so it also crashes with the same error as when using JSON.

If you want to reproduce the error, the MDB file is over here. It contains all monuments in the Netherlands.

@hay hay reopened this Nov 7, 2018
@hay hay changed the title Build instructions OutOfMemoryError with very large (>200MB) MDB files (was: build instructions) Nov 7, 2018
@clytras
Copy link
Owner

clytras commented Nov 8, 2018

Got the MDB file, thanks. I'll try and check it out tomorrow cause it's late here in Greece!
I'm surprised that it crashes when converting to SQLite, but I need to change the text buffer methods.
In the meanwhile I managed to get the SQL file extracted with this command:

java -Xmx10g -XX:+UseConcMarkSweepGC -jar ../AccessConverter.jar --access-file "./Extract_MRS_V11.0.03.mdb" --task convert-mysql-dump --output-file "./Extract_MRS.sql" --log-file "./test1.log" -show-progress -no-log

Here is the link if you want to download it (200MB):
https://lytrax.io/pub/Extract_MRS.sql

@hay
Copy link
Author

hay commented Nov 8, 2018

Wow, efcharisto! Thanks so much for converting that file!

@clytras
Copy link
Owner

clytras commented Nov 9, 2018

You're welcome!
Please do not close this issue. I have to implement a new way to dump the data for the text files .sql and .json.

@hay
Copy link
Author

hay commented Nov 11, 2018

Hey Clytras,
i've tried importing the SQL dump you provided, but i think the Access to MySQL conversion has a couple of bugs that prevent insertion into a database. I'm not quite sure what's going on (because i can't look at the original database file) but two obvious things that are wrong:

  • It seems that empty fields get translated to nothing in INSERT statements. For an example see line 122 in the dump ((, '4.476x')). That single comma in the beginning makes MySQL give errors. I guess it should be a ''.
  • Whole INSERT statements seem to get dropped, for an example see line 143.

I wrote a (pretty hacky) Python script to 'fix' those bugs (available here). After running that i could import the SQL dump, but i still need to validate the actual data to see if anything is missing.

@clytras
Copy link
Owner

clytras commented Jul 31, 2024

Hey @hay, it's been quite a few years since the last release. I now have tackled the performance issues for exporting huge files, but I can't find the file with your dataset to try it out. If you care and you still have that file, I'd love to get it and test it on the latest release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants