reader: unzip and untar data without checking the filename #2

s-stepien · 2020-02-24T11:42:02Z

Recently, MaxMind has changed the URL from which the free database can
be downloaded. In the new URL schema, the suffix is given as a URL
parameter, so we need to handle it correctly during unpacking.

Signed-off-by: Sławomir Stępień [email protected]

s-stepien · 2020-02-25T07:04:50Z

Well, this is not the best thing we can do. It would be better for parse the URL, but would be super nice to check the headers (if this is even possible) and detect gnuzip and tar type data. I will take a look at better approach.

Since filename can be also an URL, it's more wise to make decision based on data and not the filename. This commit will try to unzip and untar the data, and if it fails it will pass the data down the pipe. Signed-off-by: Sławomir Stępień <[email protected]>

s-stepien · 2020-02-25T08:27:31Z

Commit updated. I think this is better approach.

mneudert

❤️ 💙 💚 💜 💛

mneudert · 2020-02-25T18:11:10Z

Just out of curiosity I ran a benchmark with the changes:

Benchee.run(
  %{
    "read" => fn -> Geolix.Adapter.MMDB2.Reader.read_database("GeoLite2-City.mmdb") end
  },
  formatters: [{Benchee.Formatters.Console, comparison: false}],
  warmup: 2,
  time: 10
)

Following values where output using an already unpacked City database (50-somewhat MB):

# untar/unzip with suffix check
Name           ips        average  deviation         median         99th %
read        135.48        7.38 ms    ±12.45%        7.21 ms        8.61 ms

# always untar/unzip
Name           ips        average  deviation         median         99th %
read         36.83       27.15 ms    ±11.83%       26.63 ms       45.35 ms

"Future versions™" might provide a more fine grained control on what to run for reading the file but as far as I am concerned these times they should not matter much right now. And it at allows to use the new download locations 👍

s-stepien requested a review from mneudert February 24, 2020 19:53

s-stepien changed the title ~~reader: handle MaxMind URL file extensions~~ reader: unzip and untar data without checking the filename Feb 25, 2020

mneudert approved these changes Feb 25, 2020

View reviewed changes

mneudert merged commit d9dbfd9 into elixir-geolix:master Feb 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reader: unzip and untar data without checking the filename #2

reader: unzip and untar data without checking the filename #2

s-stepien commented Feb 24, 2020

s-stepien commented Feb 25, 2020

s-stepien commented Feb 25, 2020

mneudert left a comment

mneudert commented Feb 25, 2020

reader: unzip and untar data without checking the filename #2

reader: unzip and untar data without checking the filename #2

Conversation

s-stepien commented Feb 24, 2020

s-stepien commented Feb 25, 2020

s-stepien commented Feb 25, 2020

mneudert left a comment

Choose a reason for hiding this comment

mneudert commented Feb 25, 2020