Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<fstream>: Non-char types are 30 to 50x slower when reading/writing than regular char #817

Closed
jessey-git opened this issue May 10, 2020 · 3 comments · Fixed by #2739
Closed
Labels
fixed Something works now, yay! performance Must go faster

Comments

@jessey-git
Copy link

jessey-git commented May 10, 2020

Describe the bug
Reading and writing to ifstream/ofstream is 30 to 50x slower when using uint8_t or std::byte vs. using regular char.

Attached is a small repro program demonstrating the poor behavior.

This caught me off guard in a project of mine where I attempted to do the "right" thing and was swiftly penalized for it. I tried to represent a buffer as a vector<uint8_t> rather than vector<char> and suddenly reading/writing that buffer became enormously slow. This was just before std::byte became available. I worked around it using reinterpret_cast<char *>

Now that std::byte is well enough into the ecosystem I decided to test again to see if that's affected as well and it sure is. I can see this becoming a large issue if folks actually embrace the byte in modern, idiomatic c++.

Compiled Version 16.6.0 Preview 6.0

Machine 1:

Write to ofstream<char> 0.9732 ms
Write to ofstream<uint8_t> 43.1419 ms
Write to ofstream<std::byte> 45.2306 ms
Read from ifstream<char> 1.0306 ms
Read from ifstream<uint8_t> 43.053 ms
Read from ifstream<std::byte> 43.2853 ms

Machine 2:

Write to ofstream<char> 0.8127 ms
Write to ofstream<uint8_t> 34.6434 ms
Write to ofstream<std::byte> 34.0428 ms
Read from ifstream<char> 0.9546 ms
Read from ifstream<uint8_t> 32.4374 ms
Read from ifstream<std::byte> 31.9037 ms

Command-line test case
Build and run the attached vs solution in x64 Release mode

Expected behavior
Equivalent performance for all similar data types char, uint8_t, and std::byte

STL version
16.6.0 Preview 6.0

SlowIO.zip

@StephanTLavavej StephanTLavavej added the performance Must go faster label May 10, 2020
@StephanTLavavej StephanTLavavej changed the title fstream: Non-char types are 30 to 50x slower when reading/writing than regular char <fstream>: Non-char types are 30 to 50x slower when reading/writing than regular char May 10, 2020
@jessey-git
Copy link
Author

The new char8_t type is also affected in a similar manner. The problem for each is due to the use of the slow codecvt path whenever the incoming type does not match char exactly (despite being the same storage size)

@fsb4000
Copy link
Contributor

fsb4000 commented Jul 18, 2021

@jessey-git If you add to your repro the lines:

TType buf[4096];
ifile.rdbuf()->pubsetbuf(buf, 4096);
// same with ofile

then the performance will be the same.

#include <cstddef>
#include <chrono>
#include <iostream>
#include <fstream>
#include <vector>

constexpr int ITER_COUNT = 50;

void write_time(std::chrono::high_resolution_clock::time_point start, char const* const message)
{
  std::chrono::high_resolution_clock::time_point end = std::chrono::high_resolution_clock::now();
  std::chrono::duration<double, std::milli> diff = end - start;
  std::cout << message << " " << diff.count() << " ms\n";
}

template <typename TType>
void test_write(std::vector<TType> const& data, char const* const file, char const* const message)
{
  std::basic_ofstream<TType, std::char_traits<TType>> ofile(file, std::ios::binary);
  TType buf[4096];
  ofile.rdbuf()->pubsetbuf(buf, 4096);
  std::chrono::high_resolution_clock::time_point start = std::chrono::high_resolution_clock::now();

  for (int i = 0; i < ITER_COUNT; i++)
  {
    ofile.write(&data[0], data.size());
  }
  ofile.flush();

  write_time(start, message);
  ofile.close();
}

template <typename TType>
void test_read(std::vector<TType>& data, char const* const file, char const* const message)
{
  std::basic_ifstream<TType, std::char_traits<TType>> ifile(file, std::ios::binary);
  TType buf[4096];
  ifile.rdbuf()->pubsetbuf(buf, 4096);
  std::chrono::high_resolution_clock::time_point start = std::chrono::high_resolution_clock::now();

  for (int i = 0; i < ITER_COUNT; i++)
  {
    ifile.read(&data[0], data.size());
  }

  write_time(start, message);
  ifile.close();
}


int main()
{
  constexpr size_t BUFFER_SIZE = 23456;

  std::vector<char> data_char(BUFFER_SIZE, 'x');
  std::vector<uint8_t> data_uint8_t(BUFFER_SIZE, 'x');
  std::vector<std::byte> data_byte(BUFFER_SIZE, std::byte{ 120 });

  // ------------------------
  // TESTS
  // ------------------------

  // ------------
  test_write(data_char, "io_char.dat", "Write to ofstream<char>");
  test_write(data_uint8_t, "io_uint8_t.dat", "Write to ofstream<uint8_t>");
  test_write(data_byte, "io_byte.dat", "Write to ofstream<std::byte>");

  // ------------
  test_read(data_char, "io_char.dat", "Read from ifstream<char>");
  test_read(data_uint8_t, "io_uint8_t.dat", "Read from ifstream<uint8_t>");
  test_read(data_byte, "io_byte.dat", "Read from ifstream<std::byte>");
}

изображение

@fsb4000
Copy link
Contributor

fsb4000 commented Jul 18, 2021

related issue #605

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed Something works now, yay! performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants
@StephanTLavavej @fsb4000 @jessey-git and others