Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add full support of format string parsing in compile-time API #2129

132 changes: 120 additions & 12 deletions include/fmt/compile.h
Original file line number Diff line number Diff line change
Expand Up @@ -463,6 +463,40 @@ template <typename Char, typename T, int N> struct field {
template <typename Char, typename T, int N>
struct is_compiled_format<field<Char, T, N>> : std::true_type {};

// A replacement field that refers to argument with name.
template <typename Char> struct runtime_named_field {
using char_type = Char;
basic_string_view<Char> name;

template <typename OutputIt, typename T>
constexpr static bool try_format_argument(OutputIt& end, OutputIt out,
alexezeder marked this conversation as resolved.
Show resolved Hide resolved
basic_string_view<Char> arg_name,
const T& arg) {
if constexpr (!is_named_arg<typename std::remove_cv<T>::type>::value) {
return false;
} else {
if (arg_name == arg.name) {
alexezeder marked this conversation as resolved.
Show resolved Hide resolved
end = write<Char>(out, arg.value);
return true;
}
return false;
}
alexezeder marked this conversation as resolved.
Show resolved Hide resolved
}

template <typename OutputIt, typename... Args>
constexpr OutputIt format(OutputIt out, const Args&... args) const {
auto end = out;
bool found = (try_format_argument(end, out, name, args) || ...);
if (!found) {
throw format_error("argument with specified name is not found");
}
return end;
}
};

template <typename Char>
struct is_compiled_format<runtime_named_field<Char>> : std::true_type {};

// A replacement field that refers to argument N and has format specifiers.
template <typename Char, typename T, int N> struct spec_field {
using char_type = Char;
Expand Down Expand Up @@ -536,15 +570,51 @@ template <typename T, typename Char> struct parse_specs_result {
int next_arg_id;
};

constexpr int manual_indexing_id() { return -1; }
alexezeder marked this conversation as resolved.
Show resolved Hide resolved

template <typename T, typename Char>
constexpr parse_specs_result<T, Char> parse_specs(basic_string_view<Char> str,
size_t pos, int arg_id) {
size_t pos, int next_arg_id) {
str.remove_prefix(pos);
auto ctx = basic_format_parse_context<Char>(str, {}, arg_id + 1);
auto ctx = basic_format_parse_context<Char>(str, {}, next_arg_id);
auto f = formatter<T, Char>();
auto end = f.parse(ctx);
return {f, pos + fmt::detail::to_unsigned(end - str.data()) + 1,
ctx.next_arg_id()};
next_arg_id == 0 ? manual_indexing_id() : ctx.next_arg_id()};
}

template <typename Char> struct arg_id_handler {
constexpr void on_error(const char* message) { throw format_error(message); }

constexpr int on_arg_id() {
throw format_error("handler cannot be used for empty arg_id");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for empty arg_id" -> "with automatic indexing"

Also can this be an assert?

Copy link
Contributor Author

@alexezeder alexezeder Feb 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it actually can be used for automatic indexing with named identifiers. Both runtime and (now) compile-time APIs keep automatic indexing when a named argument identifier is used. So it just cannot be used for an unnamed argument identifier in the automatic indexing mode, which this message is trying to say.
By the way, this function wouldn't be used in normal conditions because the code that invokes this handler actually controls that this handler is used only for numeric or named arguments. As long as it's true, no one would see this message, but when someone breaks the parsing code, they will get this message.

Also can this be an assert?

As I said, it just indicates an internal error, so the cause of this compile-time error can be everything not compile-time friendly. I saw several usages of throw format_error(...) and use it too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure what you mean by "unnamed argument identifier". Both "{}" and "{:...}" denote automatic indexing which is why I'm suggesting this minor wording change. It doesn't matter much since it's an internal error but a bit more consistent with the wording elsewhere.

it just indicates an internal error

Right and this is exactly why I'm suggesting to use an assert if possible. This will distinguish an internal error from a user error even though they both result in a compilation error. If assert doesn't work for some reason, then throw is OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, "unnamed argument identifier" sounds a bit strange. 🙂
But the problem is probably in my wrong understanding of how named arguments work. After updating this PR (as I wrote here), this wording problem would be probably eliminated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return 0;
}

constexpr int on_arg_id(int id) {
arg_id = arg_ref<Char>(id);
return 0;
}

constexpr int on_arg_id(basic_string_view<Char> id) {
arg_id = arg_ref<Char>(id);
return 0;
}

arg_ref<Char> arg_id;
};

template <typename Char> struct parse_arg_id_result {
arg_ref<Char> arg_id;
const Char* arg_id_end;
};
Comment on lines +609 to +612
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass begin by reference in parse_arg_id and avoid introducing this struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... it would be a reference to the pointer, or (IMHO better) a pointer to the pointer, is it ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think reference is better unless it can be null.

Copy link
Contributor Author

@alexezeder alexezeder Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's probably impossible because there is a need to have arg_id_end as a constexpr variable or, more importantly, begin has to be a non-constexpr variable in that case, but it should be used in a constexpr context.


template <int ID, typename Char>
constexpr auto parse_arg_id(const Char* begin, const Char* end) {
auto handler = arg_id_handler<Char>{arg_ref<Char>{}};
auto adapter = id_adapter<arg_id_handler<Char>, Char>{handler, 0};
auto arg_id_end = parse_arg_id(begin, end, adapter);
return parse_arg_id_result<Char>{handler.arg_id, arg_id_end};
}

// Compiles a non-empty format string and returns the compiled representation
Expand All @@ -558,17 +628,55 @@ constexpr auto compile_format_string(S format_str) {
throw format_error("unmatched '{' in format string");
if constexpr (str[POS + 1] == '{') {
return parse_tail<Args, POS + 2, ID>(make_text(str, POS, 1), format_str);
} else if constexpr (str[POS + 1] == '}') {
using id_type = get_type<ID, Args>;
return parse_tail<Args, POS + 2, ID + 1>(field<char_type, id_type, ID>(),
format_str);
} else if constexpr (str[POS + 1] == ':') {
} else if constexpr (str[POS + 1] == '}' || str[POS + 1] == ':') {
static_assert(ID != manual_indexing_id(),
"cannot switch from manual to automatic argument indexing");
using id_type = get_type<ID, Args>;
constexpr auto result = parse_specs<id_type>(str, POS + 2, ID);
return parse_tail<Args, result.end, result.next_arg_id>(
spec_field<char_type, id_type, ID>{result.fmt}, format_str);
if constexpr (str[POS + 1] == '}') {
constexpr auto next_id =
ID != manual_indexing_id() ? ID + 1 : manual_indexing_id();
return parse_tail<Args, POS + 2, next_id>(
field<char_type, id_type, ID>(), format_str);
} else {
constexpr auto result = parse_specs<id_type>(str, POS + 2, ID + 1);
return parse_tail<Args, result.end, result.next_arg_id>(
spec_field<char_type, id_type, ID>{result.fmt}, format_str);
}
} else {
return unknown_format();
constexpr auto arg_id_result =
parse_arg_id<ID>(str.data() + POS + 1, str.data() + str.size());
constexpr auto arg_id_end_pos = arg_id_result.arg_id_end - str.data();
constexpr char_type c =
arg_id_end_pos != str.size() ? str[arg_id_end_pos] : char_type();
static_assert(c == '}' || c == ':', "missing '}' in format string");
if constexpr (arg_id_result.arg_id.kind == arg_id_kind::index) {
static_assert(
ID == manual_indexing_id() || ID == 0,
"cannot switch from automatic to manual argument indexing");
constexpr auto arg_index = arg_id_result.arg_id.val.index;
using id_type = get_type<arg_index, Args>;
if constexpr (c == '}') {
return parse_tail<Args, arg_id_end_pos + 1, manual_indexing_id()>(
field<char_type, id_type, arg_index>(), format_str);
} else if constexpr (c == ':') {
constexpr auto result =
parse_specs<id_type>(str, arg_id_end_pos + 1, 0);
return parse_tail<Args, result.end, result.next_arg_id>(
spec_field<char_type, id_type, arg_index>{result.fmt},
format_str);
}
} else if constexpr (arg_id_result.arg_id.kind == arg_id_kind::name) {
static_assert(
ID != manual_indexing_id(),
"cannot switch from manual to automatic argument indexing");
if constexpr (c == '}') {
return parse_tail<Args, arg_id_end_pos + 1, ID + 1>(
runtime_named_field<char_type>{arg_id_result.arg_id.val.name},
format_str);
} else if constexpr (c == ':') {
return unknown_format(); // no type info for specs parsing
}
}
}
} else if constexpr (str[POS] == '}') {
if constexpr (POS + 1 == str.size())
Expand Down
129 changes: 125 additions & 4 deletions test/compile-test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -139,13 +139,136 @@ TEST(CompileTest, FormatWideString) {
EXPECT_EQ(L"42", fmt::format(FMT_COMPILE(L"{}"), 42));
}

struct test_custom_formattable {};

FMT_BEGIN_NAMESPACE
template <> struct formatter<test_custom_formattable> {
enum class output_type { two, four } type{output_type::two};

FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) {
auto it = ctx.begin(), end = ctx.end();
while (it != end && *it != '}') {
++it;
}
auto spec = string_view(ctx.begin(), static_cast<size_t>(it - ctx.begin()));
auto tag = string_view("custom");
if (spec.size() == tag.size()) {
bool is_same = true;
for (size_t index = 0; index < spec.size(); ++index) {
if (spec[index] != tag[index]) {
is_same = false;
break;
}
}
type = is_same ? output_type::four : output_type::two;
} else {
type = output_type::two;
}
return it;
}

template <typename FormatContext>
auto format(const test_custom_formattable&, FormatContext& ctx) const
-> decltype(ctx.out()) {
return format_to(ctx.out(), type == output_type::two ? "{:>2}" : "{:>4}",
42);
}
};
FMT_END_NAMESPACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using one of the existing formatters such as duration formatter instead of introducing a new one here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem here is that the chrono::duration formatter is not ready to be used with compile-time API because of that format() constness requirement. Should I update it in this PR or the separate one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I update it in this PR or the separate one?

This PR is OK since it should be a small change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with the weirdest looking format string from chrono-test


TEST(CompileTest, FormatSpecs) {
EXPECT_EQ("42", fmt::format(FMT_COMPILE("{:x}"), 0x42));
EXPECT_EQ("42", fmt::format(FMT_COMPILE("{}"), test_custom_formattable()));
EXPECT_EQ(" 42",
fmt::format(FMT_COMPILE("{:custom}"), test_custom_formattable()));
}

TEST(CompileTest, DynamicWidth) {
struct test_dynamic_formattable {};

FMT_BEGIN_NAMESPACE
template <> struct formatter<test_dynamic_formattable> {
size_t amount = 0;
detail::arg_ref<char> width_refs[3];

FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) {
amount = static_cast<size_t>(*ctx.begin() - '0');
if (amount >= 1) {
width_refs[0] = detail::arg_ref<char>(ctx.next_arg_id());
}
if (amount >= 2) {
width_refs[1] = detail::arg_ref<char>(ctx.next_arg_id());
}
if (amount >= 3) {
width_refs[2] = detail::arg_ref<char>(ctx.next_arg_id());
}
return ctx.begin() + 1;
}

template <typename FormatContext>
auto format(const test_dynamic_formattable&, FormatContext& ctx) const
-> decltype(ctx.out()) {
int widths[3]{};
for (size_t i = 0; i < amount; ++i) {
detail::handle_dynamic_spec<detail::width_checker>(widths[i],
width_refs[i], ctx);
}
if (amount == 1) {
return format_to(ctx.out(), "{:{}}", 41, widths[0]);
} else if (amount == 2) {
return format_to(ctx.out(), "{:{}}{:{}}", 41, widths[0], 42, widths[1]);
} else if (amount == 3) {
return format_to(ctx.out(), "{:{}}{:{}}{:{}}", 41, widths[0], 42,
widths[1], 43, widths[2]);
} else {
throw format_error("formatting error");
}
}
};
FMT_END_NAMESPACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. duration formatter has dynamic field support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the previous one (about custom formatter) and this are not the same.
Yes, it has dynamic field support. But as far as I can see, it supports the same set of nested replacement fields as the default formatter, {:{}.{}}. So handling 2 dynamic fields for the default formatter would probably be enough to pass the test with chrono::duration formatter.
While this custom formatter has a custom syntax for nested replacement fields (non {:{}.{}}), and it has 3 of them. So handling default dynamic fields wouldn't be enough to pass the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to test the implementation of exotic formatter specializations here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with format string from chrono-test that uses dynamic specs


TEST(CompileTest, DynamicFormatSpecs) {
EXPECT_EQ(" 42foo ",
fmt::format(FMT_COMPILE("{:{}}{:{}}"), 42, 4, "foo", 5));
EXPECT_EQ(" 41",
fmt::format(FMT_COMPILE("{:1}"), test_dynamic_formattable(), 4));
EXPECT_EQ(" 41 42",
fmt::format(FMT_COMPILE("{:2}"), test_dynamic_formattable(), 3, 5));
EXPECT_EQ(" 41 42 43", fmt::format(FMT_COMPILE("{:3}"),
test_dynamic_formattable(), 5, 3, 4));
}

TEST(CompileTest, ManualOrdering) {
EXPECT_EQ("42", fmt::format(FMT_COMPILE("{0}"), 42));
EXPECT_EQ(" -42", fmt::format(FMT_COMPILE("{0:4}"), -42));
EXPECT_EQ("41 43", fmt::format(FMT_COMPILE("{0} {1}"), 41, 43));
EXPECT_EQ("41 43", fmt::format(FMT_COMPILE("{1} {0}"), 43, 41));
EXPECT_EQ("41 43", fmt::format(FMT_COMPILE("{0} {2}"), 41, 42, 43));
EXPECT_EQ(" 41 43", fmt::format(FMT_COMPILE("{1:{2}} {0:4}"), 43, 41, 4));
EXPECT_EQ("42 42",
fmt::format(FMT_COMPILE("{1} {0:custom}"),
test_custom_formattable(), test_custom_formattable()));
EXPECT_EQ(
"true 42 42 foo 0x1234 foo",
fmt::format(FMT_COMPILE("{0} {1} {2} {3} {4} {5}"), true, 42, 42.0f,
"foo", reinterpret_cast<void*>(0x1234), test_formattable()));
EXPECT_EQ(L"42", fmt::format(FMT_COMPILE(L"{0}"), 42));
}

TEST(CompileTest, Named) {
EXPECT_EQ("41 43", fmt::format(FMT_COMPILE("{name1} {name2}"),
fmt::arg("name1", 41), fmt::arg("name2", 43)));
EXPECT_EQ("41 43",
fmt::format(FMT_COMPILE("{} {name2}"), 41, fmt::arg("name2", 43)));
EXPECT_EQ("41 43",
fmt::format(FMT_COMPILE("{name1} {}"), fmt::arg("name1", 41), 43));
EXPECT_EQ("41 43",
fmt::format(FMT_COMPILE("{name1} {name2}"), fmt::arg("name1", 41),
fmt::arg("name2", 43), fmt::arg("name3", 42)));
EXPECT_EQ("41 43", fmt::format(FMT_COMPILE("{name2} {name1}"),
fmt::arg("name1", 43), fmt::arg("name2", 41)));

EXPECT_THROW(fmt::format(FMT_COMPILE("{invalid}"), fmt::arg("valid", 42)),
fmt::format_error);
}

TEST(CompileTest, FormatTo) {
Expand Down Expand Up @@ -174,9 +297,7 @@ TEST(CompileTest, TextAndArg) {
EXPECT_EQ("42!", fmt::format(FMT_COMPILE("{}!"), 42));
}

TEST(CompileTest, Empty) {
EXPECT_EQ("", fmt::format(FMT_COMPILE("")));
}
TEST(CompileTest, Empty) { EXPECT_EQ("", fmt::format(FMT_COMPILE(""))); }
#endif

#if FMT_USE_NONTYPE_TEMPLATE_PARAMETERS
Expand Down