-
Notifications
You must be signed in to change notification settings - Fork 65
StringFormat Explained
This class is not a "Swiss Army Knife" library like regex. But it's a lot simpler to use and produces more readable code for simple yet mundane string parsing and manipulation tasks.
Without explanation, see if you can intuitively guess what the following code does?
Optional<LogFile> log =
new StringFormat("/home/{usr}/log/{year}/{month}/{day}/job-{shard_id}.log")
.parse(
logFileName,
(usr, year, month, day, shardId) ->
LogFile.builder()
.setUser(usr)
.setDate(parseInt(year), parseInt(month), parseInt(day))
.setShard(shardId)
.build());
Yeah just trust your intuitition, it does exactly what it looks like doing!
(Starting from v6.7, there is a convenient parseOrThrow()
method that throws if the input can't be parsed, with reasonably informative error message.)
Sometimes you may be searching for sub-patterns from the input string and the sub-pattern may occur 0, 1 or multiple times. You can use the scan()
method for these use cases. For example, if there are multiple breakpoint specs from the input string:
List<Breakpoint> breakpoints =
new StringFormat("breakpoint: {line={line}, color={color}}")
.scan(inputString, Breakpoint::new)
.collect(toList());
Both the parse()
and scan()
methods have overloads that support from 1 to 6 placeholders.
You can also post-filter to ignore matches that don't satisfy a post-condition. For example, if you want to ignore invalid breakpoint specs, just return null for the invalid matches:
List<Breakpoint> breakpoints =
new StringFormat("breakpoint: {line={line}, color={color}}")
.scan(
inputString,
(line, color) ->
isNumeric(line) && isValidColor(color) ? new Breakpoint(line, color) : null)
.collect(toList());
If you use bazel as your build tool, compile-time check is provided out of box.
If you use Maven, we strongly recommend adding both ErrorProne and the mug-errorprone
plugin to your annotationProcessor
paths. For example:
<build>
<pluginManagement>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<annotationProcessorPaths>
<path>
<groupId>com.google.errorprone</groupId>
<artifactId>error_prone_core</artifactId>
<version>2.23.0</version>
</path>
<path>
<groupId>com.google.mug</groupId>
<artifactId>mug-errorprone</artifactId>
<version>6.7</version>
</path>
</annotationProcessorPaths>
</configuration>
</plugin>
</plugins>
</pluginManagement>
</build>
This plugin checks against common programming errors including:
- The number of lambda parameters doesn't match the number of format placeholders
- The names of the lambda parameters don't match the placeholders
With the compile-time checks, you can safely define StringFormat
as private class constants and reference them many lines away.
The compile-time checks make the StringFormat.format()
method a safer alternative to String.format()
(it's faster too). For example:
private static final StringFormat JOB_ID_FORMAT = "{project_id}@{location}:{job}";
// 200 lines later
.setJobId(JOB_ID_FORMAT.format(projectId, location, job));
Compared to String.format()
, the benefits are:
- The format string is more human readable with the placeholder names.
- The
StringFormat
can be defined as class constant and safely reused across the file, because the compile-time check ensures that the format arguments match the placeholder names. You can't pass the wrong number or pass them in the wrong order!
Combining the parsing and formatting capability, you can round-trip between pojo and string formats.