Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the new f-string tokens in string formatting #7586

Merged
merged 1 commit into from
Sep 22, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 90 additions & 1 deletion crates/ruff_python_formatter/src/expression/string.rs
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,14 @@ impl<'a> FormatString<'a> {

impl<'a> Format<PyFormatContext<'_>> for FormatString<'a> {
fn fmt(&self, f: &mut PyFormatter) -> FormatResult<()> {
// TODO(dhruvmanila): With PEP 701, comments can be inside f-strings.
// This is to mark all of those comments as formatted but we need to
// figure out how to handle them.
if matches!(self.string, AnyString::FString(_)) {
f.context()
.comments()
.mark_verbatim_node_comments_formatted(self.string.into());
}
let locator = f.context().locator();
match self.layout {
StringLayout::Default => {
Expand Down Expand Up @@ -166,6 +174,60 @@ impl<'a> Format<PyFormatContext<'_>> for FormatString<'a> {
}
}

/// A builder for the f-string range.
///
/// For now, this is limited to the outermost f-string and doesn't support
/// nested f-strings.
#[derive(Debug, Default)]
struct FStringRangeBuilder {
start_location: TextSize,
end_location: TextSize,
nesting: u32,
}

impl FStringRangeBuilder {
fn visit_token(&mut self, token: &Tok, range: TextRange) {
match token {
Tok::FStringStart => {
if self.nesting == 0 {
self.start_location = range.start();
}
self.nesting += 1;
}
Tok::FStringEnd => {
// We can assume that this will never overflow because we know
// that the program once parsed to a valid AST which means that
// the start and end tokens for f-strings are balanced.
self.nesting -= 1;
if self.nesting == 0 {
self.end_location = range.end();
}
}
_ => {}
}
}

/// Returns `true` if the lexer is currently inside of a f-string.
///
/// It'll return `false` once the `FStringEnd` token for the outermost
/// f-string is visited.
const fn in_fstring(&self) -> bool {
self.nesting > 0
}

/// Returns the complete range of the previously visited f-string.
///
/// This method should only be called once the lexer is outside of any
/// f-string otherwise it might return an invalid range.
///
/// It doesn't consume the builder because there can be multiple f-strings
/// throughout the source code.
fn finish(&self) -> TextRange {
debug_assert!(!self.in_fstring());
TextRange::new(self.start_location, self.end_location)
}
}

struct FormatStringContinuation<'a> {
string: &'a AnyString<'a>,
}
Expand Down Expand Up @@ -195,6 +257,10 @@ impl Format<PyFormatContext<'_>> for FormatStringContinuation<'_> {
// because this is a black preview style.
let lexer = lex_starts_at(string_content, Mode::Expression, string_range.start());

// The lexer emits multiple tokens for a single f-string literal. Each token
// will have it's own range but we require the complete range of the f-string.
let mut fstring_range_builder = FStringRangeBuilder::default();

let mut joiner = f.join_with(in_parentheses_only_soft_line_break_or_space());

for token in lexer {
Expand Down Expand Up @@ -226,8 +292,31 @@ impl Format<PyFormatContext<'_>> for FormatStringContinuation<'_> {
}
};

fstring_range_builder.visit_token(&token, token_range);

// We need to ignore all the tokens within the f-string as there can
// be `String` tokens inside it as well. For example,
//
// ```python
// f"foo {'bar'} foo"
// # ^^^^^
// # Ignore any logic for this `String` token
// ```
//
// Here, we're interested in the complete f-string, not the individual
// tokens inside it.
if fstring_range_builder.in_fstring() {
continue;
}

match token {
Tok::String { .. } => {
Tok::String { .. } | Tok::FStringEnd => {
let token_range = if token.is_f_string_end() {
fstring_range_builder.finish()
} else {
token_range
};

// ```python
// (
// "a"
Expand Down
Loading