Moving formatting into a delegate #1913

hgrecco · 2024-01-04T01:54:33Z

The purpose of this issue is to have a central place to discuss everything related to formatting in 0.24

In the next version of Pint, all formatting operations are going to be moved into a Delegate. As expressed in #1895"

string formatting is becoming more and more complex as users request different (reasonable) means to represent quantities (both magnitude and units). Right now, formatting is intertwined with the registry. Adding more configuration settings increases the registry API surface and makes it harder to maintain and upgrade. The plan is to create a Formatter Delegate. All formatting settings will work directly on the delegate (for a yet unspecified number of versions, we will map this to existing registry functions for backward compatibility.). Also, users will be able to build their own formatter and replace the existing one.

Part of this work has been merged into master, and the rest is being tested in develop.

A few important things:

Pint version 0.24 will be backwards compatible in relation to formatting.
Old expected behavior should not change.
New behavior might be introduced. e.g. Implement sorting units by dimension order #1864
Clear bugs can be fixed. e.g. format_babel doesn't use comma seperator from babel locale #1849, Format doesn't respect width #1798
Existing functions/properties have been rewriting to point to the new API (and marked deprecated).

Please add other issues, or PRs

As of 9e0789c

All test are passing (except doc building).
Docs are not building. (maybe due to 3)
The separate_format_defaults is not implemented.

Please provide your insights or comments about the API and overall organization.

The text was updated successfully, but these errors were encountered:

hgrecco · 2024-01-04T02:28:25Z

@keewis I would really like your feedback on this, particularly into the separate_format_defaults.

keewis · 2024-01-09T21:37:56Z

I think I need to understand your plans a bit better to be able to comment on them. In particular, I'd like to understand how you imagine the delegates to be used (the implementation I can slowly try to understand once that's done), and how this impacts the use of the register_unit_format decorator. (As an aside, I like to think that starting with the way it is used helps a lot with API design).

I'd also recommend thinking about the different ways to customize the format string now: this would make it much easier to make room for features like the "custom modifiers" I had a PR for (which unfortunately is stale now).

The separate_format_defaults was a way to gracefully transition from one way of interpreting the spec to a different one, and as with any deprecation it would be possible to remove it at some point. So I wouldn't worry about that for now, we can put it back later if we think it would break too many people's code.

hgrecco · 2024-01-09T22:22:13Z

My idea is to start pushing composition over inheritance as much as I can in Pint. (I also would like to discuss the idea of adopting protocols, but that is another discussion!)

For the end users, there would not be much of a change. In the next versions, no change at all. Later, certain configurations are going to be within ureg.formatter (i.e. ureg.set_fmt_locale -> ureg.formatter.set_locale).

For the advanced users, it would allow them to swap the formatter for their own by anything that implements format_quantity and format_unit (and maybe format_measurement and maybe format_magnitude). See here This allows them to implement the desired format in an isolated way.

class MyFormatter:
    """Here goes the code
    """ 

ureg.formatter = MyFormatter()

For pint developers, it allows them to better organize the code but having all formatting things in a single place. Also, it could simplify testing because we could in certain tests replace the more complex formatter by something simpler.

We can make this work with register_unit_format by hooking this decorator with the default Formatter object.

hgrecco · 2024-01-09T22:24:55Z

By the way @MichaelTiemannOSC was also trying to implement certain formatting improvements that I think should be easier to implement with this delegate.

keewis · 2024-01-10T15:00:33Z

If I understand correctly, then, you're planning to move (and refactor) the code that is currently in the registry to the default formatter object. With the effect that instead of hard-coding the format parsing and the call of the appropriate formatting function, you'd even be able to customize those?

hgrecco · 2024-01-10T15:55:48Z

If I understand correctly, then, you're planning to move (and refactor) the code that is currently in the registry to the default formatter object.

Exactly, this is partially done in develop. The Formatting facet is gone, and the formatter delegate is in place.

If you desire a specific output format, you can simply exchange the formatter object easily by a custom without creating a new formatting string.

If you want to create a new spec, you still can.

hgrecco · 2024-01-15T05:25:18Z

I was testing a few ideas. A few comments:

The idea of the formatter is quite nice. I allow you to a lot of things wihout hacking pint source code. Example:

import pint

from pint.delegates.formatter.base_formatter import BaseFormatter

class NewFormatter(BaseFormatter):

    def format_quantity(self, quantity, spec: str = "") -> str:
          return "Magnitude: {}\nUnits: {}".format(quantity.m, quantity.u)

ureg = pint.UnitRegistry()
ureg.formatter = NewFormatter()

print(str(3 * ureg.meter))

returns

Magnitude: 3
Units: meter

Some units formatters modify not only the units but also the magnitudes. For example, pretty printing will change * by ×.
format_magnitude is a nice addition.

MichaelTiemannOSC · 2024-01-15T07:44:19Z

The failure of the docs build is not due to #1864. I think it is due to things changing in Sphinx that break various pins within pint.

The changes to allow sorting by dimension order can easily be overriden in child classes as well (dim_order now being just an instance variable of BaseFormatter).

hgrecco · 2024-01-18T04:57:04Z

This is just to show some of the progress made with the new formatter. The new organization not only is cleaner, but some things are fixed automatically. For example:

Old version

 	 en_US 	 3.14 meters²
 	 es_ES 	 3.14 metros²
C 	 en_US 	 3.14 meters²
C 	 es_ES 	 3.14 metros²
P 	 en_US 	 3.14 meters²
P 	 es_ES 	 3.14 metros²
L 	 en_US 	 3.14 \mathrm{meter}^{2}
L 	 es_ES 	 3.14 \mathrm{meter}^{2}
Lx 	 en_US 	 3.14 \si[]{\meter\squared}
Lx 	 es_ES 	 3.14 \si[]{\meter\squared}

New version

 	 en_US 	 3.14 meters ** 2
 	 es_ES 	 3,14 metros ** 2
C 	 en_US 	 3.14 meters**2
C 	 es_ES 	 3,14 metros**2
P 	 en_US 	 3.14 meters²
P 	 es_ES 	 3,14 metros²
L 	 en_US 	 3.14\ \mathrm{meters}^{2}
L 	 es_ES 	 3,14\ \mathrm{metros}^{2}
Lx 	 en_US 	 3.14\ \si[]{\meters\squared}
Lx 	 es_ES 	 3,14\ \si[]{\metros\squared}

Notice how L and Lx now work properly in relation with unit names (they were not handled well before).
Notice also how all numbers have the right decimal separator!
Another thing I noticed after trying different architectures, that locale awareness cannot be just and addition. If you have a babel and non-babel classes, you end up with too many of them. That is why babel related keywords are now in all methods. Formatters are still very easy to build, the example above will work. babel will still be optional, but locale keywords will be needed. Without babel, number formatting will be affected by locale but not the unit names.

Take a look at develop ... (test are failing but it is shaping nicely)

hgrecco · 2024-01-19T06:19:04Z

Fewer and fewer tests are failing, but now there is a decision to be made. While I have been trying to make everything backwards compatible, it is becoming hard to keep the same string representation for numbers. As I mentioned before, I am fixing the localized version (that is a change, but a valuable one). The previous behavior was broken.

But there is one more: number of digits.

Are we ok with changing how numbers are formatted?

Until now, numbers were formatted into strings sometimes with str(value) except when a spec was provided (or a numpy array was being formatted). But this leads to unexpected differences in the way numbers are shown. Also, this is not localizable. The way to make numbers localizable with format is using the n type.

But:

>>> str(24.0)
24.0
>>> format(24.0, "n")
24
>>> str(1.234567890)
'1.23456789'
>>> format(1.234567890, "n")
'1.23457'

In other words, using the n format changes the way magnitudes (and quantities) are printed. We could create a function to handle all the cases, but is a pain in the neck. Also, using n requires changing the locale which is not thread safe AFAIK.

Another way would be to use babel format_decimal. On very nice thing about this is that it does not require changing the locale to make it work and therefore is fully thread safe. But AFAIK it uses the Locale Data Markup Language specification and not Python's Format Specification Mini-Language. Users will require to learn yet another thing.

Options:

Use format with n as the default.
🟢 works for all python numerical types out of the box
🟢 localizable,
🟢 builtin, less code to maintain
🔴 changes current representation
🔴 not thread safe
🟢 does not require learning a new mini language
Use format with distinct defaults for float and int
🟡 works for all numerical python types, with distinct code paths.
🟢 localizable,
🔴 requires building a specific function
🟡 some representations will work, others not (corner cases)
🔴 not thread safe
🟢 does not require learning a new mini language
Use babel's format_decimal
🟢 works for all numerical python types out of the box
🟢 localizable,
🟢 in babel, less code to maintain
🟡 maybe changes current representation
🟢 not thread safe
🔴 require learning a new mini language

Another option will be exploring translating python mini language to babel (is something available?). Or use a package that implements a localizable with arguments format if available.

Ps.- @MichaelTiemannOSC I have a great way to sort units in mind. Simple, easy to implement and that will available in all formatters out of the box.

MichaelTiemannOSC · 2024-01-19T07:13:18Z

I read through you questions and your recent commits. I don't have the knowledge myself to answer them, but perhaps a release candidate with the option you think is cleanest (#3?) can get enough user feedback to provide more data.

I look forward to seeing the new sorting idea you have in mind!

dalito · 2024-01-19T08:29:57Z

I am slightly in favor of option 3 but I haven't used babel localization much. So there may be implications that I don't foresee.

hgrecco · 2024-01-20T06:10:02Z

@dalito I think it might be possible to generate UCUM notation! Let's stabilize the API and see.
@MichaelTiemannOSC good idea. I think we sshould release an alpha or beta version and invite people to try it

hgrecco · 2024-01-20T06:15:44Z

🎊 All test are passing

Disclaimer: I had to changed a few tests

Proper localization of number 24.1 in localized fr_FR is 24,1
es_AR was changed to the more common es_ES
In the empty format vatios por centímetros should be vatios / centímetros
everything related to default_format and missing specific unit format was labeled as xfail until the behavior is clear

There is still work to do:

docstring
documenation
clean up and organize code
implement sorting
solve/decide number related issue (see above)

I will merge this into main tomorrow and continue working.

hgrecco · 2024-01-20T20:35:30Z

Performance

I was a little worried about the performance hit. But, it actually got faster for most cases.

Formatting 3.4 * ureg.meter ** 2 / ureg.second * ureg.kg

locale - fmt	develop / 0.23
None - D	0.96
None - D~	0.83
None - P	0.99
None - P~	0.84
None - C	0.99
None - C~	0.84
None - H	0.91
None - H~	0.86
None - L	1.01
None - L~	0.93
None - Lx	1.08
None - Lx~	1.09

Comparing a different locale or numeric modifiers is not fair because 0.23 was not complying with these codes for all cases.

hgrecco · 2024-01-22T17:33:16Z

I started to update the documentation: https://pint.readthedocs.io/en/develop/user/formatting.html

hgrecco · 2024-01-22T18:02:53Z

Is there a NIST rule for pluralization of compound units?

2 meter / second
2 meters / second
2 meters / seconds

I think the correct way is 2, but I was not able to find this in the NIST handbook.

hgrecco · 2024-01-22T18:29:01Z

@MichaelTiemannOSC you will notice in develop that the formatter function now has a sort_func argument. This takes any function from Iterable[tuple[str, Number]] to Iterable[tuple[str, Number]].

We need to find a good way to expose this. One option is to have an order or sorting or similar configuration flag in the formatter.

ureg.formatter.order = None
ureg.formatter.order = "alpha"
ureg.formatter.order = "ISO80000"
ureg.formatter.order = sort
ureg.formatter.order = some_function_that_I_made

MichaelTiemannOSC · 2024-01-22T18:34:13Z

My reading of this section (https://www.nist.gov/pml/special-publication-811/nist-guide-si-chapter-9-rules-and-style-conventions-spelling-unit-names#97) says:

Unit names have different pluralization rules (which follow English grammar) as compared to unit symbols (which are not pluralized, since ms should be read as the symbol for milliseconds not the plural of meters).
When it comes to pluralizing names, the idioms I've learned to pluralize is that the we only pluralize the last unit name (so Newton meters not Newtons meter nor Newton meter) except in the denominator, where we never pluralize (because denominators are presumed to be singular). Thus Newton meters per second not Newton meters per seconds.
If the numerator is clearly singlular (1 meter / second) then we don't say 1 meters / second). But if not, (X meters / second), we might say X meters / second => 1 meter/second when X==1.

hgrecco · 2024-01-22T18:41:20Z

How about this: If the magnitude is larger than 1, then pluralize the last unit in the numerator.

MichaelTiemannOSC · 2024-01-22T18:42:42Z

How about this: If the magnitude is larger than 1, then pluralize the last unit in the numerator.

Not quite:

Half a meter per second vs. 0.5 meters per second

hgrecco · 2024-01-22T18:52:57Z

You are right.

More questions:

Do we always pluralize or just when localizing?
How / or per is selected? By format type (e.g. pretty print uses per but default uses /) or if localization occurs?
Does pluralization works the same in 3 meter / second than in 3 meter second^-1?

MichaelTiemannOSC · 2024-01-22T19:14:42Z

You are right.

More questions:

Do we always pluralize or just when localizing?

How / or per is selected? By format type (e.g. pretty print uses per but default uses /) or if localization occurs?

Does pluralization works the same in 3 meter / second than in 3 meter second^-1?

My advice on (1) is to let the respective open source localization experts answer for their locale. I understand that it becomes a political question to ask "when localizing?" because in some sense, any standard must be read within the context of some locale. But I think we'll get feedback fairly quickly on when and how pluralization rules apply in the default case vs. various locales. As long as we have the pluralization infrastructure we can implement what's needed.

On (2), NIST gives this rule for / vs. per: https://www.nist.gov/pml/special-publication-811/nist-guide-si-chapter-9-rules-and-style-conventions-spelling-unit-names#98

On (3), NIST says that both are degenerate. Reflecting on my own idiomatic interpretation, I would say that mathematical formulation creates an entity syntactically distinct from the quantity. That syntactic entity, which is mathematical in nature (meter / second or meter second^-1), does not take plural form. But in the case where the mathematical form is identical to natural english (e.g., meter by itself), then we treat that as part of a spoken rather than formulaic construction, and we say 3 meters rather than 3 meter. When we use per instead of /, it's a spoken expression, not a mathematical one, so English idioms apply: 3 meters per second.

dalito · 2024-01-22T19:18:10Z

I would also say "just when localizing". German has interestingly no plural for units.

hgrecco · 2024-01-22T19:39:33Z

In relation to 1, could be something like this

When locale is None: English number formatting and canonical unit names
When locale is not None: Locale number formatting and translated unit names, pluralizing when appropriate (TBD)

>>> import pint
>>> ureg = pint.UnitRegistry()
>>> q = 2.3 * ureg.meter / ureg.second
>>> q
<Quantity(2.3, 'meter / second')>
>>> ureg.formatter.set_locale(None)
>>> str(q)
'2.3 meter / second'
>>> ureg.formatter.set_locale("en_US")
>>> str(q)
'2.3 meters / second'
>>> ureg.formatter.set_locale("de_DE")
>>> str(q)
'2,3 Meter / Sekunden'

MichaelTiemannOSC · 2024-01-22T20:15:52Z

"Pluralizing where appropriate" meaning '2.3 meters per second' in en_US.

dalito · 2024-01-22T20:25:30Z

The correct German version would be '2,3 Meter / Sekunde' (without plural-n at the end).

hgrecco · 2024-01-22T20:34:53Z

The correct German version would be '2,3 Meter / Sekunde' (without plural-n at the end).

Sorry about that!

hgrecco · 2024-01-22T20:35:19Z

"Pluralizing where appropriate" meaning '2.3 meters per second' in en_US.

This what I am not sure. I am pretty sure that in locale is None, / is the right thing. i.e. I think that locale is None is like the definition file: nglish number formatting, canonical unit names and standard math ops.

But do we make per dependent on the formatting code or always true for locale is not None

For example, could be that

>>> import pint
>>> ureg = pint.UnitRegistry()
>>> q = 2.3 * ureg.meter / ureg.second
>>> q
<Quantity(2.3, 'meter / second')>
>>> ureg.formatter.set_locale("en_US")
>>> "{q:D}" # default
'2.3 meters / second'
>>> "{q:P}" # pretty
>>> str(q)
'2.3 meters per second'

dalito · 2024-01-22T20:35:23Z

But thinking about it some units are pluralized, time units for example when in nominator, e.g. 1 Sekunde but 2 Sekunden. Hopefully babel knows the rules. I don't know them just the language.

hgrecco · 2024-01-22T20:36:23Z

(sorry for editing your comments, I press the wrong icon!)

hgrecco · 2024-01-22T20:38:00Z

But thinking about it some units are pluralized, time units for example when in nominator, e.g. 1 Sekunde but 2 Sekunden. Hopefully babel knows the rules.

I looked at babel and it kind of knows, but not quite. It also does not uses Pythons' string formatting mini language. Actually took part of its code and adapt it. But I can look more into it.

dalito · 2024-01-22T20:43:23Z

Don't worry! Getting German right should not hold this PR.

hgrecco · 2024-01-22T20:53:51Z

But thinking about it some units are pluralized, time units for example when in nominator, e.g. 1 Sekunde but 2 Sekunden. Hopefully babel knows the rules.

I looked at babel and it kind of knows, but not quite. It also does not uses Pythons' string formatting mini language. Actually took part of its code and adapt it. But I can look more into it.

In simple terms:

Works really well for single units
Works well for some compound units (unit A / unit B)
Works very bad for all other compound units

That is why what I have done is pick the single unit translator and use it to build a compound translation.

hgrecco · 2024-01-22T22:41:22Z

@MichaelTiemannOSC you will notice in develop that the formatter function now has a sort_func argument. This takes any function from Iterable[tuple[str, Number]] to Iterable[tuple[str, Number]].

We need to find a good way to expose this. One option is to have an order or sorting or similar configuration flag in the formatter.
ureg.formatter.order = None
ureg.formatter.order = "alpha"
ureg.formatter.order = "ISO80000"
ureg.formatter.order = sort
ureg.formatter.order = some_function_that_I_made

I just realized that this will not work. The high level API is ok (we need to bikeshed naming, etc). But internally wont work. Now, pint does the following things.

If "~" is present, generate the short form version of the units
If locale is present, translate units (includes pluralization)
If sorting is requested, sort
Build the unit string

This is fine, as long as pluralization works on all units or no units. But it has to work in only one determined by its position (for example in the last unit of the numerator if there are multiple units), then this will not work and localization will need to be done in two steps

If "~" is present, generate the short form version of the units
If locale is present, translate units (includes pluralization)
If sorting is requested, sort
If locale is present, pluralize according to rule.
Build the unit string

hgrecco · 2024-01-22T23:10:56Z

All test are passing (including doc building and doctest). I am merging this into master (even if we still need to decide about certain things!)

MichaelTiemannOSC · 2024-01-22T23:30:44Z

That's great news. I have once again confused myself and/or my own fork of Pint due to my limited understanding of Git. I will try to sort that out and get a fresh PR against the latest master. Git is my friend 99% of the time, and then 1% of the time it just stabs me in the back. Again, and again, and again.

andrewgsavage · 2024-01-22T23:50:08Z

came accross this module which may be worth supporting for formatting in future https://sciform.readthedocs.io/en/stable/

hgrecco · 2024-01-23T00:44:05Z

That's great news. I have once again confused myself and/or my own fork of Pint due to my limited understanding of Git. I will try to sort that out and get a fresh PR against the latest master. Git is my friend 99% of the time, and then 1% of the time it just stabs me in the back. Again, and again, and again.

Been there. I feel your pain!

MichaelTiemannOSC · 2024-01-23T03:05:14Z

I'm looking to unify my #1841 test case with some of the new patterns I see in the latest pint test suite:

def test_issues_1841(subtests):
    import pint

    ur = UnitRegistry()

    for x, spec, result in (
            (ur.Unit(UnitsContainer(hour=1,watt=1)), "P~", "W·h"),
            (ur.Unit(UnitsContainer(ampere=1,volt=1)), "P~", "V·A"),
            (ur.Unit(UnitsContainer(meter=1,newton=1)), "P~", "N·m"),
    ):
        with subtests.test(spec):
            ur.default_format = spec
            breakpoint()
            assert f"{x}" == result, f"Failed for {spec}, {result}"

To make this work, we cannot pass sort_func from the point of use ("{x}"), but we could set ur.sort_func.

Recapping: all the GenericSPECIALRegistries inherit from GenericPlainRegistry, which sets self.formatter = pint.delegates.Formatter(). And that starts life as FullFormatter, which has the aforementioned default_format instance variable. FullFormatter manages the delegation to the various SUBformatters based on spec. And the SUBformatters access default_format not via inheritance, but by climbing back up through the registry to registry.formatter.default_format.

Since all comparable units have to belong to the same registry, we could define a similar instance variable default_sort_func that the various SUBformatters consistently use to properly output compound units. Otherwise, I see forcing users to do a lot of work to enable/disable dimensional sorting. I'm going to take a crack at that approach, but feel free to interject if you think I'm missing something.

hgrecco · 2024-01-23T19:06:43Z

In my opinion, Formatters only need to have certain functions: format_X, with X being magnitude, quantity, etc. Being
modeled after format, each should have at least two arguments in addition to self: the actual object and the corresponding spec string.

If somebody writes :

class SecretFormatter:

      def format_quantity(self, quantity, spec):
            return "**********"

should work. This makes it very easy to replace the formatter for custom cases, prototype ideas, etc.

In my opinion, everything else is an extra and should not be required by the user, nor Pint core (although it should be provided by default Pint Formatter, see below)

This is not the case now, due to backwards compatibility issues (i.e. to able to do ureg.default_format = at the registry level you need to make sure that the default formatter has a default_format attribute. The same with locale. And also due to the format_babel methods.)

I was also not able to make simple architecture in which is:

fast
easy to read
and keep the signature simple as this: format_quantity(self, quantity, spec)

The Formatter shipped by Pint by default should be very close to the current one (FullFormatter):

there is a way to specify a default format (done)
there is a way to specify a default locale and babel length (done)
there is a way to specify a default sorting order (in your PR)

Not sure if the SubFormatters (which are Formatters themselves :-D) should have this as well. Now they do not have a locale or format or babel_length, they get this through arguments. Should we add one more (sort_func)?

MichaelTiemannOSC · 2024-01-23T19:13:13Z

As you can see from the PR, quantity (and unit) have pathways to registries, which encodes tremendous power and flexibility. I needed to do some magic to make formatter work with arguments stripped of these powers (because all the arguments are reduced to strings and formatting strings). Feel free to comment on the lines of code within the PR about what makes you happy and what makes you sad.

hgrecco · 2024-01-23T19:35:25Z

@MichaelTiemannOSC The PR looks great. Indeed, certain features require access to the registry (the same is true for "~" spec which requires a short form of the unit). I think this is unavoidable. I was thinking to have a function that works as a guard and can be reused.

def get_required_registry(obj):
    try:
         return obj._REGISTRY
    except Exception:
         raise ValueError( < a nicely written error msg >")

I thought about this a lot. Maybe we should make the formatter only available to registry bounded objects ... but in certain cases you want to format an unbounded object and is fine.

MichaelTiemannOSC · 2024-01-23T19:53:18Z

If the formatter took an optional registry parameter that would make this PR much simpler. I'll work that up to show you.

hgrecco · 2024-01-23T19:55:26Z

If the formatter took an optional registry parameter that would make this PR much simpler. I'll work that up to show you.

The problem is doing it at the registry level is that then every Formatter needs to implement this option. That is why I was promotting somethign along the lines of

ureg.formatter.sort_func = XXX

MichaelTiemannOSC · 2024-01-23T19:58:20Z

If the formatter took an optional registry parameter that would make this PR much simpler. I'll work that up to show you.

The problem is doing it at the registry level is that then every Formatter needs to implement this option. That is why I was promotting somethign along the lines of

ureg.formatter.sort_func = XXX

OK...so related to that, since all units must come from the same registry, we can read the registry from the first item that has a meaningful registry (dimensionless might not...and it might not matter if it's the only "dimension"). I'll see if there's something I can still simplify around that idea.

hgrecco · 2024-01-23T20:03:13Z

You are right, if we sort there we have no registry. Take a look at format_compound_unit (

pint/pint/delegates/formatter/_format_helpers.py

Line 236 in 3cc2d36

def format_compound_unit(

) Maybe we could figure out together a better way to achieve this remembering that we must do:

If "~" is present, generate the short form version of the units
If locale is present, translate units (includes pluralization)
If sorting is requested, sort
If locale is present, pluralize according to rule.
Build the unit string

I decided to split it but maybe it was the wrong call and we should put 1 to 4 together.

keewis · 2024-01-23T20:24:11Z

I'd probably combine 2 and 4 together (i.e. apply locale), which would be mutually exclusive to 1. With that we can first sort (unless you need to sort alphabetically by the unit instead of its dimensions?), format the individual units and finally put together the full string:

sort if requested
format units

either: construct short form version of the units
or: translate units and pluralize according to the locale (might need to have knowledge about the full unit expression)

build the unit string

MichaelTiemannOSC · 2024-01-23T20:47:49Z

I've merged sorting into format_compound_unit after translation (so that sorting by alpha works in the target language). I think it really belongs there: #1926.

What I don't understand is why doc build is now failing...shouldn't there be an easy way for me to see that I should merge from upstream if I'm now out-of-date? Anyways...comments welcome.

hgrecco · 2024-03-11T03:42:24Z

I just pushed to develop a new version of the formatter delegate. The main advantage is that now pluralization of localized units works as expected in an easy-to-understand workflow.

It also incorporates the ability to sort units. The current sorting options are: None, by show name (i.e. localized), by unit name (non localized) and by dimensions.

Not sure how this should be handled from the user side. Options are:

defined by the formatting code "D", "C", etc ...
configurable
2.a. configurable via formatter attribute
2.b. extra argument to (some) Formatter function + default 2a
2.c. new formatting codes (e.g. $n, $s, $u, $d) + default 2a

I still need to work to speed up the code.

hgrecco · 2024-03-14T03:39:34Z

What are your thoughts on this @MichaelTiemannOSC @keewis @dalito @andrewgsavage ?

andrewgsavage · 2024-04-09T21:17:16Z

It needs to be distinct from the formatting code as you may wish for a different order to the sorted version.

My preference is for a configurable default option, that can be overwridden when needed (ie when units don't come out in the order you want). So your option 2c

I think it is worth adding a 'reversed' option too; that would be the most obvious way to reorder 2 units when they're not in order. Maybe this could be a modifier?
Could also do something for 3 units, eg for units a, b, c the user would have an ordering they want. so have a character (a-e below) in the formatting string to get to the desired order:

a -> b, c, a
b -> c, a, b
c -> b, a, c
d -> c, b, a
e -> a, c, b

andrewgsavage · 2024-04-09T21:24:48Z

another option could be to have a list of common 'unit pairs' that should be kept together, eg VA, Wh

maybe it would need to be dimensionality pairs to also catch Ws

atravert · 2024-07-23T17:31:57Z

Hello,

Can you let me know whether it is still possible to change the behaviour of default formatters in pint 0.24 ?

in previous versions we did so like this:

from pint import formatting

del formatting._FORMATTERS["P"]

@formatting.register_unit_format("P")
def format_pretty(unit, registry, **options):
 return formatting.formatter(
     unit.items(),
     as_ratio=False,   # default is True
     single_denominator=False,
     product_fmt=".",
     division_fmt="/",
     power_fmt="{}{}",
     parentheses_fmt="({})",
      exp_call=formatting._pretty_fmt_exponent,
     )

in pint 0.24 this does not work anymore. I can still apparently remove the default formatter :
del formatting.REGISTERED_FORMATTERS["P"]

but nothing changes whatever custom format I put in register_unit_format("P")...

keewis · 2024-07-28T06:54:03Z

that's because the default formatter delegate, FullFormatter, has its own registry of formatters:

pint/pint/delegates/formatter/full.py

Lines 72 to 79 in 67303b8

    
           self._formatters = {} 
        
           self._formatters["raw"] = RawFormatter(registry) 
        
           self._formatters["D"] = DefaultFormatter(registry) 
        
           self._formatters["H"] = HTMLFormatter(registry) 
        
           self._formatters["P"] = PrettyFormatter(registry) 
        
           self._formatters["Lx"] = SIunitxFormatter(registry) 
        
           self._formatters["L"] = LatexFormatter(registry) 
        
           self._formatters["C"] = CompactFormatter(registry)

. To modify existing behavior, you'd have to remove that particular formatter, as well:

import pint
from pint import formatting

del formatting.REGISTERED_FORMATTERS["P"]

@pint.register_unit_format("P")
def format_pretty(unit, registry, **options):
    return formatting.formatter(
        unit.items(),
        as_ratio=False,   # default is True
        single_denominator=False,
        product_fmt=".",
        division_fmt="/",
        power_fmt="{}{}",
        parentheses_fmt="({})",
        exp_call=formatting._pretty_fmt_exponent,
    )

ureg = pint.UnitRegistry()
del ureg.formatter._formatters["P"]
q = ureg.Quantity(1, "m")
f"{q:~P}"

Not sure if this is a supported use case, though.

atravert · 2024-07-30T21:36:53Z

Thank you very much for your reply @keewis. It works, but indeed a dirty hack, which causes later problems in my case (because ureg.formatter._formatters[“P”] is permanently deleted).

So, as suggested in the documentation, I have subclassed the default formatters as follows:

from pint import UnitRegistry
from pint.delegates.formatter.plain import PrettyFormatter
from pint.delegates.formatter.full import FullFormatter
from pint.delegates.formatter._compound_unit_helpers import prepare_compount_unit
from pint.delegates.formatter._format_helpers import formatter, pretty_fmt_exponent



class MyPrettyFormatter(PrettyFormatter):

    def format_unit(self, unit, uspec, sort_func, **babel_kwds) -> str:
        numerator, denominator = prepare_compount_unit(
            unit,
            uspec,
            sort_func=sort_func,
            **babel_kwds,
            registry=self._registry,
        )

        return formatter(
            numerator,
            denominator,
            as_ratio=False,
            single_denominator=False,
            product_fmt=" ",
            division_fmt="/",
            power_fmt="{}{}",
            parentheses_fmt=r"({})",
            exp_call=pretty_fmt_exponent
        )


class MyFullFormatter(FullFormatter):

    default_format: str = "~P"

    def __init__(self, registry: UnitRegistry | None = None):
        super().__init__(registry)

        self._formatters = {}
        self._formatters["P"] = MyPrettyFormatter(registry)


ur = UnitRegistry()
ur.formatter = MyFullFormatter(ur)
ur.formatter._registry = ur


q = 1 * ur.meter ** 2 / ur.second

assert f"{q}" == "1.0 m² s⁻¹"
assert f"{q: ~D}" == ' 1.0 m ** 2 / s'

and this solved my problem. Thank you again for your fast reply.

dalito mentioned this issue Jan 20, 2024

Add pint-to-ucum conversion dalito/ucumvert#11

Draft

andrewgsavage mentioned this issue Jan 21, 2024

package maintanence lmfit/uncertainties#180

Open

hgrecco mentioned this issue Jan 23, 2024

custom formatters: pass unit modifiers to the formatter #1448

Open

4 tasks

dalito mentioned this issue Apr 28, 2024

add serialization hampusnasstrom/ontopint#2

Open

Moving formatting into a delegate #1913

Moving formatting into a delegate #1913

Comments

hgrecco commented Jan 4, 2024

hgrecco commented Jan 4, 2024

keewis commented Jan 9, 2024 • edited Loading

hgrecco commented Jan 9, 2024

hgrecco commented Jan 9, 2024

keewis commented Jan 10, 2024

hgrecco commented Jan 10, 2024

hgrecco commented Jan 15, 2024

MichaelTiemannOSC commented Jan 15, 2024

hgrecco commented Jan 18, 2024

hgrecco commented Jan 19, 2024 • edited Loading

MichaelTiemannOSC commented Jan 19, 2024

dalito commented Jan 19, 2024 • edited Loading

hgrecco commented Jan 20, 2024

hgrecco commented Jan 20, 2024 • edited Loading

hgrecco commented Jan 20, 2024

hgrecco commented Jan 22, 2024

hgrecco commented Jan 22, 2024

hgrecco commented Jan 22, 2024 • edited Loading

MichaelTiemannOSC commented Jan 22, 2024

hgrecco commented Jan 22, 2024

MichaelTiemannOSC commented Jan 22, 2024

hgrecco commented Jan 22, 2024 • edited Loading

MichaelTiemannOSC commented Jan 22, 2024

dalito commented Jan 22, 2024

hgrecco commented Jan 22, 2024

MichaelTiemannOSC commented Jan 22, 2024 • edited by hgrecco Loading

dalito commented Jan 22, 2024 • edited by hgrecco Loading

hgrecco commented Jan 22, 2024

hgrecco commented Jan 22, 2024

dalito commented Jan 22, 2024 • edited Loading

hgrecco commented Jan 22, 2024

hgrecco commented Jan 22, 2024

dalito commented Jan 22, 2024 • edited Loading

hgrecco commented Jan 22, 2024 • edited Loading

hgrecco commented Jan 22, 2024

hgrecco commented Jan 22, 2024

MichaelTiemannOSC commented Jan 22, 2024

andrewgsavage commented Jan 22, 2024

hgrecco commented Jan 23, 2024

MichaelTiemannOSC commented Jan 23, 2024

hgrecco commented Jan 23, 2024

MichaelTiemannOSC commented Jan 23, 2024

hgrecco commented Jan 23, 2024

MichaelTiemannOSC commented Jan 23, 2024

hgrecco commented Jan 23, 2024

MichaelTiemannOSC commented Jan 23, 2024

hgrecco commented Jan 23, 2024 • edited Loading

keewis commented Jan 23, 2024 • edited Loading

MichaelTiemannOSC commented Jan 23, 2024

hgrecco commented Mar 11, 2024

hgrecco commented Mar 14, 2024

andrewgsavage commented Apr 9, 2024

andrewgsavage commented Apr 9, 2024

atravert commented Jul 23, 2024 • edited Loading

keewis commented Jul 28, 2024

atravert commented Jul 30, 2024

keewis commented Jan 9, 2024 •

edited

Loading

hgrecco commented Jan 19, 2024 •

edited

Loading

dalito commented Jan 19, 2024 •

edited

Loading

hgrecco commented Jan 20, 2024 •

edited

Loading

hgrecco commented Jan 22, 2024 •

edited

Loading

hgrecco commented Jan 22, 2024 •

edited

Loading

MichaelTiemannOSC commented Jan 22, 2024 •

edited by hgrecco

Loading

dalito commented Jan 22, 2024 •

edited by hgrecco

Loading

dalito commented Jan 22, 2024 •

edited

Loading

dalito commented Jan 22, 2024 •

edited

Loading

hgrecco commented Jan 22, 2024 •

edited

Loading

hgrecco commented Jan 23, 2024 •

edited

Loading

keewis commented Jan 23, 2024 •

edited

Loading

atravert commented Jul 23, 2024 •

edited

Loading