-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Examine the distribution of Fonts compile times with fontmake and define "slow compile" #25
Comments
Thoughts about an appropriate environment to use for testing? |
@simoncozens What is your general sense about the consistency of build workflows across the Fonts catalog? Is there a way that we might automate the builds using a standardized build approach? |
Well, this is why we set up the googlefonts-project-template and specifically the gftools-builder, so that there would be a standard approach for upstreams. Here's a quick and dirty count of how many repositories are (probably) using that structure. |
I dont' know that we have to nail down the environment, just to be sure to capture what it was when reporting any result. Clear instructions on how to test build as much of Google Fonts as possible would be helpful. IIUC it's something like:
|
The new https://googlefonts.github.io/gf-guide is probably the best place to put docs about how to set up a clean machine. Then template repo can link there. @twardoch has a "font engineering toolbox" meta package or doc about all the kinds of things you want to have installed when jiving around here. Adam where's the best link for that? |
Thanks Dave, that looks like the missing link. Following how to build links from gf-guide led me to https://googlefonts.github.io/gf-guide/tools.md which curiously spews raw markdown at me, maybe it's malformed in a way that makes GH angry and tired? |
googlefonts-project-template is written in such a way (minimal dependencies, automatically installing any required Python modules into a virtual environments) that it shouldn't require any setup to work. But we should probably document that fact.
Fixed it. (Or rather, fixed the link, which should go to the rendered version instead.) |
One more thought: within the next few weeks I will be splitting the noto-source repository into per-project repositories. Each of these repos will have a standard build process, runnable both via CI and from a simple "make build" on the commandline. We will therefore have 158 standardised font projects, some large and some small, that we can use for benchmarking. |
Nice! How would one go about finding those Noto repos? Will they have upstream.yaml files in google/fonts or is some other mechanism required? |
I’ll ensure there’s a JSON feed somewhere off notofonts.github.io |
I spoke with Simon today. He plans to use the same set of build dependency versions across all Noto families and a consistent compile make target in each repository. Local execution of the compiles will be simple. I'll run the compile time tests on the Noto projects when the build workflow updates indicated in #25 (comment) are available. |
Does noto build with gfbuilder or some other mechanism? |
It will be building with an extension of gftools-builder; this is because Noto also has a range of artefacts (hinted and unhinted TTF, OTF and variable where possible, as well as builds that include subsets of Noto Sans/Serif). |
What does "an extension" look like? - it's very useful to be able to checkout based on upstream.yaml, find the config files [aside: wish they used a common name or location, when I tried grabbing upstream repos they did not seem to], and point the common builder at them. Can we could enhance the common builder to support Noto so it's consistent? |
Upstream repos should use Of course there's nothing stopping you using |
Can we could enhance the common builder to support Noto so it's consistent? If Noto needs many families can it simply have many config files? - I see other upstream repos doing that. For example, from a quick attempt to grab all the upstreams in google/fonts:
If I checkout all the repository_urls that work I end up with 293 directories. In those directories 121 yaml files that "look like" (have sources and familyName) gfbuilder configs exist.
EDIT: filed google/fonts#4772 requesting a curated list of interesting test cases. I'm out until mid-July for non-work reasons so I figure if we have that by end of July we're in good shape. Since we support Noto on fonts.google.com if we want to include Noto compile scenarios I would ideally like to see them work consistently: follow upstream.yaml to repo then do something consistent to build them. If the build process truly must differ for Noto them maybe upstream.yaml needs to advise me how to build them...? |
I don't think so. The two things Noto specifically needs are
The second one is obviously the bigger deal, and is very Noto specific. I don't see it as something that is at all useful for other GF library fonts, since we already require them to have a GF Latin core glyph set as part of their sources. (The Noto builder inherits from gftools.builder, so it's not completely doing its own thing.) (I will write up a doc explaining the thinking around the new Noto build/repo system soon. There is a design doc, but some of the ideas have developed over time. A lot of it is designed around the idea of how to make the tooling consistent and manageable when you have hundreds of repos; so for example, the point of using a separate
That's how it works; I thought we were talking about discovery of the build process and I was just saying that the config file should be named However, Noto does follows the consistent googlefonts-project-template approach of having a Makefile with a "make build" target, so you should be able to clone any googlefonts-project-template / noto-project-template repo and run I've added something to produce that JSON file now, too, so once we've pulled the switch, you can build all Noto with
|
Many of the fonts I build for GF have separate Latin sources (I almost never design Latin, so I use an existing Latin design and keep the sources separate for easy updates) and I know at least one other foundry that does the same, so such merge-sources-at-build-time should probably be generally available if it is going to be developed. |
Interesting, I had not imagined merge to be in scope for a fast compiler once the "make lots of statics and merge for VF" step was gone. @simoncozens noob question, where does it say to run
EDIT: I don't seem to see very many Makefile, |
Yeah, this is a different kind of merging. Rather than merging at the binary level, we're merging glyphs from some sources into the font at the UFO level, before the build happens. We do this because we need those glyphs to interact with layout. Take the case of Vedic marks - if you have a Sharada font with above and below anchors for your marks, you want to add the Vedic marks from the Devanagari font. But if you do that with, say, It's not something you would think of as "in scope for compilation", but it is something that needs to happen as part of the process of compiling Noto fonts.
The backstory is that there are kind of two levels of standardisation here. Once upon a time, upstreams could have whatever repo structure and whatever build scripts they wanted so long as they made fonts that worked. The first level of standardisation was to gather up all the build scripts and work out what they had in common; out of that we put together the gftools-builder as a replacement for ad hoc build scripts. The next level of standardisation was to provide a template repository structure so that designers could get their font projects up and running quickly (and with easy build steps and GitHub actions, could get on with designing fonts instead of messing about getting the builds working) and also that when onboarders came alongside the designers, they didn't have to spend time working out where everything was. So that's googlefonts-project-template. The Makefile is part of that repo structure, so any repos created out of So the reason you may be seeing |
(Argh, hit wrong button while editing) Possibly stupid question: Have we defined "compile"? I ask because the profile of different compilation jobs will look different. For example, in gftools-builder, we essentially run fontmake three times, to create variable font and static interpolated OTFs and static interpolated TTFs. There's an obvious speedup there in re-using the by-products (interpolated UFOs) of the OTF run to generate the TTFs. Creating variable fonts has fewer shared by-products compared with compiling statics, because compiling a variable font doesn't require UFO interpolation (which seems, just by gutfeel, to be a pretty slow step). I propose we test for two output profiles: just a variable TTF, and variable TTF + static OTF + static TTF. |
In another issue I called those a narrow build, and a fat build. Another interested case is an incremental build. |
Not a stupid question at all. That is very much the goal of the "define slow compile" part of this thread. We want to understand what the distribution of Python pipeline project compile times are (projects yet to be defined, #24 is an attempt to collect information about experiences with "slow compiles" using fontmake) and then investigate why the tails of the distribution take the times that they do. My understanding from an earlier discussion was that we might begin with a broad, multi-script set of projects that include multiple compiled artifacts per project (Noto). This may be an interesting place to start because the question why some projects compile "fast" and lie on the other end of the distribution might be informative too. |
Do we know if the full Noto catalog covers all of these cases? |
This isn't a factor of the font but a factor of the build process. Noto is currently built using a custom process which only does what we're calling fat builds, because we need a range of artefacts to suit the needs of Android, GF, Linux distros, etc. But I suggest for our data gathering we just use plain fontmake. If you run Now I come to think of it, I remember @madig has already put together a framework to do this data gathering - he had a script which downloaded and built a range of fonts using a variety of compile tools / versions / etc. https://daltonmaag.github.io/pipeline-perf-tracker/results/ |
For this issue I had intended to focus on times for specific fontmake invocations, as @simoncozens suggests above. |
Source inputs: full Noto project catalog? Or are you interested in a single distribution of compile times that includes both build processes on each set of project sources? |
Here's the Fat build disabled for now because of googlefonts/fontmake#912 Results coming in at e.g. https://github.com/simoncozens/time-font-compilation/actions/runs/2666743230 |
IIUC that basically says forget using upstream files or building a large set - something I had hoped would be possible due to upstream.yaml and standardized compilation - entirely and just pick a specific few? In effect it proposes an answer to google/fonts#4772? |
Fair enough; I was trying to make it work with CI where we only have six hours to get everything done. Of course if we run the tests locally, we can run for as long as we like. I'm out of the habit of thinking about building fonts locally. :-) |
OK, here's the build stats for all of Noto: Note that we don't really have hour-long monsters like Roboto Serif or whatever. Most are single-master files with less than a couple of hundred glyphs. |
Pretty convincingly linear except for the Urdu Nastaliq outlier. IIUC those times are compiles to VF format. Is the time relationship against glyphs * masters also linear for static instance compiles? And what is the time / (glyphs * masters) slope for static instance compiles relative to VF format compiles? (real time) / (glyphs * masters) seems like it could be a useful ratio for cross-project compile time performance optimization testing. |
I'm trying to graph this data as well, but bunch of the entries don't have "masters" or "glyphs". They are also clustered in your graph at zero. Eg. "kufi-arabic". |
Right, but that outlier means "All bets are off once your layout rules get very complicated." |
This is great to see! Forgive my density but how do I reproduce the timing collection? Is there a script or something to get https://gist.github.com/simoncozens/c896bd0fae2ae353b2fad63ca425dadd as measured on my machine? |
Here's my terrible build script: https://gist.github.com/simoncozens/173c9d35e28c1c6e43e58405d0c4695b |
It'll probably be linear in glyphs * instances, but I'll run a build. This is where things do get slow in Noto because we have a number of fonts with 18 instances or so. Interpolating instance UFOs takes a while. (Finishing off triangulate would lead to an obvious win here.) |
Here's the data and plot for static builds. https://gist.github.com/simoncozens/4b859f672e047b51da1f127ba250eae6
|
Time scale is min correct? It takes over 11 hours to compile all Noto Sans LGC statics sequentially? |
Seconds. |
You are compiling ~50% of all Noto project static instances in under 2 sec? Or am I misunderstanding your summary stat table? |
More than half of Noto fonts are single-instances with <200 glyphs. So yes, we get through most of them in almost no time. Noto is possibly a skewed data set. :-) But the more interesting data is in the bigger fonts, and we have, I think, shown a linear relationship. |
Define a test environment and analyze the distribution of Fonts catalog compile times with fontmake. We'll use the distribution to analyze outliers and improve our understanding of the concept of "slow". Slow might mean more font formats to compile, more outlines to compile, more complex layout to compile, more breadth of design space to compile, etc.
Related #24
The text was updated successfully, but these errors were encountered: