Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate using Profile-Guided Optimization (PGO) and LLVM BOLT #1227

Open
zamazan4ik opened this issue Oct 27, 2023 · 1 comment
Open

Evaluate using Profile-Guided Optimization (PGO) and LLVM BOLT #1227

zamazan4ik opened this issue Oct 27, 2023 · 1 comment

Comments

@zamazan4ik
Copy link

Hi!

Recently I checked Profile-Guided Optimization (PGO) improvements on many projects - all current results are available here. According to multiple tests, PGO can help with improving performance in many cases (including libraries like pydantic-core). Trying to optimize the Tensorflow Text library can be beneficial since it could reduce spent CPU time on routines like text preprocessing.

I can suggest the following action points:

  • Perform PGO benchmarks on Tensorflow Text. And if it shows improvements - add a note to the documentation about possible improvements in Tensorflow Text performance with PGO.
  • Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize Tensorflow Text according to their own workloads if they decide to rebuild Tensorflow Text for their own needs.
  • Optimize pre-built binaries (if it's possible to prepare or collect a good-enough training workload)

Since the Tensorflow Text native part (C++) is the library, I think the Pydantic-core experience can be reused here — also, Clang supports PGO for shared libraries. I think in this case possible to prepare some text preprocessing routines, collect the PGO profiles from them, and then use them as training PGO data.

Maybe testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.

Here are some examples of how PGO optimization is integrated in other projects:

Many of the examples above are applications but there should be a huge difference - PGO works well with libraries too.

@cantonios
Copy link
Collaborator

Text tokenization is not likely to be a bottleneck in any real-world model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants