Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Absolute cell references in formula not updated when inserting/removing rows and columns - Inconsistent with Excel #3368

Closed
1 of 8 tasks
fdjohnston opened this issue Feb 15, 2023 · 7 comments

Comments

@fdjohnston
Copy link
Contributor

This is:

- [x] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

When inserting rows or columns that alter the position of cells referenced absolutely in a formula, the formula is not updated by PHPSpreadsheet. In Excel, absolute references are updated.

What is the expected behavior?

Absolute references should be updated when rows/columns are inserted/removed.

What is the current behavior?

Absolute references are not updated when rows/columns are inserted/removed. Normal references are updated.

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();

$sheet1 = $spreadsheet->getActiveSheet();
$sheet1->getCell('A1')->setValue(0.24);
$sheet1->getCell('A2')->setValue(0.1);
$sheet1->getCell('A3')->setValue('=$A$1+$A$2');
$sheet1->insertNewRowBefore(1, 2);

var_dump($sheet1->getCell('A5')->getValue());

To align with the behaviour in Excel, the value of A5 should be =$A$3+$A$4.

What features do you think are causing the issue

  • Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Issue appears to be in ReferenceHelper. The updateFormulaReferences function includes an optional param for including absolute references when updating references, however it doesn't appear to ever get set to true.

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

I expect so since this is occurring without writing or opening a file.

Which versions of PhpSpreadsheet and PHP are affected?

1.27.0

@MarkBaker
Copy link
Member

The whole point of absolute references is that they don't change on insert/delete of rows/columns; and that has been my understanding for over 20 years. PhpExcel, and then PhpSpreadsheet has always tried to replicate this behaviour.
Now you're telling me that MS Excel does change absolute references.

Sorry; you've just broken me

@fdjohnston
Copy link
Contributor Author

Sorry, @MarkBaker !

My understanding was that absolute references would remain the same when used in formulas that are dragged or copied to other cells, unlike relative references which will get updated. If absolute references don't update when the structure of the worksheet changes, any addition of columns or rows would instantly break all absolute references, would it not?

I don't want to call your last 20 years of understanding into question (lol), but I just tested Libre Office and Open Office and they both behave the same way.

@MarkBaker
Copy link
Member

I don't want to call your last 20 years of understanding into question (lol), but I just tested Libre Office and Open Office and they both behave the same way.

It's the craziness of believing something for 20 years and then suddenly learning that I've been wrong all that time.
PhpSpreadsheet has extra code to ensure that Absolute References aren't changed, has unit tests to verify that Absolute References remain unchanged; because that was what I believed.

I just suddenly feel so freaking stupid!

Now I need to figure out how to resolve this... it's simple enough to do, but will be a bc break... and while there's been a bc-breaking v2.0 that was almost ready for release (with full handling for Excel's array formulae, among other features) which I can piggy-back for the bc break, that 2.0 version is in major code conflict with the current 1.x code

@fdjohnston
Copy link
Contributor Author

Never a nice feeling; though please don't be too hard on yourself (just saw your Tweets about this issue)! I suppose one of the wonderfully frustrating things about being a developer is that (much like these references) absolute certainty sometimes turns out not to be so absolute 😃

Understanding that 2.0 is coming soon, what does the support roadmap look like for 1.x? Is this something you think is worth fixing in both versions?
I'd be happy to try to help on either version, though I have more experience with 1.x.

@MarkBaker
Copy link
Member

Never a nice feeling; though please don't be too hard on yourself (just saw your Tweets about this issue)! I suppose one of the wonderfully frustrating things about being a developer is that (much like these references) absolute certainty sometimes turns out not to be so absolute 😃

There is a big difference between discovering edge cases, such as how certain Excel functions handle a string literal differently to a string from a cell, and something as fundamental as absolute references though; to the point of having code in place that explicitly handled that belief.

Understanding that 2.0 is coming soon, what does the support roadmap look like for 1.x? Is this something you think is worth fixing in both versions? I'd be happy to try to help on either version, though I have more experience with 1.x.

2.0 isn't fundamentally different, but it does have certain new features (such as the array function handling with the @ implicit intersection operator, spillage, and the # spill operator) that required a bc break (to Value Binders, and I know it will affect developers that use custom Binders). But because that is a bc break, I've taken the opportunity to rationalise a few other idiosyncracies, like removing the historic _ from method/property names, and rationalising case in method names.

I'm already 8 months late with my original sel-imposed deadline; and unfortunately, it's now a long way out of synch with the changes and fixes that have been happening on the 1.x version, that a merge results in nearly 500 files worth of merge conflicts... it will probably take a couple of full working weeks just to resynch it with those changes from 1.x

The roadmap was to maintain new features only in 2.0, and bugfixes in the two branches in parallel until take-up of 2.0 has reached a high enough threshold (>75%) to drop 1.x
I'd estimated that to be about 4-6 months, based on packagist version download figures.

PhpSpreadsheet Version Download Statistics

Maintaining 2 branches would be a lot of extra work; but IMO worth it for the array function handling alone.

Although I am still worried by the number of downloads even in the last month of versions of PhpSpreadsheet going back to the original 1.0 release, and by the number of users downloading PhpSpreadsheet 1.27 against unsupported PHP versions

PhpSpreadsheet PHP Version Download Statistics
(That 0.3% still equates to 300 downloads)

@MarkBaker
Copy link
Member

The same applies to Absolute Cell ranges/references in Named Ranges and Named Formulae

@MarkBaker
Copy link
Member

I still need to investigate all the other references that are updated by the ReferenceHelper (page breaks, groups, autofilter, freeze pane, etc) but PR #3402 should fix the update for all formula calculations (Cell formulae, Named Formulae and Named Ranges)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants