Skip to content

Commit

Permalink
Change Hash Code for Worksheet
Browse files Browse the repository at this point in the history
Fix PHPOffice#4192. Although that issue can be dealt with by changing user code, it would be better to fix it within PhpSpreadsheet. A cloned worksheet may have a pointer to a spreadsheet to which it is not attached. Code can assume it does belong to the spreadsheet, and throw an exception when the spreadsheet cannot find the worksheet in question. It may also not throw an exception when it should.

In my comments to the issue, I was concerned that adding in the needed protection would add overhead to an extremely common situation (setting a cell's value) in order to avoid a pretty rare problem. However, there are problems with both the accuracy and efficiency of the existing code, and I think any performance losses caused by the additional checks will be offset by the performance gains and accuracy of the new code.

Spreadsheet `getIndex` attempts to find the index of a worksheet within its spreadsheet collection. It does so by comparing the hash codes of each sheet in its collection with the hash code of the sheet it is looking for. Its major problem problem is performance-related, namely that it recomputes the hash code of the target sheet with each iteration.

A more severe problem is the accuracy of the hash code. It generates this by hashing together the sheet title, the string range of its auto-filter, and a character representation of whether sheet protection is enabled. Title should definitely be part of the calculation (it must be unique for all sheets attached to a spreadsheet), but it is not clear why this subset of the other properties of Worksheet is used. It tries to save some cycles by using a `dirty` property to indicate whether re-hashing is necessary. It sets that property whenever the title changes, or when `setProtection` is called. So, it doesn't set it when auto-filter changes, and you can easily bypass `setProtection` when changing any of the `Protection` properties. Not to mention the many other properties of worksheet that can be changed. Additionally, if you clone a worksheet, the clone and the original will have the same hash code, which can lead to problems:
```php
$clone = clone $original;
$spreadsheet->getSheet($spreadsheet->getIndex($clone))
    ->setCellValue('A1', 100);
```
That code will change the value of A1 in the original, not the clone.

The `hash` property in Worksheet will now be calculated immediately when the object is constructed or cloned or unserialized. It will not be recalculated, and there is no longer a need for the `dirty` property, which is removed. Hash will be generated by spl_object_id, which was designed for this purpose. (So was spl_object_hash, but many online references suggest that \_id performs much better than \_hash.) Our problem example above will now throw an Exception, as it should, rather than changing the wrong cell. `setValueExplicit`, the problem in the original issue, will now test that the worksheet is attached to the spreadsheet before doing any style manipulation. In order that this not be a breaking change, `getHashCode` will continue to return string, but it is deprecated in favor of `getHashInt`, and Worksheet will no longer implement IComparable to facilitate the deprecation.

I had a vague hope that this change might help with issue PHPOffice#641. It doesn't.
  • Loading branch information
oleibman committed Oct 31, 2024
1 parent 8799a04 commit 2f445a1
Show file tree
Hide file tree
Showing 6 changed files with 92 additions and 24 deletions.
2 changes: 1 addition & 1 deletion src/PhpSpreadsheet/Cell/Cell.php
Original file line number Diff line number Diff line change
Expand Up @@ -324,7 +324,7 @@ public function setValueExplicit(mixed $value, string $dataType = DataType::TYPE
self::updateIfCellIsTableHeader($this->getParent()?->getParent(), $this, $oldValue, $value);
$worksheet = $this->getWorksheet();
$spreadsheet = $worksheet->getParent();
if (isset($spreadsheet)) {
if (isset($spreadsheet) && $spreadsheet->getIndex($worksheet, true) >= 0) {
$originalSelected = $worksheet->getSelectedCells();
$activeSheetIndex = $spreadsheet->getActiveSheetIndex();
$style = $this->getStyle();
Expand Down
4 changes: 2 additions & 2 deletions src/PhpSpreadsheet/ReferenceHelper.php
Original file line number Diff line number Diff line change
Expand Up @@ -921,7 +921,7 @@ private function updateNamedRange(DefinedName $definedName, Worksheet $worksheet
{
$cellAddress = $definedName->getValue();
$asFormula = ($cellAddress[0] === '=');
if ($definedName->getWorksheet() !== null && $definedName->getWorksheet()->getHashCode() === $worksheet->getHashCode()) {
if ($definedName->getWorksheet() !== null && $definedName->getWorksheet()->getHashInt() === $worksheet->getHashInt()) {
/**
* If we delete the entire range that is referenced by a Named Range, MS Excel sets the value to #REF!
* PhpSpreadsheet still only does a basic adjustment, so the Named Range will still reference Cells.
Expand All @@ -940,7 +940,7 @@ private function updateNamedRange(DefinedName $definedName, Worksheet $worksheet

private function updateNamedFormula(DefinedName $definedName, Worksheet $worksheet, string $beforeCellAddress, int $numberOfColumns, int $numberOfRows): void
{
if ($definedName->getWorksheet() !== null && $definedName->getWorksheet()->getHashCode() === $worksheet->getHashCode()) {
if ($definedName->getWorksheet() !== null && $definedName->getWorksheet()->getHashInt() === $worksheet->getHashInt()) {
/**
* If we delete the entire range that is referenced by a Named Formula, MS Excel sets the value to #REF!
* PhpSpreadsheet still only does a basic adjustment, so the Named Formula will still reference Cells.
Expand Down
8 changes: 6 additions & 2 deletions src/PhpSpreadsheet/Spreadsheet.php
Original file line number Diff line number Diff line change
Expand Up @@ -660,13 +660,17 @@ public function getSheetByNameOrThrow(string $worksheetName): Worksheet
*
* @return int index
*/
public function getIndex(Worksheet $worksheet): int
public function getIndex(Worksheet $worksheet, bool $noThrow = false): int
{
$wsHash = $worksheet->getHashInt();
foreach ($this->workSheetCollection as $key => $value) {
if ($value->getHashCode() === $worksheet->getHashCode()) {
if ($value->getHashInt() === $wsHash) {
return $key;
}
}
if ($noThrow) {
return -1;
}

throw new Exception('Sheet does not exist.');
}
Expand Down
2 changes: 1 addition & 1 deletion src/PhpSpreadsheet/Worksheet/BaseDrawing.php
Original file line number Diff line number Diff line change
Expand Up @@ -414,7 +414,7 @@ public function getHashCode(): string
return md5(
$this->name
. $this->description
. (($this->worksheet === null) ? '' : $this->worksheet->getHashCode())
. (($this->worksheet === null) ? '' : (string) $this->worksheet->getHashInt())
. $this->coordinates
. $this->offsetX
. $this->offsetY
Expand Down
34 changes: 16 additions & 18 deletions src/PhpSpreadsheet/Worksheet/Worksheet.php
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@
use PhpOffice\PhpSpreadsheet\Comment;
use PhpOffice\PhpSpreadsheet\DefinedName;
use PhpOffice\PhpSpreadsheet\Exception;
use PhpOffice\PhpSpreadsheet\IComparable;
use PhpOffice\PhpSpreadsheet\ReferenceHelper;
use PhpOffice\PhpSpreadsheet\RichText\RichText;
use PhpOffice\PhpSpreadsheet\Shared;
Expand All @@ -32,7 +31,7 @@
use PhpOffice\PhpSpreadsheet\Style\Protection as StyleProtection;
use PhpOffice\PhpSpreadsheet\Style\Style;

class Worksheet implements IComparable
class Worksheet
{
// Break types
public const BREAK_NONE = 0;
Expand Down Expand Up @@ -305,15 +304,10 @@ class Worksheet implements IComparable
*/
private ?Color $tabColor = null;

/**
* Dirty flag.
*/
private bool $dirty = true;

/**
* Hash.
*/
private string $hash;
private int $hash;

/**
* CodeName.
Expand Down Expand Up @@ -355,6 +349,7 @@ public function __construct(?Spreadsheet $parent = null, string $title = 'Worksh
$this->autoFilter = new AutoFilter('', $this);
// Table collection
$this->tableCollection = new ArrayObject();
$this->hash = spl_object_id($this);
}

/**
Expand Down Expand Up @@ -383,6 +378,12 @@ public function __destruct()
unset($this->rowDimensions, $this->columnDimensions, $this->tableCollection, $this->drawingCollection, $this->chartCollection, $this->autoFilter);
}

public function __wakeup(): void
{
$this->hash = spl_object_id($this);
$this->parent = null;
}

/**
* Return the cell collection.
*/
Expand Down Expand Up @@ -897,7 +898,6 @@ public function setTitle(string $title, bool $updateFormulaCellReferences = true

// Set title
$this->title = $title;
$this->dirty = true;

if ($this->parent && $this->parent->getCalculationEngine()) {
// New title
Expand Down Expand Up @@ -1032,7 +1032,6 @@ public function getProtection(): Protection
public function setProtection(Protection $protection): static
{
$this->protection = $protection;
$this->dirty = true;

return $this;
}
Expand Down Expand Up @@ -3014,7 +3013,7 @@ private function validateNamedRange(string $definedName, bool $returnNullIfInval

if ($namedRange->getLocalOnly()) {
$worksheet = $namedRange->getWorksheet();
if ($worksheet === null || $this->getHashCode() !== $worksheet->getHashCode()) {
if ($worksheet === null || $this->hash !== $worksheet->getHashInt()) {
if ($returnNullIfInvalid) {
return null;
}
Expand Down Expand Up @@ -3154,17 +3153,15 @@ public function garbageCollect(): static
}

/**
* Get hash code.
*
* @return string Hash code
* @deprecated 3.5.0 use getHashInt instead.
*/
public function getHashCode(): string
{
if ($this->dirty) {
$this->hash = md5($this->title . $this->autoFilter . ($this->protection->isProtectionEnabled() ? 't' : 'f') . __CLASS__);
$this->dirty = false;
}
return (string) $this->hash;
}

public function getHashInt(): int
{
return $this->hash;
}

Expand Down Expand Up @@ -3493,6 +3490,7 @@ public function __clone()
}
}
}
$this->hash = spl_object_id($this);
}

/**
Expand Down
66 changes: 66 additions & 0 deletions tests/PhpSpreadsheetTests/Worksheet/CloneTest.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
<?php

declare(strict_types=1);

namespace PhpOffice\PhpSpreadsheetTests\Worksheet;

use PhpOffice\PhpSpreadsheet\Exception as SpreadsheetException;
use PhpOffice\PhpSpreadsheet\Spreadsheet;
use PhpOffice\PhpSpreadsheet\Worksheet\Worksheet;
use PHPUnit\Framework\TestCase;

class CloneTest extends TestCase
{
public function testUnattachedIndex(): void
{
$spreadsheet = new Spreadsheet();
$sheet1 = $spreadsheet->getActiveSheet();
$sheet1->getCell('A1')->setValue(10);
$sheet2 = clone $sheet1;
$sheet2->getCell('A1')->setValue(20);
self::assertSame(0, $spreadsheet->getIndex($sheet1));
$idx = $spreadsheet->getIndex($sheet2, true);
self::assertSame(-1, $idx);
$sheet2->setTitle('clone');
$spreadsheet->addSheet($sheet2);
$idx = $spreadsheet->getIndex($sheet2, true);
self::assertSame(1, $idx);
self::assertSame(10, $spreadsheet->getSheet(0)->getCell('A1')->getValue());
self::assertSame(20, $spreadsheet->getSheet(1)->getCell('A1')->getValue());
$spreadsheet->disconnectWorksheets();
}

public function testGetCloneIndex(): void
{
$this->expectException(SpreadsheetException::class);
$this->expectExceptionMessage('Sheet does not exist');
$spreadsheet = new Spreadsheet();
$sheet1 = $spreadsheet->getActiveSheet();
$sheet1->getCell('A1')->setValue(10);
$sheet2 = clone $sheet1;
$spreadsheet->getSheet($spreadsheet->getIndex($sheet2))
->setCellValue('A1', 100);
}

public function testSerialize1(): void
{
// If worksheet attached to spreadsheet, can't serialize it.
$this->expectException(SpreadsheetException::class);
$this->expectExceptionMessage('cannot be serialized');
$spreadsheet = new Spreadsheet();
$sheet1 = $spreadsheet->getActiveSheet();
serialize($sheet1);
}

#[\PHPUnit\Framework\Attributes\RunInSeparateProcess]
public function testSerialize2(): void
{
$sheet1 = new Worksheet();
$sheet1->getCell('A1')->setValue(10);
$serialized = serialize($sheet1);
/** @var Worksheet */
$newSheet = unserialize($serialized);
self::assertSame(10, $newSheet->getCell('A1')->getValue());
self::assertNotEquals($newSheet->getHashInt(), $sheet1->getHashInt());
}
}

0 comments on commit 2f445a1

Please sign in to comment.