diff --git a/CHANGELOG.md b/CHANGELOG.md index 69d4c9ef03..f03be98331 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,11 +9,12 @@ and this project adheres to [Semantic Versioning](https://semver.org). ### Added -- Nothing yet. +- Add Dynamic valueBinder Property to Spreadsheet and Readers. [Issue #1395](https://github.com/PHPOffice/PhpSpreadsheet/issues/1395) [PR #4185](https://github.com/PHPOffice/PhpSpreadsheet/pull/4185) +- Allow Omitting Chart Border. [Issue #562](https://github.com/PHPOffice/PhpSpreadsheet/issues/562) [PR #4188](https://github.com/PHPOffice/PhpSpreadsheet/pull/4188) ### Changed -- Nothing yet. +- Refactor Xls Reader. [PR #4118](https://github.com/PHPOffice/PhpSpreadsheet/pull/4118) ### Deprecated @@ -31,6 +32,8 @@ and this project adheres to [Semantic Versioning](https://semver.org). - SUMIFS Does Not Require xlfn. [Issue #4182](https://github.com/PHPOffice/PhpSpreadsheet/issues/4182) [PR #4186](https://github.com/PHPOffice/PhpSpreadsheet/pull/4186) - Image Transparency/Opacity with Html Reader Changes. [Discussion #4117](https://github.com/PHPOffice/PhpSpreadsheet/discussions/4117) [PR #4142](https://github.com/PHPOffice/PhpSpreadsheet/pull/4142) - Option to Write Hyperlink Rather Than Label to Csv. [Issue #1412](https://github.com/PHPOffice/PhpSpreadsheet/issues/1412) [PR #4151](https://github.com/PHPOffice/PhpSpreadsheet/pull/4151) +- Invalid Html Due to Cached Filesize. [Issue #1107](https://github.com/PHPOffice/PhpSpreadsheet/issues/1107) [PR #4184](https://github.com/PHPOffice/PhpSpreadsheet/pull/4184) +- Excel 2003 Allows Html Entities. [Issue #2157](https://github.com/PHPOffice/PhpSpreadsheet/issues/2157) [PR #4187](https://github.com/PHPOffice/PhpSpreadsheet/pull/4187) ## 2024-09-29 - 3.3.0 (no 3.0.\*, 3.1.\*, 3.2.\*) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index e89e99ec5a..67aa525c60 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -7,7 +7,7 @@ If you would like to contribute, here are some notes and guidelines: - The code must work with all PHP versions that we support. - You can call `composer versions` to test version compatibility. - Code style should be maintained. - - `composer style` will identify any issues with Coding Style`. + - `composer style` will identify any issues with Coding Style. - `composer fix` will fix most issues with Coding Style. - All code changes must be validated by `composer check`. - Please include Unit Tests to verify that a bug exists, and that this PR fixes it. diff --git a/docs/topics/Behind the Mask.md b/docs/topics/Behind the Mask.md index 9648f52dfa..ba77fcc4e9 100644 --- a/docs/topics/Behind the Mask.md +++ b/docs/topics/Behind the Mask.md @@ -117,7 +117,10 @@ If you wish to emulate the MS Excel behaviour, and automatically convert string You can do this by changing the Value Binder, which will then apply every time you set a Cell value. ```php +// Old method using static property Cell::setValueBinder(new AdvancedValueBinder()); +// Preferred method using dynamic property since 3.4.0 +$spreadsheet->setValueBinder(new AdvancedValueBinder()); // Set Cell C21 using a formatted string value $worksheet->getCell('C20')->setValue('€ -12345.6789'); diff --git a/docs/topics/The Dating Game.md b/docs/topics/The Dating Game.md index 75726614d2..5b0c2812ff 100644 --- a/docs/topics/The Dating Game.md +++ b/docs/topics/The Dating Game.md @@ -257,7 +257,10 @@ $spreadsheet = new Spreadsheet(); $worksheet = $spreadsheet->getActiveSheet(); // Use the Advanced Value Binder so that our string date/time values will be automatically converted // to Excel serialized date/timestamps +// Old method using static property Cell::setValueBinder(new AdvancedValueBinder()); +// Preferred method using dynamic property since 3.4.0 +$spreadsheet->setValueBinder(new AdvancedValueBinder()); // Write our data to the worksheet $worksheet->fromArray($projectHeading); diff --git a/docs/topics/accessing-cells.md b/docs/topics/accessing-cells.md index d377da899d..a9fe3695de 100644 --- a/docs/topics/accessing-cells.md +++ b/docs/topics/accessing-cells.md @@ -518,15 +518,15 @@ style information. The following example demonstrates how to set the value binder in PhpSpreadsheet: ```php -/** PhpSpreadsheet */ -require_once 'src/Boostrap.php'; - -// Set value binder +// Older method using static property \PhpOffice\PhpSpreadsheet\Cell\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); - // Create new Spreadsheet object $spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet(); +// Preferred method using dynamic property since 3.4.0 +$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet(); +$spreadsheet->setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); + // ... // Add some data, resembling some different data types $spreadsheet->getActiveSheet()->setCellValue('A4', 'Percentage value:'); @@ -555,13 +555,20 @@ $stringValueBinder->setNumericConversion(false) ->setBooleanConversion(false) ->setNullConversion(false) ->setFormulaConversion(false); +// Older method using static property \PhpOffice\PhpSpreadsheet\Cell\Cell::setValueBinder( $stringValueBinder ); +// Preferred method using dynamic property since 3.4.0 +$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet(); +$spreadsheet->setValueBinder( $stringValueBinder ); ``` You can override the current binder when setting individual cell values by specifying a different Binder to use in the Cell's `setValue()` or the Worksheet's `setCellValue()` methods. ```php $spreadsheet = new Spreadsheet(); +// Old method using static property Cell::setValueBinder(new AdvancedValueBinder()); +// Preferred method using dynamic property since 3.4.0 +$spreadsheet->setValueBinder(new AdvancedValueBinder()); $value = '12.5%'; diff --git a/docs/topics/reading-files.md b/docs/topics/reading-files.md index 8f74978d34..4aba019ff7 100644 --- a/docs/topics/reading-files.md +++ b/docs/topics/reading-files.md @@ -755,14 +755,18 @@ So using a Value Binder allows a great deal more flexibility in the loader logic when reading unformatted text files. ```php -/** Tell PhpSpreadsheet that we want to use the Advanced Value Binder **/ -\PhpOffice\PhpSpreadsheet\Cell\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); - $inputFileType = 'Csv'; $inputFileName = './sampleData/example1.tsv'; $reader = \PhpOffice\PhpSpreadsheet\IOFactory::createReader($inputFileType); $reader->setDelimiter("\t"); + +/** Tell PhpSpreadsheet that we want to use the Advanced Value Binder **/ +// Old method using static property +\PhpOffice\PhpSpreadsheet\Cell\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); +// Preferred method using dynamic property since 3.4.0 +$reader::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); + $spreadsheet = $reader->load($inputFileName); ``` @@ -774,7 +778,7 @@ Loading using a Value Binder applies to: Reader | Y/N |Reader | Y/N |Reader | Y/N ----------|:---:|--------|:---:|--------------|:---: Xlsx | NO | Xls | NO | Xml | NO -Ods | NO | SYLK | NO | Gnumeric | NO +Ods | NO | SYLK | YES | Gnumeric | NO CSV | YES | HTML | YES Note that you can also use the Binder to determine how PhpSpreadsheet identified datatypes for values when you set a cell value without explicitly setting a datatype. diff --git a/docs/topics/recipes.md b/docs/topics/recipes.md index 670714c84c..b5a3927039 100644 --- a/docs/topics/recipes.md +++ b/docs/topics/recipes.md @@ -233,7 +233,10 @@ method that suits you the best. Here are some examples: ```php // MySQL-like timestamp '2008-12-31' or date string +// Old method using static property \PhpOffice\PhpSpreadsheet\Cell\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); +// Preferred method using dynamic property since 3.4.0 +$spreadsheet->setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); $spreadsheet->getActiveSheet() ->setCellValue('D1', '2008-12-31'); @@ -599,7 +602,10 @@ when it sees a newline character in a string that you are inserting in a cell. Just like Microsoft Office Excel. Try this: ```php +// Old method using static property \PhpOffice\PhpSpreadsheet\Cell\Cell::setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); +// Preferred method using dynamic property since 3.4.0 +$spreadsheet->setValueBinder( new \PhpOffice\PhpSpreadsheet\Cell\AdvancedValueBinder() ); $spreadsheet->getActiveSheet()->getCell('A1')->setValue("hello\nworld"); ``` diff --git a/samples/Chart33a/33_Chart_create_area.php b/samples/Chart33a/33_Chart_create_area.php index c6e349aa14..4e6fd4ec46 100644 --- a/samples/Chart33a/33_Chart_create_area.php +++ b/samples/Chart33a/33_Chart_create_area.php @@ -91,6 +91,8 @@ $chart->setTopLeftPosition('A7'); $chart->setBottomRightPosition('H20'); +$chart->setNoBorder(true); + // Add the chart to the worksheet $worksheet->addChart($chart); diff --git a/src/PhpSpreadsheet/Cell/Cell.php b/src/PhpSpreadsheet/Cell/Cell.php index d4e08b639e..1e353db913 100644 --- a/src/PhpSpreadsheet/Cell/Cell.php +++ b/src/PhpSpreadsheet/Cell/Cell.php @@ -112,8 +112,11 @@ public function __construct(mixed $value, ?string $dataType, Worksheet $workshee $dataType = DataType::TYPE_STRING; } $this->dataType = $dataType; - } elseif (self::getValueBinder()->bindValue($this, $value) === false) { - throw new SpreadsheetException('Value could not be bound to cell.'); + } else { + $valueBinder = $worksheet->getParent()?->getValueBinder() ?? self::getValueBinder(); + if ($valueBinder->bindValue($this, $value) === false) { + throw new SpreadsheetException('Value could not be bound to cell.'); + } } $this->ignoredErrors = new IgnoredErrors(); } @@ -232,7 +235,8 @@ protected static function updateIfCellIsTableHeader(?Worksheet $workSheet, self */ public function setValue(mixed $value, ?IValueBinder $binder = null): self { - $binder ??= self::getValueBinder(); + // Cells?->Worksheet?->Spreadsheet + $binder ??= $this->parent?->getParent()?->getParent()?->getValueBinder() ?? self::getValueBinder(); if (!$binder->bindValue($this, $value)) { throw new SpreadsheetException('Value could not be bound to cell.'); } diff --git a/src/PhpSpreadsheet/Chart/Chart.php b/src/PhpSpreadsheet/Chart/Chart.php index dae43c95d0..6bcae28c87 100644 --- a/src/PhpSpreadsheet/Chart/Chart.php +++ b/src/PhpSpreadsheet/Chart/Chart.php @@ -106,6 +106,8 @@ class Chart private bool $noFill = false; + private bool $noBorder = false; + private bool $roundedCorners = false; private GridLines $borderLines; @@ -696,6 +698,18 @@ public function setNoFill(bool $noFill): self return $this; } + public function getNoBorder(): bool + { + return $this->noBorder; + } + + public function setNoBorder(bool $noBorder): self + { + $this->noBorder = $noBorder; + + return $this; + } + public function getRoundedCorners(): bool { return $this->roundedCorners; diff --git a/src/PhpSpreadsheet/Reader/BaseReader.php b/src/PhpSpreadsheet/Reader/BaseReader.php index 218e69fbd5..de80834d19 100644 --- a/src/PhpSpreadsheet/Reader/BaseReader.php +++ b/src/PhpSpreadsheet/Reader/BaseReader.php @@ -2,6 +2,7 @@ namespace PhpOffice\PhpSpreadsheet\Reader; +use PhpOffice\PhpSpreadsheet\Cell\IValueBinder; use PhpOffice\PhpSpreadsheet\Exception as PhpSpreadsheetException; use PhpOffice\PhpSpreadsheet\Reader\Exception as ReaderException; use PhpOffice\PhpSpreadsheet\Reader\Security\XmlScanner; @@ -56,6 +57,8 @@ abstract class BaseReader implements IReader protected ?XmlScanner $securityScanner = null; + protected ?IValueBinder $valueBinder = null; + public function __construct() { $this->readFilter = new DefaultReadFilter(); @@ -242,4 +245,16 @@ public function listWorksheetNames(string $filename): array return $returnArray; } + + public function getValueBinder(): ?IValueBinder + { + return $this->valueBinder; + } + + public function setValueBinder(?IValueBinder $valueBinder): self + { + $this->valueBinder = $valueBinder; + + return $this; + } } diff --git a/src/PhpSpreadsheet/Reader/Csv.php b/src/PhpSpreadsheet/Reader/Csv.php index c481fe1ab9..33bee9cdfa 100644 --- a/src/PhpSpreadsheet/Reader/Csv.php +++ b/src/PhpSpreadsheet/Reader/Csv.php @@ -263,6 +263,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet { // Create new Spreadsheet $spreadsheet = new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); // Load into this instance return $this->loadIntoExisting($filename, $spreadsheet); @@ -275,6 +276,7 @@ public function loadSpreadsheetFromString(string $contents): Spreadsheet { // Create new Spreadsheet $spreadsheet = new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); // Load into this instance return $this->loadStringOrFile('data://text/plain,' . urlencode($contents), $spreadsheet, true); @@ -413,7 +415,7 @@ private function loadStringOrFile2(string $filename, Spreadsheet $spreadsheet, b // Loop through each line of the file in turn $delimiter = $this->delimiter ?? ''; $rowData = self::getCsv($fileHandle, 0, $delimiter, $this->enclosure, $this->escapeCharacter); - $valueBinder = Cell::getValueBinder(); + $valueBinder = $this->valueBinder ?? Cell::getValueBinder(); $preserveBooleanString = method_exists($valueBinder, 'getBooleanConversion') && $valueBinder->getBooleanConversion(); $this->getTrue = Calculation::getTRUE(); $this->getFalse = Calculation::getFALSE(); diff --git a/src/PhpSpreadsheet/Reader/Gnumeric.php b/src/PhpSpreadsheet/Reader/Gnumeric.php index 898e32db5c..d80a87ecc5 100644 --- a/src/PhpSpreadsheet/Reader/Gnumeric.php +++ b/src/PhpSpreadsheet/Reader/Gnumeric.php @@ -226,6 +226,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet { // Create new Spreadsheet $spreadsheet = new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); $spreadsheet->removeSheetByIndex(0); // Load into this instance diff --git a/src/PhpSpreadsheet/Reader/Html.php b/src/PhpSpreadsheet/Reader/Html.php index 6e9317e783..4e00f7dd3a 100644 --- a/src/PhpSpreadsheet/Reader/Html.php +++ b/src/PhpSpreadsheet/Reader/Html.php @@ -173,6 +173,7 @@ private function readEnding(): string // Phpstan incorrectly flags following line for Php8.2-, corrected in 8.3 $filename = $meta['uri']; //@phpstan-ignore-line + clearstatcache(true, $filename); $size = (int) filesize($filename); if ($size === 0) { return ''; @@ -210,6 +211,7 @@ public function loadSpreadsheetFromFile(string $filename): Spreadsheet { // Create new Spreadsheet $spreadsheet = new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); // Load into this instance return $this->loadIntoExisting($filename, $spreadsheet); @@ -792,6 +794,7 @@ public function loadFromString(string $content, ?Spreadsheet $spreadsheet = null throw new Exception('Failed to load content as a DOM Document', 0, $e ?? null); } $spreadsheet = $spreadsheet ?? new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); self::loadProperties($dom, $spreadsheet); return $this->loadDocument($dom, $spreadsheet); diff --git a/src/PhpSpreadsheet/Reader/Ods.php b/src/PhpSpreadsheet/Reader/Ods.php index da7b85cee6..85fa08ecf0 100644 --- a/src/PhpSpreadsheet/Reader/Ods.php +++ b/src/PhpSpreadsheet/Reader/Ods.php @@ -234,6 +234,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet { // Create new Spreadsheet $spreadsheet = new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); $spreadsheet->removeSheetByIndex(0); // Load into this instance diff --git a/src/PhpSpreadsheet/Reader/Slk.php b/src/PhpSpreadsheet/Reader/Slk.php index 2a0b2fcd92..7359f926e3 100644 --- a/src/PhpSpreadsheet/Reader/Slk.php +++ b/src/PhpSpreadsheet/Reader/Slk.php @@ -145,6 +145,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet { // Create new Spreadsheet $spreadsheet = new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); // Load into this instance return $this->loadIntoExisting($filename, $spreadsheet); diff --git a/src/PhpSpreadsheet/Reader/Xls.php b/src/PhpSpreadsheet/Reader/Xls.php index 833b2550f3..5883b20dad 100644 --- a/src/PhpSpreadsheet/Reader/Xls.php +++ b/src/PhpSpreadsheet/Reader/Xls.php @@ -6,21 +6,16 @@ use PhpOffice\PhpSpreadsheet\Cell\DataType; use PhpOffice\PhpSpreadsheet\Cell\DataValidation; use PhpOffice\PhpSpreadsheet\Exception as PhpSpreadsheetException; -use PhpOffice\PhpSpreadsheet\NamedRange; -use PhpOffice\PhpSpreadsheet\Reader\Xls\ConditionalFormatting; use PhpOffice\PhpSpreadsheet\Reader\Xls\Style\CellFont; use PhpOffice\PhpSpreadsheet\Reader\Xls\Style\FillPattern; use PhpOffice\PhpSpreadsheet\RichText\RichText; use PhpOffice\PhpSpreadsheet\Shared\CodePage; use PhpOffice\PhpSpreadsheet\Shared\Date; use PhpOffice\PhpSpreadsheet\Shared\Escher; -use PhpOffice\PhpSpreadsheet\Shared\Escher\DgContainer\SpgrContainer\SpContainer; -use PhpOffice\PhpSpreadsheet\Shared\Escher\DggContainer\BstoreContainer\BSE; use PhpOffice\PhpSpreadsheet\Shared\File; use PhpOffice\PhpSpreadsheet\Shared\OLE; use PhpOffice\PhpSpreadsheet\Shared\OLERead; use PhpOffice\PhpSpreadsheet\Shared\StringHelper; -use PhpOffice\PhpSpreadsheet\Shared\Xls as SharedXls; use PhpOffice\PhpSpreadsheet\Spreadsheet; use PhpOffice\PhpSpreadsheet\Style\Alignment; use PhpOffice\PhpSpreadsheet\Style\Border; @@ -31,7 +26,6 @@ use PhpOffice\PhpSpreadsheet\Style\NumberFormat; use PhpOffice\PhpSpreadsheet\Style\Protection; use PhpOffice\PhpSpreadsheet\Style\Style; -use PhpOffice\PhpSpreadsheet\Worksheet\MemoryDrawing; use PhpOffice\PhpSpreadsheet\Worksheet\PageSetup; use PhpOffice\PhpSpreadsheet\Worksheet\SheetView; use PhpOffice\PhpSpreadsheet\Worksheet\Worksheet; @@ -66,297 +60,175 @@ // Patch code for user-defined named cells supports single cells only. // NOTE: this patch only works for BIFF8 as BIFF5-7 use a different // external sheet reference structure -class Xls extends BaseReader +class Xls extends XlsBase { - private const HIGH_ORDER_BIT = 0x80 << 24; - private const FC000000 = 0xFC << 24; - private const FE000000 = 0xFE << 24; - - // ParseXL definitions - const XLS_BIFF8 = 0x0600; - const XLS_BIFF7 = 0x0500; - const XLS_WORKBOOKGLOBALS = 0x0005; - const XLS_WORKSHEET = 0x0010; - - // record identifiers - const XLS_TYPE_FORMULA = 0x0006; - const XLS_TYPE_EOF = 0x000A; - const XLS_TYPE_PROTECT = 0x0012; - const XLS_TYPE_OBJECTPROTECT = 0x0063; - const XLS_TYPE_SCENPROTECT = 0x00DD; - const XLS_TYPE_PASSWORD = 0x0013; - const XLS_TYPE_HEADER = 0x0014; - const XLS_TYPE_FOOTER = 0x0015; - const XLS_TYPE_EXTERNSHEET = 0x0017; - const XLS_TYPE_DEFINEDNAME = 0x0018; - const XLS_TYPE_VERTICALPAGEBREAKS = 0x001A; - const XLS_TYPE_HORIZONTALPAGEBREAKS = 0x001B; - const XLS_TYPE_NOTE = 0x001C; - const XLS_TYPE_SELECTION = 0x001D; - const XLS_TYPE_DATEMODE = 0x0022; - const XLS_TYPE_EXTERNNAME = 0x0023; - const XLS_TYPE_LEFTMARGIN = 0x0026; - const XLS_TYPE_RIGHTMARGIN = 0x0027; - const XLS_TYPE_TOPMARGIN = 0x0028; - const XLS_TYPE_BOTTOMMARGIN = 0x0029; - const XLS_TYPE_PRINTGRIDLINES = 0x002B; - const XLS_TYPE_FILEPASS = 0x002F; - const XLS_TYPE_FONT = 0x0031; - const XLS_TYPE_CONTINUE = 0x003C; - const XLS_TYPE_PANE = 0x0041; - const XLS_TYPE_CODEPAGE = 0x0042; - const XLS_TYPE_DEFCOLWIDTH = 0x0055; - const XLS_TYPE_OBJ = 0x005D; - const XLS_TYPE_COLINFO = 0x007D; - const XLS_TYPE_IMDATA = 0x007F; - const XLS_TYPE_SHEETPR = 0x0081; - const XLS_TYPE_HCENTER = 0x0083; - const XLS_TYPE_VCENTER = 0x0084; - const XLS_TYPE_SHEET = 0x0085; - const XLS_TYPE_PALETTE = 0x0092; - const XLS_TYPE_SCL = 0x00A0; - const XLS_TYPE_PAGESETUP = 0x00A1; - const XLS_TYPE_MULRK = 0x00BD; - const XLS_TYPE_MULBLANK = 0x00BE; - const XLS_TYPE_DBCELL = 0x00D7; - const XLS_TYPE_XF = 0x00E0; - const XLS_TYPE_MERGEDCELLS = 0x00E5; - const XLS_TYPE_MSODRAWINGGROUP = 0x00EB; - const XLS_TYPE_MSODRAWING = 0x00EC; - const XLS_TYPE_SST = 0x00FC; - const XLS_TYPE_LABELSST = 0x00FD; - const XLS_TYPE_EXTSST = 0x00FF; - const XLS_TYPE_EXTERNALBOOK = 0x01AE; - const XLS_TYPE_DATAVALIDATIONS = 0x01B2; - const XLS_TYPE_TXO = 0x01B6; - const XLS_TYPE_HYPERLINK = 0x01B8; - const XLS_TYPE_DATAVALIDATION = 0x01BE; - const XLS_TYPE_DIMENSION = 0x0200; - const XLS_TYPE_BLANK = 0x0201; - const XLS_TYPE_NUMBER = 0x0203; - const XLS_TYPE_LABEL = 0x0204; - const XLS_TYPE_BOOLERR = 0x0205; - const XLS_TYPE_STRING = 0x0207; - const XLS_TYPE_ROW = 0x0208; - const XLS_TYPE_INDEX = 0x020B; - const XLS_TYPE_ARRAY = 0x0221; - const XLS_TYPE_DEFAULTROWHEIGHT = 0x0225; - const XLS_TYPE_WINDOW2 = 0x023E; - const XLS_TYPE_RK = 0x027E; - const XLS_TYPE_STYLE = 0x0293; - const XLS_TYPE_FORMAT = 0x041E; - const XLS_TYPE_SHAREDFMLA = 0x04BC; - const XLS_TYPE_BOF = 0x0809; - const XLS_TYPE_SHEETPROTECTION = 0x0867; - const XLS_TYPE_RANGEPROTECTION = 0x0868; - const XLS_TYPE_SHEETLAYOUT = 0x0862; - const XLS_TYPE_XFEXT = 0x087D; - const XLS_TYPE_PAGELAYOUTVIEW = 0x088B; - const XLS_TYPE_CFHEADER = 0x01B0; - const XLS_TYPE_CFRULE = 0x01B1; - const XLS_TYPE_UNKNOWN = 0xFFFF; - - // Encryption type - const MS_BIFF_CRYPTO_NONE = 0; - const MS_BIFF_CRYPTO_XOR = 1; - const MS_BIFF_CRYPTO_RC4 = 2; - - // Size of stream blocks when using RC4 encryption - const REKEY_BLOCK = 0x400; - - // should be consistent with Writer\Xls\Style\CellBorder - const BORDER_STYLE_MAP = [ - Border::BORDER_NONE, // => 0x00, - Border::BORDER_THIN, // => 0x01, - Border::BORDER_MEDIUM, // => 0x02, - Border::BORDER_DASHED, // => 0x03, - Border::BORDER_DOTTED, // => 0x04, - Border::BORDER_THICK, // => 0x05, - Border::BORDER_DOUBLE, // => 0x06, - Border::BORDER_HAIR, // => 0x07, - Border::BORDER_MEDIUMDASHED, // => 0x08, - Border::BORDER_DASHDOT, // => 0x09, - Border::BORDER_MEDIUMDASHDOT, // => 0x0A, - Border::BORDER_DASHDOTDOT, // => 0x0B, - Border::BORDER_MEDIUMDASHDOTDOT, // => 0x0C, - Border::BORDER_SLANTDASHDOT, // => 0x0D, - Border::BORDER_OMIT, // => 0x0E, - Border::BORDER_OMIT, // => 0x0F, - ]; - /** * Summary Information stream data. */ - private ?string $summaryInformation = null; + protected ?string $summaryInformation = null; /** * Extended Summary Information stream data. */ - private ?string $documentSummaryInformation = null; + protected ?string $documentSummaryInformation = null; /** * Workbook stream data. (Includes workbook globals substream as well as sheet substreams). */ - private string $data; + protected string $data; /** * Size in bytes of $this->data. */ - private int $dataSize; + protected int $dataSize; /** * Current position in stream. */ - private int $pos; + protected int $pos; /** * Workbook to be returned by the reader. */ - private Spreadsheet $spreadsheet; + protected Spreadsheet $spreadsheet; /** * Worksheet that is currently being built by the reader. */ - private Worksheet $phpSheet; + protected Worksheet $phpSheet; /** * BIFF version. */ - private int $version = 0; - - /** - * Codepage set in the Excel file being read. Only important for BIFF5 (Excel 5.0 - Excel 95) - * For BIFF8 (Excel 97 - Excel 2003) this will always have the value 'UTF-16LE'. - */ - private string $codepage = ''; + protected int $version = 0; /** * Shared formats. */ - private array $formats; + protected array $formats; /** * Shared fonts. * * @var Font[] */ - private array $objFonts; + protected array $objFonts; /** * Color palette. */ - private array $palette; + protected array $palette; /** * Worksheets. */ - private array $sheets; + protected array $sheets; /** * External books. */ - private array $externalBooks; + protected array $externalBooks; /** * REF structures. Only applies to BIFF8. */ - private array $ref; + protected array $ref; /** * External names. */ - private array $externalNames; + protected array $externalNames; /** * Defined names. */ - private array $definedname; + protected array $definedname; /** * Shared strings. Only applies to BIFF8. */ - private array $sst; + protected array $sst; /** * Panes are frozen? (in sheet currently being read). See WINDOW2 record. */ - private bool $frozen; + protected bool $frozen; /** * Fit printout to number of pages? (in sheet currently being read). See SHEETPR record. */ - private bool $isFitToPages; + protected bool $isFitToPages; /** * Objects. One OBJ record contributes with one entry. */ - private array $objs; + protected array $objs; /** * Text Objects. One TXO record corresponds with one entry. */ - private array $textObjects; + protected array $textObjects; /** * Cell Annotations (BIFF8). */ - private array $cellNotes; + protected array $cellNotes; /** * The combined MSODRAWINGGROUP data. */ - private string $drawingGroupData; + protected string $drawingGroupData; /** * The combined MSODRAWING data (per sheet). */ - private string $drawingData; + protected string $drawingData; /** * Keep track of XF index. */ - private int $xfIndex; + protected int $xfIndex; /** * Mapping of XF index (that is a cell XF) to final index in cellXf collection. */ - private array $mapCellXfIndex; + protected array $mapCellXfIndex; /** * Mapping of XF index (that is a style XF) to final index in cellStyleXf collection. */ - private array $mapCellStyleXfIndex; + protected array $mapCellStyleXfIndex; /** * The shared formulas in a sheet. One SHAREDFMLA record contributes with one value. */ - private array $sharedFormulas; + protected array $sharedFormulas; /** * The shared formula parts in a sheet. One FORMULA record contributes with one value if it * refers to a shared formula. */ - private array $sharedFormulaParts; + protected array $sharedFormulaParts; /** * The type of encryption in use. */ - private int $encryption = 0; + protected int $encryption = 0; /** * The position in the stream after which contents are encrypted. */ - private int $encryptionStartPos = 0; + protected int $encryptionStartPos = 0; /** * The current RC4 decryption object. */ - private ?Xls\RC4 $rc4Key = null; + protected ?Xls\RC4 $rc4Key = null; /** * The position in the stream that the RC4 decryption object was left at. */ - private int $rc4Pos = 0; + protected int $rc4Pos = 0; /** * The current MD5 context state. @@ -364,1286 +236,1122 @@ class Xls extends BaseReader */ private string $md5Ctxt; // @phpstan-ignore-line - private int $textObjRef; + protected int $textObjRef; - private string $baseCell; + protected string $baseCell; - private bool $activeSheetSet = false; + protected bool $activeSheetSet = false; /** - * Create a new Xls Reader instance. + * Reads names of the worksheets from a file, without parsing the whole file to a PhpSpreadsheet object. */ - public function __construct() + public function listWorksheetNames(string $filename): array { - parent::__construct(); + return (new Xls\ListFunctions())->listWorksheetNames2($filename, $this); } /** - * Can the current IReader read the file? + * Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns). */ - public function canRead(string $filename): bool - { - if (File::testFileNoThrow($filename) === false) { - return false; - } - - try { - // Use ParseXL for the hard work. - $ole = new OLERead(); - - // get excel data - $ole->read($filename); - if ($ole->wrkbook === null) { - throw new Exception('The filename ' . $filename . ' is not recognised as a Spreadsheet file'); - } - - return true; - } catch (PhpSpreadsheetException) { - return false; - } - } - - public function setCodepage(string $codepage): void + public function listWorksheetInfo(string $filename): array { - if (CodePage::validate($codepage) === false) { - throw new PhpSpreadsheetException('Unknown codepage: ' . $codepage); - } - - $this->codepage = $codepage; + return (new Xls\ListFunctions())->listWorksheetInfo2($filename, $this); } - public function getCodepage(): string + /** + * Loads PhpSpreadsheet from file. + */ + protected function loadSpreadsheetFromFile(string $filename): Spreadsheet { - return $this->codepage; + return (new Xls\LoadSpreadsheet())->loadSpreadsheetFromFile2($filename, $this); } /** - * Reads names of the worksheets from a file, without parsing the whole file to a PhpSpreadsheet object. + * Read record data from stream, decrypting as required. + * + * @param string $data Data stream to read from + * @param int $pos Position to start reading from + * @param int $len Record data length + * + * @return string Record data */ - public function listWorksheetNames(string $filename): array + protected function readRecordData(string $data, int $pos, int $len): string { - File::assertFile($filename); - - $worksheetNames = []; - - // Read the OLE file - $this->loadOLE($filename); - - // total byte size of Excel data (workbook global substream + sheet substreams) - $this->dataSize = strlen($this->data); - - $this->pos = 0; - $this->sheets = []; + $data = substr($data, $pos, $len); - // Parse Workbook Global Substream - while ($this->pos < $this->dataSize) { - $code = self::getUInt2d($this->data, $this->pos); + // File not encrypted, or record before encryption start point + if ($this->encryption == self::MS_BIFF_CRYPTO_NONE || $pos < $this->encryptionStartPos) { + return $data; + } - match ($code) { - self::XLS_TYPE_BOF => $this->readBof(), - self::XLS_TYPE_SHEET => $this->readSheet(), - self::XLS_TYPE_EOF => $this->readDefault(), - self::XLS_TYPE_CODEPAGE => $this->readCodepage(), - default => $this->readDefault(), - }; + $recordData = ''; + if ($this->encryption == self::MS_BIFF_CRYPTO_RC4) { + $oldBlock = floor($this->rc4Pos / self::REKEY_BLOCK); + $block = (int) floor($pos / self::REKEY_BLOCK); + $endBlock = (int) floor(($pos + $len) / self::REKEY_BLOCK); - if ($code === self::XLS_TYPE_EOF) { - break; + // Spin an RC4 decryptor to the right spot. If we have a decryptor sitting + // at a point earlier in the current block, re-use it as we can save some time. + if ($block != $oldBlock || $pos < $this->rc4Pos || !$this->rc4Key) { + $this->rc4Key = $this->makeKey($block, $this->md5Ctxt); + $step = $pos % self::REKEY_BLOCK; + } else { + $step = $pos - $this->rc4Pos; } - } + $this->rc4Key->RC4(str_repeat("\0", $step)); - foreach ($this->sheets as $sheet) { - if ($sheet['sheetType'] != 0x00) { - // 0x00: Worksheet, 0x02: Chart, 0x06: Visual Basic module - continue; + // Decrypt record data (re-keying at the end of every block) + while ($block != $endBlock) { + $step = self::REKEY_BLOCK - ($pos % self::REKEY_BLOCK); + $recordData .= $this->rc4Key->RC4(substr($data, 0, $step)); + $data = substr($data, $step); + $pos += $step; + $len -= $step; + ++$block; + $this->rc4Key = $this->makeKey($block, $this->md5Ctxt); } + $recordData .= $this->rc4Key->RC4(substr($data, 0, $len)); - $worksheetNames[] = $sheet['name']; + // Keep track of the position of this decryptor. + // We'll try and re-use it later if we can to speed things up + $this->rc4Pos = $pos + $len; + } elseif ($this->encryption == self::MS_BIFF_CRYPTO_XOR) { + throw new Exception('XOr encryption not supported'); } - return $worksheetNames; + return $recordData; } /** - * Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns). + * Use OLE reader to extract the relevant data streams from the OLE file. */ - public function listWorksheetInfo(string $filename): array + protected function loadOLE(string $filename): void { - File::assertFile($filename); + // OLE reader + $ole = new OLERead(); + // get excel data, + $ole->read($filename); + // Get workbook data: workbook stream + sheet streams + $this->data = $ole->getStream($ole->wrkbook); // @phpstan-ignore-line + // Get summary information data + $this->summaryInformation = $ole->getStream($ole->summaryInformation); + // Get additional document summary information data + $this->documentSummaryInformation = $ole->getStream($ole->documentSummaryInformation); + } - $worksheetInfo = []; + /** + * Read summary information. + */ + protected function readSummaryInformation(): void + { + if (!isset($this->summaryInformation)) { + return; + } - // Read the OLE file - $this->loadOLE($filename); + // offset: 0; size: 2; must be 0xFE 0xFF (UTF-16 LE byte order mark) + // offset: 2; size: 2; + // offset: 4; size: 2; OS version + // offset: 6; size: 2; OS indicator + // offset: 8; size: 16 + // offset: 24; size: 4; section count + //$secCount = self::getInt4d($this->summaryInformation, 24); - // total byte size of Excel data (workbook global substream + sheet substreams) - $this->dataSize = strlen($this->data); + // offset: 28; size: 16; first section's class id: e0 85 9f f2 f9 4f 68 10 ab 91 08 00 2b 27 b3 d9 + // offset: 44; size: 4 + $secOffset = self::getInt4d($this->summaryInformation, 44); - // initialize - $this->pos = 0; - $this->sheets = []; + // section header + // offset: $secOffset; size: 4; section length + //$secLength = self::getInt4d($this->summaryInformation, $secOffset); - // Parse Workbook Global Substream - while ($this->pos < $this->dataSize) { - $code = self::getUInt2d($this->data, $this->pos); + // offset: $secOffset+4; size: 4; property count + $countProperties = self::getInt4d($this->summaryInformation, $secOffset + 4); - match ($code) { - self::XLS_TYPE_BOF => $this->readBof(), - self::XLS_TYPE_SHEET => $this->readSheet(), - self::XLS_TYPE_EOF => $this->readDefault(), - self::XLS_TYPE_CODEPAGE => $this->readCodepage(), - default => $this->readDefault(), - }; + // initialize code page (used to resolve string values) + $codePage = 'CP1252'; - if ($code === self::XLS_TYPE_EOF) { - break; - } - } + // offset: ($secOffset+8); size: var + // loop through property decarations and properties + for ($i = 0; $i < $countProperties; ++$i) { + // offset: ($secOffset+8) + (8 * $i); size: 4; property ID + $id = self::getInt4d($this->summaryInformation, ($secOffset + 8) + (8 * $i)); - // Parse the individual sheets - foreach ($this->sheets as $sheet) { - if ($sheet['sheetType'] != 0x00) { - // 0x00: Worksheet - // 0x02: Chart - // 0x06: Visual Basic module - continue; - } + // Use value of property id as appropriate + // offset: ($secOffset+12) + (8 * $i); size: 4; offset from beginning of section (48) + $offset = self::getInt4d($this->summaryInformation, ($secOffset + 12) + (8 * $i)); - $tmpInfo = []; - $tmpInfo['worksheetName'] = $sheet['name']; - $tmpInfo['lastColumnLetter'] = 'A'; - $tmpInfo['lastColumnIndex'] = 0; - $tmpInfo['totalRows'] = 0; - $tmpInfo['totalColumns'] = 0; + $type = self::getInt4d($this->summaryInformation, $secOffset + $offset); - $this->pos = $sheet['offset']; + // initialize property value + $value = null; - while ($this->pos <= $this->dataSize - 4) { - $code = self::getUInt2d($this->data, $this->pos); + // extract property value based on property type + switch ($type) { + case 0x02: // 2 byte signed integer + $value = self::getUInt2d($this->summaryInformation, $secOffset + 4 + $offset); - switch ($code) { - case self::XLS_TYPE_RK: - case self::XLS_TYPE_LABELSST: - case self::XLS_TYPE_NUMBER: - case self::XLS_TYPE_FORMULA: - case self::XLS_TYPE_BOOLERR: - case self::XLS_TYPE_LABEL: - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + break; + case 0x03: // 4 byte signed integer + $value = self::getInt4d($this->summaryInformation, $secOffset + 4 + $offset); - // move stream pointer to next record - $this->pos += 4 + $length; + break; + case 0x13: // 4 byte unsigned integer + // not needed yet, fix later if necessary + break; + case 0x1E: // null-terminated string prepended by dword string length + $byteLength = self::getInt4d($this->summaryInformation, $secOffset + 4 + $offset); + $value = substr($this->summaryInformation, $secOffset + 8 + $offset, $byteLength); + $value = StringHelper::convertEncoding($value, 'UTF-8', $codePage); + $value = rtrim($value); - $rowIndex = self::getUInt2d($recordData, 0) + 1; - $columnIndex = self::getUInt2d($recordData, 2); + break; + case 0x40: // Filetime (64-bit value representing the number of 100-nanosecond intervals since January 1, 1601) + // PHP-time + $value = OLE::OLE2LocalDate(substr($this->summaryInformation, $secOffset + 4 + $offset, 8)); - $tmpInfo['totalRows'] = max($tmpInfo['totalRows'], $rowIndex); - $tmpInfo['lastColumnIndex'] = max($tmpInfo['lastColumnIndex'], $columnIndex); + break; + case 0x47: // Clipboard format + // not needed yet, fix later if necessary + break; + } - break; - case self::XLS_TYPE_BOF: - $this->readBof(); + switch ($id) { + case 0x01: // Code Page + $codePage = CodePage::numberToName((int) $value); - break; - case self::XLS_TYPE_EOF: - $this->readDefault(); + break; + case 0x02: // Title + $this->spreadsheet->getProperties()->setTitle("$value"); - break 2; - default: - $this->readDefault(); + break; + case 0x03: // Subject + $this->spreadsheet->getProperties()->setSubject("$value"); - break; - } - } + break; + case 0x04: // Author (Creator) + $this->spreadsheet->getProperties()->setCreator("$value"); - $tmpInfo['lastColumnLetter'] = Coordinate::stringFromColumnIndex($tmpInfo['lastColumnIndex'] + 1); - $tmpInfo['totalColumns'] = $tmpInfo['lastColumnIndex'] + 1; + break; + case 0x05: // Keywords + $this->spreadsheet->getProperties()->setKeywords("$value"); - $worksheetInfo[] = $tmpInfo; - } + break; + case 0x06: // Comments (Description) + $this->spreadsheet->getProperties()->setDescription("$value"); + + break; + case 0x07: // Template + // Not supported by PhpSpreadsheet + break; + case 0x08: // Last Saved By (LastModifiedBy) + $this->spreadsheet->getProperties()->setLastModifiedBy("$value"); + + break; + case 0x09: // Revision + // Not supported by PhpSpreadsheet + break; + case 0x0A: // Total Editing Time + // Not supported by PhpSpreadsheet + break; + case 0x0B: // Last Printed + // Not supported by PhpSpreadsheet + break; + case 0x0C: // Created Date/Time + $this->spreadsheet->getProperties()->setCreated($value); + + break; + case 0x0D: // Modified Date/Time + $this->spreadsheet->getProperties()->setModified($value); - return $worksheetInfo; + break; + case 0x0E: // Number of Pages + // Not supported by PhpSpreadsheet + break; + case 0x0F: // Number of Words + // Not supported by PhpSpreadsheet + break; + case 0x10: // Number of Characters + // Not supported by PhpSpreadsheet + break; + case 0x11: // Thumbnail + // Not supported by PhpSpreadsheet + break; + case 0x12: // Name of creating application + // Not supported by PhpSpreadsheet + break; + case 0x13: // Security + // Not supported by PhpSpreadsheet + break; + } + } } /** - * Loads PhpSpreadsheet from file. + * Read additional document summary information. */ - protected function loadSpreadsheetFromFile(string $filename): Spreadsheet + protected function readDocumentSummaryInformation(): void { - // Read the OLE file - $this->loadOLE($filename); - - // Initialisations - $this->spreadsheet = new Spreadsheet(); - $this->spreadsheet->removeSheetByIndex(0); // remove 1st sheet - if (!$this->readDataOnly) { - $this->spreadsheet->removeCellStyleXfByIndex(0); // remove the default style - $this->spreadsheet->removeCellXfByIndex(0); // remove the default style + if (!isset($this->documentSummaryInformation)) { + return; } - // Read the summary information stream (containing meta data) - $this->readSummaryInformation(); - - // Read the Additional document summary information stream (containing application-specific meta data) - $this->readDocumentSummaryInformation(); - - // total byte size of Excel data (workbook global substream + sheet substreams) - $this->dataSize = strlen($this->data); - - // initialize - $this->pos = 0; - $this->codepage = $this->codepage ?: CodePage::DEFAULT_CODE_PAGE; - $this->formats = []; - $this->objFonts = []; - $this->palette = []; - $this->sheets = []; - $this->externalBooks = []; - $this->ref = []; - $this->definedname = []; - $this->sst = []; - $this->drawingGroupData = ''; - $this->xfIndex = 0; - $this->mapCellXfIndex = []; - $this->mapCellStyleXfIndex = []; - - // Parse Workbook Global Substream - while ($this->pos < $this->dataSize) { - $code = self::getUInt2d($this->data, $this->pos); - - match ($code) { - self::XLS_TYPE_BOF => $this->readBof(), - self::XLS_TYPE_FILEPASS => $this->readFilepass(), - self::XLS_TYPE_CODEPAGE => $this->readCodepage(), - self::XLS_TYPE_DATEMODE => $this->readDateMode(), - self::XLS_TYPE_FONT => $this->readFont(), - self::XLS_TYPE_FORMAT => $this->readFormat(), - self::XLS_TYPE_XF => $this->readXf(), - self::XLS_TYPE_XFEXT => $this->readXfExt(), - self::XLS_TYPE_STYLE => $this->readStyle(), - self::XLS_TYPE_PALETTE => $this->readPalette(), - self::XLS_TYPE_SHEET => $this->readSheet(), - self::XLS_TYPE_EXTERNALBOOK => $this->readExternalBook(), - self::XLS_TYPE_EXTERNNAME => $this->readExternName(), - self::XLS_TYPE_EXTERNSHEET => $this->readExternSheet(), - self::XLS_TYPE_DEFINEDNAME => $this->readDefinedName(), - self::XLS_TYPE_MSODRAWINGGROUP => $this->readMsoDrawingGroup(), - self::XLS_TYPE_SST => $this->readSst(), - self::XLS_TYPE_EOF => $this->readDefault(), - default => $this->readDefault(), - }; - - if ($code === self::XLS_TYPE_EOF) { - break; - } - } + // offset: 0; size: 2; must be 0xFE 0xFF (UTF-16 LE byte order mark) + // offset: 2; size: 2; + // offset: 4; size: 2; OS version + // offset: 6; size: 2; OS indicator + // offset: 8; size: 16 + // offset: 24; size: 4; section count + //$secCount = self::getInt4d($this->documentSummaryInformation, 24); - // Resolve indexed colors for font, fill, and border colors - // Cannot be resolved already in XF record, because PALETTE record comes afterwards - if (!$this->readDataOnly) { - foreach ($this->objFonts as $objFont) { - if (isset($objFont->colorIndex)) { - $color = Xls\Color::map($objFont->colorIndex, $this->palette, $this->version); - $objFont->getColor()->setRGB($color['rgb']); - } - } + // offset: 28; size: 16; first section's class id: 02 d5 cd d5 9c 2e 1b 10 93 97 08 00 2b 2c f9 ae + // offset: 44; size: 4; first section offset + $secOffset = self::getInt4d($this->documentSummaryInformation, 44); - foreach ($this->spreadsheet->getCellXfCollection() as $objStyle) { - // fill start and end color - $fill = $objStyle->getFill(); + // section header + // offset: $secOffset; size: 4; section length + //$secLength = self::getInt4d($this->documentSummaryInformation, $secOffset); - if (isset($fill->startcolorIndex)) { - $startColor = Xls\Color::map($fill->startcolorIndex, $this->palette, $this->version); - $fill->getStartColor()->setRGB($startColor['rgb']); - } - if (isset($fill->endcolorIndex)) { - $endColor = Xls\Color::map($fill->endcolorIndex, $this->palette, $this->version); - $fill->getEndColor()->setRGB($endColor['rgb']); - } + // offset: $secOffset+4; size: 4; property count + $countProperties = self::getInt4d($this->documentSummaryInformation, $secOffset + 4); - // border colors - $top = $objStyle->getBorders()->getTop(); - $right = $objStyle->getBorders()->getRight(); - $bottom = $objStyle->getBorders()->getBottom(); - $left = $objStyle->getBorders()->getLeft(); - $diagonal = $objStyle->getBorders()->getDiagonal(); + // initialize code page (used to resolve string values) + $codePage = 'CP1252'; - if (isset($top->colorIndex)) { - $borderTopColor = Xls\Color::map($top->colorIndex, $this->palette, $this->version); - $top->getColor()->setRGB($borderTopColor['rgb']); - } - if (isset($right->colorIndex)) { - $borderRightColor = Xls\Color::map($right->colorIndex, $this->palette, $this->version); - $right->getColor()->setRGB($borderRightColor['rgb']); - } - if (isset($bottom->colorIndex)) { - $borderBottomColor = Xls\Color::map($bottom->colorIndex, $this->palette, $this->version); - $bottom->getColor()->setRGB($borderBottomColor['rgb']); - } - if (isset($left->colorIndex)) { - $borderLeftColor = Xls\Color::map($left->colorIndex, $this->palette, $this->version); - $left->getColor()->setRGB($borderLeftColor['rgb']); - } - if (isset($diagonal->colorIndex)) { - $borderDiagonalColor = Xls\Color::map($diagonal->colorIndex, $this->palette, $this->version); - $diagonal->getColor()->setRGB($borderDiagonalColor['rgb']); - } - } - } - - // treat MSODRAWINGGROUP records, workbook-level Escher - $escherWorkbook = null; - if (!$this->readDataOnly && $this->drawingGroupData) { - $escher = new Escher(); - $reader = new Xls\Escher($escher); - $escherWorkbook = $reader->load($this->drawingGroupData); - } - - // Parse the individual sheets - $this->activeSheetSet = false; - foreach ($this->sheets as $sheet) { - $selectedCells = ''; - if ($sheet['sheetType'] != 0x00) { - // 0x00: Worksheet, 0x02: Chart, 0x06: Visual Basic module - continue; - } + // offset: ($secOffset+8); size: var + // loop through property decarations and properties + for ($i = 0; $i < $countProperties; ++$i) { + // offset: ($secOffset+8) + (8 * $i); size: 4; property ID + $id = self::getInt4d($this->documentSummaryInformation, ($secOffset + 8) + (8 * $i)); - // check if sheet should be skipped - if (isset($this->loadSheetsOnly) && !in_array($sheet['name'], $this->loadSheetsOnly)) { - continue; - } + // Use value of property id as appropriate + // offset: 60 + 8 * $i; size: 4; offset from beginning of section (48) + $offset = self::getInt4d($this->documentSummaryInformation, ($secOffset + 12) + (8 * $i)); - // add sheet to PhpSpreadsheet object - $this->phpSheet = $this->spreadsheet->createSheet(); - // Use false for $updateFormulaCellReferences to prevent adjustment of worksheet references in formula - // cells... during the load, all formulae should be correct, and we're simply bringing the worksheet - // name in line with the formula, not the reverse - $this->phpSheet->setTitle($sheet['name'], false, false); - $this->phpSheet->setSheetState($sheet['sheetState']); + $type = self::getInt4d($this->documentSummaryInformation, $secOffset + $offset); - $this->pos = $sheet['offset']; + // initialize property value + $value = null; - // Initialize isFitToPages. May change after reading SHEETPR record. - $this->isFitToPages = false; + // extract property value based on property type + switch ($type) { + case 0x02: // 2 byte signed integer + $value = self::getUInt2d($this->documentSummaryInformation, $secOffset + 4 + $offset); - // Initialize drawingData - $this->drawingData = ''; + break; + case 0x03: // 4 byte signed integer + $value = self::getInt4d($this->documentSummaryInformation, $secOffset + 4 + $offset); - // Initialize objs - $this->objs = []; + break; + case 0x0B: // Boolean + $value = self::getUInt2d($this->documentSummaryInformation, $secOffset + 4 + $offset); + $value = ($value == 0 ? false : true); - // Initialize shared formula parts - $this->sharedFormulaParts = []; + break; + case 0x13: // 4 byte unsigned integer + // not needed yet, fix later if necessary + break; + case 0x1E: // null-terminated string prepended by dword string length + $byteLength = self::getInt4d($this->documentSummaryInformation, $secOffset + 4 + $offset); + $value = substr($this->documentSummaryInformation, $secOffset + 8 + $offset, $byteLength); + $value = StringHelper::convertEncoding($value, 'UTF-8', $codePage); + $value = rtrim($value); - // Initialize shared formulas - $this->sharedFormulas = []; + break; + case 0x40: // Filetime (64-bit value representing the number of 100-nanosecond intervals since January 1, 1601) + // PHP-Time + $value = OLE::OLE2LocalDate(substr($this->documentSummaryInformation, $secOffset + 4 + $offset, 8)); - // Initialize text objs - $this->textObjects = []; + break; + case 0x47: // Clipboard format + // not needed yet, fix later if necessary + break; + } - // Initialize cell annotations - $this->cellNotes = []; - $this->textObjRef = -1; + switch ($id) { + case 0x01: // Code Page + $codePage = CodePage::numberToName((int) $value); - while ($this->pos <= $this->dataSize - 4) { - $code = self::getUInt2d($this->data, $this->pos); + break; + case 0x02: // Category + $this->spreadsheet->getProperties()->setCategory("$value"); - switch ($code) { - case self::XLS_TYPE_BOF: - $this->readBof(); + break; + case 0x03: // Presentation Target + // Not supported by PhpSpreadsheet + break; + case 0x04: // Bytes + // Not supported by PhpSpreadsheet + break; + case 0x05: // Lines + // Not supported by PhpSpreadsheet + break; + case 0x06: // Paragraphs + // Not supported by PhpSpreadsheet + break; + case 0x07: // Slides + // Not supported by PhpSpreadsheet + break; + case 0x08: // Notes + // Not supported by PhpSpreadsheet + break; + case 0x09: // Hidden Slides + // Not supported by PhpSpreadsheet + break; + case 0x0A: // MM Clips + // Not supported by PhpSpreadsheet + break; + case 0x0B: // Scale Crop + // Not supported by PhpSpreadsheet + break; + case 0x0C: // Heading Pairs + // Not supported by PhpSpreadsheet + break; + case 0x0D: // Titles of Parts + // Not supported by PhpSpreadsheet + break; + case 0x0E: // Manager + $this->spreadsheet->getProperties()->setManager("$value"); - break; - case self::XLS_TYPE_PRINTGRIDLINES: - $this->readPrintGridlines(); + break; + case 0x0F: // Company + $this->spreadsheet->getProperties()->setCompany("$value"); - break; - case self::XLS_TYPE_DEFAULTROWHEIGHT: - $this->readDefaultRowHeight(); + break; + case 0x10: // Links up-to-date + // Not supported by PhpSpreadsheet + break; + } + } + } - break; - case self::XLS_TYPE_SHEETPR: - $this->readSheetPr(); + /** + * Reads a general type of BIFF record. Does nothing except for moving stream pointer forward to next record. + */ + protected function readDefault(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); - break; - case self::XLS_TYPE_HORIZONTALPAGEBREAKS: - $this->readHorizontalPageBreaks(); + // move stream pointer to next record + $this->pos += 4 + $length; + } - break; - case self::XLS_TYPE_VERTICALPAGEBREAKS: - $this->readVerticalPageBreaks(); + /** + * The NOTE record specifies a comment associated with a particular cell. In Excel 95 (BIFF7) and earlier versions, + * this record stores a note (cell note). This feature was significantly enhanced in Excel 97. + */ + protected function readNote(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case self::XLS_TYPE_HEADER: - $this->readHeader(); + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case self::XLS_TYPE_FOOTER: - $this->readFooter(); + if ($this->readDataOnly) { + return; + } - break; - case self::XLS_TYPE_HCENTER: - $this->readHcenter(); + $cellAddress = Xls\Biff8::readBIFF8CellAddress(substr($recordData, 0, 4)); + if ($this->version == self::XLS_BIFF8) { + $noteObjID = self::getUInt2d($recordData, 6); + $noteAuthor = self::readUnicodeStringLong(substr($recordData, 8)); + $noteAuthor = $noteAuthor['value']; + $this->cellNotes[$noteObjID] = [ + 'cellRef' => $cellAddress, + 'objectID' => $noteObjID, + 'author' => $noteAuthor, + ]; + } else { + $extension = false; + if ($cellAddress == '$B$65536') { + // If the address row is -1 and the column is 0, (which translates as $B$65536) then this is a continuation + // note from the previous cell annotation. We're not yet handling this, so annotations longer than the + // max 2048 bytes will probably throw a wobbly. + //$row = self::getUInt2d($recordData, 0); + $extension = true; + $arrayKeys = array_keys($this->phpSheet->getComments()); + $cellAddress = array_pop($arrayKeys); + } - break; - case self::XLS_TYPE_VCENTER: - $this->readVcenter(); + $cellAddress = str_replace('$', '', (string) $cellAddress); + //$noteLength = self::getUInt2d($recordData, 4); + $noteText = trim(substr($recordData, 6)); - break; - case self::XLS_TYPE_LEFTMARGIN: - $this->readLeftMargin(); + if ($extension) { + // Concatenate this extension with the currently set comment for the cell + $comment = $this->phpSheet->getComment($cellAddress); + $commentText = $comment->getText()->getPlainText(); + $comment->setText($this->parseRichText($commentText . $noteText)); + } else { + // Set comment for the cell + $this->phpSheet->getComment($cellAddress)->setText($this->parseRichText($noteText)); +// ->setAuthor($author) + } + } + } - break; - case self::XLS_TYPE_RIGHTMARGIN: - $this->readRightMargin(); + /** + * The TEXT Object record contains the text associated with a cell annotation. + */ + protected function readTextObject(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case self::XLS_TYPE_TOPMARGIN: - $this->readTopMargin(); + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case self::XLS_TYPE_BOTTOMMARGIN: - $this->readBottomMargin(); + if ($this->readDataOnly) { + return; + } - break; - case self::XLS_TYPE_PAGESETUP: - $this->readPageSetup(); + // recordData consists of an array of subrecords looking like this: + // grbit: 2 bytes; Option Flags + // rot: 2 bytes; rotation + // cchText: 2 bytes; length of the text (in the first continue record) + // cbRuns: 2 bytes; length of the formatting (in the second continue record) + // followed by the continuation records containing the actual text and formatting + $grbitOpts = self::getUInt2d($recordData, 0); + $rot = self::getUInt2d($recordData, 2); + //$cchText = self::getUInt2d($recordData, 10); + $cbRuns = self::getUInt2d($recordData, 12); + $text = $this->getSplicedRecordData(); - break; - case self::XLS_TYPE_PROTECT: - $this->readProtect(); + $textByte = $text['spliceOffsets'][1] - $text['spliceOffsets'][0] - 1; + $textStr = substr($text['recordData'], $text['spliceOffsets'][0] + 1, $textByte); + // get 1 byte + $is16Bit = ord($text['recordData'][0]); + // it is possible to use a compressed format, + // which omits the high bytes of all characters, if they are all zero + if (($is16Bit & 0x01) === 0) { + $textStr = StringHelper::ConvertEncoding($textStr, 'UTF-8', 'ISO-8859-1'); + } else { + $textStr = $this->decodeCodepage($textStr); + } - break; - case self::XLS_TYPE_SCENPROTECT: - $this->readScenProtect(); + $this->textObjects[$this->textObjRef] = [ + 'text' => $textStr, + 'format' => substr($text['recordData'], $text['spliceOffsets'][1], $cbRuns), + 'alignment' => $grbitOpts, + 'rotation' => $rot, + ]; + } - break; - case self::XLS_TYPE_OBJECTPROTECT: - $this->readObjectProtect(); + /** + * Read BOF. + */ + protected function readBof(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = substr($this->data, $this->pos + 4, $length); - break; - case self::XLS_TYPE_PASSWORD: - $this->readPassword(); + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case self::XLS_TYPE_DEFCOLWIDTH: - $this->readDefColWidth(); + // offset: 2; size: 2; type of the following data + $substreamType = self::getUInt2d($recordData, 2); - break; - case self::XLS_TYPE_COLINFO: - $this->readColInfo(); + switch ($substreamType) { + case self::XLS_WORKBOOKGLOBALS: + $version = self::getUInt2d($recordData, 0); + if (($version != self::XLS_BIFF8) && ($version != self::XLS_BIFF7)) { + throw new Exception('Cannot read this Excel file. Version is too old.'); + } + $this->version = $version; - break; - case self::XLS_TYPE_DIMENSION: - $this->readDefault(); + break; + case self::XLS_WORKSHEET: + // do not use this version information for anything + // it is unreliable (OpenOffice doc, 5.8), use only version information from the global stream + break; + default: + // substream, e.g. chart + // just skip the entire substream + do { + $code = self::getUInt2d($this->data, $this->pos); + $this->readDefault(); + } while ($code != self::XLS_TYPE_EOF && $this->pos < $this->dataSize); - break; - case self::XLS_TYPE_ROW: - $this->readRow(); + break; + } + } - break; - case self::XLS_TYPE_DBCELL: - $this->readDefault(); - - break; - case self::XLS_TYPE_RK: - $this->readRk(); - - break; - case self::XLS_TYPE_LABELSST: - $this->readLabelSst(); - - break; - case self::XLS_TYPE_MULRK: - $this->readMulRk(); - - break; - case self::XLS_TYPE_NUMBER: - $this->readNumber(); - - break; - case self::XLS_TYPE_FORMULA: - $this->readFormula(); - - break; - case self::XLS_TYPE_SHAREDFMLA: - $this->readSharedFmla(); - - break; - case self::XLS_TYPE_BOOLERR: - $this->readBoolErr(); - - break; - case self::XLS_TYPE_MULBLANK: - $this->readMulBlank(); - - break; - case self::XLS_TYPE_LABEL: - $this->readLabel(); - - break; - case self::XLS_TYPE_BLANK: - $this->readBlank(); - - break; - case self::XLS_TYPE_MSODRAWING: - $this->readMsoDrawing(); - - break; - case self::XLS_TYPE_OBJ: - $this->readObj(); + /** + * FILEPASS. + * + * This record is part of the File Protection Block. It + * contains information about the read/write password of the + * file. All record contents following this record will be + * encrypted. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" + * + * The decryption functions and objects used from here on in + * are based on the source of Spreadsheet-ParseExcel: + * https://metacpan.org/release/Spreadsheet-ParseExcel + */ + protected function readFilepass(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); - break; - case self::XLS_TYPE_WINDOW2: - $this->readWindow2(); + if ($length != 54) { + throw new Exception('Unexpected file pass record length'); + } - break; - case self::XLS_TYPE_PAGELAYOUTVIEW: - $this->readPageLayoutView(); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case self::XLS_TYPE_SCL: - $this->readScl(); + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case self::XLS_TYPE_PANE: - $this->readPane(); + if (!$this->verifyPassword('VelvetSweatshop', substr($recordData, 6, 16), substr($recordData, 22, 16), substr($recordData, 38, 16), $this->md5Ctxt)) { + throw new Exception('Decryption password incorrect'); + } - break; - case self::XLS_TYPE_SELECTION: - $selectedCells = $this->readSelection(); + $this->encryption = self::MS_BIFF_CRYPTO_RC4; - break; - case self::XLS_TYPE_MERGEDCELLS: - $this->readMergedCells(); + // Decryption required from the record after next onwards + $this->encryptionStartPos = $this->pos + self::getUInt2d($this->data, $this->pos + 2); + } - break; - case self::XLS_TYPE_HYPERLINK: - $this->readHyperLink(); + /** + * Make an RC4 decryptor for the given block. + * + * @param int $block Block for which to create decrypto + * @param string $valContext MD5 context state + */ + private function makeKey(int $block, string $valContext): Xls\RC4 + { + $pwarray = str_repeat("\0", 64); - break; - case self::XLS_TYPE_DATAVALIDATIONS: - $this->readDataValidations(); + for ($i = 0; $i < 5; ++$i) { + $pwarray[$i] = $valContext[$i]; + } - break; - case self::XLS_TYPE_DATAVALIDATION: - $this->readDataValidation(); + $pwarray[5] = chr($block & 0xFF); + $pwarray[6] = chr(($block >> 8) & 0xFF); + $pwarray[7] = chr(($block >> 16) & 0xFF); + $pwarray[8] = chr(($block >> 24) & 0xFF); - break; - case self::XLS_TYPE_CFHEADER: - $cellRangeAddresses = $this->readCFHeader(); + $pwarray[9] = "\x80"; + $pwarray[56] = "\x48"; - break; - case self::XLS_TYPE_CFRULE: - $this->readCFRule($cellRangeAddresses ?? []); + $md5 = new Xls\MD5(); + $md5->add($pwarray); - break; - case self::XLS_TYPE_SHEETLAYOUT: - $this->readSheetLayout(); + $s = $md5->getContext(); - break; - case self::XLS_TYPE_SHEETPROTECTION: - $this->readSheetProtection(); + return new Xls\RC4($s); + } - break; - case self::XLS_TYPE_RANGEPROTECTION: - $this->readRangeProtection(); + /** + * Verify RC4 file password. + * + * @param string $password Password to check + * @param string $docid Document id + * @param string $salt_data Salt data + * @param string $hashedsalt_data Hashed salt data + * @param string $valContext Set to the MD5 context of the value + * + * @return bool Success + */ + private function verifyPassword(string $password, string $docid, string $salt_data, string $hashedsalt_data, string &$valContext): bool + { + $pwarray = str_repeat("\0", 64); - break; - case self::XLS_TYPE_NOTE: - $this->readNote(); + $iMax = strlen($password); + for ($i = 0; $i < $iMax; ++$i) { + $o = ord(substr($password, $i, 1)); + $pwarray[2 * $i] = chr($o & 0xFF); + $pwarray[2 * $i + 1] = chr(($o >> 8) & 0xFF); + } + $pwarray[2 * $i] = chr(0x80); + $pwarray[56] = chr(($i << 4) & 0xFF); - break; - case self::XLS_TYPE_TXO: - $this->readTextObject(); + $md5 = new Xls\MD5(); + $md5->add($pwarray); - break; - case self::XLS_TYPE_CONTINUE: - $this->readContinue(); + $mdContext1 = $md5->getContext(); - break; - case self::XLS_TYPE_EOF: - $this->readDefault(); + $offset = 0; + $keyoffset = 0; + $tocopy = 5; - break 2; - default: - $this->readDefault(); + $md5->reset(); - break; - } + while ($offset != 16) { + if ((64 - $offset) < 5) { + $tocopy = 64 - $offset; } + for ($i = 0; $i <= $tocopy; ++$i) { + $pwarray[$offset + $i] = $mdContext1[$keyoffset + $i]; + } + $offset += $tocopy; - // treat MSODRAWING records, sheet-level Escher - if (!$this->readDataOnly && $this->drawingData) { - $escherWorksheet = new Escher(); - $reader = new Xls\Escher($escherWorksheet); - $escherWorksheet = $reader->load($this->drawingData); + if ($offset == 64) { + $md5->add($pwarray); + $keyoffset = $tocopy; + $tocopy = 5 - $tocopy; + $offset = 0; - // get all spContainers in one long array, so they can be mapped to OBJ records - /** @var SpContainer[] $allSpContainers */ - $allSpContainers = method_exists($escherWorksheet, 'getDgContainer') ? $escherWorksheet->getDgContainer()->getSpgrContainer()->getAllSpContainers() : []; + continue; } - // treat OBJ records - foreach ($this->objs as $n => $obj) { - // the first shape container never has a corresponding OBJ record, hence $n + 1 - if (isset($allSpContainers[$n + 1])) { - $spContainer = $allSpContainers[$n + 1]; + $keyoffset = 0; + $tocopy = 5; + for ($i = 0; $i < 16; ++$i) { + $pwarray[$offset + $i] = $docid[$i]; + } + $offset += 16; + } - // we skip all spContainers that are a part of a group shape since we cannot yet handle those - if ($spContainer->getNestingLevel() > 1) { - continue; - } + $pwarray[16] = "\x80"; + for ($i = 0; $i < 47; ++$i) { + $pwarray[17 + $i] = "\0"; + } + $pwarray[56] = "\x80"; + $pwarray[57] = "\x0a"; - // calculate the width and height of the shape - /** @var int $startRow */ - [$startColumn, $startRow] = Coordinate::coordinateFromString($spContainer->getStartCoordinates()); - /** @var int $endRow */ - [$endColumn, $endRow] = Coordinate::coordinateFromString($spContainer->getEndCoordinates()); - - $startOffsetX = $spContainer->getStartOffsetX(); - $startOffsetY = $spContainer->getStartOffsetY(); - $endOffsetX = $spContainer->getEndOffsetX(); - $endOffsetY = $spContainer->getEndOffsetY(); - - $width = SharedXls::getDistanceX($this->phpSheet, $startColumn, $startOffsetX, $endColumn, $endOffsetX); - $height = SharedXls::getDistanceY($this->phpSheet, $startRow, $startOffsetY, $endRow, $endOffsetY); - - // calculate offsetX and offsetY of the shape - $offsetX = (int) ($startOffsetX * SharedXls::sizeCol($this->phpSheet, $startColumn) / 1024); - $offsetY = (int) ($startOffsetY * SharedXls::sizeRow($this->phpSheet, $startRow) / 256); - - switch ($obj['otObjType']) { - case 0x19: - // Note - if (isset($this->cellNotes[$obj['idObjID']])) { - //$cellNote = $this->cellNotes[$obj['idObjID']]; - - if (isset($this->textObjects[$obj['idObjID']])) { - $textObject = $this->textObjects[$obj['idObjID']]; - $this->cellNotes[$obj['idObjID']]['objTextData'] = $textObject; - } - } + $md5->add($pwarray); + $valContext = $md5->getContext(); - break; - case 0x08: - // picture - // get index to BSE entry (1-based) - $BSEindex = $spContainer->getOPT(0x0104); - - // If there is no BSE Index, we will fail here and other fields are not read. - // Fix by checking here. - // TODO: Why is there no BSE Index? Is this a new Office Version? Password protected field? - // More likely : a uncompatible picture - if (!$BSEindex) { - continue 2; - } + $key = $this->makeKey(0, $valContext); - if ($escherWorkbook) { - $BSECollection = method_exists($escherWorkbook, 'getDggContainer') ? $escherWorkbook->getDggContainer()->getBstoreContainer()->getBSECollection() : []; - $BSE = $BSECollection[$BSEindex - 1]; - $blipType = $BSE->getBlipType(); - - // need check because some blip types are not supported by Escher reader such as EMF - if ($blip = $BSE->getBlip()) { - $ih = imagecreatefromstring($blip->getData()); - if ($ih !== false) { - $drawing = new MemoryDrawing(); - $drawing->setImageResource($ih); - - // width, height, offsetX, offsetY - $drawing->setResizeProportional(false); - $drawing->setWidth($width); - $drawing->setHeight($height); - $drawing->setOffsetX($offsetX); - $drawing->setOffsetY($offsetY); - - switch ($blipType) { - case BSE::BLIPTYPE_JPEG: - $drawing->setRenderingFunction(MemoryDrawing::RENDERING_JPEG); - $drawing->setMimeType(MemoryDrawing::MIMETYPE_JPEG); - - break; - case BSE::BLIPTYPE_PNG: - imagealphablending($ih, false); - imagesavealpha($ih, true); - $drawing->setRenderingFunction(MemoryDrawing::RENDERING_PNG); - $drawing->setMimeType(MemoryDrawing::MIMETYPE_PNG); - - break; - } - - $drawing->setWorksheet($this->phpSheet); - $drawing->setCoordinates($spContainer->getStartCoordinates()); - } - } - } + $salt = $key->RC4($salt_data); + $hashedsalt = $key->RC4($hashedsalt_data); - break; - default: - // other object type - break; - } - } - } + $salt .= "\x80" . str_repeat("\0", 47); + $salt[56] = "\x80"; - // treat SHAREDFMLA records - if ($this->version == self::XLS_BIFF8) { - foreach ($this->sharedFormulaParts as $cell => $baseCell) { - /** @var int $row */ - [$column, $row] = Coordinate::coordinateFromString($cell); - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($column, $row, $this->phpSheet->getTitle())) { - $formula = $this->getFormulaFromStructure($this->sharedFormulas[$baseCell], $cell); - $this->phpSheet->getCell($cell)->setValueExplicit('=' . $formula, DataType::TYPE_FORMULA); - } - } - } + $md5->reset(); + $md5->add($salt); + $mdContext2 = $md5->getContext(); - if (!empty($this->cellNotes)) { - foreach ($this->cellNotes as $note => $noteDetails) { - if (!isset($noteDetails['objTextData'])) { - if (isset($this->textObjects[$note])) { - $textObject = $this->textObjects[$note]; - $noteDetails['objTextData'] = $textObject; - } else { - $noteDetails['objTextData']['text'] = ''; - } - } - $cellAddress = str_replace('$', '', $noteDetails['cellRef']); - $this->phpSheet->getComment($cellAddress)->setAuthor($noteDetails['author'])->setText($this->parseRichText($noteDetails['objTextData']['text'])); - } - } - if ($selectedCells !== '') { - $this->phpSheet->setSelectedCells($selectedCells); - } - } - if ($this->activeSheetSet === false) { - $this->spreadsheet->setActiveSheetIndex(0); - } + return $mdContext2 == $hashedsalt; + } - // add the named ranges (defined names) - foreach ($this->definedname as $definedName) { - if ($definedName['isBuiltInName']) { - switch ($definedName['name']) { - case pack('C', 0x06): - // print area - // in general, formula looks like this: Foo!$C$7:$J$66,Bar!$A$1:$IV$2 - $ranges = explode(',', $definedName['formula']); // FIXME: what if sheetname contains comma? - - $extractedRanges = []; - $sheetName = ''; - /** @var non-empty-string $range */ - foreach ($ranges as $range) { - // $range should look like one of these - // Foo!$C$7:$J$66 - // Bar!$A$1:$IV$2 - $explodes = Worksheet::extractSheetTitle($range, true); - $sheetName = trim($explodes[0], "'"); - if (!str_contains($explodes[1], ':')) { - $explodes[1] = $explodes[1] . ':' . $explodes[1]; - } - $extractedRanges[] = str_replace('$', '', $explodes[1]); // C7:J66 - } - if ($docSheet = $this->spreadsheet->getSheetByName($sheetName)) { - $docSheet->getPageSetup()->setPrintArea(implode(',', $extractedRanges)); // C7:J66,A1:IV2 - } + /** + * CODEPAGE. + * + * This record stores the text encoding used to write byte + * strings, stored as MS Windows code page identifier. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" + */ + protected function readCodepage(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case pack('C', 0x07): - // print titles (repeating rows) - // Assuming BIFF8, there are 3 cases - // 1. repeating rows - // formula looks like this: Sheet!$A$1:$IV$2 - // rows 1-2 repeat - // 2. repeating columns - // formula looks like this: Sheet!$A$1:$B$65536 - // columns A-B repeat - // 3. both repeating rows and repeating columns - // formula looks like this: Sheet!$A$1:$B$65536,Sheet!$A$1:$IV$2 - $ranges = explode(',', $definedName['formula']); // FIXME: what if sheetname contains comma? - foreach ($ranges as $range) { - // $range should look like this one of these - // Sheet!$A$1:$B$65536 - // Sheet!$A$1:$IV$2 - if (str_contains($range, '!')) { - $explodes = Worksheet::extractSheetTitle($range, true); - if ($docSheet = $this->spreadsheet->getSheetByName($explodes[0])) { - $extractedRange = $explodes[1]; - $extractedRange = str_replace('$', '', $extractedRange); - - $coordinateStrings = explode(':', $extractedRange); - if (count($coordinateStrings) == 2) { - [$firstColumn, $firstRow] = Coordinate::coordinateFromString($coordinateStrings[0]); - [$lastColumn, $lastRow] = Coordinate::coordinateFromString($coordinateStrings[1]); - - if ($firstColumn == 'A' && $lastColumn == 'IV') { - // then we have repeating rows - $docSheet->getPageSetup()->setRowsToRepeatAtTop([$firstRow, $lastRow]); - } elseif ($firstRow == 1 && $lastRow == 65536) { - // then we have repeating columns - $docSheet->getPageSetup()->setColumnsToRepeatAtLeft([$firstColumn, $lastColumn]); - } - } - } - } - } + // move stream pointer to next record + $this->pos += 4 + $length; - break; - } - } else { - // Extract range - /** @var non-empty-string $formula */ - $formula = $definedName['formula']; - if (str_contains($formula, '!')) { - $explodes = Worksheet::extractSheetTitle($formula, true); - if ( - ($docSheet = $this->spreadsheet->getSheetByName($explodes[0])) - || ($docSheet = $this->spreadsheet->getSheetByName(trim($explodes[0], "'"))) - ) { - $extractedRange = $explodes[1]; - - $localOnly = ($definedName['scope'] === 0) ? false : true; - - $scope = ($definedName['scope'] === 0) ? null : $this->spreadsheet->getSheetByName($this->sheets[$definedName['scope'] - 1]['name']); - - $this->spreadsheet->addNamedRange(new NamedRange((string) $definedName['name'], $docSheet, $extractedRange, $localOnly, $scope)); - } - } - // Named Value - // TODO Provide support for named values - } - } - $this->data = ''; + // offset: 0; size: 2; code page identifier + $codepage = self::getUInt2d($recordData, 0); - return $this->spreadsheet; + $this->codepage = CodePage::numberToName($codepage); } /** - * Read record data from stream, decrypting as required. + * DATEMODE. * - * @param string $data Data stream to read from - * @param int $pos Position to start reading from - * @param int $len Record data length + * This record specifies the base date for displaying date + * values. All dates are stored as count of days past this + * base date. In BIFF2-BIFF4 this record is part of the + * Calculation Settings Block. In BIFF5-BIFF8 it is + * stored in the Workbook Globals Substream. * - * @return string Record data + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readRecordData(string $data, int $pos, int $len): string + protected function readDateMode(): void { - $data = substr($data, $pos, $len); + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // File not encrypted, or record before encryption start point - if ($this->encryption == self::MS_BIFF_CRYPTO_NONE || $pos < $this->encryptionStartPos) { - return $data; + // move stream pointer to next record + $this->pos += 4 + $length; + + // offset: 0; size: 2; 0 = base 1900, 1 = base 1904 + Date::setExcelCalendar(Date::CALENDAR_WINDOWS_1900); + $this->spreadsheet->setExcelCalendar(Date::CALENDAR_WINDOWS_1900); + if (ord($recordData[0]) == 1) { + Date::setExcelCalendar(Date::CALENDAR_MAC_1904); + $this->spreadsheet->setExcelCalendar(Date::CALENDAR_MAC_1904); } + } - $recordData = ''; - if ($this->encryption == self::MS_BIFF_CRYPTO_RC4) { - $oldBlock = floor($this->rc4Pos / self::REKEY_BLOCK); - $block = (int) floor($pos / self::REKEY_BLOCK); - $endBlock = (int) floor(($pos + $len) / self::REKEY_BLOCK); + /** + * Read a FONT record. + */ + protected function readFont(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // Spin an RC4 decryptor to the right spot. If we have a decryptor sitting - // at a point earlier in the current block, re-use it as we can save some time. - if ($block != $oldBlock || $pos < $this->rc4Pos || !$this->rc4Key) { - $this->rc4Key = $this->makeKey($block, $this->md5Ctxt); - $step = $pos % self::REKEY_BLOCK; - } else { - $step = $pos - $this->rc4Pos; + // move stream pointer to next record + $this->pos += 4 + $length; + + if (!$this->readDataOnly) { + $objFont = new Font(); + + // offset: 0; size: 2; height of the font (in twips = 1/20 of a point) + $size = self::getUInt2d($recordData, 0); + $objFont->setSize($size / 20); + + // offset: 2; size: 2; option flags + // bit: 0; mask 0x0001; bold (redundant in BIFF5-BIFF8) + // bit: 1; mask 0x0002; italic + $isItalic = (0x0002 & self::getUInt2d($recordData, 2)) >> 1; + if ($isItalic) { + $objFont->setItalic(true); } - $this->rc4Key->RC4(str_repeat("\0", $step)); - // Decrypt record data (re-keying at the end of every block) - while ($block != $endBlock) { - $step = self::REKEY_BLOCK - ($pos % self::REKEY_BLOCK); - $recordData .= $this->rc4Key->RC4(substr($data, 0, $step)); - $data = substr($data, $step); - $pos += $step; - $len -= $step; - ++$block; - $this->rc4Key = $this->makeKey($block, $this->md5Ctxt); + // bit: 2; mask 0x0004; underlined (redundant in BIFF5-BIFF8) + // bit: 3; mask 0x0008; strikethrough + $isStrike = (0x0008 & self::getUInt2d($recordData, 2)) >> 3; + if ($isStrike) { + $objFont->setStrikethrough(true); } - $recordData .= $this->rc4Key->RC4(substr($data, 0, $len)); - // Keep track of the position of this decryptor. - // We'll try and re-use it later if we can to speed things up - $this->rc4Pos = $pos + $len; - } elseif ($this->encryption == self::MS_BIFF_CRYPTO_XOR) { - throw new Exception('XOr encryption not supported'); - } + // offset: 4; size: 2; colour index + $colorIndex = self::getUInt2d($recordData, 4); + $objFont->colorIndex = $colorIndex; - return $recordData; + // offset: 6; size: 2; font weight + $weight = self::getUInt2d($recordData, 6); // regular=400 bold=700 + if ($weight >= 550) { + $objFont->setBold(true); + } + + // offset: 8; size: 2; escapement type + $escapement = self::getUInt2d($recordData, 8); + CellFont::escapement($objFont, $escapement); + + // offset: 10; size: 1; underline type + $underlineType = ord($recordData[10]); + CellFont::underline($objFont, $underlineType); + + // offset: 11; size: 1; font family + // offset: 12; size: 1; character set + // offset: 13; size: 1; not used + // offset: 14; size: var; font name + if ($this->version == self::XLS_BIFF8) { + $string = self::readUnicodeStringShort(substr($recordData, 14)); + } else { + $string = $this->readByteStringShort(substr($recordData, 14)); + } + $objFont->setName($string['value']); + + $this->objFonts[] = $objFont; + } } /** - * Use OLE reader to extract the relevant data streams from the OLE file. + * FORMAT. + * + * This record contains information about a number format. + * All FORMAT records occur together in a sequential list. + * + * In BIFF2-BIFF4 other records referencing a FORMAT record + * contain a zero-based index into this list. From BIFF5 on + * the FORMAT record contains the index itself that will be + * used by other records. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function loadOLE(string $filename): void + protected function readFormat(): void { - // OLE reader - $ole = new OLERead(); - // get excel data, - $ole->read($filename); - // Get workbook data: workbook stream + sheet streams - $this->data = $ole->getStream($ole->wrkbook); // @phpstan-ignore-line - // Get summary information data - $this->summaryInformation = $ole->getStream($ole->summaryInformation); - // Get additional document summary information data - $this->documentSummaryInformation = $ole->getStream($ole->documentSummaryInformation); + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + + // move stream pointer to next record + $this->pos += 4 + $length; + + if (!$this->readDataOnly) { + $indexCode = self::getUInt2d($recordData, 0); + + if ($this->version == self::XLS_BIFF8) { + $string = self::readUnicodeStringLong(substr($recordData, 2)); + } else { + // BIFF7 + $string = $this->readByteStringShort(substr($recordData, 2)); + } + + $formatString = $string['value']; + // Apache Open Office sets wrong case writing to xls - issue 2239 + if ($formatString === 'GENERAL') { + $formatString = NumberFormat::FORMAT_GENERAL; + } + $this->formats[$indexCode] = $formatString; + } } /** - * Read summary information. + * XF - Extended Format. + * + * This record contains formatting information for cells, rows, columns or styles. + * According to https://support.microsoft.com/en-us/help/147732 there are always at least 15 cell style XF + * and 1 cell XF. + * Inspection of Excel files generated by MS Office Excel shows that XF records 0-14 are cell style XF + * and XF record 15 is a cell XF + * We only read the first cell style XF and skip the remaining cell style XF records + * We read all cell XF records. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readSummaryInformation(): void + protected function readXf(): void { - if (!isset($this->summaryInformation)) { - return; - } + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: 0; size: 2; must be 0xFE 0xFF (UTF-16 LE byte order mark) - // offset: 2; size: 2; - // offset: 4; size: 2; OS version - // offset: 6; size: 2; OS indicator - // offset: 8; size: 16 - // offset: 24; size: 4; section count - //$secCount = self::getInt4d($this->summaryInformation, 24); + // move stream pointer to next record + $this->pos += 4 + $length; - // offset: 28; size: 16; first section's class id: e0 85 9f f2 f9 4f 68 10 ab 91 08 00 2b 27 b3 d9 - // offset: 44; size: 4 - $secOffset = self::getInt4d($this->summaryInformation, 44); + $objStyle = new Style(); - // section header - // offset: $secOffset; size: 4; section length - //$secLength = self::getInt4d($this->summaryInformation, $secOffset); + if (!$this->readDataOnly) { + // offset: 0; size: 2; Index to FONT record + if (self::getUInt2d($recordData, 0) < 4) { + $fontIndex = self::getUInt2d($recordData, 0); + } else { + // this has to do with that index 4 is omitted in all BIFF versions for some strange reason + // check the OpenOffice documentation of the FONT record + $fontIndex = self::getUInt2d($recordData, 0) - 1; + } + if (isset($this->objFonts[$fontIndex])) { + $objStyle->setFont($this->objFonts[$fontIndex]); + } - // offset: $secOffset+4; size: 4; property count - $countProperties = self::getInt4d($this->summaryInformation, $secOffset + 4); + // offset: 2; size: 2; Index to FORMAT record + $numberFormatIndex = self::getUInt2d($recordData, 2); + if (isset($this->formats[$numberFormatIndex])) { + // then we have user-defined format code + $numberFormat = ['formatCode' => $this->formats[$numberFormatIndex]]; + } elseif (($code = NumberFormat::builtInFormatCode($numberFormatIndex)) !== '') { + // then we have built-in format code + $numberFormat = ['formatCode' => $code]; + } else { + // we set the general format code + $numberFormat = ['formatCode' => NumberFormat::FORMAT_GENERAL]; + } + $objStyle->getNumberFormat()->setFormatCode($numberFormat['formatCode']); - // initialize code page (used to resolve string values) - $codePage = 'CP1252'; + // offset: 4; size: 2; XF type, cell protection, and parent style XF + // bit 2-0; mask 0x0007; XF_TYPE_PROT + $xfTypeProt = self::getUInt2d($recordData, 4); + // bit 0; mask 0x01; 1 = cell is locked + $isLocked = (0x01 & $xfTypeProt) >> 0; + $objStyle->getProtection()->setLocked($isLocked ? Protection::PROTECTION_INHERIT : Protection::PROTECTION_UNPROTECTED); - // offset: ($secOffset+8); size: var - // loop through property decarations and properties - for ($i = 0; $i < $countProperties; ++$i) { - // offset: ($secOffset+8) + (8 * $i); size: 4; property ID - $id = self::getInt4d($this->summaryInformation, ($secOffset + 8) + (8 * $i)); + // bit 1; mask 0x02; 1 = Formula is hidden + $isHidden = (0x02 & $xfTypeProt) >> 1; + $objStyle->getProtection()->setHidden($isHidden ? Protection::PROTECTION_PROTECTED : Protection::PROTECTION_UNPROTECTED); - // Use value of property id as appropriate - // offset: ($secOffset+12) + (8 * $i); size: 4; offset from beginning of section (48) - $offset = self::getInt4d($this->summaryInformation, ($secOffset + 12) + (8 * $i)); + // bit 2; mask 0x04; 0 = Cell XF, 1 = Cell Style XF + $isCellStyleXf = (0x04 & $xfTypeProt) >> 2; - $type = self::getInt4d($this->summaryInformation, $secOffset + $offset); + // offset: 6; size: 1; Alignment and text break + // bit 2-0, mask 0x07; horizontal alignment + $horAlign = (0x07 & ord($recordData[6])) >> 0; + Xls\Style\CellAlignment::horizontal($objStyle->getAlignment(), $horAlign); - // initialize property value - $value = null; + // bit 3, mask 0x08; wrap text + $wrapText = (0x08 & ord($recordData[6])) >> 3; + Xls\Style\CellAlignment::wrap($objStyle->getAlignment(), $wrapText); - // extract property value based on property type - switch ($type) { - case 0x02: // 2 byte signed integer - $value = self::getUInt2d($this->summaryInformation, $secOffset + 4 + $offset); + // bit 6-4, mask 0x70; vertical alignment + $vertAlign = (0x70 & ord($recordData[6])) >> 4; + Xls\Style\CellAlignment::vertical($objStyle->getAlignment(), $vertAlign); - break; - case 0x03: // 4 byte signed integer - $value = self::getInt4d($this->summaryInformation, $secOffset + 4 + $offset); + if ($this->version == self::XLS_BIFF8) { + // offset: 7; size: 1; XF_ROTATION: Text rotation angle + $angle = ord($recordData[7]); + $rotation = 0; + if ($angle <= 90) { + $rotation = $angle; + } elseif ($angle <= 180) { + $rotation = 90 - $angle; + } elseif ($angle == Alignment::TEXTROTATION_STACK_EXCEL) { + $rotation = Alignment::TEXTROTATION_STACK_PHPSPREADSHEET; + } + $objStyle->getAlignment()->setTextRotation($rotation); - break; - case 0x13: // 4 byte unsigned integer - // not needed yet, fix later if necessary - break; - case 0x1E: // null-terminated string prepended by dword string length - $byteLength = self::getInt4d($this->summaryInformation, $secOffset + 4 + $offset); - $value = substr($this->summaryInformation, $secOffset + 8 + $offset, $byteLength); - $value = StringHelper::convertEncoding($value, 'UTF-8', $codePage); - $value = rtrim($value); + // offset: 8; size: 1; Indentation, shrink to cell size, and text direction + // bit: 3-0; mask: 0x0F; indent level + $indent = (0x0F & ord($recordData[8])) >> 0; + $objStyle->getAlignment()->setIndent($indent); - break; - case 0x40: // Filetime (64-bit value representing the number of 100-nanosecond intervals since January 1, 1601) - // PHP-time - $value = OLE::OLE2LocalDate(substr($this->summaryInformation, $secOffset + 4 + $offset, 8)); + // bit: 4; mask: 0x10; 1 = shrink content to fit into cell + $shrinkToFit = (0x10 & ord($recordData[8])) >> 4; + switch ($shrinkToFit) { + case 0: + $objStyle->getAlignment()->setShrinkToFit(false); - break; - case 0x47: // Clipboard format - // not needed yet, fix later if necessary - break; - } + break; + case 1: + $objStyle->getAlignment()->setShrinkToFit(true); - switch ($id) { - case 0x01: // Code Page - $codePage = CodePage::numberToName((int) $value); + break; + } - break; - case 0x02: // Title - $this->spreadsheet->getProperties()->setTitle("$value"); + // offset: 9; size: 1; Flags used for attribute groups - break; - case 0x03: // Subject - $this->spreadsheet->getProperties()->setSubject("$value"); + // offset: 10; size: 4; Cell border lines and background area + // bit: 3-0; mask: 0x0000000F; left style + if ($bordersLeftStyle = Xls\Style\Border::lookup((0x0000000F & self::getInt4d($recordData, 10)) >> 0)) { + $objStyle->getBorders()->getLeft()->setBorderStyle($bordersLeftStyle); + } + // bit: 7-4; mask: 0x000000F0; right style + if ($bordersRightStyle = Xls\Style\Border::lookup((0x000000F0 & self::getInt4d($recordData, 10)) >> 4)) { + $objStyle->getBorders()->getRight()->setBorderStyle($bordersRightStyle); + } + // bit: 11-8; mask: 0x00000F00; top style + if ($bordersTopStyle = Xls\Style\Border::lookup((0x00000F00 & self::getInt4d($recordData, 10)) >> 8)) { + $objStyle->getBorders()->getTop()->setBorderStyle($bordersTopStyle); + } + // bit: 15-12; mask: 0x0000F000; bottom style + if ($bordersBottomStyle = Xls\Style\Border::lookup((0x0000F000 & self::getInt4d($recordData, 10)) >> 12)) { + $objStyle->getBorders()->getBottom()->setBorderStyle($bordersBottomStyle); + } + // bit: 22-16; mask: 0x007F0000; left color + $objStyle->getBorders()->getLeft()->colorIndex = (0x007F0000 & self::getInt4d($recordData, 10)) >> 16; - break; - case 0x04: // Author (Creator) - $this->spreadsheet->getProperties()->setCreator("$value"); + // bit: 29-23; mask: 0x3F800000; right color + $objStyle->getBorders()->getRight()->colorIndex = (0x3F800000 & self::getInt4d($recordData, 10)) >> 23; - break; - case 0x05: // Keywords - $this->spreadsheet->getProperties()->setKeywords("$value"); + // bit: 30; mask: 0x40000000; 1 = diagonal line from top left to right bottom + $diagonalDown = (0x40000000 & self::getInt4d($recordData, 10)) >> 30 ? true : false; - break; - case 0x06: // Comments (Description) - $this->spreadsheet->getProperties()->setDescription("$value"); + // bit: 31; mask: 0x800000; 1 = diagonal line from bottom left to top right + $diagonalUp = (self::HIGH_ORDER_BIT & self::getInt4d($recordData, 10)) >> 31 ? true : false; - break; - case 0x07: // Template - // Not supported by PhpSpreadsheet - break; - case 0x08: // Last Saved By (LastModifiedBy) - $this->spreadsheet->getProperties()->setLastModifiedBy("$value"); + if ($diagonalUp === false) { + if ($diagonalDown === false) { + $objStyle->getBorders()->setDiagonalDirection(Borders::DIAGONAL_NONE); + } else { + $objStyle->getBorders()->setDiagonalDirection(Borders::DIAGONAL_DOWN); + } + } elseif ($diagonalDown === false) { + $objStyle->getBorders()->setDiagonalDirection(Borders::DIAGONAL_UP); + } else { + $objStyle->getBorders()->setDiagonalDirection(Borders::DIAGONAL_BOTH); + } - break; - case 0x09: // Revision - // Not supported by PhpSpreadsheet - break; - case 0x0A: // Total Editing Time - // Not supported by PhpSpreadsheet - break; - case 0x0B: // Last Printed - // Not supported by PhpSpreadsheet - break; - case 0x0C: // Created Date/Time - $this->spreadsheet->getProperties()->setCreated($value); + // offset: 14; size: 4; + // bit: 6-0; mask: 0x0000007F; top color + $objStyle->getBorders()->getTop()->colorIndex = (0x0000007F & self::getInt4d($recordData, 14)) >> 0; - break; - case 0x0D: // Modified Date/Time - $this->spreadsheet->getProperties()->setModified($value); + // bit: 13-7; mask: 0x00003F80; bottom color + $objStyle->getBorders()->getBottom()->colorIndex = (0x00003F80 & self::getInt4d($recordData, 14)) >> 7; - break; - case 0x0E: // Number of Pages - // Not supported by PhpSpreadsheet - break; - case 0x0F: // Number of Words - // Not supported by PhpSpreadsheet - break; - case 0x10: // Number of Characters - // Not supported by PhpSpreadsheet - break; - case 0x11: // Thumbnail - // Not supported by PhpSpreadsheet - break; - case 0x12: // Name of creating application - // Not supported by PhpSpreadsheet - break; - case 0x13: // Security - // Not supported by PhpSpreadsheet - break; - } - } - } + // bit: 20-14; mask: 0x001FC000; diagonal color + $objStyle->getBorders()->getDiagonal()->colorIndex = (0x001FC000 & self::getInt4d($recordData, 14)) >> 14; - /** - * Read additional document summary information. - */ - private function readDocumentSummaryInformation(): void - { - if (!isset($this->documentSummaryInformation)) { - return; - } + // bit: 24-21; mask: 0x01E00000; diagonal style + if ($bordersDiagonalStyle = Xls\Style\Border::lookup((0x01E00000 & self::getInt4d($recordData, 14)) >> 21)) { + $objStyle->getBorders()->getDiagonal()->setBorderStyle($bordersDiagonalStyle); + } - // offset: 0; size: 2; must be 0xFE 0xFF (UTF-16 LE byte order mark) - // offset: 2; size: 2; - // offset: 4; size: 2; OS version - // offset: 6; size: 2; OS indicator - // offset: 8; size: 16 - // offset: 24; size: 4; section count - //$secCount = self::getInt4d($this->documentSummaryInformation, 24); + // bit: 31-26; mask: 0xFC000000 fill pattern + if ($fillType = FillPattern::lookup((self::FC000000 & self::getInt4d($recordData, 14)) >> 26)) { + $objStyle->getFill()->setFillType($fillType); + } + // offset: 18; size: 2; pattern and background colour + // bit: 6-0; mask: 0x007F; color index for pattern color + $objStyle->getFill()->startcolorIndex = (0x007F & self::getUInt2d($recordData, 18)) >> 0; - // offset: 28; size: 16; first section's class id: 02 d5 cd d5 9c 2e 1b 10 93 97 08 00 2b 2c f9 ae - // offset: 44; size: 4; first section offset - $secOffset = self::getInt4d($this->documentSummaryInformation, 44); + // bit: 13-7; mask: 0x3F80; color index for pattern background + $objStyle->getFill()->endcolorIndex = (0x3F80 & self::getUInt2d($recordData, 18)) >> 7; + } else { + // BIFF5 - // section header - // offset: $secOffset; size: 4; section length - //$secLength = self::getInt4d($this->documentSummaryInformation, $secOffset); + // offset: 7; size: 1; Text orientation and flags + $orientationAndFlags = ord($recordData[7]); - // offset: $secOffset+4; size: 4; property count - $countProperties = self::getInt4d($this->documentSummaryInformation, $secOffset + 4); + // bit: 1-0; mask: 0x03; XF_ORIENTATION: Text orientation + $xfOrientation = (0x03 & $orientationAndFlags) >> 0; + switch ($xfOrientation) { + case 0: + $objStyle->getAlignment()->setTextRotation(0); - // initialize code page (used to resolve string values) - $codePage = 'CP1252'; + break; + case 1: + $objStyle->getAlignment()->setTextRotation(Alignment::TEXTROTATION_STACK_PHPSPREADSHEET); - // offset: ($secOffset+8); size: var - // loop through property decarations and properties - for ($i = 0; $i < $countProperties; ++$i) { - // offset: ($secOffset+8) + (8 * $i); size: 4; property ID - $id = self::getInt4d($this->documentSummaryInformation, ($secOffset + 8) + (8 * $i)); + break; + case 2: + $objStyle->getAlignment()->setTextRotation(90); - // Use value of property id as appropriate - // offset: 60 + 8 * $i; size: 4; offset from beginning of section (48) - $offset = self::getInt4d($this->documentSummaryInformation, ($secOffset + 12) + (8 * $i)); + break; + case 3: + $objStyle->getAlignment()->setTextRotation(-90); - $type = self::getInt4d($this->documentSummaryInformation, $secOffset + $offset); + break; + } - // initialize property value - $value = null; + // offset: 8; size: 4; cell border lines and background area + $borderAndBackground = self::getInt4d($recordData, 8); - // extract property value based on property type - switch ($type) { - case 0x02: // 2 byte signed integer - $value = self::getUInt2d($this->documentSummaryInformation, $secOffset + 4 + $offset); + // bit: 6-0; mask: 0x0000007F; color index for pattern color + $objStyle->getFill()->startcolorIndex = (0x0000007F & $borderAndBackground) >> 0; - break; - case 0x03: // 4 byte signed integer - $value = self::getInt4d($this->documentSummaryInformation, $secOffset + 4 + $offset); + // bit: 13-7; mask: 0x00003F80; color index for pattern background + $objStyle->getFill()->endcolorIndex = (0x00003F80 & $borderAndBackground) >> 7; - break; - case 0x0B: // Boolean - $value = self::getUInt2d($this->documentSummaryInformation, $secOffset + 4 + $offset); - $value = ($value == 0 ? false : true); + // bit: 21-16; mask: 0x003F0000; fill pattern + $objStyle->getFill()->setFillType(FillPattern::lookup((0x003F0000 & $borderAndBackground) >> 16)); - break; - case 0x13: // 4 byte unsigned integer - // not needed yet, fix later if necessary - break; - case 0x1E: // null-terminated string prepended by dword string length - $byteLength = self::getInt4d($this->documentSummaryInformation, $secOffset + 4 + $offset); - $value = substr($this->documentSummaryInformation, $secOffset + 8 + $offset, $byteLength); - $value = StringHelper::convertEncoding($value, 'UTF-8', $codePage); - $value = rtrim($value); + // bit: 24-22; mask: 0x01C00000; bottom line style + $objStyle->getBorders()->getBottom()->setBorderStyle(Xls\Style\Border::lookup((0x01C00000 & $borderAndBackground) >> 22)); - break; - case 0x40: // Filetime (64-bit value representing the number of 100-nanosecond intervals since January 1, 1601) - // PHP-Time - $value = OLE::OLE2LocalDate(substr($this->documentSummaryInformation, $secOffset + 4 + $offset, 8)); + // bit: 31-25; mask: 0xFE000000; bottom line color + $objStyle->getBorders()->getBottom()->colorIndex = (self::FE000000 & $borderAndBackground) >> 25; - break; - case 0x47: // Clipboard format - // not needed yet, fix later if necessary - break; - } + // offset: 12; size: 4; cell border lines + $borderLines = self::getInt4d($recordData, 12); - switch ($id) { - case 0x01: // Code Page - $codePage = CodePage::numberToName((int) $value); + // bit: 2-0; mask: 0x00000007; top line style + $objStyle->getBorders()->getTop()->setBorderStyle(Xls\Style\Border::lookup((0x00000007 & $borderLines) >> 0)); - break; - case 0x02: // Category - $this->spreadsheet->getProperties()->setCategory("$value"); + // bit: 5-3; mask: 0x00000038; left line style + $objStyle->getBorders()->getLeft()->setBorderStyle(Xls\Style\Border::lookup((0x00000038 & $borderLines) >> 3)); - break; - case 0x03: // Presentation Target - // Not supported by PhpSpreadsheet - break; - case 0x04: // Bytes - // Not supported by PhpSpreadsheet - break; - case 0x05: // Lines - // Not supported by PhpSpreadsheet - break; - case 0x06: // Paragraphs - // Not supported by PhpSpreadsheet - break; - case 0x07: // Slides - // Not supported by PhpSpreadsheet - break; - case 0x08: // Notes - // Not supported by PhpSpreadsheet - break; - case 0x09: // Hidden Slides - // Not supported by PhpSpreadsheet - break; - case 0x0A: // MM Clips - // Not supported by PhpSpreadsheet - break; - case 0x0B: // Scale Crop - // Not supported by PhpSpreadsheet - break; - case 0x0C: // Heading Pairs - // Not supported by PhpSpreadsheet - break; - case 0x0D: // Titles of Parts - // Not supported by PhpSpreadsheet - break; - case 0x0E: // Manager - $this->spreadsheet->getProperties()->setManager("$value"); - - break; - case 0x0F: // Company - $this->spreadsheet->getProperties()->setCompany("$value"); - - break; - case 0x10: // Links up-to-date - // Not supported by PhpSpreadsheet - break; - } - } - } - - /** - * Reads a general type of BIFF record. Does nothing except for moving stream pointer forward to next record. - */ - private function readDefault(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - - // move stream pointer to next record - $this->pos += 4 + $length; - } - - /** - * The NOTE record specifies a comment associated with a particular cell. In Excel 95 (BIFF7) and earlier versions, - * this record stores a note (cell note). This feature was significantly enhanced in Excel 97. - */ - private function readNote(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // bit: 8-6; mask: 0x000001C0; right line style + $objStyle->getBorders()->getRight()->setBorderStyle(Xls\Style\Border::lookup((0x000001C0 & $borderLines) >> 6)); - // move stream pointer to next record - $this->pos += 4 + $length; + // bit: 15-9; mask: 0x0000FE00; top line color index + $objStyle->getBorders()->getTop()->colorIndex = (0x0000FE00 & $borderLines) >> 9; - if ($this->readDataOnly) { - return; - } + // bit: 22-16; mask: 0x007F0000; left line color index + $objStyle->getBorders()->getLeft()->colorIndex = (0x007F0000 & $borderLines) >> 16; - $cellAddress = $this->readBIFF8CellAddress(substr($recordData, 0, 4)); - if ($this->version == self::XLS_BIFF8) { - $noteObjID = self::getUInt2d($recordData, 6); - $noteAuthor = self::readUnicodeStringLong(substr($recordData, 8)); - $noteAuthor = $noteAuthor['value']; - $this->cellNotes[$noteObjID] = [ - 'cellRef' => $cellAddress, - 'objectID' => $noteObjID, - 'author' => $noteAuthor, - ]; - } else { - $extension = false; - if ($cellAddress == '$B$65536') { - // If the address row is -1 and the column is 0, (which translates as $B$65536) then this is a continuation - // note from the previous cell annotation. We're not yet handling this, so annotations longer than the - // max 2048 bytes will probably throw a wobbly. - //$row = self::getUInt2d($recordData, 0); - $extension = true; - $arrayKeys = array_keys($this->phpSheet->getComments()); - $cellAddress = array_pop($arrayKeys); + // bit: 29-23; mask: 0x3F800000; right line color index + $objStyle->getBorders()->getRight()->colorIndex = (0x3F800000 & $borderLines) >> 23; } - $cellAddress = str_replace('$', '', (string) $cellAddress); - //$noteLength = self::getUInt2d($recordData, 4); - $noteText = trim(substr($recordData, 6)); - - if ($extension) { - // Concatenate this extension with the currently set comment for the cell - $comment = $this->phpSheet->getComment($cellAddress); - $commentText = $comment->getText()->getPlainText(); - $comment->setText($this->parseRichText($commentText . $noteText)); + // add cellStyleXf or cellXf and update mapping + if ($isCellStyleXf) { + // we only read one style XF record which is always the first + if ($this->xfIndex == 0) { + $this->spreadsheet->addCellStyleXf($objStyle); + $this->mapCellStyleXfIndex[$this->xfIndex] = 0; + } } else { - // Set comment for the cell - $this->phpSheet->getComment($cellAddress)->setText($this->parseRichText($noteText)); -// ->setAuthor($author) + // we read all cell XF records + $this->spreadsheet->addCellXf($objStyle); + $this->mapCellXfIndex[$this->xfIndex] = count($this->spreadsheet->getCellXfCollection()) - 1; } + + // update XF index for when we read next record + ++$this->xfIndex; } } - /** - * The TEXT Object record contains the text associated with a cell annotation. - */ - private function readTextObject(): void + protected function readXfExt(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -1651,245 +1359,175 @@ private function readTextObject(): void // move stream pointer to next record $this->pos += 4 + $length; - if ($this->readDataOnly) { - return; - } - - // recordData consists of an array of subrecords looking like this: - // grbit: 2 bytes; Option Flags - // rot: 2 bytes; rotation - // cchText: 2 bytes; length of the text (in the first continue record) - // cbRuns: 2 bytes; length of the formatting (in the second continue record) - // followed by the continuation records containing the actual text and formatting - $grbitOpts = self::getUInt2d($recordData, 0); - $rot = self::getUInt2d($recordData, 2); - //$cchText = self::getUInt2d($recordData, 10); - $cbRuns = self::getUInt2d($recordData, 12); - $text = $this->getSplicedRecordData(); + if (!$this->readDataOnly) { + // offset: 0; size: 2; 0x087D = repeated header - $textByte = $text['spliceOffsets'][1] - $text['spliceOffsets'][0] - 1; - $textStr = substr($text['recordData'], $text['spliceOffsets'][0] + 1, $textByte); - // get 1 byte - $is16Bit = ord($text['recordData'][0]); - // it is possible to use a compressed format, - // which omits the high bytes of all characters, if they are all zero - if (($is16Bit & 0x01) === 0) { - $textStr = StringHelper::ConvertEncoding($textStr, 'UTF-8', 'ISO-8859-1'); - } else { - $textStr = $this->decodeCodepage($textStr); - } + // offset: 2; size: 2 - $this->textObjects[$this->textObjRef] = [ - 'text' => $textStr, - 'format' => substr($text['recordData'], $text['spliceOffsets'][1], $cbRuns), - 'alignment' => $grbitOpts, - 'rotation' => $rot, - ]; - } + // offset: 4; size: 8; not used - /** - * Read BOF. - */ - private function readBof(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = substr($this->data, $this->pos + 4, $length); + // offset: 12; size: 2; record version - // move stream pointer to next record - $this->pos += 4 + $length; + // offset: 14; size: 2; index to XF record which this record modifies + $ixfe = self::getUInt2d($recordData, 14); - // offset: 2; size: 2; type of the following data - $substreamType = self::getUInt2d($recordData, 2); + // offset: 16; size: 2; not used - switch ($substreamType) { - case self::XLS_WORKBOOKGLOBALS: - $version = self::getUInt2d($recordData, 0); - if (($version != self::XLS_BIFF8) && ($version != self::XLS_BIFF7)) { - throw new Exception('Cannot read this Excel file. Version is too old.'); - } - $this->version = $version; + // offset: 18; size: 2; number of extension properties that follow + //$cexts = self::getUInt2d($recordData, 18); - break; - case self::XLS_WORKSHEET: - // do not use this version information for anything - // it is unreliable (OpenOffice doc, 5.8), use only version information from the global stream - break; - default: - // substream, e.g. chart - // just skip the entire substream - do { - $code = self::getUInt2d($this->data, $this->pos); - $this->readDefault(); - } while ($code != self::XLS_TYPE_EOF && $this->pos < $this->dataSize); + // start reading the actual extension data + $offset = 20; + while ($offset < $length) { + // extension type + $extType = self::getUInt2d($recordData, $offset); - break; - } - } + // extension length + $cb = self::getUInt2d($recordData, $offset + 2); - /** - * FILEPASS. - * - * This record is part of the File Protection Block. It - * contains information about the read/write password of the - * file. All record contents following this record will be - * encrypted. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" - * - * The decryption functions and objects used from here on in - * are based on the source of Spreadsheet-ParseExcel: - * https://metacpan.org/release/Spreadsheet-ParseExcel - */ - private function readFilepass(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); + // extension data + $extData = substr($recordData, $offset + 4, $cb); - if ($length != 54) { - throw new Exception('Unexpected file pass record length'); - } + switch ($extType) { + case 4: // fill start color + $xclfType = self::getUInt2d($extData, 0); // color type + $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + if ($xclfType == 2) { + $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - // move stream pointer to next record - $this->pos += 4 + $length; + // modify the relevant style property + if (isset($this->mapCellXfIndex[$ixfe])) { + $fill = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getFill(); + $fill->getStartColor()->setRGB($rgb); + $fill->startcolorIndex = null; // normal color index does not apply, discard + } + } - if (!$this->verifyPassword('VelvetSweatshop', substr($recordData, 6, 16), substr($recordData, 22, 16), substr($recordData, 38, 16), $this->md5Ctxt)) { - throw new Exception('Decryption password incorrect'); - } + break; + case 5: // fill end color + $xclfType = self::getUInt2d($extData, 0); // color type + $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - $this->encryption = self::MS_BIFF_CRYPTO_RC4; + if ($xclfType == 2) { + $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - // Decryption required from the record after next onwards - $this->encryptionStartPos = $this->pos + self::getUInt2d($this->data, $this->pos + 2); - } + // modify the relevant style property + if (isset($this->mapCellXfIndex[$ixfe])) { + $fill = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getFill(); + $fill->getEndColor()->setRGB($rgb); + $fill->endcolorIndex = null; // normal color index does not apply, discard + } + } - /** - * Make an RC4 decryptor for the given block. - * - * @param int $block Block for which to create decrypto - * @param string $valContext MD5 context state - */ - private function makeKey(int $block, string $valContext): Xls\RC4 - { - $pwarray = str_repeat("\0", 64); - - for ($i = 0; $i < 5; ++$i) { - $pwarray[$i] = $valContext[$i]; - } - - $pwarray[5] = chr($block & 0xFF); - $pwarray[6] = chr(($block >> 8) & 0xFF); - $pwarray[7] = chr(($block >> 16) & 0xFF); - $pwarray[8] = chr(($block >> 24) & 0xFF); - - $pwarray[9] = "\x80"; - $pwarray[56] = "\x48"; - - $md5 = new Xls\MD5(); - $md5->add($pwarray); + break; + case 7: // border color top + $xclfType = self::getUInt2d($extData, 0); // color type + $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - $s = $md5->getContext(); + if ($xclfType == 2) { + $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - return new Xls\RC4($s); - } + // modify the relevant style property + if (isset($this->mapCellXfIndex[$ixfe])) { + $top = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getTop(); + $top->getColor()->setRGB($rgb); + $top->colorIndex = null; // normal color index does not apply, discard + } + } - /** - * Verify RC4 file password. - * - * @param string $password Password to check - * @param string $docid Document id - * @param string $salt_data Salt data - * @param string $hashedsalt_data Hashed salt data - * @param string $valContext Set to the MD5 context of the value - * - * @return bool Success - */ - private function verifyPassword(string $password, string $docid, string $salt_data, string $hashedsalt_data, string &$valContext): bool - { - $pwarray = str_repeat("\0", 64); + break; + case 8: // border color bottom + $xclfType = self::getUInt2d($extData, 0); // color type + $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - $iMax = strlen($password); - for ($i = 0; $i < $iMax; ++$i) { - $o = ord(substr($password, $i, 1)); - $pwarray[2 * $i] = chr($o & 0xFF); - $pwarray[2 * $i + 1] = chr(($o >> 8) & 0xFF); - } - $pwarray[2 * $i] = chr(0x80); - $pwarray[56] = chr(($i << 4) & 0xFF); + if ($xclfType == 2) { + $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - $md5 = new Xls\MD5(); - $md5->add($pwarray); + // modify the relevant style property + if (isset($this->mapCellXfIndex[$ixfe])) { + $bottom = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getBottom(); + $bottom->getColor()->setRGB($rgb); + $bottom->colorIndex = null; // normal color index does not apply, discard + } + } - $mdContext1 = $md5->getContext(); + break; + case 9: // border color left + $xclfType = self::getUInt2d($extData, 0); // color type + $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - $offset = 0; - $keyoffset = 0; - $tocopy = 5; + if ($xclfType == 2) { + $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - $md5->reset(); + // modify the relevant style property + if (isset($this->mapCellXfIndex[$ixfe])) { + $left = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getLeft(); + $left->getColor()->setRGB($rgb); + $left->colorIndex = null; // normal color index does not apply, discard + } + } - while ($offset != 16) { - if ((64 - $offset) < 5) { - $tocopy = 64 - $offset; - } - for ($i = 0; $i <= $tocopy; ++$i) { - $pwarray[$offset + $i] = $mdContext1[$keyoffset + $i]; - } - $offset += $tocopy; + break; + case 10: // border color right + $xclfType = self::getUInt2d($extData, 0); // color type + $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - if ($offset == 64) { - $md5->add($pwarray); - $keyoffset = $tocopy; - $tocopy = 5 - $tocopy; - $offset = 0; + if ($xclfType == 2) { + $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - continue; - } + // modify the relevant style property + if (isset($this->mapCellXfIndex[$ixfe])) { + $right = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getRight(); + $right->getColor()->setRGB($rgb); + $right->colorIndex = null; // normal color index does not apply, discard + } + } - $keyoffset = 0; - $tocopy = 5; - for ($i = 0; $i < 16; ++$i) { - $pwarray[$offset + $i] = $docid[$i]; - } - $offset += 16; - } + break; + case 11: // border color diagonal + $xclfType = self::getUInt2d($extData, 0); // color type + $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - $pwarray[16] = "\x80"; - for ($i = 0; $i < 47; ++$i) { - $pwarray[17 + $i] = "\0"; - } - $pwarray[56] = "\x80"; - $pwarray[57] = "\x0a"; + if ($xclfType == 2) { + $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - $md5->add($pwarray); - $valContext = $md5->getContext(); + // modify the relevant style property + if (isset($this->mapCellXfIndex[$ixfe])) { + $diagonal = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getDiagonal(); + $diagonal->getColor()->setRGB($rgb); + $diagonal->colorIndex = null; // normal color index does not apply, discard + } + } - $key = $this->makeKey(0, $valContext); + break; + case 13: // font color + $xclfType = self::getUInt2d($extData, 0); // color type + $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - $salt = $key->RC4($salt_data); - $hashedsalt = $key->RC4($hashedsalt_data); + if ($xclfType == 2) { + $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - $salt .= "\x80" . str_repeat("\0", 47); - $salt[56] = "\x80"; + // modify the relevant style property + if (isset($this->mapCellXfIndex[$ixfe])) { + $font = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getFont(); + $font->getColor()->setRGB($rgb); + $font->colorIndex = null; // normal color index does not apply, discard + } + } - $md5->reset(); - $md5->add($salt); - $mdContext2 = $md5->getContext(); + break; + } - return $mdContext2 == $hashedsalt; + $offset += $cb; + } + } } /** - * CODEPAGE. - * - * This record stores the text encoding used to write byte - * strings, stored as MS Windows code page identifier. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read STYLE record. */ - private function readCodepage(): void + protected function readStyle(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -1897,25 +1535,36 @@ private function readCodepage(): void // move stream pointer to next record $this->pos += 4 + $length; - // offset: 0; size: 2; code page identifier - $codepage = self::getUInt2d($recordData, 0); + if (!$this->readDataOnly) { + // offset: 0; size: 2; index to XF record and flag for built-in style + $ixfe = self::getUInt2d($recordData, 0); - $this->codepage = CodePage::numberToName($codepage); + // bit: 11-0; mask 0x0FFF; index to XF record + //$xfIndex = (0x0FFF & $ixfe) >> 0; + + // bit: 15; mask 0x8000; 0 = user-defined style, 1 = built-in style + $isBuiltIn = (bool) ((0x8000 & $ixfe) >> 15); + + if ($isBuiltIn) { + // offset: 2; size: 1; identifier for built-in style + $builtInId = ord($recordData[2]); + + switch ($builtInId) { + case 0x00: + // currently, we are not using this for anything + break; + default: + break; + } + } + // user-defined; not supported by PhpSpreadsheet + } } /** - * DATEMODE. - * - * This record specifies the base date for displaying date - * values. All dates are stored as count of days past this - * base date. In BIFF2-BIFF4 this record is part of the - * Calculation Settings Block. In BIFF5-BIFF8 it is - * stored in the Workbook Globals Substream. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read PALETTE record. */ - private function readDateMode(): void + protected function readPalette(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -1923,96 +1572,137 @@ private function readDateMode(): void // move stream pointer to next record $this->pos += 4 + $length; - // offset: 0; size: 2; 0 = base 1900, 1 = base 1904 - Date::setExcelCalendar(Date::CALENDAR_WINDOWS_1900); - $this->spreadsheet->setExcelCalendar(Date::CALENDAR_WINDOWS_1900); - if (ord($recordData[0]) == 1) { - Date::setExcelCalendar(Date::CALENDAR_MAC_1904); - $this->spreadsheet->setExcelCalendar(Date::CALENDAR_MAC_1904); + if (!$this->readDataOnly) { + // offset: 0; size: 2; number of following colors + $nm = self::getUInt2d($recordData, 0); + + // list of RGB colors + for ($i = 0; $i < $nm; ++$i) { + $rgb = substr($recordData, 2 + 4 * $i, 4); + $this->palette[] = self::readRGB($rgb); + } } } /** - * Read a FONT record. + * SHEET. + * + * This record is located in the Workbook Globals + * Substream and represents a sheet inside the workbook. + * One SHEET record is written for each sheet. It stores the + * sheet name and a stream offset to the BOF record of the + * respective Sheet Substream within the Workbook Stream. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readFont(): void + protected function readSheet(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // offset: 0; size: 4; absolute stream position of the BOF record of the sheet + // NOTE: not encrypted + $rec_offset = self::getInt4d($this->data, $this->pos + 4); + // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - $objFont = new Font(); - - // offset: 0; size: 2; height of the font (in twips = 1/20 of a point) - $size = self::getUInt2d($recordData, 0); - $objFont->setSize($size / 20); + // offset: 4; size: 1; sheet state + $sheetState = match (ord($recordData[4])) { + 0x00 => Worksheet::SHEETSTATE_VISIBLE, + 0x01 => Worksheet::SHEETSTATE_HIDDEN, + 0x02 => Worksheet::SHEETSTATE_VERYHIDDEN, + default => Worksheet::SHEETSTATE_VISIBLE, + }; - // offset: 2; size: 2; option flags - // bit: 0; mask 0x0001; bold (redundant in BIFF5-BIFF8) - // bit: 1; mask 0x0002; italic - $isItalic = (0x0002 & self::getUInt2d($recordData, 2)) >> 1; - if ($isItalic) { - $objFont->setItalic(true); - } + // offset: 5; size: 1; sheet type + $sheetType = ord($recordData[5]); - // bit: 2; mask 0x0004; underlined (redundant in BIFF5-BIFF8) - // bit: 3; mask 0x0008; strikethrough - $isStrike = (0x0008 & self::getUInt2d($recordData, 2)) >> 3; - if ($isStrike) { - $objFont->setStrikethrough(true); - } + // offset: 6; size: var; sheet name + $rec_name = null; + if ($this->version == self::XLS_BIFF8) { + $string = self::readUnicodeStringShort(substr($recordData, 6)); + $rec_name = $string['value']; + } elseif ($this->version == self::XLS_BIFF7) { + $string = $this->readByteStringShort(substr($recordData, 6)); + $rec_name = $string['value']; + } - // offset: 4; size: 2; colour index - $colorIndex = self::getUInt2d($recordData, 4); - $objFont->colorIndex = $colorIndex; + $this->sheets[] = [ + 'name' => $rec_name, + 'offset' => $rec_offset, + 'sheetState' => $sheetState, + 'sheetType' => $sheetType, + ]; + } - // offset: 6; size: 2; font weight - $weight = self::getUInt2d($recordData, 6); // regular=400 bold=700 - if ($weight >= 550) { - $objFont->setBold(true); - } + /** + * Read EXTERNALBOOK record. + */ + protected function readExternalBook(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: 8; size: 2; escapement type - $escapement = self::getUInt2d($recordData, 8); - CellFont::escapement($objFont, $escapement); + // move stream pointer to next record + $this->pos += 4 + $length; - // offset: 10; size: 1; underline type - $underlineType = ord($recordData[10]); - CellFont::underline($objFont, $underlineType); + // offset within record data + $offset = 0; - // offset: 11; size: 1; font family - // offset: 12; size: 1; character set - // offset: 13; size: 1; not used - // offset: 14; size: var; font name - if ($this->version == self::XLS_BIFF8) { - $string = self::readUnicodeStringShort(substr($recordData, 14)); - } else { - $string = $this->readByteStringShort(substr($recordData, 14)); + // there are 4 types of records + if (strlen($recordData) > 4) { + // external reference + // offset: 0; size: 2; number of sheet names ($nm) + $nm = self::getUInt2d($recordData, 0); + $offset += 2; + + // offset: 2; size: var; encoded URL without sheet name (Unicode string, 16-bit length) + $encodedUrlString = self::readUnicodeStringLong(substr($recordData, 2)); + $offset += $encodedUrlString['size']; + + // offset: var; size: var; list of $nm sheet names (Unicode strings, 16-bit length) + $externalSheetNames = []; + for ($i = 0; $i < $nm; ++$i) { + $externalSheetNameString = self::readUnicodeStringLong(substr($recordData, $offset)); + $externalSheetNames[] = $externalSheetNameString['value']; + $offset += $externalSheetNameString['size']; } - $objFont->setName($string['value']); - $this->objFonts[] = $objFont; + // store the record data + $this->externalBooks[] = [ + 'type' => 'external', + 'encodedUrl' => $encodedUrlString['value'], + 'externalSheetNames' => $externalSheetNames, + ]; + } elseif (substr($recordData, 2, 2) == pack('CC', 0x01, 0x04)) { + // internal reference + // offset: 0; size: 2; number of sheet in this document + // offset: 2; size: 2; 0x01 0x04 + $this->externalBooks[] = [ + 'type' => 'internal', + ]; + } elseif (substr($recordData, 0, 4) == pack('vCC', 0x0001, 0x01, 0x3A)) { + // add-in function + // offset: 0; size: 2; 0x0001 + $this->externalBooks[] = [ + 'type' => 'addInFunction', + ]; + } elseif (substr($recordData, 0, 2) == pack('v', 0x0000)) { + // DDE links, OLE links + // offset: 0; size: 2; 0x0000 + // offset: 2; size: var; encoded source document name + $this->externalBooks[] = [ + 'type' => 'DDEorOLE', + ]; } } /** - * FORMAT. - * - * This record contains information about a number format. - * All FORMAT records occur together in a sequential list. - * - * In BIFF2-BIFF4 other records referencing a FORMAT record - * contain a zero-based index into this list. From BIFF5 on - * the FORMAT record contains the index itself that will be - * used by other records. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read EXTERNNAME record. */ - private function readFormat(): void + protected function readExternName(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2020,40 +1710,69 @@ private function readFormat(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - $indexCode = self::getUInt2d($recordData, 0); + // external sheet references provided for named cells + if ($this->version == self::XLS_BIFF8) { + // offset: 0; size: 2; options + //$options = self::getUInt2d($recordData, 0); - if ($this->version == self::XLS_BIFF8) { - $string = self::readUnicodeStringLong(substr($recordData, 2)); - } else { - // BIFF7 - $string = $this->readByteStringShort(substr($recordData, 2)); - } + // offset: 2; size: 2; - $formatString = $string['value']; - // Apache Open Office sets wrong case writing to xls - issue 2239 - if ($formatString === 'GENERAL') { - $formatString = NumberFormat::FORMAT_GENERAL; + // offset: 4; size: 2; not used + + // offset: 6; size: var + $nameString = self::readUnicodeStringShort(substr($recordData, 6)); + + // offset: var; size: var; formula data + $offset = 6 + $nameString['size']; + $formula = $this->getFormulaFromStructure(substr($recordData, $offset)); + + $this->externalNames[] = [ + 'name' => $nameString['value'], + 'formula' => $formula, + ]; + } + } + + /** + * Read EXTERNSHEET record. + */ + protected function readExternSheet(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + + // move stream pointer to next record + $this->pos += 4 + $length; + + // external sheet references provided for named cells + if ($this->version == self::XLS_BIFF8) { + // offset: 0; size: 2; number of following ref structures + $nm = self::getUInt2d($recordData, 0); + for ($i = 0; $i < $nm; ++$i) { + $this->ref[] = [ + // offset: 2 + 6 * $i; index to EXTERNALBOOK record + 'externalBookIndex' => self::getUInt2d($recordData, 2 + 6 * $i), + // offset: 4 + 6 * $i; index to first sheet in EXTERNALBOOK record + 'firstSheetIndex' => self::getUInt2d($recordData, 4 + 6 * $i), + // offset: 6 + 6 * $i; index to last sheet in EXTERNALBOOK record + 'lastSheetIndex' => self::getUInt2d($recordData, 6 + 6 * $i), + ]; } - $this->formats[$indexCode] = $formatString; } } /** - * XF - Extended Format. + * DEFINEDNAME. * - * This record contains formatting information for cells, rows, columns or styles. - * According to https://support.microsoft.com/en-us/help/147732 there are always at least 15 cell style XF - * and 1 cell XF. - * Inspection of Excel files generated by MS Office Excel shows that XF records 0-14 are cell style XF - * and XF record 15 is a cell XF - * We only read the first cell style XF and skip the remaining cell style XF records - * We read all cell XF records. + * This record is part of a Link Table. It contains the name + * and the token array of an internal defined name. Token + * arrays of defined names contain tokens with aberrant + * token classes. * * -- "OpenOffice.org's Documentation of the Microsoft * Excel File Format" */ - private function readXf(): void + protected function readDefinedName(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2061,247 +1780,272 @@ private function readXf(): void // move stream pointer to next record $this->pos += 4 + $length; - $objStyle = new Style(); + if ($this->version == self::XLS_BIFF8) { + // retrieves named cells - if (!$this->readDataOnly) { - // offset: 0; size: 2; Index to FONT record - if (self::getUInt2d($recordData, 0) < 4) { - $fontIndex = self::getUInt2d($recordData, 0); - } else { - // this has to do with that index 4 is omitted in all BIFF versions for some strange reason - // check the OpenOffice documentation of the FONT record - $fontIndex = self::getUInt2d($recordData, 0) - 1; - } - if (isset($this->objFonts[$fontIndex])) { - $objStyle->setFont($this->objFonts[$fontIndex]); - } + // offset: 0; size: 2; option flags + $opts = self::getUInt2d($recordData, 0); - // offset: 2; size: 2; Index to FORMAT record - $numberFormatIndex = self::getUInt2d($recordData, 2); - if (isset($this->formats[$numberFormatIndex])) { - // then we have user-defined format code - $numberFormat = ['formatCode' => $this->formats[$numberFormatIndex]]; - } elseif (($code = NumberFormat::builtInFormatCode($numberFormatIndex)) !== '') { - // then we have built-in format code - $numberFormat = ['formatCode' => $code]; - } else { - // we set the general format code - $numberFormat = ['formatCode' => NumberFormat::FORMAT_GENERAL]; - } - $objStyle->getNumberFormat()->setFormatCode($numberFormat['formatCode']); + // bit: 5; mask: 0x0020; 0 = user-defined name, 1 = built-in-name + $isBuiltInName = (0x0020 & $opts) >> 5; - // offset: 4; size: 2; XF type, cell protection, and parent style XF - // bit 2-0; mask 0x0007; XF_TYPE_PROT - $xfTypeProt = self::getUInt2d($recordData, 4); - // bit 0; mask 0x01; 1 = cell is locked - $isLocked = (0x01 & $xfTypeProt) >> 0; - $objStyle->getProtection()->setLocked($isLocked ? Protection::PROTECTION_INHERIT : Protection::PROTECTION_UNPROTECTED); + // offset: 2; size: 1; keyboard shortcut - // bit 1; mask 0x02; 1 = Formula is hidden - $isHidden = (0x02 & $xfTypeProt) >> 1; - $objStyle->getProtection()->setHidden($isHidden ? Protection::PROTECTION_PROTECTED : Protection::PROTECTION_UNPROTECTED); + // offset: 3; size: 1; length of the name (character count) + $nlen = ord($recordData[3]); - // bit 2; mask 0x04; 0 = Cell XF, 1 = Cell Style XF - $isCellStyleXf = (0x04 & $xfTypeProt) >> 2; + // offset: 4; size: 2; size of the formula data (it can happen that this is zero) + // note: there can also be additional data, this is not included in $flen + $flen = self::getUInt2d($recordData, 4); - // offset: 6; size: 1; Alignment and text break - // bit 2-0, mask 0x07; horizontal alignment - $horAlign = (0x07 & ord($recordData[6])) >> 0; - Xls\Style\CellAlignment::horizontal($objStyle->getAlignment(), $horAlign); + // offset: 8; size: 2; 0=Global name, otherwise index to sheet (1-based) + $scope = self::getUInt2d($recordData, 8); - // bit 3, mask 0x08; wrap text - $wrapText = (0x08 & ord($recordData[6])) >> 3; - Xls\Style\CellAlignment::wrap($objStyle->getAlignment(), $wrapText); - - // bit 6-4, mask 0x70; vertical alignment - $vertAlign = (0x70 & ord($recordData[6])) >> 4; - Xls\Style\CellAlignment::vertical($objStyle->getAlignment(), $vertAlign); + // offset: 14; size: var; Name (Unicode string without length field) + $string = self::readUnicodeString(substr($recordData, 14), $nlen); - if ($this->version == self::XLS_BIFF8) { - // offset: 7; size: 1; XF_ROTATION: Text rotation angle - $angle = ord($recordData[7]); - $rotation = 0; - if ($angle <= 90) { - $rotation = $angle; - } elseif ($angle <= 180) { - $rotation = 90 - $angle; - } elseif ($angle == Alignment::TEXTROTATION_STACK_EXCEL) { - $rotation = Alignment::TEXTROTATION_STACK_PHPSPREADSHEET; - } - $objStyle->getAlignment()->setTextRotation($rotation); + // offset: var; size: $flen; formula data + $offset = 14 + $string['size']; + $formulaStructure = pack('v', $flen) . substr($recordData, $offset); - // offset: 8; size: 1; Indentation, shrink to cell size, and text direction - // bit: 3-0; mask: 0x0F; indent level - $indent = (0x0F & ord($recordData[8])) >> 0; - $objStyle->getAlignment()->setIndent($indent); + try { + $formula = $this->getFormulaFromStructure($formulaStructure); + } catch (PhpSpreadsheetException) { + $formula = ''; + $isBuiltInName = 0; + } - // bit: 4; mask: 0x10; 1 = shrink content to fit into cell - $shrinkToFit = (0x10 & ord($recordData[8])) >> 4; - switch ($shrinkToFit) { - case 0: - $objStyle->getAlignment()->setShrinkToFit(false); + $this->definedname[] = [ + 'isBuiltInName' => $isBuiltInName, + 'name' => $string['value'], + 'formula' => $formula, + 'scope' => $scope, + ]; + } + } - break; - case 1: - $objStyle->getAlignment()->setShrinkToFit(true); + /** + * Read MSODRAWINGGROUP record. + */ + protected function readMsoDrawingGroup(): void + { + //$length = self::getUInt2d($this->data, $this->pos + 2); - break; - } + // get spliced record data + $splicedRecordData = $this->getSplicedRecordData(); + $recordData = $splicedRecordData['recordData']; - // offset: 9; size: 1; Flags used for attribute groups + $this->drawingGroupData .= $recordData; + } - // offset: 10; size: 4; Cell border lines and background area - // bit: 3-0; mask: 0x0000000F; left style - if ($bordersLeftStyle = Xls\Style\Border::lookup((0x0000000F & self::getInt4d($recordData, 10)) >> 0)) { - $objStyle->getBorders()->getLeft()->setBorderStyle($bordersLeftStyle); - } - // bit: 7-4; mask: 0x000000F0; right style - if ($bordersRightStyle = Xls\Style\Border::lookup((0x000000F0 & self::getInt4d($recordData, 10)) >> 4)) { - $objStyle->getBorders()->getRight()->setBorderStyle($bordersRightStyle); - } - // bit: 11-8; mask: 0x00000F00; top style - if ($bordersTopStyle = Xls\Style\Border::lookup((0x00000F00 & self::getInt4d($recordData, 10)) >> 8)) { - $objStyle->getBorders()->getTop()->setBorderStyle($bordersTopStyle); - } - // bit: 15-12; mask: 0x0000F000; bottom style - if ($bordersBottomStyle = Xls\Style\Border::lookup((0x0000F000 & self::getInt4d($recordData, 10)) >> 12)) { - $objStyle->getBorders()->getBottom()->setBorderStyle($bordersBottomStyle); - } - // bit: 22-16; mask: 0x007F0000; left color - $objStyle->getBorders()->getLeft()->colorIndex = (0x007F0000 & self::getInt4d($recordData, 10)) >> 16; + /** + * SST - Shared String Table. + * + * This record contains a list of all strings used anywhere + * in the workbook. Each string occurs only once. The + * workbook uses indexes into the list to reference the + * strings. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" + */ + protected function readSst(): void + { + // offset within (spliced) record data + $pos = 0; - // bit: 29-23; mask: 0x3F800000; right color - $objStyle->getBorders()->getRight()->colorIndex = (0x3F800000 & self::getInt4d($recordData, 10)) >> 23; + // Limit global SST position, further control for bad SST Length in BIFF8 data + $limitposSST = 0; - // bit: 30; mask: 0x40000000; 1 = diagonal line from top left to right bottom - $diagonalDown = (0x40000000 & self::getInt4d($recordData, 10)) >> 30 ? true : false; + // get spliced record data + $splicedRecordData = $this->getSplicedRecordData(); - // bit: 31; mask: 0x800000; 1 = diagonal line from bottom left to top right - $diagonalUp = (self::HIGH_ORDER_BIT & self::getInt4d($recordData, 10)) >> 31 ? true : false; + $recordData = $splicedRecordData['recordData']; + $spliceOffsets = $splicedRecordData['spliceOffsets']; - if ($diagonalUp === false) { - if ($diagonalDown === false) { - $objStyle->getBorders()->setDiagonalDirection(Borders::DIAGONAL_NONE); - } else { - $objStyle->getBorders()->setDiagonalDirection(Borders::DIAGONAL_DOWN); - } - } elseif ($diagonalDown === false) { - $objStyle->getBorders()->setDiagonalDirection(Borders::DIAGONAL_UP); - } else { - $objStyle->getBorders()->setDiagonalDirection(Borders::DIAGONAL_BOTH); - } + // offset: 0; size: 4; total number of strings in the workbook + $pos += 4; - // offset: 14; size: 4; - // bit: 6-0; mask: 0x0000007F; top color - $objStyle->getBorders()->getTop()->colorIndex = (0x0000007F & self::getInt4d($recordData, 14)) >> 0; + // offset: 4; size: 4; number of following strings ($nm) + $nm = self::getInt4d($recordData, 4); + $pos += 4; - // bit: 13-7; mask: 0x00003F80; bottom color - $objStyle->getBorders()->getBottom()->colorIndex = (0x00003F80 & self::getInt4d($recordData, 14)) >> 7; + // look up limit position + foreach ($spliceOffsets as $spliceOffset) { + // it can happen that the string is empty, therefore we need + // <= and not just < + if ($pos <= $spliceOffset) { + $limitposSST = $spliceOffset; + } + } - // bit: 20-14; mask: 0x001FC000; diagonal color - $objStyle->getBorders()->getDiagonal()->colorIndex = (0x001FC000 & self::getInt4d($recordData, 14)) >> 14; + // loop through the Unicode strings (16-bit length) + for ($i = 0; $i < $nm && $pos < $limitposSST; ++$i) { + // number of characters in the Unicode string + $numChars = self::getUInt2d($recordData, $pos); + $pos += 2; - // bit: 24-21; mask: 0x01E00000; diagonal style - if ($bordersDiagonalStyle = Xls\Style\Border::lookup((0x01E00000 & self::getInt4d($recordData, 14)) >> 21)) { - $objStyle->getBorders()->getDiagonal()->setBorderStyle($bordersDiagonalStyle); - } + // option flags + $optionFlags = ord($recordData[$pos]); + ++$pos; - // bit: 31-26; mask: 0xFC000000 fill pattern - if ($fillType = FillPattern::lookup((self::FC000000 & self::getInt4d($recordData, 14)) >> 26)) { - $objStyle->getFill()->setFillType($fillType); - } - // offset: 18; size: 2; pattern and background colour - // bit: 6-0; mask: 0x007F; color index for pattern color - $objStyle->getFill()->startcolorIndex = (0x007F & self::getUInt2d($recordData, 18)) >> 0; + // bit: 0; mask: 0x01; 0 = compressed; 1 = uncompressed + $isCompressed = (($optionFlags & 0x01) == 0); - // bit: 13-7; mask: 0x3F80; color index for pattern background - $objStyle->getFill()->endcolorIndex = (0x3F80 & self::getUInt2d($recordData, 18)) >> 7; - } else { - // BIFF5 + // bit: 2; mask: 0x02; 0 = ordinary; 1 = Asian phonetic + $hasAsian = (($optionFlags & 0x04) != 0); - // offset: 7; size: 1; Text orientation and flags - $orientationAndFlags = ord($recordData[7]); + // bit: 3; mask: 0x03; 0 = ordinary; 1 = Rich-Text + $hasRichText = (($optionFlags & 0x08) != 0); - // bit: 1-0; mask: 0x03; XF_ORIENTATION: Text orientation - $xfOrientation = (0x03 & $orientationAndFlags) >> 0; - switch ($xfOrientation) { - case 0: - $objStyle->getAlignment()->setTextRotation(0); + $formattingRuns = 0; + if ($hasRichText) { + // number of Rich-Text formatting runs + $formattingRuns = self::getUInt2d($recordData, $pos); + $pos += 2; + } - break; - case 1: - $objStyle->getAlignment()->setTextRotation(Alignment::TEXTROTATION_STACK_PHPSPREADSHEET); + $extendedRunLength = 0; + if ($hasAsian) { + // size of Asian phonetic setting + $extendedRunLength = self::getInt4d($recordData, $pos); + $pos += 4; + } - break; - case 2: - $objStyle->getAlignment()->setTextRotation(90); + // expected byte length of character array if not split + $len = ($isCompressed) ? $numChars : $numChars * 2; - break; - case 3: - $objStyle->getAlignment()->setTextRotation(-90); + // look up limit position - Check it again to be sure that no error occurs when parsing SST structure + $limitpos = null; + foreach ($spliceOffsets as $spliceOffset) { + // it can happen that the string is empty, therefore we need + // <= and not just < + if ($pos <= $spliceOffset) { + $limitpos = $spliceOffset; - break; + break; } + } - // offset: 8; size: 4; cell border lines and background area - $borderAndBackground = self::getInt4d($recordData, 8); - - // bit: 6-0; mask: 0x0000007F; color index for pattern color - $objStyle->getFill()->startcolorIndex = (0x0000007F & $borderAndBackground) >> 0; - - // bit: 13-7; mask: 0x00003F80; color index for pattern background - $objStyle->getFill()->endcolorIndex = (0x00003F80 & $borderAndBackground) >> 7; + if ($pos + $len <= $limitpos) { + // character array is not split between records - // bit: 21-16; mask: 0x003F0000; fill pattern - $objStyle->getFill()->setFillType(FillPattern::lookup((0x003F0000 & $borderAndBackground) >> 16)); + $retstr = substr($recordData, $pos, $len); + $pos += $len; + } else { + // character array is split between records - // bit: 24-22; mask: 0x01C00000; bottom line style - $objStyle->getBorders()->getBottom()->setBorderStyle(Xls\Style\Border::lookup((0x01C00000 & $borderAndBackground) >> 22)); + // first part of character array + $retstr = substr($recordData, $pos, $limitpos - $pos); - // bit: 31-25; mask: 0xFE000000; bottom line color - $objStyle->getBorders()->getBottom()->colorIndex = (self::FE000000 & $borderAndBackground) >> 25; + $bytesRead = $limitpos - $pos; - // offset: 12; size: 4; cell border lines - $borderLines = self::getInt4d($recordData, 12); + // remaining characters in Unicode string + $charsLeft = $numChars - (($isCompressed) ? $bytesRead : ($bytesRead / 2)); - // bit: 2-0; mask: 0x00000007; top line style - $objStyle->getBorders()->getTop()->setBorderStyle(Xls\Style\Border::lookup((0x00000007 & $borderLines) >> 0)); + $pos = $limitpos; - // bit: 5-3; mask: 0x00000038; left line style - $objStyle->getBorders()->getLeft()->setBorderStyle(Xls\Style\Border::lookup((0x00000038 & $borderLines) >> 3)); + // keep reading the characters + while ($charsLeft > 0) { + // look up next limit position, in case the string span more than one continue record + foreach ($spliceOffsets as $spliceOffset) { + if ($pos < $spliceOffset) { + $limitpos = $spliceOffset; - // bit: 8-6; mask: 0x000001C0; right line style - $objStyle->getBorders()->getRight()->setBorderStyle(Xls\Style\Border::lookup((0x000001C0 & $borderLines) >> 6)); + break; + } + } - // bit: 15-9; mask: 0x0000FE00; top line color index - $objStyle->getBorders()->getTop()->colorIndex = (0x0000FE00 & $borderLines) >> 9; + // repeated option flags + // OpenOffice.org documentation 5.21 + $option = ord($recordData[$pos]); + ++$pos; - // bit: 22-16; mask: 0x007F0000; left line color index - $objStyle->getBorders()->getLeft()->colorIndex = (0x007F0000 & $borderLines) >> 16; + if ($isCompressed && ($option == 0)) { + // 1st fragment compressed + // this fragment compressed + $len = min($charsLeft, $limitpos - $pos); + $retstr .= substr($recordData, $pos, $len); + $charsLeft -= $len; + $isCompressed = true; + } elseif (!$isCompressed && ($option != 0)) { + // 1st fragment uncompressed + // this fragment uncompressed + $len = min($charsLeft * 2, $limitpos - $pos); + $retstr .= substr($recordData, $pos, $len); + $charsLeft -= $len / 2; + $isCompressed = false; + } elseif (!$isCompressed && ($option == 0)) { + // 1st fragment uncompressed + // this fragment compressed + $len = min($charsLeft, $limitpos - $pos); + for ($j = 0; $j < $len; ++$j) { + $retstr .= $recordData[$pos + $j] + . chr(0); + } + $charsLeft -= $len; + $isCompressed = false; + } else { + // 1st fragment compressed + // this fragment uncompressed + $newstr = ''; + $jMax = strlen($retstr); + for ($j = 0; $j < $jMax; ++$j) { + $newstr .= $retstr[$j] . chr(0); + } + $retstr = $newstr; + $len = min($charsLeft * 2, $limitpos - $pos); + $retstr .= substr($recordData, $pos, $len); + $charsLeft -= $len / 2; + $isCompressed = false; + } - // bit: 29-23; mask: 0x3F800000; right line color index - $objStyle->getBorders()->getRight()->colorIndex = (0x3F800000 & $borderLines) >> 23; + $pos += $len; + } } - // add cellStyleXf or cellXf and update mapping - if ($isCellStyleXf) { - // we only read one style XF record which is always the first - if ($this->xfIndex == 0) { - $this->spreadsheet->addCellStyleXf($objStyle); - $this->mapCellStyleXfIndex[$this->xfIndex] = 0; + // convert to UTF-8 + $retstr = self::encodeUTF16($retstr, $isCompressed); + + // read additional Rich-Text information, if any + $fmtRuns = []; + if ($hasRichText) { + // list of formatting runs + for ($j = 0; $j < $formattingRuns; ++$j) { + // first formatted character; zero-based + $charPos = self::getUInt2d($recordData, $pos + $j * 4); + + // index to font record + $fontIndex = self::getUInt2d($recordData, $pos + 2 + $j * 4); + + $fmtRuns[] = [ + 'charPos' => $charPos, + 'fontIndex' => $fontIndex, + ]; } - } else { - // we read all cell XF records - $this->spreadsheet->addCellXf($objStyle); - $this->mapCellXfIndex[$this->xfIndex] = count($this->spreadsheet->getCellXfCollection()) - 1; + $pos += 4 * $formattingRuns; } - // update XF index for when we read next record - ++$this->xfIndex; + // read additional Asian phonetics information, if any + if ($hasAsian) { + // For Asian phonetic settings, we skip the extended string data + $pos += $extendedRunLength; + } + + // store the shared sting + $this->sst[] = [ + 'value' => $retstr, + 'fmtRuns' => $fmtRuns, + ]; } + + // getSplicedRecordData() takes care of moving current position in data stream } - private function readXfExt(): void + /** + * Read PRINTGRIDLINES record. + */ + protected function readPrintGridlines(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2309,175 +2053,141 @@ private function readXfExt(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - // offset: 0; size: 2; 0x087D = repeated header - - // offset: 2; size: 2 + if ($this->version == self::XLS_BIFF8 && !$this->readDataOnly) { + // offset: 0; size: 2; 0 = do not print sheet grid lines; 1 = print sheet gridlines + $printGridlines = (bool) self::getUInt2d($recordData, 0); + $this->phpSheet->setPrintGridlines($printGridlines); + } + } - // offset: 4; size: 8; not used + /** + * Read DEFAULTROWHEIGHT record. + */ + protected function readDefaultRowHeight(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: 12; size: 2; record version + // move stream pointer to next record + $this->pos += 4 + $length; - // offset: 14; size: 2; index to XF record which this record modifies - $ixfe = self::getUInt2d($recordData, 14); + // offset: 0; size: 2; option flags + // offset: 2; size: 2; default height for unused rows, (twips 1/20 point) + $height = self::getUInt2d($recordData, 2); + $this->phpSheet->getDefaultRowDimension()->setRowHeight($height / 20); + } - // offset: 16; size: 2; not used + /** + * Read SHEETPR record. + */ + protected function readSheetPr(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: 18; size: 2; number of extension properties that follow - //$cexts = self::getUInt2d($recordData, 18); + // move stream pointer to next record + $this->pos += 4 + $length; - // start reading the actual extension data - $offset = 20; - while ($offset < $length) { - // extension type - $extType = self::getUInt2d($recordData, $offset); + // offset: 0; size: 2 - // extension length - $cb = self::getUInt2d($recordData, $offset + 2); + // bit: 6; mask: 0x0040; 0 = outline buttons above outline group + $isSummaryBelow = (0x0040 & self::getUInt2d($recordData, 0)) >> 6; + $this->phpSheet->setShowSummaryBelow((bool) $isSummaryBelow); - // extension data - $extData = substr($recordData, $offset + 4, $cb); + // bit: 7; mask: 0x0080; 0 = outline buttons left of outline group + $isSummaryRight = (0x0080 & self::getUInt2d($recordData, 0)) >> 7; + $this->phpSheet->setShowSummaryRight((bool) $isSummaryRight); - switch ($extType) { - case 4: // fill start color - $xclfType = self::getUInt2d($extData, 0); // color type - $xclrValue = substr($extData, 4, 4); // color value (value based on color type) + // bit: 8; mask: 0x100; 0 = scale printout in percent, 1 = fit printout to number of pages + // this corresponds to radio button setting in page setup dialog in Excel + $this->isFitToPages = (bool) ((0x0100 & self::getUInt2d($recordData, 0)) >> 8); + } - if ($xclfType == 2) { - $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); + /** + * Read HORIZONTALPAGEBREAKS record. + */ + protected function readHorizontalPageBreaks(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // modify the relevant style property - if (isset($this->mapCellXfIndex[$ixfe])) { - $fill = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getFill(); - $fill->getStartColor()->setRGB($rgb); - $fill->startcolorIndex = null; // normal color index does not apply, discard - } - } + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case 5: // fill end color - $xclfType = self::getUInt2d($extData, 0); // color type - $xclrValue = substr($extData, 4, 4); // color value (value based on color type) + if ($this->version == self::XLS_BIFF8 && !$this->readDataOnly) { + // offset: 0; size: 2; number of the following row index structures + $nm = self::getUInt2d($recordData, 0); - if ($xclfType == 2) { - $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); + // offset: 2; size: 6 * $nm; list of $nm row index structures + for ($i = 0; $i < $nm; ++$i) { + $r = self::getUInt2d($recordData, 2 + 6 * $i); + $cf = self::getUInt2d($recordData, 2 + 6 * $i + 2); + //$cl = self::getUInt2d($recordData, 2 + 6 * $i + 4); - // modify the relevant style property - if (isset($this->mapCellXfIndex[$ixfe])) { - $fill = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getFill(); - $fill->getEndColor()->setRGB($rgb); - $fill->endcolorIndex = null; // normal color index does not apply, discard - } - } + // not sure why two column indexes are necessary? + $this->phpSheet->setBreak([$cf + 1, $r], Worksheet::BREAK_ROW); + } + } + } - break; - case 7: // border color top - $xclfType = self::getUInt2d($extData, 0); // color type - $xclrValue = substr($extData, 4, 4); // color value (value based on color type) + /** + * Read VERTICALPAGEBREAKS record. + */ + protected function readVerticalPageBreaks(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - if ($xclfType == 2) { - $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); + // move stream pointer to next record + $this->pos += 4 + $length; - // modify the relevant style property - if (isset($this->mapCellXfIndex[$ixfe])) { - $top = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getTop(); - $top->getColor()->setRGB($rgb); - $top->colorIndex = null; // normal color index does not apply, discard - } - } + if ($this->version == self::XLS_BIFF8 && !$this->readDataOnly) { + // offset: 0; size: 2; number of the following column index structures + $nm = self::getUInt2d($recordData, 0); - break; - case 8: // border color bottom - $xclfType = self::getUInt2d($extData, 0); // color type - $xclrValue = substr($extData, 4, 4); // color value (value based on color type) + // offset: 2; size: 6 * $nm; list of $nm row index structures + for ($i = 0; $i < $nm; ++$i) { + $c = self::getUInt2d($recordData, 2 + 6 * $i); + $rf = self::getUInt2d($recordData, 2 + 6 * $i + 2); + //$rl = self::getUInt2d($recordData, 2 + 6 * $i + 4); - if ($xclfType == 2) { - $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); + // not sure why two row indexes are necessary? + $this->phpSheet->setBreak([$c + 1, ($rf > 0) ? $rf : 1], Worksheet::BREAK_COLUMN); + } + } + } - // modify the relevant style property - if (isset($this->mapCellXfIndex[$ixfe])) { - $bottom = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getBottom(); - $bottom->getColor()->setRGB($rgb); - $bottom->colorIndex = null; // normal color index does not apply, discard - } - } + /** + * Read HEADER record. + */ + protected function readHeader(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case 9: // border color left - $xclfType = self::getUInt2d($extData, 0); // color type - $xclrValue = substr($extData, 4, 4); // color value (value based on color type) + // move stream pointer to next record + $this->pos += 4 + $length; - if ($xclfType == 2) { - $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - - // modify the relevant style property - if (isset($this->mapCellXfIndex[$ixfe])) { - $left = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getLeft(); - $left->getColor()->setRGB($rgb); - $left->colorIndex = null; // normal color index does not apply, discard - } - } - - break; - case 10: // border color right - $xclfType = self::getUInt2d($extData, 0); // color type - $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - - if ($xclfType == 2) { - $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - - // modify the relevant style property - if (isset($this->mapCellXfIndex[$ixfe])) { - $right = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getRight(); - $right->getColor()->setRGB($rgb); - $right->colorIndex = null; // normal color index does not apply, discard - } - } - - break; - case 11: // border color diagonal - $xclfType = self::getUInt2d($extData, 0); // color type - $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - - if ($xclfType == 2) { - $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - - // modify the relevant style property - if (isset($this->mapCellXfIndex[$ixfe])) { - $diagonal = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getBorders()->getDiagonal(); - $diagonal->getColor()->setRGB($rgb); - $diagonal->colorIndex = null; // normal color index does not apply, discard - } - } - - break; - case 13: // font color - $xclfType = self::getUInt2d($extData, 0); // color type - $xclrValue = substr($extData, 4, 4); // color value (value based on color type) - - if ($xclfType == 2) { - $rgb = sprintf('%02X%02X%02X', ord($xclrValue[0]), ord($xclrValue[1]), ord($xclrValue[2])); - - // modify the relevant style property - if (isset($this->mapCellXfIndex[$ixfe])) { - $font = $this->spreadsheet->getCellXfByIndex($this->mapCellXfIndex[$ixfe])->getFont(); - $font->getColor()->setRGB($rgb); - $font->colorIndex = null; // normal color index does not apply, discard - } - } - - break; + if (!$this->readDataOnly) { + // offset: 0; size: var + // realized that $recordData can be empty even when record exists + if ($recordData) { + if ($this->version == self::XLS_BIFF8) { + $string = self::readUnicodeStringLong($recordData); + } else { + $string = $this->readByteStringShort($recordData); } - $offset += $cb; + $this->phpSheet->getHeaderFooter()->setOddHeader($string['value']); + $this->phpSheet->getHeaderFooter()->setEvenHeader($string['value']); } } } /** - * Read STYLE record. + * Read FOOTER record. */ - private function readStyle(): void + protected function readFooter(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2486,35 +2196,24 @@ private function readStyle(): void $this->pos += 4 + $length; if (!$this->readDataOnly) { - // offset: 0; size: 2; index to XF record and flag for built-in style - $ixfe = self::getUInt2d($recordData, 0); - - // bit: 11-0; mask 0x0FFF; index to XF record - //$xfIndex = (0x0FFF & $ixfe) >> 0; - - // bit: 15; mask 0x8000; 0 = user-defined style, 1 = built-in style - $isBuiltIn = (bool) ((0x8000 & $ixfe) >> 15); - - if ($isBuiltIn) { - // offset: 2; size: 1; identifier for built-in style - $builtInId = ord($recordData[2]); - - switch ($builtInId) { - case 0x00: - // currently, we are not using this for anything - break; - default: - break; + // offset: 0; size: var + // realized that $recordData can be empty even when record exists + if ($recordData) { + if ($this->version == self::XLS_BIFF8) { + $string = self::readUnicodeStringLong($recordData); + } else { + $string = $this->readByteStringShort($recordData); } + $this->phpSheet->getHeaderFooter()->setOddFooter($string['value']); + $this->phpSheet->getHeaderFooter()->setEvenFooter($string['value']); } - // user-defined; not supported by PhpSpreadsheet } } /** - * Read PALETTE record. + * Read HCENTER record. */ - private function readPalette(): void + protected function readHcenter(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2523,74 +2222,36 @@ private function readPalette(): void $this->pos += 4 + $length; if (!$this->readDataOnly) { - // offset: 0; size: 2; number of following colors - $nm = self::getUInt2d($recordData, 0); + // offset: 0; size: 2; 0 = print sheet left aligned, 1 = print sheet centered horizontally + $isHorizontalCentered = (bool) self::getUInt2d($recordData, 0); - // list of RGB colors - for ($i = 0; $i < $nm; ++$i) { - $rgb = substr($recordData, 2 + 4 * $i, 4); - $this->palette[] = self::readRGB($rgb); - } + $this->phpSheet->getPageSetup()->setHorizontalCentered($isHorizontalCentered); } } /** - * SHEET. - * - * This record is located in the Workbook Globals - * Substream and represents a sheet inside the workbook. - * One SHEET record is written for each sheet. It stores the - * sheet name and a stream offset to the BOF record of the - * respective Sheet Substream within the Workbook Stream. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read VCENTER record. */ - private function readSheet(): void + protected function readVcenter(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: 0; size: 4; absolute stream position of the BOF record of the sheet - // NOTE: not encrypted - $rec_offset = self::getInt4d($this->data, $this->pos + 4); - // move stream pointer to next record $this->pos += 4 + $length; - // offset: 4; size: 1; sheet state - $sheetState = match (ord($recordData[4])) { - 0x00 => Worksheet::SHEETSTATE_VISIBLE, - 0x01 => Worksheet::SHEETSTATE_HIDDEN, - 0x02 => Worksheet::SHEETSTATE_VERYHIDDEN, - default => Worksheet::SHEETSTATE_VISIBLE, - }; - - // offset: 5; size: 1; sheet type - $sheetType = ord($recordData[5]); + if (!$this->readDataOnly) { + // offset: 0; size: 2; 0 = print sheet aligned at top page border, 1 = print sheet vertically centered + $isVerticalCentered = (bool) self::getUInt2d($recordData, 0); - // offset: 6; size: var; sheet name - $rec_name = null; - if ($this->version == self::XLS_BIFF8) { - $string = self::readUnicodeStringShort(substr($recordData, 6)); - $rec_name = $string['value']; - } elseif ($this->version == self::XLS_BIFF7) { - $string = $this->readByteStringShort(substr($recordData, 6)); - $rec_name = $string['value']; + $this->phpSheet->getPageSetup()->setVerticalCentered($isVerticalCentered); } - - $this->sheets[] = [ - 'name' => $rec_name, - 'offset' => $rec_offset, - 'sheetState' => $sheetState, - 'sheetType' => $sheetType, - ]; } /** - * Read EXTERNALBOOK record. + * Read LEFTMARGIN record. */ - private function readExternalBook(): void + protected function readLeftMargin(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2598,61 +2259,33 @@ private function readExternalBook(): void // move stream pointer to next record $this->pos += 4 + $length; - // offset within record data - $offset = 0; - - // there are 4 types of records - if (strlen($recordData) > 4) { - // external reference - // offset: 0; size: 2; number of sheet names ($nm) - $nm = self::getUInt2d($recordData, 0); - $offset += 2; + if (!$this->readDataOnly) { + // offset: 0; size: 8 + $this->phpSheet->getPageMargins()->setLeft(self::extractNumber($recordData)); + } + } - // offset: 2; size: var; encoded URL without sheet name (Unicode string, 16-bit length) - $encodedUrlString = self::readUnicodeStringLong(substr($recordData, 2)); - $offset += $encodedUrlString['size']; + /** + * Read RIGHTMARGIN record. + */ + protected function readRightMargin(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: var; size: var; list of $nm sheet names (Unicode strings, 16-bit length) - $externalSheetNames = []; - for ($i = 0; $i < $nm; ++$i) { - $externalSheetNameString = self::readUnicodeStringLong(substr($recordData, $offset)); - $externalSheetNames[] = $externalSheetNameString['value']; - $offset += $externalSheetNameString['size']; - } + // move stream pointer to next record + $this->pos += 4 + $length; - // store the record data - $this->externalBooks[] = [ - 'type' => 'external', - 'encodedUrl' => $encodedUrlString['value'], - 'externalSheetNames' => $externalSheetNames, - ]; - } elseif (substr($recordData, 2, 2) == pack('CC', 0x01, 0x04)) { - // internal reference - // offset: 0; size: 2; number of sheet in this document - // offset: 2; size: 2; 0x01 0x04 - $this->externalBooks[] = [ - 'type' => 'internal', - ]; - } elseif (substr($recordData, 0, 4) == pack('vCC', 0x0001, 0x01, 0x3A)) { - // add-in function - // offset: 0; size: 2; 0x0001 - $this->externalBooks[] = [ - 'type' => 'addInFunction', - ]; - } elseif (substr($recordData, 0, 2) == pack('v', 0x0000)) { - // DDE links, OLE links - // offset: 0; size: 2; 0x0000 - // offset: 2; size: var; encoded source document name - $this->externalBooks[] = [ - 'type' => 'DDEorOLE', - ]; + if (!$this->readDataOnly) { + // offset: 0; size: 8 + $this->phpSheet->getPageMargins()->setRight(self::extractNumber($recordData)); } } /** - * Read EXTERNNAME record. + * Read TOPMARGIN record. */ - private function readExternName(): void + protected function readTopMargin(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2660,33 +2293,33 @@ private function readExternName(): void // move stream pointer to next record $this->pos += 4 + $length; - // external sheet references provided for named cells - if ($this->version == self::XLS_BIFF8) { - // offset: 0; size: 2; options - //$options = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; - - // offset: 4; size: 2; not used + if (!$this->readDataOnly) { + // offset: 0; size: 8 + $this->phpSheet->getPageMargins()->setTop(self::extractNumber($recordData)); + } + } - // offset: 6; size: var - $nameString = self::readUnicodeStringShort(substr($recordData, 6)); + /** + * Read BOTTOMMARGIN record. + */ + protected function readBottomMargin(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: var; size: var; formula data - $offset = 6 + $nameString['size']; - $formula = $this->getFormulaFromStructure(substr($recordData, $offset)); + // move stream pointer to next record + $this->pos += 4 + $length; - $this->externalNames[] = [ - 'name' => $nameString['value'], - 'formula' => $formula, - ]; + if (!$this->readDataOnly) { + // offset: 0; size: 8 + $this->phpSheet->getPageMargins()->setBottom(self::extractNumber($recordData)); } } /** - * Read EXTERNSHEET record. + * Read PAGESETUP record. */ - private function readExternSheet(): void + protected function readPageSetup(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2694,35 +2327,57 @@ private function readExternSheet(): void // move stream pointer to next record $this->pos += 4 + $length; - // external sheet references provided for named cells - if ($this->version == self::XLS_BIFF8) { - // offset: 0; size: 2; number of following ref structures - $nm = self::getUInt2d($recordData, 0); - for ($i = 0; $i < $nm; ++$i) { - $this->ref[] = [ - // offset: 2 + 6 * $i; index to EXTERNALBOOK record - 'externalBookIndex' => self::getUInt2d($recordData, 2 + 6 * $i), - // offset: 4 + 6 * $i; index to first sheet in EXTERNALBOOK record - 'firstSheetIndex' => self::getUInt2d($recordData, 4 + 6 * $i), - // offset: 6 + 6 * $i; index to last sheet in EXTERNALBOOK record - 'lastSheetIndex' => self::getUInt2d($recordData, 6 + 6 * $i), - ]; + if (!$this->readDataOnly) { + // offset: 0; size: 2; paper size + $paperSize = self::getUInt2d($recordData, 0); + + // offset: 2; size: 2; scaling factor + $scale = self::getUInt2d($recordData, 2); + + // offset: 6; size: 2; fit worksheet width to this number of pages, 0 = use as many as needed + $fitToWidth = self::getUInt2d($recordData, 6); + + // offset: 8; size: 2; fit worksheet height to this number of pages, 0 = use as many as needed + $fitToHeight = self::getUInt2d($recordData, 8); + + // offset: 10; size: 2; option flags + + // bit: 0; mask: 0x0001; 0=down then over, 1=over then down + $isOverThenDown = (0x0001 & self::getUInt2d($recordData, 10)); + + // bit: 1; mask: 0x0002; 0=landscape, 1=portrait + $isPortrait = (0x0002 & self::getUInt2d($recordData, 10)) >> 1; + + // bit: 2; mask: 0x0004; 1= paper size, scaling factor, paper orient. not init + // when this bit is set, do not use flags for those properties + $isNotInit = (0x0004 & self::getUInt2d($recordData, 10)) >> 2; + + if (!$isNotInit) { + $this->phpSheet->getPageSetup()->setPaperSize($paperSize); + $this->phpSheet->getPageSetup()->setPageOrder(((bool) $isOverThenDown) ? PageSetup::PAGEORDER_OVER_THEN_DOWN : PageSetup::PAGEORDER_DOWN_THEN_OVER); + $this->phpSheet->getPageSetup()->setOrientation(((bool) $isPortrait) ? PageSetup::ORIENTATION_PORTRAIT : PageSetup::ORIENTATION_LANDSCAPE); + + $this->phpSheet->getPageSetup()->setScale($scale, false); + $this->phpSheet->getPageSetup()->setFitToPage((bool) $this->isFitToPages); + $this->phpSheet->getPageSetup()->setFitToWidth($fitToWidth, false); + $this->phpSheet->getPageSetup()->setFitToHeight($fitToHeight, false); } + + // offset: 16; size: 8; header margin (IEEE 754 floating-point value) + $marginHeader = self::extractNumber(substr($recordData, 16, 8)); + $this->phpSheet->getPageMargins()->setHeader($marginHeader); + + // offset: 24; size: 8; footer margin (IEEE 754 floating-point value) + $marginFooter = self::extractNumber(substr($recordData, 24, 8)); + $this->phpSheet->getPageMargins()->setFooter($marginFooter); } } /** - * DEFINEDNAME. - * - * This record is part of a Link Table. It contains the name - * and the token array of an internal defined name. Token - * arrays of defined names contain tokens with aberrant - * token classes. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * PROTECT - Sheet protection (BIFF2 through BIFF8) + * if this record is omitted, then it also means no sheet protection. */ - private function readDefinedName(): void + protected function readProtect(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -2730,290 +2385,232 @@ private function readDefinedName(): void // move stream pointer to next record $this->pos += 4 + $length; - if ($this->version == self::XLS_BIFF8) { - // retrieves named cells + if ($this->readDataOnly) { + return; + } - // offset: 0; size: 2; option flags - $opts = self::getUInt2d($recordData, 0); + // offset: 0; size: 2; - // bit: 5; mask: 0x0020; 0 = user-defined name, 1 = built-in-name - $isBuiltInName = (0x0020 & $opts) >> 5; + // bit 0, mask 0x01; 1 = sheet is protected + $bool = (0x01 & self::getUInt2d($recordData, 0)) >> 0; + $this->phpSheet->getProtection()->setSheet((bool) $bool); + } - // offset: 2; size: 1; keyboard shortcut + /** + * SCENPROTECT. + */ + protected function readScenProtect(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: 3; size: 1; length of the name (character count) - $nlen = ord($recordData[3]); + // move stream pointer to next record + $this->pos += 4 + $length; - // offset: 4; size: 2; size of the formula data (it can happen that this is zero) - // note: there can also be additional data, this is not included in $flen - $flen = self::getUInt2d($recordData, 4); + if ($this->readDataOnly) { + return; + } - // offset: 8; size: 2; 0=Global name, otherwise index to sheet (1-based) - $scope = self::getUInt2d($recordData, 8); + // offset: 0; size: 2; - // offset: 14; size: var; Name (Unicode string without length field) - $string = self::readUnicodeString(substr($recordData, 14), $nlen); + // bit: 0, mask 0x01; 1 = scenarios are protected + $bool = (0x01 & self::getUInt2d($recordData, 0)) >> 0; - // offset: var; size: $flen; formula data - $offset = 14 + $string['size']; - $formulaStructure = pack('v', $flen) . substr($recordData, $offset); + $this->phpSheet->getProtection()->setScenarios((bool) $bool); + } - try { - $formula = $this->getFormulaFromStructure($formulaStructure); - } catch (PhpSpreadsheetException) { - $formula = ''; - $isBuiltInName = 0; - } + /** + * OBJECTPROTECT. + */ + protected function readObjectProtect(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - $this->definedname[] = [ - 'isBuiltInName' => $isBuiltInName, - 'name' => $string['value'], - 'formula' => $formula, - 'scope' => $scope, - ]; + // move stream pointer to next record + $this->pos += 4 + $length; + + if ($this->readDataOnly) { + return; } + + // offset: 0; size: 2; + + // bit: 0, mask 0x01; 1 = objects are protected + $bool = (0x01 & self::getUInt2d($recordData, 0)) >> 0; + + $this->phpSheet->getProtection()->setObjects((bool) $bool); } /** - * Read MSODRAWINGGROUP record. + * PASSWORD - Sheet protection (hashed) password (BIFF2 through BIFF8). */ - private function readMsoDrawingGroup(): void + protected function readPassword(): void { - //$length = self::getUInt2d($this->data, $this->pos + 2); + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // get spliced record data - $splicedRecordData = $this->getSplicedRecordData(); - $recordData = $splicedRecordData['recordData']; + // move stream pointer to next record + $this->pos += 4 + $length; - $this->drawingGroupData .= $recordData; + if (!$this->readDataOnly) { + // offset: 0; size: 2; 16-bit hash value of password + $password = strtoupper(dechex(self::getUInt2d($recordData, 0))); // the hashed password + $this->phpSheet->getProtection()->setPassword($password, true); + } } /** - * SST - Shared String Table. - * - * This record contains a list of all strings used anywhere - * in the workbook. Each string occurs only once. The - * workbook uses indexes into the list to reference the - * strings. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read DEFCOLWIDTH record. */ - private function readSst(): void + protected function readDefColWidth(): void { - // offset within (spliced) record data - $pos = 0; - - // Limit global SST position, further control for bad SST Length in BIFF8 data - $limitposSST = 0; + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // get spliced record data - $splicedRecordData = $this->getSplicedRecordData(); + // move stream pointer to next record + $this->pos += 4 + $length; - $recordData = $splicedRecordData['recordData']; - $spliceOffsets = $splicedRecordData['spliceOffsets']; + // offset: 0; size: 2; default column width + $width = self::getUInt2d($recordData, 0); + if ($width != 8) { + $this->phpSheet->getDefaultColumnDimension()->setWidth($width); + } + } - // offset: 0; size: 4; total number of strings in the workbook - $pos += 4; + /** + * Read COLINFO record. + */ + protected function readColInfo(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // offset: 4; size: 4; number of following strings ($nm) - $nm = self::getInt4d($recordData, 4); - $pos += 4; + // move stream pointer to next record + $this->pos += 4 + $length; - // look up limit position - foreach ($spliceOffsets as $spliceOffset) { - // it can happen that the string is empty, therefore we need - // <= and not just < - if ($pos <= $spliceOffset) { - $limitposSST = $spliceOffset; - } - } + if (!$this->readDataOnly) { + // offset: 0; size: 2; index to first column in range + $firstColumnIndex = self::getUInt2d($recordData, 0); - // loop through the Unicode strings (16-bit length) - for ($i = 0; $i < $nm && $pos < $limitposSST; ++$i) { - // number of characters in the Unicode string - $numChars = self::getUInt2d($recordData, $pos); - $pos += 2; + // offset: 2; size: 2; index to last column in range + $lastColumnIndex = self::getUInt2d($recordData, 2); - // option flags - $optionFlags = ord($recordData[$pos]); - ++$pos; + // offset: 4; size: 2; width of the column in 1/256 of the width of the zero character + $width = self::getUInt2d($recordData, 4); - // bit: 0; mask: 0x01; 0 = compressed; 1 = uncompressed - $isCompressed = (($optionFlags & 0x01) == 0); + // offset: 6; size: 2; index to XF record for default column formatting + $xfIndex = self::getUInt2d($recordData, 6); - // bit: 2; mask: 0x02; 0 = ordinary; 1 = Asian phonetic - $hasAsian = (($optionFlags & 0x04) != 0); + // offset: 8; size: 2; option flags + // bit: 0; mask: 0x0001; 1= columns are hidden + $isHidden = (0x0001 & self::getUInt2d($recordData, 8)) >> 0; - // bit: 3; mask: 0x03; 0 = ordinary; 1 = Rich-Text - $hasRichText = (($optionFlags & 0x08) != 0); + // bit: 10-8; mask: 0x0700; outline level of the columns (0 = no outline) + $level = (0x0700 & self::getUInt2d($recordData, 8)) >> 8; - $formattingRuns = 0; - if ($hasRichText) { - // number of Rich-Text formatting runs - $formattingRuns = self::getUInt2d($recordData, $pos); - $pos += 2; - } + // bit: 12; mask: 0x1000; 1 = collapsed + $isCollapsed = (bool) ((0x1000 & self::getUInt2d($recordData, 8)) >> 12); - $extendedRunLength = 0; - if ($hasAsian) { - // size of Asian phonetic setting - $extendedRunLength = self::getInt4d($recordData, $pos); - $pos += 4; - } + // offset: 10; size: 2; not used - // expected byte length of character array if not split - $len = ($isCompressed) ? $numChars : $numChars * 2; - - // look up limit position - Check it again to be sure that no error occurs when parsing SST structure - $limitpos = null; - foreach ($spliceOffsets as $spliceOffset) { - // it can happen that the string is empty, therefore we need - // <= and not just < - if ($pos <= $spliceOffset) { - $limitpos = $spliceOffset; + for ($i = $firstColumnIndex + 1; $i <= $lastColumnIndex + 1; ++$i) { + if ($lastColumnIndex == 255 || $lastColumnIndex == 256) { + $this->phpSheet->getDefaultColumnDimension()->setWidth($width / 256); break; } + $this->phpSheet->getColumnDimensionByColumn($i)->setWidth($width / 256); + $this->phpSheet->getColumnDimensionByColumn($i)->setVisible(!$isHidden); + $this->phpSheet->getColumnDimensionByColumn($i)->setOutlineLevel($level); + $this->phpSheet->getColumnDimensionByColumn($i)->setCollapsed($isCollapsed); + if (isset($this->mapCellXfIndex[$xfIndex])) { + $this->phpSheet->getColumnDimensionByColumn($i)->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } } + } + } - if ($pos + $len <= $limitpos) { - // character array is not split between records - - $retstr = substr($recordData, $pos, $len); - $pos += $len; - } else { - // character array is split between records - - // first part of character array - $retstr = substr($recordData, $pos, $limitpos - $pos); + /** + * ROW. + * + * This record contains the properties of a single row in a + * sheet. Rows and cells in a sheet are divided into blocks + * of 32 rows. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" + */ + protected function readRow(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - $bytesRead = $limitpos - $pos; + // move stream pointer to next record + $this->pos += 4 + $length; - // remaining characters in Unicode string - $charsLeft = $numChars - (($isCompressed) ? $bytesRead : ($bytesRead / 2)); + if (!$this->readDataOnly) { + // offset: 0; size: 2; index of this row + $r = self::getUInt2d($recordData, 0); - $pos = $limitpos; + // offset: 2; size: 2; index to column of the first cell which is described by a cell record - // keep reading the characters - while ($charsLeft > 0) { - // look up next limit position, in case the string span more than one continue record - foreach ($spliceOffsets as $spliceOffset) { - if ($pos < $spliceOffset) { - $limitpos = $spliceOffset; + // offset: 4; size: 2; index to column of the last cell which is described by a cell record, increased by 1 - break; - } - } + // offset: 6; size: 2; - // repeated option flags - // OpenOffice.org documentation 5.21 - $option = ord($recordData[$pos]); - ++$pos; + // bit: 14-0; mask: 0x7FFF; height of the row, in twips = 1/20 of a point + $height = (0x7FFF & self::getUInt2d($recordData, 6)) >> 0; - if ($isCompressed && ($option == 0)) { - // 1st fragment compressed - // this fragment compressed - $len = min($charsLeft, $limitpos - $pos); - $retstr .= substr($recordData, $pos, $len); - $charsLeft -= $len; - $isCompressed = true; - } elseif (!$isCompressed && ($option != 0)) { - // 1st fragment uncompressed - // this fragment uncompressed - $len = min($charsLeft * 2, $limitpos - $pos); - $retstr .= substr($recordData, $pos, $len); - $charsLeft -= $len / 2; - $isCompressed = false; - } elseif (!$isCompressed && ($option == 0)) { - // 1st fragment uncompressed - // this fragment compressed - $len = min($charsLeft, $limitpos - $pos); - for ($j = 0; $j < $len; ++$j) { - $retstr .= $recordData[$pos + $j] - . chr(0); - } - $charsLeft -= $len; - $isCompressed = false; - } else { - // 1st fragment compressed - // this fragment uncompressed - $newstr = ''; - $jMax = strlen($retstr); - for ($j = 0; $j < $jMax; ++$j) { - $newstr .= $retstr[$j] . chr(0); - } - $retstr = $newstr; - $len = min($charsLeft * 2, $limitpos - $pos); - $retstr .= substr($recordData, $pos, $len); - $charsLeft -= $len / 2; - $isCompressed = false; - } + // bit: 15: mask: 0x8000; 0 = row has custom height; 1= row has default height + $useDefaultHeight = (0x8000 & self::getUInt2d($recordData, 6)) >> 15; - $pos += $len; - } + if (!$useDefaultHeight) { + $this->phpSheet->getRowDimension($r + 1)->setRowHeight($height / 20); } - // convert to UTF-8 - $retstr = self::encodeUTF16($retstr, $isCompressed); - - // read additional Rich-Text information, if any - $fmtRuns = []; - if ($hasRichText) { - // list of formatting runs - for ($j = 0; $j < $formattingRuns; ++$j) { - // first formatted character; zero-based - $charPos = self::getUInt2d($recordData, $pos + $j * 4); + // offset: 8; size: 2; not used - // index to font record - $fontIndex = self::getUInt2d($recordData, $pos + 2 + $j * 4); + // offset: 10; size: 2; not used in BIFF5-BIFF8 - $fmtRuns[] = [ - 'charPos' => $charPos, - 'fontIndex' => $fontIndex, - ]; - } - $pos += 4 * $formattingRuns; - } + // offset: 12; size: 4; option flags and default row formatting - // read additional Asian phonetics information, if any - if ($hasAsian) { - // For Asian phonetic settings, we skip the extended string data - $pos += $extendedRunLength; - } + // bit: 2-0: mask: 0x00000007; outline level of the row + $level = (0x00000007 & self::getInt4d($recordData, 12)) >> 0; + $this->phpSheet->getRowDimension($r + 1)->setOutlineLevel($level); - // store the shared sting - $this->sst[] = [ - 'value' => $retstr, - 'fmtRuns' => $fmtRuns, - ]; - } + // bit: 4; mask: 0x00000010; 1 = outline group start or ends here... and is collapsed + $isCollapsed = (bool) ((0x00000010 & self::getInt4d($recordData, 12)) >> 4); + $this->phpSheet->getRowDimension($r + 1)->setCollapsed($isCollapsed); - // getSplicedRecordData() takes care of moving current position in data stream - } + // bit: 5; mask: 0x00000020; 1 = row is hidden + $isHidden = (0x00000020 & self::getInt4d($recordData, 12)) >> 5; + $this->phpSheet->getRowDimension($r + 1)->setVisible(!$isHidden); - /** - * Read PRINTGRIDLINES record. - */ - private function readPrintGridlines(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // bit: 7; mask: 0x00000080; 1 = row has explicit format + $hasExplicitFormat = (0x00000080 & self::getInt4d($recordData, 12)) >> 7; - // move stream pointer to next record - $this->pos += 4 + $length; + // bit: 27-16; mask: 0x0FFF0000; only applies when hasExplicitFormat = 1; index to XF record + $xfIndex = (0x0FFF0000 & self::getInt4d($recordData, 12)) >> 16; - if ($this->version == self::XLS_BIFF8 && !$this->readDataOnly) { - // offset: 0; size: 2; 0 = do not print sheet grid lines; 1 = print sheet gridlines - $printGridlines = (bool) self::getUInt2d($recordData, 0); - $this->phpSheet->setPrintGridlines($printGridlines); + if ($hasExplicitFormat && isset($this->mapCellXfIndex[$xfIndex])) { + $this->phpSheet->getRowDimension($r + 1)->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } } } /** - * Read DEFAULTROWHEIGHT record. + * Read RK record + * This record represents a cell that contains an RK value + * (encoded integer or floating-point value). If a + * floating-point value cannot be encoded to an RK value, + * a NUMBER record will be written. This record replaces the + * record INTEGER written in BIFF2. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readDefaultRowHeight(): void + protected function readRk(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3021,42 +2618,43 @@ private function readDefaultRowHeight(): void // move stream pointer to next record $this->pos += 4 + $length; - // offset: 0; size: 2; option flags - // offset: 2; size: 2; default height for unused rows, (twips 1/20 point) - $height = self::getUInt2d($recordData, 2); - $this->phpSheet->getDefaultRowDimension()->setRowHeight($height / 20); - } - - /** - * Read SHEETPR record. - */ - private function readSheetPr(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // offset: 0; size: 2; index to row + $row = self::getUInt2d($recordData, 0); - // move stream pointer to next record - $this->pos += 4 + $length; + // offset: 2; size: 2; index to column + $column = self::getUInt2d($recordData, 2); + $columnString = Coordinate::stringFromColumnIndex($column + 1); - // offset: 0; size: 2 + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + // offset: 4; size: 2; index to XF record + $xfIndex = self::getUInt2d($recordData, 4); - // bit: 6; mask: 0x0040; 0 = outline buttons above outline group - $isSummaryBelow = (0x0040 & self::getUInt2d($recordData, 0)) >> 6; - $this->phpSheet->setShowSummaryBelow((bool) $isSummaryBelow); + // offset: 6; size: 4; RK value + $rknum = self::getInt4d($recordData, 6); + $numValue = self::getIEEE754($rknum); - // bit: 7; mask: 0x0080; 0 = outline buttons left of outline group - $isSummaryRight = (0x0080 & self::getUInt2d($recordData, 0)) >> 7; - $this->phpSheet->setShowSummaryRight((bool) $isSummaryRight); + $cell = $this->phpSheet->getCell($columnString . ($row + 1)); + if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { + // add style information + $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } - // bit: 8; mask: 0x100; 0 = scale printout in percent, 1 = fit printout to number of pages - // this corresponds to radio button setting in page setup dialog in Excel - $this->isFitToPages = (bool) ((0x0100 & self::getUInt2d($recordData, 0)) >> 8); + // add cell + $cell->setValueExplicit($numValue, DataType::TYPE_NUMERIC); + } } /** - * Read HORIZONTALPAGEBREAKS record. + * Read LABELSST record + * This record represents a cell that contains a string. It + * replaces the LABEL record and RSTRING record used in + * BIFF2-BIFF5. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readHorizontalPageBreaks(): void + protected function readLabelSst(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3064,53 +2662,84 @@ private function readHorizontalPageBreaks(): void // move stream pointer to next record $this->pos += 4 + $length; - if ($this->version == self::XLS_BIFF8 && !$this->readDataOnly) { - // offset: 0; size: 2; number of the following row index structures - $nm = self::getUInt2d($recordData, 0); + // offset: 0; size: 2; index to row + $row = self::getUInt2d($recordData, 0); - // offset: 2; size: 6 * $nm; list of $nm row index structures - for ($i = 0; $i < $nm; ++$i) { - $r = self::getUInt2d($recordData, 2 + 6 * $i); - $cf = self::getUInt2d($recordData, 2 + 6 * $i + 2); - //$cl = self::getUInt2d($recordData, 2 + 6 * $i + 4); + // offset: 2; size: 2; index to column + $column = self::getUInt2d($recordData, 2); + $columnString = Coordinate::stringFromColumnIndex($column + 1); - // not sure why two column indexes are necessary? - $this->phpSheet->setBreak([$cf + 1, $r], Worksheet::BREAK_ROW); - } - } - } + $cell = null; + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + // offset: 4; size: 2; index to XF record + $xfIndex = self::getUInt2d($recordData, 4); - /** - * Read VERTICALPAGEBREAKS record. - */ - private function readVerticalPageBreaks(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; + // offset: 6; size: 4; index to SST record + $index = self::getInt4d($recordData, 6); - if ($this->version == self::XLS_BIFF8 && !$this->readDataOnly) { - // offset: 0; size: 2; number of the following column index structures - $nm = self::getUInt2d($recordData, 0); + // add cell + if (($fmtRuns = $this->sst[$index]['fmtRuns']) && !$this->readDataOnly) { + // then we should treat as rich text + $richText = new RichText(); + $charPos = 0; + $sstCount = count($this->sst[$index]['fmtRuns']); + for ($i = 0; $i <= $sstCount; ++$i) { + if (isset($fmtRuns[$i])) { + $text = StringHelper::substring($this->sst[$index]['value'], $charPos, $fmtRuns[$i]['charPos'] - $charPos); + $charPos = $fmtRuns[$i]['charPos']; + } else { + $text = StringHelper::substring($this->sst[$index]['value'], $charPos, StringHelper::countCharacters($this->sst[$index]['value'])); + } - // offset: 2; size: 6 * $nm; list of $nm row index structures - for ($i = 0; $i < $nm; ++$i) { - $c = self::getUInt2d($recordData, 2 + 6 * $i); - $rf = self::getUInt2d($recordData, 2 + 6 * $i + 2); - //$rl = self::getUInt2d($recordData, 2 + 6 * $i + 4); + if (StringHelper::countCharacters($text) > 0) { + if ($i == 0) { // first text run, no style + $richText->createText($text); + } else { + $textRun = $richText->createTextRun($text); + if (isset($fmtRuns[$i - 1])) { + if ($fmtRuns[$i - 1]['fontIndex'] < 4) { + $fontIndex = $fmtRuns[$i - 1]['fontIndex']; + } else { + // this has to do with that index 4 is omitted in all BIFF versions for some stra nge reason + // check the OpenOffice documentation of the FONT record + $fontIndex = $fmtRuns[$i - 1]['fontIndex'] - 1; + } + if (array_key_exists($fontIndex, $this->objFonts) === false) { + $fontIndex = count($this->objFonts) - 1; + } + $textRun->setFont(clone $this->objFonts[$fontIndex]); + } + } + } + } + if ($this->readEmptyCells || trim($richText->getPlainText()) !== '') { + $cell = $this->phpSheet->getCell($columnString . ($row + 1)); + $cell->setValueExplicit($richText, DataType::TYPE_STRING); + } + } else { + if ($this->readEmptyCells || trim($this->sst[$index]['value']) !== '') { + $cell = $this->phpSheet->getCell($columnString . ($row + 1)); + $cell->setValueExplicit($this->sst[$index]['value'], DataType::TYPE_STRING); + } + } - // not sure why two row indexes are necessary? - $this->phpSheet->setBreak([$c + 1, ($rf > 0) ? $rf : 1], Worksheet::BREAK_COLUMN); + if (!$this->readDataOnly && $cell !== null && isset($this->mapCellXfIndex[$xfIndex])) { + // add style information + $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); } } } /** - * Read HEADER record. + * Read MULRK record + * This record represents a cell range containing RK value + * cells. All cells are located in the same row. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readHeader(): void + protected function readMulRk(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3118,52 +2747,52 @@ private function readHeader(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - // offset: 0; size: var - // realized that $recordData can be empty even when record exists - if ($recordData) { - if ($this->version == self::XLS_BIFF8) { - $string = self::readUnicodeStringLong($recordData); - } else { - $string = $this->readByteStringShort($recordData); - } + // offset: 0; size: 2; index to row + $row = self::getUInt2d($recordData, 0); - $this->phpSheet->getHeaderFooter()->setOddHeader($string['value']); - $this->phpSheet->getHeaderFooter()->setEvenHeader($string['value']); - } - } - } + // offset: 2; size: 2; index to first column + $colFirst = self::getUInt2d($recordData, 2); - /** - * Read FOOTER record. - */ - private function readFooter(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // offset: var; size: 2; index to last column + $colLast = self::getUInt2d($recordData, $length - 2); + $columns = $colLast - $colFirst + 1; - // move stream pointer to next record - $this->pos += 4 + $length; + // offset within record data + $offset = 4; - if (!$this->readDataOnly) { - // offset: 0; size: var - // realized that $recordData can be empty even when record exists - if ($recordData) { - if ($this->version == self::XLS_BIFF8) { - $string = self::readUnicodeStringLong($recordData); - } else { - $string = $this->readByteStringShort($recordData); + for ($i = 1; $i <= $columns; ++$i) { + $columnString = Coordinate::stringFromColumnIndex($colFirst + $i); + + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + // offset: var; size: 2; index to XF record + $xfIndex = self::getUInt2d($recordData, $offset); + + // offset: var; size: 4; RK value + $numValue = self::getIEEE754(self::getInt4d($recordData, $offset + 2)); + $cell = $this->phpSheet->getCell($columnString . ($row + 1)); + if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { + // add style + $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); } - $this->phpSheet->getHeaderFooter()->setOddFooter($string['value']); - $this->phpSheet->getHeaderFooter()->setEvenFooter($string['value']); + + // add cell value + $cell->setValueExplicit($numValue, DataType::TYPE_NUMERIC); } + + $offset += 6; } } /** - * Read HCENTER record. + * Read NUMBER record + * This record represents a cell that contains a + * floating-point value. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readHcenter(): void + protected function readNumber(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3171,37 +2800,40 @@ private function readHcenter(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - // offset: 0; size: 2; 0 = print sheet left aligned, 1 = print sheet centered horizontally - $isHorizontalCentered = (bool) self::getUInt2d($recordData, 0); + // offset: 0; size: 2; index to row + $row = self::getUInt2d($recordData, 0); - $this->phpSheet->getPageSetup()->setHorizontalCentered($isHorizontalCentered); - } - } + // offset: 2; size 2; index to column + $column = self::getUInt2d($recordData, 2); + $columnString = Coordinate::stringFromColumnIndex($column + 1); - /** - * Read VCENTER record. - */ - private function readVcenter(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + // offset 4; size: 2; index to XF record + $xfIndex = self::getUInt2d($recordData, 4); - // move stream pointer to next record - $this->pos += 4 + $length; + $numValue = self::extractNumber(substr($recordData, 6, 8)); - if (!$this->readDataOnly) { - // offset: 0; size: 2; 0 = print sheet aligned at top page border, 1 = print sheet vertically centered - $isVerticalCentered = (bool) self::getUInt2d($recordData, 0); + $cell = $this->phpSheet->getCell($columnString . ($row + 1)); + if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { + // add cell style + $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } - $this->phpSheet->getPageSetup()->setVerticalCentered($isVerticalCentered); + // add cell value + $cell->setValueExplicit($numValue, DataType::TYPE_NUMERIC); } } /** - * Read LEFTMARGIN record. + * Read FORMULA record + perhaps a following STRING record if formula result is a string + * This record contains the token array and the result of a + * formula cell. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readLeftMargin(): void + protected function readFormula(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3209,67 +2841,131 @@ private function readLeftMargin(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - // offset: 0; size: 8 - $this->phpSheet->getPageMargins()->setLeft(self::extractNumber($recordData)); - } - } + // offset: 0; size: 2; row index + $row = self::getUInt2d($recordData, 0); - /** - * Read RIGHTMARGIN record. - */ - private function readRightMargin(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // offset: 2; size: 2; col index + $column = self::getUInt2d($recordData, 2); + $columnString = Coordinate::stringFromColumnIndex($column + 1); - // move stream pointer to next record - $this->pos += 4 + $length; + // offset: 20: size: variable; formula structure + $formulaStructure = substr($recordData, 20); - if (!$this->readDataOnly) { - // offset: 0; size: 8 - $this->phpSheet->getPageMargins()->setRight(self::extractNumber($recordData)); - } - } + // offset: 14: size: 2; option flags, recalculate always, recalculate on open etc. + $options = self::getUInt2d($recordData, 14); - /** - * Read TOPMARGIN record. - */ - private function readTopMargin(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // bit: 0; mask: 0x0001; 1 = recalculate always + // bit: 1; mask: 0x0002; 1 = calculate on open + // bit: 2; mask: 0x0008; 1 = part of a shared formula + $isPartOfSharedFormula = (bool) (0x0008 & $options); - // move stream pointer to next record - $this->pos += 4 + $length; + // WARNING: + // We can apparently not rely on $isPartOfSharedFormula. Even when $isPartOfSharedFormula = true + // the formula data may be ordinary formula data, therefore we need to check + // explicitly for the tExp token (0x01) + $isPartOfSharedFormula = $isPartOfSharedFormula && ord($formulaStructure[2]) == 0x01; - if (!$this->readDataOnly) { - // offset: 0; size: 8 - $this->phpSheet->getPageMargins()->setTop(self::extractNumber($recordData)); + if ($isPartOfSharedFormula) { + // part of shared formula which means there will be a formula with a tExp token and nothing else + // get the base cell, grab tExp token + $baseRow = self::getUInt2d($formulaStructure, 3); + $baseCol = self::getUInt2d($formulaStructure, 5); + $this->baseCell = Coordinate::stringFromColumnIndex($baseCol + 1) . ($baseRow + 1); } - } - /** - * Read BOTTOMMARGIN record. - */ - private function readBottomMargin(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + if ($isPartOfSharedFormula) { + // formula is added to this cell after the sheet has been read + $this->sharedFormulaParts[$columnString . ($row + 1)] = $this->baseCell; + } - // move stream pointer to next record - $this->pos += 4 + $length; + // offset: 16: size: 4; not used - if (!$this->readDataOnly) { - // offset: 0; size: 8 - $this->phpSheet->getPageMargins()->setBottom(self::extractNumber($recordData)); + // offset: 4; size: 2; XF index + $xfIndex = self::getUInt2d($recordData, 4); + + // offset: 6; size: 8; result of the formula + if ((ord($recordData[6]) == 0) && (ord($recordData[12]) == 255) && (ord($recordData[13]) == 255)) { + // String formula. Result follows in appended STRING record + $dataType = DataType::TYPE_STRING; + + // read possible SHAREDFMLA record + $code = self::getUInt2d($this->data, $this->pos); + if ($code == self::XLS_TYPE_SHAREDFMLA) { + $this->readSharedFmla(); + } + + // read STRING record + $value = $this->readString(); + } elseif ( + (ord($recordData[6]) == 1) + && (ord($recordData[12]) == 255) + && (ord($recordData[13]) == 255) + ) { + // Boolean formula. Result is in +2; 0=false, 1=true + $dataType = DataType::TYPE_BOOL; + $value = (bool) ord($recordData[8]); + } elseif ( + (ord($recordData[6]) == 2) + && (ord($recordData[12]) == 255) + && (ord($recordData[13]) == 255) + ) { + // Error formula. Error code is in +2 + $dataType = DataType::TYPE_ERROR; + $value = Xls\ErrorCode::lookup(ord($recordData[8])); + } elseif ( + (ord($recordData[6]) == 3) + && (ord($recordData[12]) == 255) + && (ord($recordData[13]) == 255) + ) { + // Formula result is a null string + $dataType = DataType::TYPE_NULL; + $value = ''; + } else { + // forumla result is a number, first 14 bytes like _NUMBER record + $dataType = DataType::TYPE_NUMERIC; + $value = self::extractNumber(substr($recordData, 6, 8)); + } + + $cell = $this->phpSheet->getCell($columnString . ($row + 1)); + if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { + // add cell style + $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } + + // store the formula + if (!$isPartOfSharedFormula) { + // not part of shared formula + // add cell value. If we can read formula, populate with formula, otherwise just used cached value + try { + if ($this->version != self::XLS_BIFF8) { + throw new Exception('Not BIFF8. Can only read BIFF8 formulas'); + } + $formula = $this->getFormulaFromStructure($formulaStructure); // get formula in human language + $cell->setValueExplicit('=' . $formula, DataType::TYPE_FORMULA); + } catch (PhpSpreadsheetException) { + $cell->setValueExplicit($value, $dataType); + } + } else { + if ($this->version == self::XLS_BIFF8) { + // do nothing at this point, formula id added later in the code + } else { + $cell->setValueExplicit($value, $dataType); + } + } + + // store the cached calculated value + $cell->setCalculatedValue($value, $dataType === DataType::TYPE_NUMERIC); } } /** - * Read PAGESETUP record. + * Read a SHAREDFMLA record. This function just stores the binary shared formula in the reader, + * which usually contains relative references. + * These will be used to construct the formula in each shared formula part after the sheet is read. */ - private function readPageSetup(): void + protected function readSharedFmla(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3277,57 +2973,30 @@ private function readPageSetup(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - // offset: 0; size: 2; paper size - $paperSize = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; scaling factor - $scale = self::getUInt2d($recordData, 2); - - // offset: 6; size: 2; fit worksheet width to this number of pages, 0 = use as many as needed - $fitToWidth = self::getUInt2d($recordData, 6); - - // offset: 8; size: 2; fit worksheet height to this number of pages, 0 = use as many as needed - $fitToHeight = self::getUInt2d($recordData, 8); - - // offset: 10; size: 2; option flags - - // bit: 0; mask: 0x0001; 0=down then over, 1=over then down - $isOverThenDown = (0x0001 & self::getUInt2d($recordData, 10)); - - // bit: 1; mask: 0x0002; 0=landscape, 1=portrait - $isPortrait = (0x0002 & self::getUInt2d($recordData, 10)) >> 1; - - // bit: 2; mask: 0x0004; 1= paper size, scaling factor, paper orient. not init - // when this bit is set, do not use flags for those properties - $isNotInit = (0x0004 & self::getUInt2d($recordData, 10)) >> 2; + // offset: 0, size: 6; cell range address of the area used by the shared formula, not used for anything + //$cellRange = substr($recordData, 0, 6); + //$cellRange = Xls\Biff5::readBIFF5CellRangeAddressFixed($cellRange); // note: even BIFF8 uses BIFF5 syntax - if (!$isNotInit) { - $this->phpSheet->getPageSetup()->setPaperSize($paperSize); - $this->phpSheet->getPageSetup()->setPageOrder(((bool) $isOverThenDown) ? PageSetup::PAGEORDER_OVER_THEN_DOWN : PageSetup::PAGEORDER_DOWN_THEN_OVER); - $this->phpSheet->getPageSetup()->setOrientation(((bool) $isPortrait) ? PageSetup::ORIENTATION_PORTRAIT : PageSetup::ORIENTATION_LANDSCAPE); + // offset: 6, size: 1; not used - $this->phpSheet->getPageSetup()->setScale($scale, false); - $this->phpSheet->getPageSetup()->setFitToPage((bool) $this->isFitToPages); - $this->phpSheet->getPageSetup()->setFitToWidth($fitToWidth, false); - $this->phpSheet->getPageSetup()->setFitToHeight($fitToHeight, false); - } + // offset: 7, size: 1; number of existing FORMULA records for this shared formula + //$no = ord($recordData[7]); - // offset: 16; size: 8; header margin (IEEE 754 floating-point value) - $marginHeader = self::extractNumber(substr($recordData, 16, 8)); - $this->phpSheet->getPageMargins()->setHeader($marginHeader); + // offset: 8, size: var; Binary token array of the shared formula + $formula = substr($recordData, 8); - // offset: 24; size: 8; footer margin (IEEE 754 floating-point value) - $marginFooter = self::extractNumber(substr($recordData, 24, 8)); - $this->phpSheet->getPageMargins()->setFooter($marginFooter); - } + // at this point we only store the shared formula for later use + $this->sharedFormulas[$this->baseCell] = $formula; } /** - * PROTECT - Sheet protection (BIFF2 through BIFF8) - * if this record is omitted, then it also means no sheet protection. + * Read a STRING record from current stream position and advance the stream pointer to next record + * This record is used for storing result from FORMULA record when it is a string, and + * it occurs directly after the FORMULA record. + * + * @return string The string contents as UTF-8 */ - private function readProtect(): void + protected function readString(): string { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3335,21 +3004,26 @@ private function readProtect(): void // move stream pointer to next record $this->pos += 4 + $length; - if ($this->readDataOnly) { - return; + if ($this->version == self::XLS_BIFF8) { + $string = self::readUnicodeStringLong($recordData); + $value = $string['value']; + } else { + $string = $this->readByteStringLong($recordData); + $value = $string['value']; } - // offset: 0; size: 2; - - // bit 0, mask 0x01; 1 = sheet is protected - $bool = (0x01 & self::getUInt2d($recordData, 0)) >> 0; - $this->phpSheet->getProtection()->setSheet((bool) $bool); + return $value; } /** - * SCENPROTECT. + * Read BOOLERR record + * This record represents a Boolean value or error value + * cell. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readScenProtect(): void + protected function readBoolErr(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3357,45 +3031,58 @@ private function readScenProtect(): void // move stream pointer to next record $this->pos += 4 + $length; - if ($this->readDataOnly) { - return; - } + // offset: 0; size: 2; row index + $row = self::getUInt2d($recordData, 0); - // offset: 0; size: 2; + // offset: 2; size: 2; column index + $column = self::getUInt2d($recordData, 2); + $columnString = Coordinate::stringFromColumnIndex($column + 1); - // bit: 0, mask 0x01; 1 = scenarios are protected - $bool = (0x01 & self::getUInt2d($recordData, 0)) >> 0; + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + // offset: 4; size: 2; index to XF record + $xfIndex = self::getUInt2d($recordData, 4); - $this->phpSheet->getProtection()->setScenarios((bool) $bool); - } + // offset: 6; size: 1; the boolean value or error value + $boolErr = ord($recordData[6]); - /** - * OBJECTPROTECT. - */ - private function readObjectProtect(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + // offset: 7; size: 1; 0=boolean; 1=error + $isError = ord($recordData[7]); - // move stream pointer to next record - $this->pos += 4 + $length; + $cell = $this->phpSheet->getCell($columnString . ($row + 1)); + switch ($isError) { + case 0: // boolean + $value = (bool) $boolErr; - if ($this->readDataOnly) { - return; - } + // add cell value + $cell->setValueExplicit($value, DataType::TYPE_BOOL); - // offset: 0; size: 2; + break; + case 1: // error type + $value = Xls\ErrorCode::lookup($boolErr); - // bit: 0, mask 0x01; 1 = objects are protected - $bool = (0x01 & self::getUInt2d($recordData, 0)) >> 0; + // add cell value + $cell->setValueExplicit($value, DataType::TYPE_ERROR); - $this->phpSheet->getProtection()->setObjects((bool) $bool); + break; + } + + if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { + // add cell style + $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } + } } /** - * PASSWORD - Sheet protection (hashed) password (BIFF2 through BIFF8). + * Read MULBLANK record + * This record represents a cell range of empty cells. All + * cells are located in the same row. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readPassword(): void + protected function readMulBlank(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3403,17 +3090,42 @@ private function readPassword(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - // offset: 0; size: 2; 16-bit hash value of password - $password = strtoupper(dechex(self::getUInt2d($recordData, 0))); // the hashed password - $this->phpSheet->getProtection()->setPassword($password, true); + // offset: 0; size: 2; index to row + $row = self::getUInt2d($recordData, 0); + + // offset: 2; size: 2; index to first column + $fc = self::getUInt2d($recordData, 2); + + // offset: 4; size: 2 x nc; list of indexes to XF records + // add style information + if (!$this->readDataOnly && $this->readEmptyCells) { + for ($i = 0; $i < $length / 2 - 3; ++$i) { + $columnString = Coordinate::stringFromColumnIndex($fc + $i + 1); + + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + $xfIndex = self::getUInt2d($recordData, 4 + 2 * $i); + if (isset($this->mapCellXfIndex[$xfIndex])) { + $this->phpSheet->getCell($columnString . ($row + 1))->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } + } + } } + + // offset: 6; size 2; index to last column (not needed) } /** - * Read DEFCOLWIDTH record. + * Read LABEL record + * This record represents a cell that contains a string. In + * BIFF8 it is usually replaced by the LABELSST record. + * Excel still uses this record, if it copies unformatted + * text cells to the clipboard. + * + * -- "OpenOffice.org's Documentation of the Microsoft + * Excel File Format" */ - private function readDefColWidth(): void + protected function readLabel(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3421,17 +3133,43 @@ private function readDefColWidth(): void // move stream pointer to next record $this->pos += 4 + $length; - // offset: 0; size: 2; default column width - $width = self::getUInt2d($recordData, 0); - if ($width != 8) { - $this->phpSheet->getDefaultColumnDimension()->setWidth($width); + // offset: 0; size: 2; index to row + $row = self::getUInt2d($recordData, 0); + + // offset: 2; size: 2; index to column + $column = self::getUInt2d($recordData, 2); + $columnString = Coordinate::stringFromColumnIndex($column + 1); + + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + // offset: 4; size: 2; XF index + $xfIndex = self::getUInt2d($recordData, 4); + + // add cell value + // todo: what if string is very long? continue record + if ($this->version == self::XLS_BIFF8) { + $string = self::readUnicodeStringLong(substr($recordData, 6)); + $value = $string['value']; + } else { + $string = $this->readByteStringLong(substr($recordData, 6)); + $value = $string['value']; + } + if ($this->readEmptyCells || trim($value) !== '') { + $cell = $this->phpSheet->getCell($columnString . ($row + 1)); + $cell->setValueExplicit($value, DataType::TYPE_STRING); + + if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { + // add cell style + $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } + } } } /** - * Read COLINFO record. + * Read BLANK record. */ - private function readColInfo(): void + protected function readBlank(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3439,59 +3177,83 @@ private function readColInfo(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - // offset: 0; size: 2; index to first column in range - $firstColumnIndex = self::getUInt2d($recordData, 0); + // offset: 0; size: 2; row index + $row = self::getUInt2d($recordData, 0); - // offset: 2; size: 2; index to last column in range - $lastColumnIndex = self::getUInt2d($recordData, 2); + // offset: 2; size: 2; col index + $col = self::getUInt2d($recordData, 2); + $columnString = Coordinate::stringFromColumnIndex($col + 1); - // offset: 4; size: 2; width of the column in 1/256 of the width of the zero character - $width = self::getUInt2d($recordData, 4); + // Read cell? + if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { + // offset: 4; size: 2; XF index + $xfIndex = self::getUInt2d($recordData, 4); - // offset: 6; size: 2; index to XF record for default column formatting - $xfIndex = self::getUInt2d($recordData, 6); + // add style information + if (!$this->readDataOnly && $this->readEmptyCells && isset($this->mapCellXfIndex[$xfIndex])) { + $this->phpSheet->getCell($columnString . ($row + 1))->setXfIndex($this->mapCellXfIndex[$xfIndex]); + } + } + } - // offset: 8; size: 2; option flags - // bit: 0; mask: 0x0001; 1= columns are hidden - $isHidden = (0x0001 & self::getUInt2d($recordData, 8)) >> 0; + /** + * Read MSODRAWING record. + */ + protected function readMsoDrawing(): void + { + //$length = self::getUInt2d($this->data, $this->pos + 2); - // bit: 10-8; mask: 0x0700; outline level of the columns (0 = no outline) - $level = (0x0700 & self::getUInt2d($recordData, 8)) >> 8; + // get spliced record data + $splicedRecordData = $this->getSplicedRecordData(); + $recordData = $splicedRecordData['recordData']; - // bit: 12; mask: 0x1000; 1 = collapsed - $isCollapsed = (bool) ((0x1000 & self::getUInt2d($recordData, 8)) >> 12); + $this->drawingData .= $recordData; + } - // offset: 10; size: 2; not used + /** + * Read OBJ record. + */ + protected function readObj(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - for ($i = $firstColumnIndex + 1; $i <= $lastColumnIndex + 1; ++$i) { - if ($lastColumnIndex == 255 || $lastColumnIndex == 256) { - $this->phpSheet->getDefaultColumnDimension()->setWidth($width / 256); + // move stream pointer to next record + $this->pos += 4 + $length; - break; - } - $this->phpSheet->getColumnDimensionByColumn($i)->setWidth($width / 256); - $this->phpSheet->getColumnDimensionByColumn($i)->setVisible(!$isHidden); - $this->phpSheet->getColumnDimensionByColumn($i)->setOutlineLevel($level); - $this->phpSheet->getColumnDimensionByColumn($i)->setCollapsed($isCollapsed); - if (isset($this->mapCellXfIndex[$xfIndex])) { - $this->phpSheet->getColumnDimensionByColumn($i)->setXfIndex($this->mapCellXfIndex[$xfIndex]); - } - } + if ($this->readDataOnly || $this->version != self::XLS_BIFF8) { + return; } + + // recordData consists of an array of subrecords looking like this: + // ft: 2 bytes; ftCmo type (0x15) + // cb: 2 bytes; size in bytes of ftCmo data + // ot: 2 bytes; Object Type + // id: 2 bytes; Object id number + // grbit: 2 bytes; Option Flags + // data: var; subrecord data + + // for now, we are just interested in the second subrecord containing the object type + $ftCmoType = self::getUInt2d($recordData, 0); + $cbCmoSize = self::getUInt2d($recordData, 2); + $otObjType = self::getUInt2d($recordData, 4); + $idObjID = self::getUInt2d($recordData, 6); + $grbitOpts = self::getUInt2d($recordData, 6); + + $this->objs[] = [ + 'ftCmoType' => $ftCmoType, + 'cbCmoSize' => $cbCmoSize, + 'otObjType' => $otObjType, + 'idObjID' => $idObjID, + 'grbitOpts' => $grbitOpts, + ]; + $this->textObjRef = $idObjID; } /** - * ROW. - * - * This record contains the properties of a single row in a - * sheet. Rows and cells in a sheet are divided into blocks - * of 32 rows. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read WINDOW2 record. */ - private function readRow(): void + protected function readWindow2(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3499,68 +3261,84 @@ private function readRow(): void // move stream pointer to next record $this->pos += 4 + $length; - if (!$this->readDataOnly) { - // offset: 0; size: 2; index of this row - $r = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; index to column of the first cell which is described by a cell record - - // offset: 4; size: 2; index to column of the last cell which is described by a cell record, increased by 1 + // offset: 0; size: 2; option flags + $options = self::getUInt2d($recordData, 0); - // offset: 6; size: 2; + // offset: 2; size: 2; index to first visible row + //$firstVisibleRow = self::getUInt2d($recordData, 2); - // bit: 14-0; mask: 0x7FFF; height of the row, in twips = 1/20 of a point - $height = (0x7FFF & self::getUInt2d($recordData, 6)) >> 0; + // offset: 4; size: 2; index to first visible colum + //$firstVisibleColumn = self::getUInt2d($recordData, 4); + $zoomscaleInPageBreakPreview = 0; + $zoomscaleInNormalView = 0; + if ($this->version === self::XLS_BIFF8) { + // offset: 8; size: 2; not used + // offset: 10; size: 2; cached magnification factor in page break preview (in percent); 0 = Default (60%) + // offset: 12; size: 2; cached magnification factor in normal view (in percent); 0 = Default (100%) + // offset: 14; size: 4; not used + if (!isset($recordData[10])) { + $zoomscaleInPageBreakPreview = 0; + } else { + $zoomscaleInPageBreakPreview = self::getUInt2d($recordData, 10); + } - // bit: 15: mask: 0x8000; 0 = row has custom height; 1= row has default height - $useDefaultHeight = (0x8000 & self::getUInt2d($recordData, 6)) >> 15; + if ($zoomscaleInPageBreakPreview === 0) { + $zoomscaleInPageBreakPreview = 60; + } - if (!$useDefaultHeight) { - $this->phpSheet->getRowDimension($r + 1)->setRowHeight($height / 20); + if (!isset($recordData[12])) { + $zoomscaleInNormalView = 0; + } else { + $zoomscaleInNormalView = self::getUInt2d($recordData, 12); } - // offset: 8; size: 2; not used + if ($zoomscaleInNormalView === 0) { + $zoomscaleInNormalView = 100; + } + } - // offset: 10; size: 2; not used in BIFF5-BIFF8 + // bit: 1; mask: 0x0002; 0 = do not show gridlines, 1 = show gridlines + $showGridlines = (bool) ((0x0002 & $options) >> 1); + $this->phpSheet->setShowGridlines($showGridlines); - // offset: 12; size: 4; option flags and default row formatting + // bit: 2; mask: 0x0004; 0 = do not show headers, 1 = show headers + $showRowColHeaders = (bool) ((0x0004 & $options) >> 2); + $this->phpSheet->setShowRowColHeaders($showRowColHeaders); - // bit: 2-0: mask: 0x00000007; outline level of the row - $level = (0x00000007 & self::getInt4d($recordData, 12)) >> 0; - $this->phpSheet->getRowDimension($r + 1)->setOutlineLevel($level); + // bit: 3; mask: 0x0008; 0 = panes are not frozen, 1 = panes are frozen + $this->frozen = (bool) ((0x0008 & $options) >> 3); - // bit: 4; mask: 0x00000010; 1 = outline group start or ends here... and is collapsed - $isCollapsed = (bool) ((0x00000010 & self::getInt4d($recordData, 12)) >> 4); - $this->phpSheet->getRowDimension($r + 1)->setCollapsed($isCollapsed); + // bit: 6; mask: 0x0040; 0 = columns from left to right, 1 = columns from right to left + $this->phpSheet->setRightToLeft((bool) ((0x0040 & $options) >> 6)); - // bit: 5; mask: 0x00000020; 1 = row is hidden - $isHidden = (0x00000020 & self::getInt4d($recordData, 12)) >> 5; - $this->phpSheet->getRowDimension($r + 1)->setVisible(!$isHidden); + // bit: 10; mask: 0x0400; 0 = sheet not active, 1 = sheet active + $isActive = (bool) ((0x0400 & $options) >> 10); + if ($isActive) { + $this->spreadsheet->setActiveSheetIndex($this->spreadsheet->getIndex($this->phpSheet)); + $this->activeSheetSet = true; + } - // bit: 7; mask: 0x00000080; 1 = row has explicit format - $hasExplicitFormat = (0x00000080 & self::getInt4d($recordData, 12)) >> 7; + // bit: 11; mask: 0x0800; 0 = normal view, 1 = page break view + $isPageBreakPreview = (bool) ((0x0800 & $options) >> 11); - // bit: 27-16; mask: 0x0FFF0000; only applies when hasExplicitFormat = 1; index to XF record - $xfIndex = (0x0FFF0000 & self::getInt4d($recordData, 12)) >> 16; + //FIXME: set $firstVisibleRow and $firstVisibleColumn - if ($hasExplicitFormat && isset($this->mapCellXfIndex[$xfIndex])) { - $this->phpSheet->getRowDimension($r + 1)->setXfIndex($this->mapCellXfIndex[$xfIndex]); + if ($this->phpSheet->getSheetView()->getView() !== SheetView::SHEETVIEW_PAGE_LAYOUT) { + //NOTE: this setting is inferior to page layout view(Excel2007-) + $view = $isPageBreakPreview ? SheetView::SHEETVIEW_PAGE_BREAK_PREVIEW : SheetView::SHEETVIEW_NORMAL; + $this->phpSheet->getSheetView()->setView($view); + if ($this->version === self::XLS_BIFF8) { + $zoomScale = $isPageBreakPreview ? $zoomscaleInPageBreakPreview : $zoomscaleInNormalView; + $this->phpSheet->getSheetView()->setZoomScale($zoomScale); + $this->phpSheet->getSheetView()->setZoomScaleNormal($zoomscaleInNormalView); } } } /** - * Read RK record - * This record represents a cell that contains an RK value - * (encoded integer or floating-point value). If a - * floating-point value cannot be encoded to an RK value, - * a NUMBER record will be written. This record replaces the - * record INTEGER written in BIFF2. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read PLV Record(Created by Excel2007 or upper). */ - private function readRk(): void + protected function readPageLayoutView(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3568,43 +3346,36 @@ private function readRk(): void // move stream pointer to next record $this->pos += 4 + $length; - // offset: 0; size: 2; index to row - $row = self::getUInt2d($recordData, 0); + // offset: 0; size: 2; rt + //->ignore + //$rt = self::getUInt2d($recordData, 0); + // offset: 2; size: 2; grbitfr + //->ignore + //$grbitFrt = self::getUInt2d($recordData, 2); + // offset: 4; size: 8; reserved + //->ignore - // offset: 2; size: 2; index to column - $column = self::getUInt2d($recordData, 2); - $columnString = Coordinate::stringFromColumnIndex($column + 1); + // offset: 12; size 2; zoom scale + $wScalePLV = self::getUInt2d($recordData, 12); + // offset: 14; size 2; grbit + $grbit = self::getUInt2d($recordData, 14); - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - // offset: 4; size: 2; index to XF record - $xfIndex = self::getUInt2d($recordData, 4); + // decomprise grbit + $fPageLayoutView = $grbit & 0x01; + //$fRulerVisible = ($grbit >> 1) & 0x01; //no support + //$fWhitespaceHidden = ($grbit >> 3) & 0x01; //no support - // offset: 6; size: 4; RK value - $rknum = self::getInt4d($recordData, 6); - $numValue = self::getIEEE754($rknum); - - $cell = $this->phpSheet->getCell($columnString . ($row + 1)); - if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { - // add style information - $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); - } - - // add cell - $cell->setValueExplicit($numValue, DataType::TYPE_NUMERIC); + if ($fPageLayoutView === 1) { + $this->phpSheet->getSheetView()->setView(SheetView::SHEETVIEW_PAGE_LAYOUT); + $this->phpSheet->getSheetView()->setZoomScale($wScalePLV); //set by Excel2007 only if SHEETVIEW_PAGE_LAYOUT } + //otherwise, we cannot know whether SHEETVIEW_PAGE_LAYOUT or SHEETVIEW_PAGE_BREAK_PREVIEW. } /** - * Read LABELSST record - * This record represents a cell that contains a string. It - * replaces the LABEL record and RSTRING record used in - * BIFF2-BIFF5. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read SCL record. */ - private function readLabelSst(): void + protected function readScl(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3612,137 +3383,134 @@ private function readLabelSst(): void // move stream pointer to next record $this->pos += 4 + $length; - // offset: 0; size: 2; index to row - $row = self::getUInt2d($recordData, 0); + // offset: 0; size: 2; numerator of the view magnification + $numerator = self::getUInt2d($recordData, 0); - // offset: 2; size: 2; index to column - $column = self::getUInt2d($recordData, 2); - $columnString = Coordinate::stringFromColumnIndex($column + 1); + // offset: 2; size: 2; numerator of the view magnification + $denumerator = self::getUInt2d($recordData, 2); - $cell = null; - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - // offset: 4; size: 2; index to XF record - $xfIndex = self::getUInt2d($recordData, 4); + // set the zoom scale (in percent) + $this->phpSheet->getSheetView()->setZoomScale($numerator * 100 / $denumerator); + } - // offset: 6; size: 4; index to SST record - $index = self::getInt4d($recordData, 6); + /** + * Read PANE record. + */ + protected function readPane(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // add cell - if (($fmtRuns = $this->sst[$index]['fmtRuns']) && !$this->readDataOnly) { - // then we should treat as rich text - $richText = new RichText(); - $charPos = 0; - $sstCount = count($this->sst[$index]['fmtRuns']); - for ($i = 0; $i <= $sstCount; ++$i) { - if (isset($fmtRuns[$i])) { - $text = StringHelper::substring($this->sst[$index]['value'], $charPos, $fmtRuns[$i]['charPos'] - $charPos); - $charPos = $fmtRuns[$i]['charPos']; - } else { - $text = StringHelper::substring($this->sst[$index]['value'], $charPos, StringHelper::countCharacters($this->sst[$index]['value'])); - } + // move stream pointer to next record + $this->pos += 4 + $length; - if (StringHelper::countCharacters($text) > 0) { - if ($i == 0) { // first text run, no style - $richText->createText($text); - } else { - $textRun = $richText->createTextRun($text); - if (isset($fmtRuns[$i - 1])) { - if ($fmtRuns[$i - 1]['fontIndex'] < 4) { - $fontIndex = $fmtRuns[$i - 1]['fontIndex']; - } else { - // this has to do with that index 4 is omitted in all BIFF versions for some stra nge reason - // check the OpenOffice documentation of the FONT record - $fontIndex = $fmtRuns[$i - 1]['fontIndex'] - 1; - } - if (array_key_exists($fontIndex, $this->objFonts) === false) { - $fontIndex = count($this->objFonts) - 1; - } - $textRun->setFont(clone $this->objFonts[$fontIndex]); - } - } - } - } - if ($this->readEmptyCells || trim($richText->getPlainText()) !== '') { - $cell = $this->phpSheet->getCell($columnString . ($row + 1)); - $cell->setValueExplicit($richText, DataType::TYPE_STRING); - } - } else { - if ($this->readEmptyCells || trim($this->sst[$index]['value']) !== '') { - $cell = $this->phpSheet->getCell($columnString . ($row + 1)); - $cell->setValueExplicit($this->sst[$index]['value'], DataType::TYPE_STRING); - } - } + if (!$this->readDataOnly) { + // offset: 0; size: 2; position of vertical split + $px = self::getUInt2d($recordData, 0); - if (!$this->readDataOnly && $cell !== null && isset($this->mapCellXfIndex[$xfIndex])) { - // add style information - $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); + // offset: 2; size: 2; position of horizontal split + $py = self::getUInt2d($recordData, 2); + + // offset: 4; size: 2; top most visible row in the bottom pane + $rwTop = self::getUInt2d($recordData, 4); + + // offset: 6; size: 2; first visible left column in the right pane + $colLeft = self::getUInt2d($recordData, 6); + + if ($this->frozen) { + // frozen panes + $cell = Coordinate::stringFromColumnIndex($px + 1) . ($py + 1); + $topLeftCell = Coordinate::stringFromColumnIndex($colLeft + 1) . ($rwTop + 1); + $this->phpSheet->freezePane($cell, $topLeftCell); } + // unfrozen panes; split windows; not supported by PhpSpreadsheet core } } /** - * Read MULRK record - * This record represents a cell range containing RK value - * cells. All cells are located in the same row. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read SELECTION record. There is one such record for each pane in the sheet. */ - private function readMulRk(): void + protected function readSelection(): string { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); + $selectedCells = ''; // move stream pointer to next record $this->pos += 4 + $length; - // offset: 0; size: 2; index to row - $row = self::getUInt2d($recordData, 0); + if (!$this->readDataOnly) { + // offset: 0; size: 1; pane identifier + //$paneId = ord($recordData[0]); - // offset: 2; size: 2; index to first column - $colFirst = self::getUInt2d($recordData, 2); + // offset: 1; size: 2; index to row of the active cell + //$r = self::getUInt2d($recordData, 1); - // offset: var; size: 2; index to last column - $colLast = self::getUInt2d($recordData, $length - 2); - $columns = $colLast - $colFirst + 1; + // offset: 3; size: 2; index to column of the active cell + //$c = self::getUInt2d($recordData, 3); - // offset within record data - $offset = 4; + // offset: 5; size: 2; index into the following cell range list to the + // entry that contains the active cell + //$index = self::getUInt2d($recordData, 5); - for ($i = 1; $i <= $columns; ++$i) { - $columnString = Coordinate::stringFromColumnIndex($colFirst + $i); + // offset: 7; size: var; cell range address list containing all selected cell ranges + $data = substr($recordData, 7); + $cellRangeAddressList = Xls\Biff5::readBIFF5CellRangeAddressList($data); // note: also BIFF8 uses BIFF5 syntax - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - // offset: var; size: 2; index to XF record - $xfIndex = self::getUInt2d($recordData, $offset); + $selectedCells = $cellRangeAddressList['cellRangeAddresses'][0]; - // offset: var; size: 4; RK value - $numValue = self::getIEEE754(self::getInt4d($recordData, $offset + 2)); - $cell = $this->phpSheet->getCell($columnString . ($row + 1)); - if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { - // add style - $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); - } + // first row '1' + last row '16384' indicates that full column is selected (apparently also in BIFF8!) + if (preg_match('/^([A-Z]+1\:[A-Z]+)16384$/', $selectedCells)) { + $selectedCells = (string) preg_replace('/^([A-Z]+1\:[A-Z]+)16384$/', '${1}1048576', $selectedCells); + } - // add cell value - $cell->setValueExplicit($numValue, DataType::TYPE_NUMERIC); + // first row '1' + last row '65536' indicates that full column is selected + if (preg_match('/^([A-Z]+1\:[A-Z]+)65536$/', $selectedCells)) { + $selectedCells = (string) preg_replace('/^([A-Z]+1\:[A-Z]+)65536$/', '${1}1048576', $selectedCells); } - $offset += 6; + // first column 'A' + last column 'IV' indicates that full row is selected + if (preg_match('/^(A\d+\:)IV(\d+)$/', $selectedCells)) { + $selectedCells = (string) preg_replace('/^(A\d+\:)IV(\d+)$/', '${1}XFD${2}', $selectedCells); + } + + $this->phpSheet->setSelectedCells($selectedCells); + } + + return $selectedCells; + } + + private function includeCellRangeFiltered(string $cellRangeAddress): bool + { + $includeCellRange = true; + if ($this->getReadFilter() !== null) { + $includeCellRange = false; + $rangeBoundaries = Coordinate::getRangeBoundaries($cellRangeAddress); + ++$rangeBoundaries[1][0]; + for ($row = $rangeBoundaries[0][1]; $row <= $rangeBoundaries[1][1]; ++$row) { + for ($column = $rangeBoundaries[0][0]; $column != $rangeBoundaries[1][0]; ++$column) { + if ($this->getReadFilter()->readCell($column, $row, $this->phpSheet->getTitle())) { + $includeCellRange = true; + + break 2; + } + } + } } + + return $includeCellRange; } /** - * Read NUMBER record - * This record represents a cell that contains a - * floating-point value. + * MERGEDCELLS. + * + * This record contains the addresses of merged cell ranges + * in the current sheet. * * -- "OpenOffice.org's Documentation of the Microsoft * Excel File Format" */ - private function readNumber(): void + protected function readMergedCells(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); @@ -3750,2618 +3518,971 @@ private function readNumber(): void // move stream pointer to next record $this->pos += 4 + $length; - // offset: 0; size: 2; index to row - $row = self::getUInt2d($recordData, 0); - - // offset: 2; size 2; index to column - $column = self::getUInt2d($recordData, 2); - $columnString = Coordinate::stringFromColumnIndex($column + 1); - - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - // offset 4; size: 2; index to XF record - $xfIndex = self::getUInt2d($recordData, 4); - - $numValue = self::extractNumber(substr($recordData, 6, 8)); - - $cell = $this->phpSheet->getCell($columnString . ($row + 1)); - if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { - // add cell style - $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); + if ($this->version == self::XLS_BIFF8 && !$this->readDataOnly) { + $cellRangeAddressList = Xls\Biff8::readBIFF8CellRangeAddressList($recordData); + foreach ($cellRangeAddressList['cellRangeAddresses'] as $cellRangeAddress) { + if ( + (str_contains($cellRangeAddress, ':')) + && ($this->includeCellRangeFiltered($cellRangeAddress)) + ) { + $this->phpSheet->mergeCells($cellRangeAddress, Worksheet::MERGE_CELL_CONTENT_HIDE); + } } - - // add cell value - $cell->setValueExplicit($numValue, DataType::TYPE_NUMERIC); } } /** - * Read FORMULA record + perhaps a following STRING record if formula result is a string - * This record contains the token array and the result of a - * formula cell. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" + * Read HYPERLINK record. */ - private function readFormula(): void + protected function readHyperLink(): void { $length = self::getUInt2d($this->data, $this->pos + 2); $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - // move stream pointer to next record + // move stream pointer forward to next record $this->pos += 4 + $length; - // offset: 0; size: 2; row index - $row = self::getUInt2d($recordData, 0); + if (!$this->readDataOnly) { + // offset: 0; size: 8; cell range address of all cells containing this hyperlink + try { + $cellRange = Xls\Biff8::readBIFF8CellRangeAddressFixed($recordData); + } catch (PhpSpreadsheetException) { + return; + } - // offset: 2; size: 2; col index - $column = self::getUInt2d($recordData, 2); - $columnString = Coordinate::stringFromColumnIndex($column + 1); + // offset: 8, size: 16; GUID of StdLink - // offset: 20: size: variable; formula structure - $formulaStructure = substr($recordData, 20); + // offset: 24, size: 4; unknown value - // offset: 14: size: 2; option flags, recalculate always, recalculate on open etc. - $options = self::getUInt2d($recordData, 14); - - // bit: 0; mask: 0x0001; 1 = recalculate always - // bit: 1; mask: 0x0002; 1 = calculate on open - // bit: 2; mask: 0x0008; 1 = part of a shared formula - $isPartOfSharedFormula = (bool) (0x0008 & $options); - - // WARNING: - // We can apparently not rely on $isPartOfSharedFormula. Even when $isPartOfSharedFormula = true - // the formula data may be ordinary formula data, therefore we need to check - // explicitly for the tExp token (0x01) - $isPartOfSharedFormula = $isPartOfSharedFormula && ord($formulaStructure[2]) == 0x01; - - if ($isPartOfSharedFormula) { - // part of shared formula which means there will be a formula with a tExp token and nothing else - // get the base cell, grab tExp token - $baseRow = self::getUInt2d($formulaStructure, 3); - $baseCol = self::getUInt2d($formulaStructure, 5); - $this->baseCell = Coordinate::stringFromColumnIndex($baseCol + 1) . ($baseRow + 1); - } - - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - if ($isPartOfSharedFormula) { - // formula is added to this cell after the sheet has been read - $this->sharedFormulaParts[$columnString . ($row + 1)] = $this->baseCell; - } - - // offset: 16: size: 4; not used - - // offset: 4; size: 2; XF index - $xfIndex = self::getUInt2d($recordData, 4); - - // offset: 6; size: 8; result of the formula - if ((ord($recordData[6]) == 0) && (ord($recordData[12]) == 255) && (ord($recordData[13]) == 255)) { - // String formula. Result follows in appended STRING record - $dataType = DataType::TYPE_STRING; - - // read possible SHAREDFMLA record - $code = self::getUInt2d($this->data, $this->pos); - if ($code == self::XLS_TYPE_SHAREDFMLA) { - $this->readSharedFmla(); - } - - // read STRING record - $value = $this->readString(); - } elseif ( - (ord($recordData[6]) == 1) - && (ord($recordData[12]) == 255) - && (ord($recordData[13]) == 255) - ) { - // Boolean formula. Result is in +2; 0=false, 1=true - $dataType = DataType::TYPE_BOOL; - $value = (bool) ord($recordData[8]); - } elseif ( - (ord($recordData[6]) == 2) - && (ord($recordData[12]) == 255) - && (ord($recordData[13]) == 255) - ) { - // Error formula. Error code is in +2 - $dataType = DataType::TYPE_ERROR; - $value = Xls\ErrorCode::lookup(ord($recordData[8])); - } elseif ( - (ord($recordData[6]) == 3) - && (ord($recordData[12]) == 255) - && (ord($recordData[13]) == 255) - ) { - // Formula result is a null string - $dataType = DataType::TYPE_NULL; - $value = ''; - } else { - // forumla result is a number, first 14 bytes like _NUMBER record - $dataType = DataType::TYPE_NUMERIC; - $value = self::extractNumber(substr($recordData, 6, 8)); - } - - $cell = $this->phpSheet->getCell($columnString . ($row + 1)); - if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { - // add cell style - $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); - } - - // store the formula - if (!$isPartOfSharedFormula) { - // not part of shared formula - // add cell value. If we can read formula, populate with formula, otherwise just used cached value - try { - if ($this->version != self::XLS_BIFF8) { - throw new Exception('Not BIFF8. Can only read BIFF8 formulas'); - } - $formula = $this->getFormulaFromStructure($formulaStructure); // get formula in human language - $cell->setValueExplicit('=' . $formula, DataType::TYPE_FORMULA); - } catch (PhpSpreadsheetException) { - $cell->setValueExplicit($value, $dataType); - } - } else { - if ($this->version == self::XLS_BIFF8) { - // do nothing at this point, formula id added later in the code - } else { - $cell->setValueExplicit($value, $dataType); - } - } - - // store the cached calculated value - $cell->setCalculatedValue($value, $dataType === DataType::TYPE_NUMERIC); - } - } - - /** - * Read a SHAREDFMLA record. This function just stores the binary shared formula in the reader, - * which usually contains relative references. - * These will be used to construct the formula in each shared formula part after the sheet is read. - */ - private function readSharedFmla(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // offset: 0, size: 6; cell range address of the area used by the shared formula, not used for anything - //$cellRange = substr($recordData, 0, 6); - //$cellRange = $this->readBIFF5CellRangeAddressFixed($cellRange); // note: even BIFF8 uses BIFF5 syntax - - // offset: 6, size: 1; not used - - // offset: 7, size: 1; number of existing FORMULA records for this shared formula - //$no = ord($recordData[7]); - - // offset: 8, size: var; Binary token array of the shared formula - $formula = substr($recordData, 8); - - // at this point we only store the shared formula for later use - $this->sharedFormulas[$this->baseCell] = $formula; - } - - /** - * Read a STRING record from current stream position and advance the stream pointer to next record - * This record is used for storing result from FORMULA record when it is a string, and - * it occurs directly after the FORMULA record. - * - * @return string The string contents as UTF-8 - */ - private function readString(): string - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - if ($this->version == self::XLS_BIFF8) { - $string = self::readUnicodeStringLong($recordData); - $value = $string['value']; - } else { - $string = $this->readByteStringLong($recordData); - $value = $string['value']; - } - - return $value; - } - - /** - * Read BOOLERR record - * This record represents a Boolean value or error value - * cell. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" - */ - private function readBoolErr(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // offset: 0; size: 2; row index - $row = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; column index - $column = self::getUInt2d($recordData, 2); - $columnString = Coordinate::stringFromColumnIndex($column + 1); - - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - // offset: 4; size: 2; index to XF record - $xfIndex = self::getUInt2d($recordData, 4); - - // offset: 6; size: 1; the boolean value or error value - $boolErr = ord($recordData[6]); - - // offset: 7; size: 1; 0=boolean; 1=error - $isError = ord($recordData[7]); - - $cell = $this->phpSheet->getCell($columnString . ($row + 1)); - switch ($isError) { - case 0: // boolean - $value = (bool) $boolErr; - - // add cell value - $cell->setValueExplicit($value, DataType::TYPE_BOOL); - - break; - case 1: // error type - $value = Xls\ErrorCode::lookup($boolErr); - - // add cell value - $cell->setValueExplicit($value, DataType::TYPE_ERROR); - - break; - } - - if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { - // add cell style - $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); - } - } - } - - /** - * Read MULBLANK record - * This record represents a cell range of empty cells. All - * cells are located in the same row. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" - */ - private function readMulBlank(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // offset: 0; size: 2; index to row - $row = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; index to first column - $fc = self::getUInt2d($recordData, 2); - - // offset: 4; size: 2 x nc; list of indexes to XF records - // add style information - if (!$this->readDataOnly && $this->readEmptyCells) { - for ($i = 0; $i < $length / 2 - 3; ++$i) { - $columnString = Coordinate::stringFromColumnIndex($fc + $i + 1); - - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - $xfIndex = self::getUInt2d($recordData, 4 + 2 * $i); - if (isset($this->mapCellXfIndex[$xfIndex])) { - $this->phpSheet->getCell($columnString . ($row + 1))->setXfIndex($this->mapCellXfIndex[$xfIndex]); - } - } - } - } - - // offset: 6; size 2; index to last column (not needed) - } - - /** - * Read LABEL record - * This record represents a cell that contains a string. In - * BIFF8 it is usually replaced by the LABELSST record. - * Excel still uses this record, if it copies unformatted - * text cells to the clipboard. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" - */ - private function readLabel(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // offset: 0; size: 2; index to row - $row = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; index to column - $column = self::getUInt2d($recordData, 2); - $columnString = Coordinate::stringFromColumnIndex($column + 1); - - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - // offset: 4; size: 2; XF index - $xfIndex = self::getUInt2d($recordData, 4); - - // add cell value - // todo: what if string is very long? continue record - if ($this->version == self::XLS_BIFF8) { - $string = self::readUnicodeStringLong(substr($recordData, 6)); - $value = $string['value']; - } else { - $string = $this->readByteStringLong(substr($recordData, 6)); - $value = $string['value']; - } - if ($this->readEmptyCells || trim($value) !== '') { - $cell = $this->phpSheet->getCell($columnString . ($row + 1)); - $cell->setValueExplicit($value, DataType::TYPE_STRING); - - if (!$this->readDataOnly && isset($this->mapCellXfIndex[$xfIndex])) { - // add cell style - $cell->setXfIndex($this->mapCellXfIndex[$xfIndex]); - } - } - } - } - - /** - * Read BLANK record. - */ - private function readBlank(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // offset: 0; size: 2; row index - $row = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; col index - $col = self::getUInt2d($recordData, 2); - $columnString = Coordinate::stringFromColumnIndex($col + 1); - - // Read cell? - if (($this->getReadFilter() !== null) && $this->getReadFilter()->readCell($columnString, $row + 1, $this->phpSheet->getTitle())) { - // offset: 4; size: 2; XF index - $xfIndex = self::getUInt2d($recordData, 4); - - // add style information - if (!$this->readDataOnly && $this->readEmptyCells && isset($this->mapCellXfIndex[$xfIndex])) { - $this->phpSheet->getCell($columnString . ($row + 1))->setXfIndex($this->mapCellXfIndex[$xfIndex]); - } - } - } - - /** - * Read MSODRAWING record. - */ - private function readMsoDrawing(): void - { - //$length = self::getUInt2d($this->data, $this->pos + 2); - - // get spliced record data - $splicedRecordData = $this->getSplicedRecordData(); - $recordData = $splicedRecordData['recordData']; - - $this->drawingData .= $recordData; - } - - /** - * Read OBJ record. - */ - private function readObj(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - if ($this->readDataOnly || $this->version != self::XLS_BIFF8) { - return; - } - - // recordData consists of an array of subrecords looking like this: - // ft: 2 bytes; ftCmo type (0x15) - // cb: 2 bytes; size in bytes of ftCmo data - // ot: 2 bytes; Object Type - // id: 2 bytes; Object id number - // grbit: 2 bytes; Option Flags - // data: var; subrecord data - - // for now, we are just interested in the second subrecord containing the object type - $ftCmoType = self::getUInt2d($recordData, 0); - $cbCmoSize = self::getUInt2d($recordData, 2); - $otObjType = self::getUInt2d($recordData, 4); - $idObjID = self::getUInt2d($recordData, 6); - $grbitOpts = self::getUInt2d($recordData, 6); - - $this->objs[] = [ - 'ftCmoType' => $ftCmoType, - 'cbCmoSize' => $cbCmoSize, - 'otObjType' => $otObjType, - 'idObjID' => $idObjID, - 'grbitOpts' => $grbitOpts, - ]; - $this->textObjRef = $idObjID; - } - - /** - * Read WINDOW2 record. - */ - private function readWindow2(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // offset: 0; size: 2; option flags - $options = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; index to first visible row - //$firstVisibleRow = self::getUInt2d($recordData, 2); - - // offset: 4; size: 2; index to first visible colum - //$firstVisibleColumn = self::getUInt2d($recordData, 4); - $zoomscaleInPageBreakPreview = 0; - $zoomscaleInNormalView = 0; - if ($this->version === self::XLS_BIFF8) { - // offset: 8; size: 2; not used - // offset: 10; size: 2; cached magnification factor in page break preview (in percent); 0 = Default (60%) - // offset: 12; size: 2; cached magnification factor in normal view (in percent); 0 = Default (100%) - // offset: 14; size: 4; not used - if (!isset($recordData[10])) { - $zoomscaleInPageBreakPreview = 0; - } else { - $zoomscaleInPageBreakPreview = self::getUInt2d($recordData, 10); - } - - if ($zoomscaleInPageBreakPreview === 0) { - $zoomscaleInPageBreakPreview = 60; - } - - if (!isset($recordData[12])) { - $zoomscaleInNormalView = 0; - } else { - $zoomscaleInNormalView = self::getUInt2d($recordData, 12); - } - - if ($zoomscaleInNormalView === 0) { - $zoomscaleInNormalView = 100; - } - } - - // bit: 1; mask: 0x0002; 0 = do not show gridlines, 1 = show gridlines - $showGridlines = (bool) ((0x0002 & $options) >> 1); - $this->phpSheet->setShowGridlines($showGridlines); - - // bit: 2; mask: 0x0004; 0 = do not show headers, 1 = show headers - $showRowColHeaders = (bool) ((0x0004 & $options) >> 2); - $this->phpSheet->setShowRowColHeaders($showRowColHeaders); - - // bit: 3; mask: 0x0008; 0 = panes are not frozen, 1 = panes are frozen - $this->frozen = (bool) ((0x0008 & $options) >> 3); - - // bit: 6; mask: 0x0040; 0 = columns from left to right, 1 = columns from right to left - $this->phpSheet->setRightToLeft((bool) ((0x0040 & $options) >> 6)); - - // bit: 10; mask: 0x0400; 0 = sheet not active, 1 = sheet active - $isActive = (bool) ((0x0400 & $options) >> 10); - if ($isActive) { - $this->spreadsheet->setActiveSheetIndex($this->spreadsheet->getIndex($this->phpSheet)); - $this->activeSheetSet = true; - } - - // bit: 11; mask: 0x0800; 0 = normal view, 1 = page break view - $isPageBreakPreview = (bool) ((0x0800 & $options) >> 11); - - //FIXME: set $firstVisibleRow and $firstVisibleColumn - - if ($this->phpSheet->getSheetView()->getView() !== SheetView::SHEETVIEW_PAGE_LAYOUT) { - //NOTE: this setting is inferior to page layout view(Excel2007-) - $view = $isPageBreakPreview ? SheetView::SHEETVIEW_PAGE_BREAK_PREVIEW : SheetView::SHEETVIEW_NORMAL; - $this->phpSheet->getSheetView()->setView($view); - if ($this->version === self::XLS_BIFF8) { - $zoomScale = $isPageBreakPreview ? $zoomscaleInPageBreakPreview : $zoomscaleInNormalView; - $this->phpSheet->getSheetView()->setZoomScale($zoomScale); - $this->phpSheet->getSheetView()->setZoomScaleNormal($zoomscaleInNormalView); - } - } - } - - /** - * Read PLV Record(Created by Excel2007 or upper). - */ - private function readPageLayoutView(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // offset: 0; size: 2; rt - //->ignore - //$rt = self::getUInt2d($recordData, 0); - // offset: 2; size: 2; grbitfr - //->ignore - //$grbitFrt = self::getUInt2d($recordData, 2); - // offset: 4; size: 8; reserved - //->ignore - - // offset: 12; size 2; zoom scale - $wScalePLV = self::getUInt2d($recordData, 12); - // offset: 14; size 2; grbit - $grbit = self::getUInt2d($recordData, 14); - - // decomprise grbit - $fPageLayoutView = $grbit & 0x01; - //$fRulerVisible = ($grbit >> 1) & 0x01; //no support - //$fWhitespaceHidden = ($grbit >> 3) & 0x01; //no support - - if ($fPageLayoutView === 1) { - $this->phpSheet->getSheetView()->setView(SheetView::SHEETVIEW_PAGE_LAYOUT); - $this->phpSheet->getSheetView()->setZoomScale($wScalePLV); //set by Excel2007 only if SHEETVIEW_PAGE_LAYOUT - } - //otherwise, we cannot know whether SHEETVIEW_PAGE_LAYOUT or SHEETVIEW_PAGE_BREAK_PREVIEW. - } - - /** - * Read SCL record. - */ - private function readScl(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // offset: 0; size: 2; numerator of the view magnification - $numerator = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; numerator of the view magnification - $denumerator = self::getUInt2d($recordData, 2); - - // set the zoom scale (in percent) - $this->phpSheet->getSheetView()->setZoomScale($numerator * 100 / $denumerator); - } - - /** - * Read PANE record. - */ - private function readPane(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - if (!$this->readDataOnly) { - // offset: 0; size: 2; position of vertical split - $px = self::getUInt2d($recordData, 0); - - // offset: 2; size: 2; position of horizontal split - $py = self::getUInt2d($recordData, 2); - - // offset: 4; size: 2; top most visible row in the bottom pane - $rwTop = self::getUInt2d($recordData, 4); - - // offset: 6; size: 2; first visible left column in the right pane - $colLeft = self::getUInt2d($recordData, 6); - - if ($this->frozen) { - // frozen panes - $cell = Coordinate::stringFromColumnIndex($px + 1) . ($py + 1); - $topLeftCell = Coordinate::stringFromColumnIndex($colLeft + 1) . ($rwTop + 1); - $this->phpSheet->freezePane($cell, $topLeftCell); - } - // unfrozen panes; split windows; not supported by PhpSpreadsheet core - } - } - - /** - * Read SELECTION record. There is one such record for each pane in the sheet. - */ - private function readSelection(): string - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - $selectedCells = ''; - - // move stream pointer to next record - $this->pos += 4 + $length; - - if (!$this->readDataOnly) { - // offset: 0; size: 1; pane identifier - //$paneId = ord($recordData[0]); - - // offset: 1; size: 2; index to row of the active cell - //$r = self::getUInt2d($recordData, 1); - - // offset: 3; size: 2; index to column of the active cell - //$c = self::getUInt2d($recordData, 3); - - // offset: 5; size: 2; index into the following cell range list to the - // entry that contains the active cell - //$index = self::getUInt2d($recordData, 5); - - // offset: 7; size: var; cell range address list containing all selected cell ranges - $data = substr($recordData, 7); - $cellRangeAddressList = $this->readBIFF5CellRangeAddressList($data); // note: also BIFF8 uses BIFF5 syntax - - $selectedCells = $cellRangeAddressList['cellRangeAddresses'][0]; - - // first row '1' + last row '16384' indicates that full column is selected (apparently also in BIFF8!) - if (preg_match('/^([A-Z]+1\:[A-Z]+)16384$/', $selectedCells)) { - $selectedCells = (string) preg_replace('/^([A-Z]+1\:[A-Z]+)16384$/', '${1}1048576', $selectedCells); - } - - // first row '1' + last row '65536' indicates that full column is selected - if (preg_match('/^([A-Z]+1\:[A-Z]+)65536$/', $selectedCells)) { - $selectedCells = (string) preg_replace('/^([A-Z]+1\:[A-Z]+)65536$/', '${1}1048576', $selectedCells); - } - - // first column 'A' + last column 'IV' indicates that full row is selected - if (preg_match('/^(A\d+\:)IV(\d+)$/', $selectedCells)) { - $selectedCells = (string) preg_replace('/^(A\d+\:)IV(\d+)$/', '${1}XFD${2}', $selectedCells); - } - - $this->phpSheet->setSelectedCells($selectedCells); - } - - return $selectedCells; - } - - private function includeCellRangeFiltered(string $cellRangeAddress): bool - { - $includeCellRange = true; - if ($this->getReadFilter() !== null) { - $includeCellRange = false; - $rangeBoundaries = Coordinate::getRangeBoundaries($cellRangeAddress); - ++$rangeBoundaries[1][0]; - for ($row = $rangeBoundaries[0][1]; $row <= $rangeBoundaries[1][1]; ++$row) { - for ($column = $rangeBoundaries[0][0]; $column != $rangeBoundaries[1][0]; ++$column) { - if ($this->getReadFilter()->readCell($column, $row, $this->phpSheet->getTitle())) { - $includeCellRange = true; - - break 2; - } - } - } - } - - return $includeCellRange; - } - - /** - * MERGEDCELLS. - * - * This record contains the addresses of merged cell ranges - * in the current sheet. - * - * -- "OpenOffice.org's Documentation of the Microsoft - * Excel File Format" - */ - private function readMergedCells(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - if ($this->version == self::XLS_BIFF8 && !$this->readDataOnly) { - $cellRangeAddressList = $this->readBIFF8CellRangeAddressList($recordData); - foreach ($cellRangeAddressList['cellRangeAddresses'] as $cellRangeAddress) { - if ( - (str_contains($cellRangeAddress, ':')) - && ($this->includeCellRangeFiltered($cellRangeAddress)) - ) { - $this->phpSheet->mergeCells($cellRangeAddress, Worksheet::MERGE_CELL_CONTENT_HIDE); - } - } - } - } - - /** - * Read HYPERLINK record. - */ - private function readHyperLink(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer forward to next record - $this->pos += 4 + $length; - - if (!$this->readDataOnly) { - // offset: 0; size: 8; cell range address of all cells containing this hyperlink - try { - $cellRange = $this->readBIFF8CellRangeAddressFixed($recordData); - } catch (PhpSpreadsheetException) { - return; - } - - // offset: 8, size: 16; GUID of StdLink - - // offset: 24, size: 4; unknown value - - // offset: 28, size: 4; option flags - // bit: 0; mask: 0x00000001; 0 = no link or extant, 1 = file link or URL - $isFileLinkOrUrl = (0x00000001 & self::getUInt2d($recordData, 28)) >> 0; - - // bit: 1; mask: 0x00000002; 0 = relative path, 1 = absolute path or URL - //$isAbsPathOrUrl = (0x00000001 & self::getUInt2d($recordData, 28)) >> 1; - - // bit: 2 (and 4); mask: 0x00000014; 0 = no description - $hasDesc = (0x00000014 & self::getUInt2d($recordData, 28)) >> 2; - - // bit: 3; mask: 0x00000008; 0 = no text, 1 = has text - $hasText = (0x00000008 & self::getUInt2d($recordData, 28)) >> 3; - - // bit: 7; mask: 0x00000080; 0 = no target frame, 1 = has target frame - $hasFrame = (0x00000080 & self::getUInt2d($recordData, 28)) >> 7; - - // bit: 8; mask: 0x00000100; 0 = file link or URL, 1 = UNC path (inc. server name) - $isUNC = (0x00000100 & self::getUInt2d($recordData, 28)) >> 8; - - // offset within record data - $offset = 32; - - if ($hasDesc) { - // offset: 32; size: var; character count of description text - $dl = self::getInt4d($recordData, 32); - // offset: 36; size: var; character array of description text, no Unicode string header, always 16-bit characters, zero terminated - //$desc = self::encodeUTF16(substr($recordData, 36, 2 * ($dl - 1)), false); - $offset += 4 + 2 * $dl; - } - if ($hasFrame) { - $fl = self::getInt4d($recordData, $offset); - $offset += 4 + 2 * $fl; - } - - // detect type of hyperlink (there are 4 types) - $hyperlinkType = null; - - if ($isUNC) { - $hyperlinkType = 'UNC'; - } elseif (!$isFileLinkOrUrl) { - $hyperlinkType = 'workbook'; - } elseif (ord($recordData[$offset]) == 0x03) { - $hyperlinkType = 'local'; - } elseif (ord($recordData[$offset]) == 0xE0) { - $hyperlinkType = 'URL'; - } - - switch ($hyperlinkType) { - case 'URL': - // section 5.58.2: Hyperlink containing a URL - // e.g. http://example.org/index.php - - // offset: var; size: 16; GUID of URL Moniker - $offset += 16; - // offset: var; size: 4; size (in bytes) of character array of the URL including trailing zero word - $us = self::getInt4d($recordData, $offset); - $offset += 4; - // offset: var; size: $us; character array of the URL, no Unicode string header, always 16-bit characters, zero-terminated - $url = self::encodeUTF16(substr($recordData, $offset, $us - 2), false); - $nullOffset = strpos($url, chr(0x00)); - if ($nullOffset) { - $url = substr($url, 0, $nullOffset); - } - $url .= $hasText ? '#' : ''; - $offset += $us; - - break; - case 'local': - // section 5.58.3: Hyperlink to local file - // examples: - // mydoc.txt - // ../../somedoc.xls#Sheet!A1 - - // offset: var; size: 16; GUI of File Moniker - $offset += 16; - - // offset: var; size: 2; directory up-level count. - $upLevelCount = self::getUInt2d($recordData, $offset); - $offset += 2; - - // offset: var; size: 4; character count of the shortened file path and name, including trailing zero word - $sl = self::getInt4d($recordData, $offset); - $offset += 4; - - // offset: var; size: sl; character array of the shortened file path and name in 8.3-DOS-format (compressed Unicode string) - $shortenedFilePath = substr($recordData, $offset, $sl); - $shortenedFilePath = self::encodeUTF16($shortenedFilePath, true); - $shortenedFilePath = substr($shortenedFilePath, 0, -1); // remove trailing zero - - $offset += $sl; - - // offset: var; size: 24; unknown sequence - $offset += 24; - - // extended file path - // offset: var; size: 4; size of the following file link field including string lenth mark - $sz = self::getInt4d($recordData, $offset); - $offset += 4; - - $extendedFilePath = ''; - // only present if $sz > 0 - if ($sz > 0) { - // offset: var; size: 4; size of the character array of the extended file path and name - $xl = self::getInt4d($recordData, $offset); - $offset += 4; - - // offset: var; size 2; unknown - $offset += 2; - - // offset: var; size $xl; character array of the extended file path and name. - $extendedFilePath = substr($recordData, $offset, $xl); - $extendedFilePath = self::encodeUTF16($extendedFilePath, false); - $offset += $xl; - } - - // construct the path - $url = str_repeat('..\\', $upLevelCount); - $url .= ($sz > 0) ? $extendedFilePath : $shortenedFilePath; // use extended path if available - $url .= $hasText ? '#' : ''; - - break; - case 'UNC': - // section 5.58.4: Hyperlink to a File with UNC (Universal Naming Convention) Path - // todo: implement - return; - case 'workbook': - // section 5.58.5: Hyperlink to the Current Workbook - // e.g. Sheet2!B1:C2, stored in text mark field - $url = 'sheet://'; - - break; - default: - return; - } - - if ($hasText) { - // offset: var; size: 4; character count of text mark including trailing zero word - $tl = self::getInt4d($recordData, $offset); - $offset += 4; - // offset: var; size: var; character array of the text mark without the # sign, no Unicode header, always 16-bit characters, zero-terminated - $text = self::encodeUTF16(substr($recordData, $offset, 2 * ($tl - 1)), false); - $url .= $text; - } - - // apply the hyperlink to all the relevant cells - foreach (Coordinate::extractAllCellReferencesInRange($cellRange) as $coordinate) { - $this->phpSheet->getCell($coordinate)->getHyperLink()->setUrl($url); - } - } - } - - /** - * Read DATAVALIDATIONS record. - */ - private function readDataValidations(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - //$recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer forward to next record - $this->pos += 4 + $length; - } - - /** - * Read DATAVALIDATION record. - */ - private function readDataValidation(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer forward to next record - $this->pos += 4 + $length; - - if ($this->readDataOnly) { - return; - } - - // offset: 0; size: 4; Options - $options = self::getInt4d($recordData, 0); - - // bit: 0-3; mask: 0x0000000F; type - $type = (0x0000000F & $options) >> 0; - $type = Xls\DataValidationHelper::type($type); - - // bit: 4-6; mask: 0x00000070; error type - $errorStyle = (0x00000070 & $options) >> 4; - $errorStyle = Xls\DataValidationHelper::errorStyle($errorStyle); - - // bit: 7; mask: 0x00000080; 1= formula is explicit (only applies to list) - // I have only seen cases where this is 1 - //$explicitFormula = (0x00000080 & $options) >> 7; - - // bit: 8; mask: 0x00000100; 1= empty cells allowed - $allowBlank = (0x00000100 & $options) >> 8; - - // bit: 9; mask: 0x00000200; 1= suppress drop down arrow in list type validity - $suppressDropDown = (0x00000200 & $options) >> 9; - - // bit: 18; mask: 0x00040000; 1= show prompt box if cell selected - $showInputMessage = (0x00040000 & $options) >> 18; - - // bit: 19; mask: 0x00080000; 1= show error box if invalid values entered - $showErrorMessage = (0x00080000 & $options) >> 19; - - // bit: 20-23; mask: 0x00F00000; condition operator - $operator = (0x00F00000 & $options) >> 20; - $operator = Xls\DataValidationHelper::operator($operator); - - if ($type === null || $errorStyle === null || $operator === null) { - return; - } - - // offset: 4; size: var; title of the prompt box - $offset = 4; - $string = self::readUnicodeStringLong(substr($recordData, $offset)); - $promptTitle = $string['value'] !== chr(0) ? $string['value'] : ''; - $offset += $string['size']; - - // offset: var; size: var; title of the error box - $string = self::readUnicodeStringLong(substr($recordData, $offset)); - $errorTitle = $string['value'] !== chr(0) ? $string['value'] : ''; - $offset += $string['size']; - - // offset: var; size: var; text of the prompt box - $string = self::readUnicodeStringLong(substr($recordData, $offset)); - $prompt = $string['value'] !== chr(0) ? $string['value'] : ''; - $offset += $string['size']; - - // offset: var; size: var; text of the error box - $string = self::readUnicodeStringLong(substr($recordData, $offset)); - $error = $string['value'] !== chr(0) ? $string['value'] : ''; - $offset += $string['size']; - - // offset: var; size: 2; size of the formula data for the first condition - $sz1 = self::getUInt2d($recordData, $offset); - $offset += 2; - - // offset: var; size: 2; not used - $offset += 2; - - // offset: var; size: $sz1; formula data for first condition (without size field) - $formula1 = substr($recordData, $offset, $sz1); - $formula1 = pack('v', $sz1) . $formula1; // prepend the length - - try { - $formula1 = $this->getFormulaFromStructure($formula1); - - // in list type validity, null characters are used as item separators - if ($type == DataValidation::TYPE_LIST) { - $formula1 = str_replace(chr(0), ',', $formula1); - } - } catch (PhpSpreadsheetException $e) { - return; - } - $offset += $sz1; - - // offset: var; size: 2; size of the formula data for the first condition - $sz2 = self::getUInt2d($recordData, $offset); - $offset += 2; - - // offset: var; size: 2; not used - $offset += 2; - - // offset: var; size: $sz2; formula data for second condition (without size field) - $formula2 = substr($recordData, $offset, $sz2); - $formula2 = pack('v', $sz2) . $formula2; // prepend the length - - try { - $formula2 = $this->getFormulaFromStructure($formula2); - } catch (PhpSpreadsheetException) { - return; - } - $offset += $sz2; - - // offset: var; size: var; cell range address list with - $cellRangeAddressList = $this->readBIFF8CellRangeAddressList(substr($recordData, $offset)); - $cellRangeAddresses = $cellRangeAddressList['cellRangeAddresses']; - - foreach ($cellRangeAddresses as $cellRange) { - $stRange = $this->phpSheet->shrinkRangeToFit($cellRange); - foreach (Coordinate::extractAllCellReferencesInRange($stRange) as $coordinate) { - $objValidation = $this->phpSheet->getCell($coordinate)->getDataValidation(); - $objValidation->setType($type); - $objValidation->setErrorStyle($errorStyle); - $objValidation->setAllowBlank((bool) $allowBlank); - $objValidation->setShowInputMessage((bool) $showInputMessage); - $objValidation->setShowErrorMessage((bool) $showErrorMessage); - $objValidation->setShowDropDown(!$suppressDropDown); - $objValidation->setOperator($operator); - $objValidation->setErrorTitle($errorTitle); - $objValidation->setError($error); - $objValidation->setPromptTitle($promptTitle); - $objValidation->setPrompt($prompt); - $objValidation->setFormula1($formula1); - $objValidation->setFormula2($formula2); - } - } - } - - /** - * Read SHEETLAYOUT record. Stores sheet tab color information. - */ - private function readSheetLayout(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - if (!$this->readDataOnly) { - // offset: 0; size: 2; repeated record identifier 0x0862 - - // offset: 2; size: 10; not used - - // offset: 12; size: 4; size of record data - // Excel 2003 uses size of 0x14 (documented), Excel 2007 uses size of 0x28 (not documented?) - $sz = self::getInt4d($recordData, 12); - - switch ($sz) { - case 0x14: - // offset: 16; size: 2; color index for sheet tab - $colorIndex = self::getUInt2d($recordData, 16); - $color = Xls\Color::map($colorIndex, $this->palette, $this->version); - $this->phpSheet->getTabColor()->setRGB($color['rgb']); - - break; - case 0x28: - // TODO: Investigate structure for .xls SHEETLAYOUT record as saved by MS Office Excel 2007 - return; - } - } - } - - /** - * Read SHEETPROTECTION record (FEATHEADR). - */ - private function readSheetProtection(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - if ($this->readDataOnly) { - return; - } - - // offset: 0; size: 2; repeated record header - - // offset: 2; size: 2; FRT cell reference flag (=0 currently) - - // offset: 4; size: 8; Currently not used and set to 0 - - // offset: 12; size: 2; Shared feature type index (2=Enhanced Protetion, 4=SmartTag) - $isf = self::getUInt2d($recordData, 12); - if ($isf != 2) { - return; - } - - // offset: 14; size: 1; =1 since this is a feat header - - // offset: 15; size: 4; size of rgbHdrSData - - // rgbHdrSData, assume "Enhanced Protection" - // offset: 19; size: 2; option flags - $options = self::getUInt2d($recordData, 19); - - // bit: 0; mask 0x0001; 1 = user may edit objects, 0 = users must not edit objects - // Note - do not negate $bool - $bool = (0x0001 & $options) >> 0; - $this->phpSheet->getProtection()->setObjects((bool) $bool); - - // bit: 1; mask 0x0002; edit scenarios - // Note - do not negate $bool - $bool = (0x0002 & $options) >> 1; - $this->phpSheet->getProtection()->setScenarios((bool) $bool); - - // bit: 2; mask 0x0004; format cells - $bool = (0x0004 & $options) >> 2; - $this->phpSheet->getProtection()->setFormatCells(!$bool); - - // bit: 3; mask 0x0008; format columns - $bool = (0x0008 & $options) >> 3; - $this->phpSheet->getProtection()->setFormatColumns(!$bool); - - // bit: 4; mask 0x0010; format rows - $bool = (0x0010 & $options) >> 4; - $this->phpSheet->getProtection()->setFormatRows(!$bool); - - // bit: 5; mask 0x0020; insert columns - $bool = (0x0020 & $options) >> 5; - $this->phpSheet->getProtection()->setInsertColumns(!$bool); - - // bit: 6; mask 0x0040; insert rows - $bool = (0x0040 & $options) >> 6; - $this->phpSheet->getProtection()->setInsertRows(!$bool); - - // bit: 7; mask 0x0080; insert hyperlinks - $bool = (0x0080 & $options) >> 7; - $this->phpSheet->getProtection()->setInsertHyperlinks(!$bool); - - // bit: 8; mask 0x0100; delete columns - $bool = (0x0100 & $options) >> 8; - $this->phpSheet->getProtection()->setDeleteColumns(!$bool); - - // bit: 9; mask 0x0200; delete rows - $bool = (0x0200 & $options) >> 9; - $this->phpSheet->getProtection()->setDeleteRows(!$bool); - - // bit: 10; mask 0x0400; select locked cells - // Note that this is opposite of most of above. - $bool = (0x0400 & $options) >> 10; - $this->phpSheet->getProtection()->setSelectLockedCells((bool) $bool); - - // bit: 11; mask 0x0800; sort cell range - $bool = (0x0800 & $options) >> 11; - $this->phpSheet->getProtection()->setSort(!$bool); - - // bit: 12; mask 0x1000; auto filter - $bool = (0x1000 & $options) >> 12; - $this->phpSheet->getProtection()->setAutoFilter(!$bool); - - // bit: 13; mask 0x2000; pivot tables - $bool = (0x2000 & $options) >> 13; - $this->phpSheet->getProtection()->setPivotTables(!$bool); - - // bit: 14; mask 0x4000; select unlocked cells - // Note that this is opposite of most of above. - $bool = (0x4000 & $options) >> 14; - $this->phpSheet->getProtection()->setSelectUnlockedCells((bool) $bool); - - // offset: 21; size: 2; not used - } - - /** - * Read RANGEPROTECTION record - * Reading of this record is based on Microsoft Office Excel 97-2000 Binary File Format Specification, - * where it is referred to as FEAT record. - */ - private function readRangeProtection(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer to next record - $this->pos += 4 + $length; - - // local pointer in record data - $offset = 0; - - if (!$this->readDataOnly) { - $offset += 12; - - // offset: 12; size: 2; shared feature type, 2 = enhanced protection, 4 = smart tag - $isf = self::getUInt2d($recordData, 12); - if ($isf != 2) { - // we only read FEAT records of type 2 - return; - } - $offset += 2; - - $offset += 5; - - // offset: 19; size: 2; count of ref ranges this feature is on - $cref = self::getUInt2d($recordData, 19); - $offset += 2; - - $offset += 6; - - // offset: 27; size: 8 * $cref; list of cell ranges (like in hyperlink record) - $cellRanges = []; - for ($i = 0; $i < $cref; ++$i) { - try { - $cellRange = $this->readBIFF8CellRangeAddressFixed(substr($recordData, 27 + 8 * $i, 8)); - } catch (PhpSpreadsheetException) { - return; - } - $cellRanges[] = $cellRange; - $offset += 8; - } - - // offset: var; size: var; variable length of feature specific data - //$rgbFeat = substr($recordData, $offset); - $offset += 4; - - // offset: var; size: 4; the encrypted password (only 16-bit although field is 32-bit) - $wPassword = self::getInt4d($recordData, $offset); - $offset += 4; - - // Apply range protection to sheet - if ($cellRanges) { - $this->phpSheet->protectCells(implode(' ', $cellRanges), ($wPassword === 0) ? '' : strtoupper(dechex($wPassword)), true); - } - } - } - - /** - * Read a free CONTINUE record. Free CONTINUE record may be a camouflaged MSODRAWING record - * When MSODRAWING data on a sheet exceeds 8224 bytes, CONTINUE records are used instead. Undocumented. - * In this case, we must treat the CONTINUE record as a MSODRAWING record. - */ - private function readContinue(): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // check if we are reading drawing data - // this is in case a free CONTINUE record occurs in other circumstances we are unaware of - if ($this->drawingData == '') { - // move stream pointer to next record - $this->pos += 4 + $length; - - return; - } - - // check if record data is at least 4 bytes long, otherwise there is no chance this is MSODRAWING data - if ($length < 4) { - // move stream pointer to next record - $this->pos += 4 + $length; - - return; - } - - // dirty check to see if CONTINUE record could be a camouflaged MSODRAWING record - // look inside CONTINUE record to see if it looks like a part of an Escher stream - // we know that Escher stream may be split at least at - // 0xF003 MsofbtSpgrContainer - // 0xF004 MsofbtSpContainer - // 0xF00D MsofbtClientTextbox - $validSplitPoints = [0xF003, 0xF004, 0xF00D]; // add identifiers if we find more - - $splitPoint = self::getUInt2d($recordData, 2); - if (in_array($splitPoint, $validSplitPoints)) { - // get spliced record data (and move pointer to next record) - $splicedRecordData = $this->getSplicedRecordData(); - $this->drawingData .= $splicedRecordData['recordData']; - - return; - } - - // move stream pointer to next record - $this->pos += 4 + $length; - } - - /** - * Reads a record from current position in data stream and continues reading data as long as CONTINUE - * records are found. Splices the record data pieces and returns the combined string as if record data - * is in one piece. - * Moves to next current position in data stream to start of next record different from a CONtINUE record. - */ - private function getSplicedRecordData(): array - { - $data = ''; - $spliceOffsets = []; - - $i = 0; - $spliceOffsets[0] = 0; - - do { - ++$i; - - // offset: 0; size: 2; identifier - //$identifier = self::getUInt2d($this->data, $this->pos); - // offset: 2; size: 2; length - $length = self::getUInt2d($this->data, $this->pos + 2); - $data .= $this->readRecordData($this->data, $this->pos + 4, $length); - - $spliceOffsets[$i] = $spliceOffsets[$i - 1] + $length; - - $this->pos += 4 + $length; - $nextIdentifier = self::getUInt2d($this->data, $this->pos); - } while ($nextIdentifier == self::XLS_TYPE_CONTINUE); - - return [ - 'recordData' => $data, - 'spliceOffsets' => $spliceOffsets, - ]; - } - - /** - * Convert formula structure into human readable Excel formula like 'A3+A5*5'. - * - * @param string $formulaStructure The complete binary data for the formula - * @param string $baseCell Base cell, only needed when formula contains tRefN tokens, e.g. with shared formulas - * - * @return string Human readable formula - */ - private function getFormulaFromStructure(string $formulaStructure, string $baseCell = 'A1'): string - { - // offset: 0; size: 2; size of the following formula data - $sz = self::getUInt2d($formulaStructure, 0); - - // offset: 2; size: sz - $formulaData = substr($formulaStructure, 2, $sz); - - // offset: 2 + sz; size: variable (optional) - if (strlen($formulaStructure) > 2 + $sz) { - $additionalData = substr($formulaStructure, 2 + $sz); - } else { - $additionalData = ''; - } - - return $this->getFormulaFromData($formulaData, $additionalData, $baseCell); - } - - /** - * Take formula data and additional data for formula and return human readable formula. - * - * @param string $formulaData The binary data for the formula itself - * @param string $additionalData Additional binary data going with the formula - * @param string $baseCell Base cell, only needed when formula contains tRefN tokens, e.g. with shared formulas - * - * @return string Human readable formula - */ - private function getFormulaFromData(string $formulaData, string $additionalData = '', string $baseCell = 'A1'): string - { - // start parsing the formula data - $tokens = []; - - while ($formulaData !== '' && $token = $this->getNextToken($formulaData, $baseCell)) { - $tokens[] = $token; - $formulaData = substr($formulaData, $token['size']); - } - - $formulaString = $this->createFormulaFromTokens($tokens, $additionalData); - - return $formulaString; - } - - /** - * Take array of tokens together with additional data for formula and return human readable formula. - * - * @param string $additionalData Additional binary data going with the formula - * - * @return string Human readable formula - */ - private function createFormulaFromTokens(array $tokens, string $additionalData): string - { - // empty formula? - if (empty($tokens)) { - return ''; - } - - $formulaStrings = []; - foreach ($tokens as $token) { - // initialize spaces - $space0 = $space0 ?? ''; // spaces before next token, not tParen - $space1 = $space1 ?? ''; // carriage returns before next token, not tParen - $space2 = $space2 ?? ''; // spaces before opening parenthesis - $space3 = $space3 ?? ''; // carriage returns before opening parenthesis - $space4 = $space4 ?? ''; // spaces before closing parenthesis - $space5 = $space5 ?? ''; // carriage returns before closing parenthesis - - switch ($token['name']) { - case 'tAdd': // addition - case 'tConcat': // addition - case 'tDiv': // division - case 'tEQ': // equality - case 'tGE': // greater than or equal - case 'tGT': // greater than - case 'tIsect': // intersection - case 'tLE': // less than or equal - case 'tList': // less than or equal - case 'tLT': // less than - case 'tMul': // multiplication - case 'tNE': // multiplication - case 'tPower': // power - case 'tRange': // range - case 'tSub': // subtraction - $op2 = array_pop($formulaStrings); - $op1 = array_pop($formulaStrings); - $formulaStrings[] = "$op1$space1$space0{$token['data']}$op2"; - unset($space0, $space1); - - break; - case 'tUplus': // unary plus - case 'tUminus': // unary minus - $op = array_pop($formulaStrings); - $formulaStrings[] = "$space1$space0{$token['data']}$op"; - unset($space0, $space1); - - break; - case 'tPercent': // percent sign - $op = array_pop($formulaStrings); - $formulaStrings[] = "$op$space1$space0{$token['data']}"; - unset($space0, $space1); - - break; - case 'tAttrVolatile': // indicates volatile function - case 'tAttrIf': - case 'tAttrSkip': - case 'tAttrChoose': - // token is only important for Excel formula evaluator - // do nothing - break; - case 'tAttrSpace': // space / carriage return - // space will be used when next token arrives, do not alter formulaString stack - switch ($token['data']['spacetype']) { - case 'type0': - $space0 = str_repeat(' ', $token['data']['spacecount']); - - break; - case 'type1': - $space1 = str_repeat("\n", $token['data']['spacecount']); - - break; - case 'type2': - $space2 = str_repeat(' ', $token['data']['spacecount']); - - break; - case 'type3': - $space3 = str_repeat("\n", $token['data']['spacecount']); - - break; - case 'type4': - $space4 = str_repeat(' ', $token['data']['spacecount']); - - break; - case 'type5': - $space5 = str_repeat("\n", $token['data']['spacecount']); - - break; - } - - break; - case 'tAttrSum': // SUM function with one parameter - $op = array_pop($formulaStrings); - $formulaStrings[] = "{$space1}{$space0}SUM($op)"; - unset($space0, $space1); - - break; - case 'tFunc': // function with fixed number of arguments - case 'tFuncV': // function with variable number of arguments - if ($token['data']['function'] != '') { - // normal function - $ops = []; // array of operators - for ($i = 0; $i < $token['data']['args']; ++$i) { - $ops[] = array_pop($formulaStrings); - } - $ops = array_reverse($ops); - $formulaStrings[] = "$space1$space0{$token['data']['function']}(" . implode(',', $ops) . ')'; - unset($space0, $space1); - } else { - // add-in function - $ops = []; // array of operators - for ($i = 0; $i < $token['data']['args'] - 1; ++$i) { - $ops[] = array_pop($formulaStrings); - } - $ops = array_reverse($ops); - $function = array_pop($formulaStrings); - $formulaStrings[] = "$space1$space0$function(" . implode(',', $ops) . ')'; - unset($space0, $space1); - } - - break; - case 'tParen': // parenthesis - $expression = array_pop($formulaStrings); - $formulaStrings[] = "$space3$space2($expression$space5$space4)"; - unset($space2, $space3, $space4, $space5); - - break; - case 'tArray': // array constant - $constantArray = self::readBIFF8ConstantArray($additionalData); - $formulaStrings[] = $space1 . $space0 . $constantArray['value']; - $additionalData = substr($additionalData, $constantArray['size']); // bite of chunk of additional data - unset($space0, $space1); - - break; - case 'tMemArea': - // bite off chunk of additional data - $cellRangeAddressList = $this->readBIFF8CellRangeAddressList($additionalData); - $additionalData = substr($additionalData, $cellRangeAddressList['size']); - $formulaStrings[] = "$space1$space0{$token['data']}"; - unset($space0, $space1); - - break; - case 'tArea': // cell range address - case 'tBool': // boolean - case 'tErr': // error code - case 'tInt': // integer - case 'tMemErr': - case 'tMemFunc': - case 'tMissArg': - case 'tName': - case 'tNameX': - case 'tNum': // number - case 'tRef': // single cell reference - case 'tRef3d': // 3d cell reference - case 'tArea3d': // 3d cell range reference - case 'tRefN': - case 'tAreaN': - case 'tStr': // string - $formulaStrings[] = "$space1$space0{$token['data']}"; - unset($space0, $space1); - - break; - } - } - $formulaString = $formulaStrings[0]; - - return $formulaString; - } - - /** - * Fetch next token from binary formula data. - * - * @param string $formulaData Formula data - * @param string $baseCell Base cell, only needed when formula contains tRefN tokens, e.g. with shared formulas - */ - private function getNextToken(string $formulaData, string $baseCell = 'A1'): array - { - // offset: 0; size: 1; token id - $id = ord($formulaData[0]); // token id - $name = false; // initialize token name - - switch ($id) { - case 0x03: - $name = 'tAdd'; - $size = 1; - $data = '+'; - - break; - case 0x04: - $name = 'tSub'; - $size = 1; - $data = '-'; - - break; - case 0x05: - $name = 'tMul'; - $size = 1; - $data = '*'; - - break; - case 0x06: - $name = 'tDiv'; - $size = 1; - $data = '/'; - - break; - case 0x07: - $name = 'tPower'; - $size = 1; - $data = '^'; - - break; - case 0x08: - $name = 'tConcat'; - $size = 1; - $data = '&'; - - break; - case 0x09: - $name = 'tLT'; - $size = 1; - $data = '<'; - - break; - case 0x0A: - $name = 'tLE'; - $size = 1; - $data = '<='; - - break; - case 0x0B: - $name = 'tEQ'; - $size = 1; - $data = '='; - - break; - case 0x0C: - $name = 'tGE'; - $size = 1; - $data = '>='; - - break; - case 0x0D: - $name = 'tGT'; - $size = 1; - $data = '>'; - - break; - case 0x0E: - $name = 'tNE'; - $size = 1; - $data = '<>'; - - break; - case 0x0F: - $name = 'tIsect'; - $size = 1; - $data = ' '; - - break; - case 0x10: - $name = 'tList'; - $size = 1; - $data = ','; - - break; - case 0x11: - $name = 'tRange'; - $size = 1; - $data = ':'; - - break; - case 0x12: - $name = 'tUplus'; - $size = 1; - $data = '+'; - - break; - case 0x13: - $name = 'tUminus'; - $size = 1; - $data = '-'; - - break; - case 0x14: - $name = 'tPercent'; - $size = 1; - $data = '%'; - - break; - case 0x15: // parenthesis - $name = 'tParen'; - $size = 1; - $data = null; - - break; - case 0x16: // missing argument - $name = 'tMissArg'; - $size = 1; - $data = ''; - - break; - case 0x17: // string - $name = 'tStr'; - // offset: 1; size: var; Unicode string, 8-bit string length - $string = self::readUnicodeStringShort(substr($formulaData, 1)); - $size = 1 + $string['size']; - $data = self::UTF8toExcelDoubleQuoted($string['value']); - - break; - case 0x19: // Special attribute - // offset: 1; size: 1; attribute type flags: - switch (ord($formulaData[1])) { - case 0x01: - $name = 'tAttrVolatile'; - $size = 4; - $data = null; - - break; - case 0x02: - $name = 'tAttrIf'; - $size = 4; - $data = null; - - break; - case 0x04: - $name = 'tAttrChoose'; - // offset: 2; size: 2; number of choices in the CHOOSE function ($nc, number of parameters decreased by 1) - $nc = self::getUInt2d($formulaData, 2); - // offset: 4; size: 2 * $nc - // offset: 4 + 2 * $nc; size: 2 - $size = 2 * $nc + 6; - $data = null; - - break; - case 0x08: - $name = 'tAttrSkip'; - $size = 4; - $data = null; - - break; - case 0x10: - $name = 'tAttrSum'; - $size = 4; - $data = null; - - break; - case 0x40: - case 0x41: - $name = 'tAttrSpace'; - $size = 4; - // offset: 2; size: 2; space type and position - $spacetype = match (ord($formulaData[2])) { - 0x00 => 'type0', - 0x01 => 'type1', - 0x02 => 'type2', - 0x03 => 'type3', - 0x04 => 'type4', - 0x05 => 'type5', - default => throw new Exception('Unrecognized space type in tAttrSpace token'), - }; - // offset: 3; size: 1; number of inserted spaces/carriage returns - $spacecount = ord($formulaData[3]); - - $data = ['spacetype' => $spacetype, 'spacecount' => $spacecount]; - - break; - default: - throw new Exception('Unrecognized attribute flag in tAttr token'); - } - - break; - case 0x1C: // error code - // offset: 1; size: 1; error code - $name = 'tErr'; - $size = 2; - $data = Xls\ErrorCode::lookup(ord($formulaData[1])); - - break; - case 0x1D: // boolean - // offset: 1; size: 1; 0 = false, 1 = true; - $name = 'tBool'; - $size = 2; - $data = ord($formulaData[1]) ? 'TRUE' : 'FALSE'; - - break; - case 0x1E: // integer - // offset: 1; size: 2; unsigned 16-bit integer - $name = 'tInt'; - $size = 3; - $data = self::getUInt2d($formulaData, 1); - - break; - case 0x1F: // number - // offset: 1; size: 8; - $name = 'tNum'; - $size = 9; - $data = self::extractNumber(substr($formulaData, 1)); - $data = str_replace(',', '.', (string) $data); // in case non-English locale - - break; - case 0x20: // array constant - case 0x40: - case 0x60: - // offset: 1; size: 7; not used - $name = 'tArray'; - $size = 8; - $data = null; - - break; - case 0x21: // function with fixed number of arguments - case 0x41: - case 0x61: - $name = 'tFunc'; - $size = 3; - // offset: 1; size: 2; index to built-in sheet function - switch (self::getUInt2d($formulaData, 1)) { - case 2: - $function = 'ISNA'; - $args = 1; - - break; - case 3: - $function = 'ISERROR'; - $args = 1; - - break; - case 10: - $function = 'NA'; - $args = 0; - - break; - case 15: - $function = 'SIN'; - $args = 1; - - break; - case 16: - $function = 'COS'; - $args = 1; - - break; - case 17: - $function = 'TAN'; - $args = 1; + // offset: 28, size: 4; option flags + // bit: 0; mask: 0x00000001; 0 = no link or extant, 1 = file link or URL + $isFileLinkOrUrl = (0x00000001 & self::getUInt2d($recordData, 28)) >> 0; - break; - case 18: - $function = 'ATAN'; - $args = 1; + // bit: 1; mask: 0x00000002; 0 = relative path, 1 = absolute path or URL + //$isAbsPathOrUrl = (0x00000001 & self::getUInt2d($recordData, 28)) >> 1; - break; - case 19: - $function = 'PI'; - $args = 0; + // bit: 2 (and 4); mask: 0x00000014; 0 = no description + $hasDesc = (0x00000014 & self::getUInt2d($recordData, 28)) >> 2; - break; - case 20: - $function = 'SQRT'; - $args = 1; + // bit: 3; mask: 0x00000008; 0 = no text, 1 = has text + $hasText = (0x00000008 & self::getUInt2d($recordData, 28)) >> 3; - break; - case 21: - $function = 'EXP'; - $args = 1; + // bit: 7; mask: 0x00000080; 0 = no target frame, 1 = has target frame + $hasFrame = (0x00000080 & self::getUInt2d($recordData, 28)) >> 7; - break; - case 22: - $function = 'LN'; - $args = 1; + // bit: 8; mask: 0x00000100; 0 = file link or URL, 1 = UNC path (inc. server name) + $isUNC = (0x00000100 & self::getUInt2d($recordData, 28)) >> 8; - break; - case 23: - $function = 'LOG10'; - $args = 1; + // offset within record data + $offset = 32; - break; - case 24: - $function = 'ABS'; - $args = 1; + if ($hasDesc) { + // offset: 32; size: var; character count of description text + $dl = self::getInt4d($recordData, 32); + // offset: 36; size: var; character array of description text, no Unicode string header, always 16-bit characters, zero terminated + //$desc = self::encodeUTF16(substr($recordData, 36, 2 * ($dl - 1)), false); + $offset += 4 + 2 * $dl; + } + if ($hasFrame) { + $fl = self::getInt4d($recordData, $offset); + $offset += 4 + 2 * $fl; + } - break; - case 25: - $function = 'INT'; - $args = 1; + // detect type of hyperlink (there are 4 types) + $hyperlinkType = null; - break; - case 26: - $function = 'SIGN'; - $args = 1; + if ($isUNC) { + $hyperlinkType = 'UNC'; + } elseif (!$isFileLinkOrUrl) { + $hyperlinkType = 'workbook'; + } elseif (ord($recordData[$offset]) == 0x03) { + $hyperlinkType = 'local'; + } elseif (ord($recordData[$offset]) == 0xE0) { + $hyperlinkType = 'URL'; + } - break; - case 27: - $function = 'ROUND'; - $args = 2; + switch ($hyperlinkType) { + case 'URL': + // section 5.58.2: Hyperlink containing a URL + // e.g. http://example.org/index.php - break; - case 30: - $function = 'REPT'; - $args = 2; + // offset: var; size: 16; GUID of URL Moniker + $offset += 16; + // offset: var; size: 4; size (in bytes) of character array of the URL including trailing zero word + $us = self::getInt4d($recordData, $offset); + $offset += 4; + // offset: var; size: $us; character array of the URL, no Unicode string header, always 16-bit characters, zero-terminated + $url = self::encodeUTF16(substr($recordData, $offset, $us - 2), false); + $nullOffset = strpos($url, chr(0x00)); + if ($nullOffset) { + $url = substr($url, 0, $nullOffset); + } + $url .= $hasText ? '#' : ''; + $offset += $us; - break; - case 31: - $function = 'MID'; - $args = 3; + break; + case 'local': + // section 5.58.3: Hyperlink to local file + // examples: + // mydoc.txt + // ../../somedoc.xls#Sheet!A1 - break; - case 32: - $function = 'LEN'; - $args = 1; + // offset: var; size: 16; GUI of File Moniker + $offset += 16; - break; - case 33: - $function = 'VALUE'; - $args = 1; + // offset: var; size: 2; directory up-level count. + $upLevelCount = self::getUInt2d($recordData, $offset); + $offset += 2; - break; - case 34: - $function = 'TRUE'; - $args = 0; + // offset: var; size: 4; character count of the shortened file path and name, including trailing zero word + $sl = self::getInt4d($recordData, $offset); + $offset += 4; - break; - case 35: - $function = 'FALSE'; - $args = 0; + // offset: var; size: sl; character array of the shortened file path and name in 8.3-DOS-format (compressed Unicode string) + $shortenedFilePath = substr($recordData, $offset, $sl); + $shortenedFilePath = self::encodeUTF16($shortenedFilePath, true); + $shortenedFilePath = substr($shortenedFilePath, 0, -1); // remove trailing zero - break; - case 38: - $function = 'NOT'; - $args = 1; + $offset += $sl; - break; - case 39: - $function = 'MOD'; - $args = 2; + // offset: var; size: 24; unknown sequence + $offset += 24; - break; - case 40: - $function = 'DCOUNT'; - $args = 3; + // extended file path + // offset: var; size: 4; size of the following file link field including string lenth mark + $sz = self::getInt4d($recordData, $offset); + $offset += 4; - break; - case 41: - $function = 'DSUM'; - $args = 3; + $extendedFilePath = ''; + // only present if $sz > 0 + if ($sz > 0) { + // offset: var; size: 4; size of the character array of the extended file path and name + $xl = self::getInt4d($recordData, $offset); + $offset += 4; - break; - case 42: - $function = 'DAVERAGE'; - $args = 3; + // offset: var; size 2; unknown + $offset += 2; - break; - case 43: - $function = 'DMIN'; - $args = 3; + // offset: var; size $xl; character array of the extended file path and name. + $extendedFilePath = substr($recordData, $offset, $xl); + $extendedFilePath = self::encodeUTF16($extendedFilePath, false); + $offset += $xl; + } - break; - case 44: - $function = 'DMAX'; - $args = 3; + // construct the path + $url = str_repeat('..\\', $upLevelCount); + $url .= ($sz > 0) ? $extendedFilePath : $shortenedFilePath; // use extended path if available + $url .= $hasText ? '#' : ''; - break; - case 45: - $function = 'DSTDEV'; - $args = 3; + break; + case 'UNC': + // section 5.58.4: Hyperlink to a File with UNC (Universal Naming Convention) Path + // todo: implement + return; + case 'workbook': + // section 5.58.5: Hyperlink to the Current Workbook + // e.g. Sheet2!B1:C2, stored in text mark field + $url = 'sheet://'; - break; - case 48: - $function = 'TEXT'; - $args = 2; + break; + default: + return; + } - break; - case 61: - $function = 'MIRR'; - $args = 3; + if ($hasText) { + // offset: var; size: 4; character count of text mark including trailing zero word + $tl = self::getInt4d($recordData, $offset); + $offset += 4; + // offset: var; size: var; character array of the text mark without the # sign, no Unicode header, always 16-bit characters, zero-terminated + $text = self::encodeUTF16(substr($recordData, $offset, 2 * ($tl - 1)), false); + $url .= $text; + } - break; - case 63: - $function = 'RAND'; - $args = 0; + // apply the hyperlink to all the relevant cells + foreach (Coordinate::extractAllCellReferencesInRange($cellRange) as $coordinate) { + $this->phpSheet->getCell($coordinate)->getHyperLink()->setUrl($url); + } + } + } - break; - case 65: - $function = 'DATE'; - $args = 3; + /** + * Read DATAVALIDATIONS record. + */ + protected function readDataValidations(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + //$recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case 66: - $function = 'TIME'; - $args = 3; + // move stream pointer forward to next record + $this->pos += 4 + $length; + } - break; - case 67: - $function = 'DAY'; - $args = 1; + /** + * Read DATAVALIDATION record. + */ + protected function readDataValidation(): void + { + (new Xls\DataValidationHelper())->readDataValidation2($this); + } - break; - case 68: - $function = 'MONTH'; - $args = 1; + /** + * Read SHEETLAYOUT record. Stores sheet tab color information. + */ + protected function readSheetLayout(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case 69: - $function = 'YEAR'; - $args = 1; + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case 71: - $function = 'HOUR'; - $args = 1; + if (!$this->readDataOnly) { + // offset: 0; size: 2; repeated record identifier 0x0862 - break; - case 72: - $function = 'MINUTE'; - $args = 1; + // offset: 2; size: 10; not used - break; - case 73: - $function = 'SECOND'; - $args = 1; + // offset: 12; size: 4; size of record data + // Excel 2003 uses size of 0x14 (documented), Excel 2007 uses size of 0x28 (not documented?) + $sz = self::getInt4d($recordData, 12); - break; - case 74: - $function = 'NOW'; - $args = 0; + switch ($sz) { + case 0x14: + // offset: 16; size: 2; color index for sheet tab + $colorIndex = self::getUInt2d($recordData, 16); + $color = Xls\Color::map($colorIndex, $this->palette, $this->version); + $this->phpSheet->getTabColor()->setRGB($color['rgb']); - break; - case 75: - $function = 'AREAS'; - $args = 1; + break; + case 0x28: + // TODO: Investigate structure for .xls SHEETLAYOUT record as saved by MS Office Excel 2007 + return; + } + } + } - break; - case 76: - $function = 'ROWS'; - $args = 1; + /** + * Read SHEETPROTECTION record (FEATHEADR). + */ + protected function readSheetProtection(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case 77: - $function = 'COLUMNS'; - $args = 1; + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case 83: - $function = 'TRANSPOSE'; - $args = 1; + if ($this->readDataOnly) { + return; + } - break; - case 86: - $function = 'TYPE'; - $args = 1; + // offset: 0; size: 2; repeated record header - break; - case 97: - $function = 'ATAN2'; - $args = 2; + // offset: 2; size: 2; FRT cell reference flag (=0 currently) - break; - case 98: - $function = 'ASIN'; - $args = 1; + // offset: 4; size: 8; Currently not used and set to 0 - break; - case 99: - $function = 'ACOS'; - $args = 1; + // offset: 12; size: 2; Shared feature type index (2=Enhanced Protetion, 4=SmartTag) + $isf = self::getUInt2d($recordData, 12); + if ($isf != 2) { + return; + } - break; - case 105: - $function = 'ISREF'; - $args = 1; + // offset: 14; size: 1; =1 since this is a feat header - break; - case 111: - $function = 'CHAR'; - $args = 1; + // offset: 15; size: 4; size of rgbHdrSData - break; - case 112: - $function = 'LOWER'; - $args = 1; + // rgbHdrSData, assume "Enhanced Protection" + // offset: 19; size: 2; option flags + $options = self::getUInt2d($recordData, 19); - break; - case 113: - $function = 'UPPER'; - $args = 1; + // bit: 0; mask 0x0001; 1 = user may edit objects, 0 = users must not edit objects + // Note - do not negate $bool + $bool = (0x0001 & $options) >> 0; + $this->phpSheet->getProtection()->setObjects((bool) $bool); - break; - case 114: - $function = 'PROPER'; - $args = 1; + // bit: 1; mask 0x0002; edit scenarios + // Note - do not negate $bool + $bool = (0x0002 & $options) >> 1; + $this->phpSheet->getProtection()->setScenarios((bool) $bool); - break; - case 117: - $function = 'EXACT'; - $args = 2; + // bit: 2; mask 0x0004; format cells + $bool = (0x0004 & $options) >> 2; + $this->phpSheet->getProtection()->setFormatCells(!$bool); - break; - case 118: - $function = 'TRIM'; - $args = 1; + // bit: 3; mask 0x0008; format columns + $bool = (0x0008 & $options) >> 3; + $this->phpSheet->getProtection()->setFormatColumns(!$bool); - break; - case 119: - $function = 'REPLACE'; - $args = 4; + // bit: 4; mask 0x0010; format rows + $bool = (0x0010 & $options) >> 4; + $this->phpSheet->getProtection()->setFormatRows(!$bool); - break; - case 121: - $function = 'CODE'; - $args = 1; + // bit: 5; mask 0x0020; insert columns + $bool = (0x0020 & $options) >> 5; + $this->phpSheet->getProtection()->setInsertColumns(!$bool); - break; - case 126: - $function = 'ISERR'; - $args = 1; + // bit: 6; mask 0x0040; insert rows + $bool = (0x0040 & $options) >> 6; + $this->phpSheet->getProtection()->setInsertRows(!$bool); - break; - case 127: - $function = 'ISTEXT'; - $args = 1; + // bit: 7; mask 0x0080; insert hyperlinks + $bool = (0x0080 & $options) >> 7; + $this->phpSheet->getProtection()->setInsertHyperlinks(!$bool); - break; - case 128: - $function = 'ISNUMBER'; - $args = 1; + // bit: 8; mask 0x0100; delete columns + $bool = (0x0100 & $options) >> 8; + $this->phpSheet->getProtection()->setDeleteColumns(!$bool); - break; - case 129: - $function = 'ISBLANK'; - $args = 1; + // bit: 9; mask 0x0200; delete rows + $bool = (0x0200 & $options) >> 9; + $this->phpSheet->getProtection()->setDeleteRows(!$bool); - break; - case 130: - $function = 'T'; - $args = 1; + // bit: 10; mask 0x0400; select locked cells + // Note that this is opposite of most of above. + $bool = (0x0400 & $options) >> 10; + $this->phpSheet->getProtection()->setSelectLockedCells((bool) $bool); - break; - case 131: - $function = 'N'; - $args = 1; + // bit: 11; mask 0x0800; sort cell range + $bool = (0x0800 & $options) >> 11; + $this->phpSheet->getProtection()->setSort(!$bool); - break; - case 140: - $function = 'DATEVALUE'; - $args = 1; + // bit: 12; mask 0x1000; auto filter + $bool = (0x1000 & $options) >> 12; + $this->phpSheet->getProtection()->setAutoFilter(!$bool); - break; - case 141: - $function = 'TIMEVALUE'; - $args = 1; + // bit: 13; mask 0x2000; pivot tables + $bool = (0x2000 & $options) >> 13; + $this->phpSheet->getProtection()->setPivotTables(!$bool); - break; - case 142: - $function = 'SLN'; - $args = 3; + // bit: 14; mask 0x4000; select unlocked cells + // Note that this is opposite of most of above. + $bool = (0x4000 & $options) >> 14; + $this->phpSheet->getProtection()->setSelectUnlockedCells((bool) $bool); - break; - case 143: - $function = 'SYD'; - $args = 4; + // offset: 21; size: 2; not used + } - break; - case 162: - $function = 'CLEAN'; - $args = 1; + /** + * Read RANGEPROTECTION record + * Reading of this record is based on Microsoft Office Excel 97-2000 Binary File Format Specification, + * where it is referred to as FEAT record. + */ + protected function readRangeProtection(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case 163: - $function = 'MDETERM'; - $args = 1; + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case 164: - $function = 'MINVERSE'; - $args = 1; + // local pointer in record data + $offset = 0; - break; - case 165: - $function = 'MMULT'; - $args = 2; + if (!$this->readDataOnly) { + $offset += 12; - break; - case 184: - $function = 'FACT'; - $args = 1; + // offset: 12; size: 2; shared feature type, 2 = enhanced protection, 4 = smart tag + $isf = self::getUInt2d($recordData, 12); + if ($isf != 2) { + // we only read FEAT records of type 2 + return; + } + $offset += 2; - break; - case 189: - $function = 'DPRODUCT'; - $args = 3; + $offset += 5; - break; - case 190: - $function = 'ISNONTEXT'; - $args = 1; + // offset: 19; size: 2; count of ref ranges this feature is on + $cref = self::getUInt2d($recordData, 19); + $offset += 2; - break; - case 195: - $function = 'DSTDEVP'; - $args = 3; + $offset += 6; - break; - case 196: - $function = 'DVARP'; - $args = 3; + // offset: 27; size: 8 * $cref; list of cell ranges (like in hyperlink record) + $cellRanges = []; + for ($i = 0; $i < $cref; ++$i) { + try { + $cellRange = Xls\Biff8::readBIFF8CellRangeAddressFixed(substr($recordData, 27 + 8 * $i, 8)); + } catch (PhpSpreadsheetException) { + return; + } + $cellRanges[] = $cellRange; + $offset += 8; + } - break; - case 198: - $function = 'ISLOGICAL'; - $args = 1; + // offset: var; size: var; variable length of feature specific data + //$rgbFeat = substr($recordData, $offset); + $offset += 4; - break; - case 199: - $function = 'DCOUNTA'; - $args = 3; + // offset: var; size: 4; the encrypted password (only 16-bit although field is 32-bit) + $wPassword = self::getInt4d($recordData, $offset); + $offset += 4; - break; - case 207: - $function = 'REPLACEB'; - $args = 4; + // Apply range protection to sheet + if ($cellRanges) { + $this->phpSheet->protectCells(implode(' ', $cellRanges), ($wPassword === 0) ? '' : strtoupper(dechex($wPassword)), true); + } + } + } - break; - case 210: - $function = 'MIDB'; - $args = 3; + /** + * Read a free CONTINUE record. Free CONTINUE record may be a camouflaged MSODRAWING record + * When MSODRAWING data on a sheet exceeds 8224 bytes, CONTINUE records are used instead. Undocumented. + * In this case, we must treat the CONTINUE record as a MSODRAWING record. + */ + protected function readContinue(): void + { + $length = self::getUInt2d($this->data, $this->pos + 2); + $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case 211: - $function = 'LENB'; - $args = 1; + // check if we are reading drawing data + // this is in case a free CONTINUE record occurs in other circumstances we are unaware of + if ($this->drawingData == '') { + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case 212: - $function = 'ROUNDUP'; - $args = 2; + return; + } - break; - case 213: - $function = 'ROUNDDOWN'; - $args = 2; + // check if record data is at least 4 bytes long, otherwise there is no chance this is MSODRAWING data + if ($length < 4) { + // move stream pointer to next record + $this->pos += 4 + $length; - break; - case 214: - $function = 'ASC'; - $args = 1; + return; + } - break; - case 215: - $function = 'DBCS'; - $args = 1; + // dirty check to see if CONTINUE record could be a camouflaged MSODRAWING record + // look inside CONTINUE record to see if it looks like a part of an Escher stream + // we know that Escher stream may be split at least at + // 0xF003 MsofbtSpgrContainer + // 0xF004 MsofbtSpContainer + // 0xF00D MsofbtClientTextbox + $validSplitPoints = [0xF003, 0xF004, 0xF00D]; // add identifiers if we find more - break; - case 221: - $function = 'TODAY'; - $args = 0; + $splitPoint = self::getUInt2d($recordData, 2); + if (in_array($splitPoint, $validSplitPoints)) { + // get spliced record data (and move pointer to next record) + $splicedRecordData = $this->getSplicedRecordData(); + $this->drawingData .= $splicedRecordData['recordData']; - break; - case 229: - $function = 'SINH'; - $args = 1; + return; + } - break; - case 230: - $function = 'COSH'; - $args = 1; + // move stream pointer to next record + $this->pos += 4 + $length; + } - break; - case 231: - $function = 'TANH'; - $args = 1; + /** + * Reads a record from current position in data stream and continues reading data as long as CONTINUE + * records are found. Splices the record data pieces and returns the combined string as if record data + * is in one piece. + * Moves to next current position in data stream to start of next record different from a CONtINUE record. + */ + private function getSplicedRecordData(): array + { + $data = ''; + $spliceOffsets = []; - break; - case 232: - $function = 'ASINH'; - $args = 1; + $i = 0; + $spliceOffsets[0] = 0; - break; - case 233: - $function = 'ACOSH'; - $args = 1; + do { + ++$i; - break; - case 234: - $function = 'ATANH'; - $args = 1; + // offset: 0; size: 2; identifier + //$identifier = self::getUInt2d($this->data, $this->pos); + // offset: 2; size: 2; length + $length = self::getUInt2d($this->data, $this->pos + 2); + $data .= $this->readRecordData($this->data, $this->pos + 4, $length); - break; - case 235: - $function = 'DGET'; - $args = 3; + $spliceOffsets[$i] = $spliceOffsets[$i - 1] + $length; - break; - case 244: - $function = 'INFO'; - $args = 1; + $this->pos += 4 + $length; + $nextIdentifier = self::getUInt2d($this->data, $this->pos); + } while ($nextIdentifier == self::XLS_TYPE_CONTINUE); - break; - case 252: - $function = 'FREQUENCY'; - $args = 2; + return [ + 'recordData' => $data, + 'spliceOffsets' => $spliceOffsets, + ]; + } - break; - case 261: - $function = 'ERROR.TYPE'; - $args = 1; + /** + * Convert formula structure into human readable Excel formula like 'A3+A5*5'. + * + * @param string $formulaStructure The complete binary data for the formula + * @param string $baseCell Base cell, only needed when formula contains tRefN tokens, e.g. with shared formulas + * + * @return string Human readable formula + */ + protected function getFormulaFromStructure(string $formulaStructure, string $baseCell = 'A1'): string + { + // offset: 0; size: 2; size of the following formula data + $sz = self::getUInt2d($formulaStructure, 0); - break; - case 271: - $function = 'GAMMALN'; - $args = 1; + // offset: 2; size: sz + $formulaData = substr($formulaStructure, 2, $sz); - break; - case 273: - $function = 'BINOMDIST'; - $args = 4; + // offset: 2 + sz; size: variable (optional) + if (strlen($formulaStructure) > 2 + $sz) { + $additionalData = substr($formulaStructure, 2 + $sz); + } else { + $additionalData = ''; + } - break; - case 274: - $function = 'CHIDIST'; - $args = 2; + return $this->getFormulaFromData($formulaData, $additionalData, $baseCell); + } - break; - case 275: - $function = 'CHIINV'; - $args = 2; + /** + * Take formula data and additional data for formula and return human readable formula. + * + * @param string $formulaData The binary data for the formula itself + * @param string $additionalData Additional binary data going with the formula + * @param string $baseCell Base cell, only needed when formula contains tRefN tokens, e.g. with shared formulas + * + * @return string Human readable formula + */ + private function getFormulaFromData(string $formulaData, string $additionalData = '', string $baseCell = 'A1'): string + { + // start parsing the formula data + $tokens = []; - break; - case 276: - $function = 'COMBIN'; - $args = 2; + while ($formulaData !== '' && $token = $this->getNextToken($formulaData, $baseCell)) { + $tokens[] = $token; + $formulaData = substr($formulaData, $token['size']); + } - break; - case 277: - $function = 'CONFIDENCE'; - $args = 3; + $formulaString = $this->createFormulaFromTokens($tokens, $additionalData); - break; - case 278: - $function = 'CRITBINOM'; - $args = 3; + return $formulaString; + } - break; - case 279: - $function = 'EVEN'; - $args = 1; + /** + * Take array of tokens together with additional data for formula and return human readable formula. + * + * @param string $additionalData Additional binary data going with the formula + * + * @return string Human readable formula + */ + private function createFormulaFromTokens(array $tokens, string $additionalData): string + { + // empty formula? + if (empty($tokens)) { + return ''; + } - break; - case 280: - $function = 'EXPONDIST'; - $args = 3; + $formulaStrings = []; + foreach ($tokens as $token) { + // initialize spaces + $space0 = $space0 ?? ''; // spaces before next token, not tParen + $space1 = $space1 ?? ''; // carriage returns before next token, not tParen + $space2 = $space2 ?? ''; // spaces before opening parenthesis + $space3 = $space3 ?? ''; // carriage returns before opening parenthesis + $space4 = $space4 ?? ''; // spaces before closing parenthesis + $space5 = $space5 ?? ''; // carriage returns before closing parenthesis - break; - case 281: - $function = 'FDIST'; - $args = 3; + switch ($token['name']) { + case 'tAdd': // addition + case 'tConcat': // addition + case 'tDiv': // division + case 'tEQ': // equality + case 'tGE': // greater than or equal + case 'tGT': // greater than + case 'tIsect': // intersection + case 'tLE': // less than or equal + case 'tList': // less than or equal + case 'tLT': // less than + case 'tMul': // multiplication + case 'tNE': // multiplication + case 'tPower': // power + case 'tRange': // range + case 'tSub': // subtraction + $op2 = array_pop($formulaStrings); + $op1 = array_pop($formulaStrings); + $formulaStrings[] = "$op1$space1$space0{$token['data']}$op2"; + unset($space0, $space1); - break; - case 282: - $function = 'FINV'; - $args = 3; + break; + case 'tUplus': // unary plus + case 'tUminus': // unary minus + $op = array_pop($formulaStrings); + $formulaStrings[] = "$space1$space0{$token['data']}$op"; + unset($space0, $space1); - break; - case 283: - $function = 'FISHER'; - $args = 1; + break; + case 'tPercent': // percent sign + $op = array_pop($formulaStrings); + $formulaStrings[] = "$op$space1$space0{$token['data']}"; + unset($space0, $space1); - break; - case 284: - $function = 'FISHERINV'; - $args = 1; + break; + case 'tAttrVolatile': // indicates volatile function + case 'tAttrIf': + case 'tAttrSkip': + case 'tAttrChoose': + // token is only important for Excel formula evaluator + // do nothing + break; + case 'tAttrSpace': // space / carriage return + // space will be used when next token arrives, do not alter formulaString stack + switch ($token['data']['spacetype']) { + case 'type0': + $space0 = str_repeat(' ', $token['data']['spacecount']); - break; - case 285: - $function = 'FLOOR'; - $args = 2; + break; + case 'type1': + $space1 = str_repeat("\n", $token['data']['spacecount']); - break; - case 286: - $function = 'GAMMADIST'; - $args = 4; + break; + case 'type2': + $space2 = str_repeat(' ', $token['data']['spacecount']); - break; - case 287: - $function = 'GAMMAINV'; - $args = 3; + break; + case 'type3': + $space3 = str_repeat("\n", $token['data']['spacecount']); - break; - case 288: - $function = 'CEILING'; - $args = 2; + break; + case 'type4': + $space4 = str_repeat(' ', $token['data']['spacecount']); - break; - case 289: - $function = 'HYPGEOMDIST'; - $args = 4; + break; + case 'type5': + $space5 = str_repeat("\n", $token['data']['spacecount']); - break; - case 290: - $function = 'LOGNORMDIST'; - $args = 3; + break; + } - break; - case 291: - $function = 'LOGINV'; - $args = 3; + break; + case 'tAttrSum': // SUM function with one parameter + $op = array_pop($formulaStrings); + $formulaStrings[] = "{$space1}{$space0}SUM($op)"; + unset($space0, $space1); - break; - case 292: - $function = 'NEGBINOMDIST'; - $args = 3; + break; + case 'tFunc': // function with fixed number of arguments + case 'tFuncV': // function with variable number of arguments + if ($token['data']['function'] != '') { + // normal function + $ops = []; // array of operators + for ($i = 0; $i < $token['data']['args']; ++$i) { + $ops[] = array_pop($formulaStrings); + } + $ops = array_reverse($ops); + $formulaStrings[] = "$space1$space0{$token['data']['function']}(" . implode(',', $ops) . ')'; + unset($space0, $space1); + } else { + // add-in function + $ops = []; // array of operators + for ($i = 0; $i < $token['data']['args'] - 1; ++$i) { + $ops[] = array_pop($formulaStrings); + } + $ops = array_reverse($ops); + $function = array_pop($formulaStrings); + $formulaStrings[] = "$space1$space0$function(" . implode(',', $ops) . ')'; + unset($space0, $space1); + } - break; - case 293: - $function = 'NORMDIST'; - $args = 4; + break; + case 'tParen': // parenthesis + $expression = array_pop($formulaStrings); + $formulaStrings[] = "$space3$space2($expression$space5$space4)"; + unset($space2, $space3, $space4, $space5); - break; - case 294: - $function = 'NORMSDIST'; - $args = 1; + break; + case 'tArray': // array constant + $constantArray = Xls\Biff8::readBIFF8ConstantArray($additionalData); + $formulaStrings[] = $space1 . $space0 . $constantArray['value']; + $additionalData = substr($additionalData, $constantArray['size']); // bite of chunk of additional data + unset($space0, $space1); - break; - case 295: - $function = 'NORMINV'; - $args = 3; + break; + case 'tMemArea': + // bite off chunk of additional data + $cellRangeAddressList = Xls\Biff8::readBIFF8CellRangeAddressList($additionalData); + $additionalData = substr($additionalData, $cellRangeAddressList['size']); + $formulaStrings[] = "$space1$space0{$token['data']}"; + unset($space0, $space1); - break; - case 296: - $function = 'NORMSINV'; - $args = 1; + break; + case 'tArea': // cell range address + case 'tBool': // boolean + case 'tErr': // error code + case 'tInt': // integer + case 'tMemErr': + case 'tMemFunc': + case 'tMissArg': + case 'tName': + case 'tNameX': + case 'tNum': // number + case 'tRef': // single cell reference + case 'tRef3d': // 3d cell reference + case 'tArea3d': // 3d cell range reference + case 'tRefN': + case 'tAreaN': + case 'tStr': // string + $formulaStrings[] = "$space1$space0{$token['data']}"; + unset($space0, $space1); - break; - case 297: - $function = 'STANDARDIZE'; - $args = 3; + break; + } + } + $formulaString = $formulaStrings[0]; - break; - case 298: - $function = 'ODD'; - $args = 1; + return $formulaString; + } - break; - case 299: - $function = 'PERMUT'; - $args = 2; + /** + * Fetch next token from binary formula data. + * + * @param string $formulaData Formula data + * @param string $baseCell Base cell, only needed when formula contains tRefN tokens, e.g. with shared formulas + */ + private function getNextToken(string $formulaData, string $baseCell = 'A1'): array + { + // offset: 0; size: 1; token id + $id = ord($formulaData[0]); // token id + $name = false; // initialize token name - break; - case 300: - $function = 'POISSON'; - $args = 3; + switch ($id) { + case 0x03: + $name = 'tAdd'; + $size = 1; + $data = '+'; - break; - case 301: - $function = 'TDIST'; - $args = 3; + break; + case 0x04: + $name = 'tSub'; + $size = 1; + $data = '-'; - break; - case 302: - $function = 'WEIBULL'; - $args = 4; + break; + case 0x05: + $name = 'tMul'; + $size = 1; + $data = '*'; - break; - case 303: - $function = 'SUMXMY2'; - $args = 2; + break; + case 0x06: + $name = 'tDiv'; + $size = 1; + $data = '/'; - break; - case 304: - $function = 'SUMX2MY2'; - $args = 2; + break; + case 0x07: + $name = 'tPower'; + $size = 1; + $data = '^'; - break; - case 305: - $function = 'SUMX2PY2'; - $args = 2; + break; + case 0x08: + $name = 'tConcat'; + $size = 1; + $data = '&'; - break; - case 306: - $function = 'CHITEST'; - $args = 2; + break; + case 0x09: + $name = 'tLT'; + $size = 1; + $data = '<'; - break; - case 307: - $function = 'CORREL'; - $args = 2; + break; + case 0x0A: + $name = 'tLE'; + $size = 1; + $data = '<='; - break; - case 308: - $function = 'COVAR'; - $args = 2; + break; + case 0x0B: + $name = 'tEQ'; + $size = 1; + $data = '='; - break; - case 309: - $function = 'FORECAST'; - $args = 3; + break; + case 0x0C: + $name = 'tGE'; + $size = 1; + $data = '>='; - break; - case 310: - $function = 'FTEST'; - $args = 2; + break; + case 0x0D: + $name = 'tGT'; + $size = 1; + $data = '>'; - break; - case 311: - $function = 'INTERCEPT'; - $args = 2; + break; + case 0x0E: + $name = 'tNE'; + $size = 1; + $data = '<>'; - break; - case 312: - $function = 'PEARSON'; - $args = 2; + break; + case 0x0F: + $name = 'tIsect'; + $size = 1; + $data = ' '; - break; - case 313: - $function = 'RSQ'; - $args = 2; + break; + case 0x10: + $name = 'tList'; + $size = 1; + $data = ','; - break; - case 314: - $function = 'STEYX'; - $args = 2; + break; + case 0x11: + $name = 'tRange'; + $size = 1; + $data = ':'; - break; - case 315: - $function = 'SLOPE'; - $args = 2; + break; + case 0x12: + $name = 'tUplus'; + $size = 1; + $data = '+'; - break; - case 316: - $function = 'TTEST'; - $args = 4; + break; + case 0x13: + $name = 'tUminus'; + $size = 1; + $data = '-'; - break; - case 325: - $function = 'LARGE'; - $args = 2; + break; + case 0x14: + $name = 'tPercent'; + $size = 1; + $data = '%'; - break; - case 326: - $function = 'SMALL'; - $args = 2; + break; + case 0x15: // parenthesis + $name = 'tParen'; + $size = 1; + $data = null; - break; - case 327: - $function = 'QUARTILE'; - $args = 2; + break; + case 0x16: // missing argument + $name = 'tMissArg'; + $size = 1; + $data = ''; - break; - case 328: - $function = 'PERCENTILE'; - $args = 2; + break; + case 0x17: // string + $name = 'tStr'; + // offset: 1; size: var; Unicode string, 8-bit string length + $string = self::readUnicodeStringShort(substr($formulaData, 1)); + $size = 1 + $string['size']; + $data = self::UTF8toExcelDoubleQuoted($string['value']); - break; - case 331: - $function = 'TRIMMEAN'; - $args = 2; + break; + case 0x19: // Special attribute + // offset: 1; size: 1; attribute type flags: + switch (ord($formulaData[1])) { + case 0x01: + $name = 'tAttrVolatile'; + $size = 4; + $data = null; break; - case 332: - $function = 'TINV'; - $args = 2; + case 0x02: + $name = 'tAttrIf'; + $size = 4; + $data = null; break; - case 337: - $function = 'POWER'; - $args = 2; + case 0x04: + $name = 'tAttrChoose'; + // offset: 2; size: 2; number of choices in the CHOOSE function ($nc, number of parameters decreased by 1) + $nc = self::getUInt2d($formulaData, 2); + // offset: 4; size: 2 * $nc + // offset: 4 + 2 * $nc; size: 2 + $size = 2 * $nc + 6; + $data = null; break; - case 342: - $function = 'RADIANS'; - $args = 1; + case 0x08: + $name = 'tAttrSkip'; + $size = 4; + $data = null; break; - case 343: - $function = 'DEGREES'; - $args = 1; + case 0x10: + $name = 'tAttrSum'; + $size = 4; + $data = null; break; - case 346: - $function = 'COUNTIF'; - $args = 2; + case 0x40: + case 0x41: + $name = 'tAttrSpace'; + $size = 4; + // offset: 2; size: 2; space type and position + $spacetype = match (ord($formulaData[2])) { + 0x00 => 'type0', + 0x01 => 'type1', + 0x02 => 'type2', + 0x03 => 'type3', + 0x04 => 'type4', + 0x05 => 'type5', + default => throw new Exception('Unrecognized space type in tAttrSpace token'), + }; + // offset: 3; size: 1; number of inserted spaces/carriage returns + $spacecount = ord($formulaData[3]); - break; - case 347: - $function = 'COUNTBLANK'; - $args = 1; + $data = ['spacetype' => $spacetype, 'spacecount' => $spacecount]; break; - case 350: - $function = 'ISPMT'; - $args = 4; + default: + throw new Exception('Unrecognized attribute flag in tAttr token'); + } - break; - case 351: - $function = 'DATEDIF'; - $args = 3; + break; + case 0x1C: // error code + // offset: 1; size: 1; error code + $name = 'tErr'; + $size = 2; + $data = Xls\ErrorCode::lookup(ord($formulaData[1])); - break; - case 352: - $function = 'DATESTRING'; - $args = 1; + break; + case 0x1D: // boolean + // offset: 1; size: 1; 0 = false, 1 = true; + $name = 'tBool'; + $size = 2; + $data = ord($formulaData[1]) ? 'TRUE' : 'FALSE'; - break; - case 353: - $function = 'NUMBERSTRING'; - $args = 2; + break; + case 0x1E: // integer + // offset: 1; size: 2; unsigned 16-bit integer + $name = 'tInt'; + $size = 3; + $data = self::getUInt2d($formulaData, 1); - break; - case 360: - $function = 'PHONETIC'; - $args = 1; + break; + case 0x1F: // number + // offset: 1; size: 8; + $name = 'tNum'; + $size = 9; + $data = self::extractNumber(substr($formulaData, 1)); + $data = str_replace(',', '.', (string) $data); // in case non-English locale - break; - case 368: - $function = 'BAHTTEXT'; - $args = 1; + break; + case 0x20: // array constant + case 0x40: + case 0x60: + // offset: 1; size: 7; not used + $name = 'tArray'; + $size = 8; + $data = null; - break; - default: - throw new Exception('Unrecognized function in formula'); + break; + case 0x21: // function with fixed number of arguments + case 0x41: + case 0x61: + $name = 'tFunc'; + $size = 3; + // offset: 1; size: 2; index to built-in sheet function + $mapping = Xls\Mappings::TFUNC_MAPPINGS[self::getUInt2d($formulaData, 1)] ?? null; + if ($mapping === null) { + throw new Exception('Unrecognized function in formula'); } - $data = ['function' => $function, 'args' => $args]; + $data = ['function' => $mapping[0], 'args' => $mapping[1]]; break; case 0x22: // function with variable number of arguments @@ -6373,97 +4494,10 @@ private function getNextToken(string $formulaData, string $baseCell = 'A1'): arr $args = ord($formulaData[1]); // offset: 2: size: 2; index to built-in sheet function $index = self::getUInt2d($formulaData, 2); - $function = match ($index) { - 0 => 'COUNT', - 1 => 'IF', - 4 => 'SUM', - 5 => 'AVERAGE', - 6 => 'MIN', - 7 => 'MAX', - 8 => 'ROW', - 9 => 'COLUMN', - 11 => 'NPV', - 12 => 'STDEV', - 13 => 'DOLLAR', - 14 => 'FIXED', - 28 => 'LOOKUP', - 29 => 'INDEX', - 36 => 'AND', - 37 => 'OR', - 46 => 'VAR', - 49 => 'LINEST', - 50 => 'TREND', - 51 => 'LOGEST', - 52 => 'GROWTH', - 56 => 'PV', - 57 => 'FV', - 58 => 'NPER', - 59 => 'PMT', - 60 => 'RATE', - 62 => 'IRR', - 64 => 'MATCH', - 70 => 'WEEKDAY', - 78 => 'OFFSET', - 82 => 'SEARCH', - 100 => 'CHOOSE', - 101 => 'HLOOKUP', - 102 => 'VLOOKUP', - 109 => 'LOG', - 115 => 'LEFT', - 116 => 'RIGHT', - 120 => 'SUBSTITUTE', - 124 => 'FIND', - 125 => 'CELL', - 144 => 'DDB', - 148 => 'INDIRECT', - 167 => 'IPMT', - 168 => 'PPMT', - 169 => 'COUNTA', - 183 => 'PRODUCT', - 193 => 'STDEVP', - 194 => 'VARP', - 197 => 'TRUNC', - 204 => 'USDOLLAR', - 205 => 'FINDB', - 206 => 'SEARCHB', - 208 => 'LEFTB', - 209 => 'RIGHTB', - 216 => 'RANK', - 219 => 'ADDRESS', - 220 => 'DAYS360', - 222 => 'VDB', - 227 => 'MEDIAN', - 228 => 'SUMPRODUCT', - 247 => 'DB', - 255 => '', - 269 => 'AVEDEV', - 270 => 'BETADIST', - 272 => 'BETAINV', - 317 => 'PROB', - 318 => 'DEVSQ', - 319 => 'GEOMEAN', - 320 => 'HARMEAN', - 321 => 'SUMSQ', - 322 => 'KURT', - 323 => 'SKEW', - 324 => 'ZTEST', - 329 => 'PERCENTRANK', - 330 => 'MODE', - 336 => 'CONCATENATE', - 344 => 'SUBTOTAL', - 345 => 'SUMIF', - 354 => 'ROMAN', - 358 => 'GETPIVOTDATA', - 359 => 'HYPERLINK', - 361 => 'AVERAGEA', - 362 => 'MAXA', - 363 => 'MINA', - 364 => 'STDEVPA', - 365 => 'VARPA', - 366 => 'STDEVA', - 367 => 'VARA', - default => throw new Exception('Unrecognized function in formula'), - }; + $function = Xls\Mappings::TFUNCV_MAPPINGS[$index] ?? null; + if ($function === null) { + throw new Exception('Unrecognized function in formula'); + } $data = ['function' => $function, 'args' => $args]; break; @@ -6483,7 +4517,7 @@ private function getNextToken(string $formulaData, string $baseCell = 'A1'): arr case 0x64: $name = 'tRef'; $size = 5; - $data = $this->readBIFF8CellAddress(substr($formulaData, 1, 4)); + $data = Xls\Biff8::readBIFF8CellAddress(substr($formulaData, 1, 4)); break; case 0x25: // cell range reference to cells in the same sheet (2d) @@ -6491,7 +4525,7 @@ private function getNextToken(string $formulaData, string $baseCell = 'A1'): arr case 0x65: $name = 'tArea'; $size = 9; - $data = $this->readBIFF8CellRangeAddress(substr($formulaData, 1, 8)); + $data = Xls\Biff8::readBIFF8CellRangeAddress(substr($formulaData, 1, 8)); break; case 0x26: // Constant reference sub-expression @@ -6531,7 +4565,7 @@ private function getNextToken(string $formulaData, string $baseCell = 'A1'): arr case 0x6C: $name = 'tRefN'; $size = 5; - $data = $this->readBIFF8CellAddressB(substr($formulaData, 1, 4), $baseCell); + $data = Xls\Biff8::readBIFF8CellAddressB(substr($formulaData, 1, 4), $baseCell); break; case 0x2D: // Relative 2d range reference @@ -6539,7 +4573,7 @@ private function getNextToken(string $formulaData, string $baseCell = 'A1'): arr case 0x6D: $name = 'tAreaN'; $size = 9; - $data = $this->readBIFF8CellRangeAddressB(substr($formulaData, 1, 8), $baseCell); + $data = Xls\Biff8::readBIFF8CellRangeAddressB(substr($formulaData, 1, 8), $baseCell); break; case 0x39: // External name @@ -6565,7 +4599,7 @@ private function getNextToken(string $formulaData, string $baseCell = 'A1'): arr // offset: 1; size: 2; index to REF entry $sheetRange = $this->readSheetRangeByRefIndex(self::getUInt2d($formulaData, 1)); // offset: 3; size: 4; cell address - $cellAddress = $this->readBIFF8CellAddress(substr($formulaData, 3, 4)); + $cellAddress = Xls\Biff8::readBIFF8CellAddress(substr($formulaData, 3, 4)); $data = "$sheetRange!$cellAddress"; } catch (PhpSpreadsheetException) { @@ -6584,349 +4618,25 @@ private function getNextToken(string $formulaData, string $baseCell = 'A1'): arr // offset: 1; size: 2; index to REF entry $sheetRange = $this->readSheetRangeByRefIndex(self::getUInt2d($formulaData, 1)); // offset: 3; size: 8; cell address - $cellRangeAddress = $this->readBIFF8CellRangeAddress(substr($formulaData, 3, 8)); + $cellRangeAddress = Xls\Biff8::readBIFF8CellRangeAddress(substr($formulaData, 3, 8)); $data = "$sheetRange!$cellRangeAddress"; } catch (PhpSpreadsheetException) { // deleted sheet reference - $data = '#REF!'; - } - - break; - // Unknown cases // don't know how to deal with - default: - throw new Exception('Unrecognized token ' . sprintf('%02X', $id) . ' in formula'); - } - - return [ - 'id' => $id, - 'name' => $name, - 'size' => $size, - 'data' => $data, - ]; - } - - /** - * Reads a cell address in BIFF8 e.g. 'A2' or '$A$2' - * section 3.3.4. - */ - private function readBIFF8CellAddress(string $cellAddressStructure): string - { - // offset: 0; size: 2; index to row (0... 65535) (or offset (-32768... 32767)) - $row = self::getUInt2d($cellAddressStructure, 0) + 1; - - // offset: 2; size: 2; index to column or column offset + relative flags - // bit: 7-0; mask 0x00FF; column index - $column = Coordinate::stringFromColumnIndex((0x00FF & self::getUInt2d($cellAddressStructure, 2)) + 1); - - // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) - if (!(0x4000 & self::getUInt2d($cellAddressStructure, 2))) { - $column = '$' . $column; - } - // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) - if (!(0x8000 & self::getUInt2d($cellAddressStructure, 2))) { - $row = '$' . $row; - } - - return $column . $row; - } - - /** - * Reads a cell address in BIFF8 for shared formulas. Uses positive and negative values for row and column - * to indicate offsets from a base cell - * section 3.3.4. - * - * @param string $baseCell Base cell, only needed when formula contains tRefN tokens, e.g. with shared formulas - */ - private function readBIFF8CellAddressB(string $cellAddressStructure, string $baseCell = 'A1'): string - { - [$baseCol, $baseRow] = Coordinate::coordinateFromString($baseCell); - $baseCol = Coordinate::columnIndexFromString($baseCol) - 1; - $baseRow = (int) $baseRow; - - // offset: 0; size: 2; index to row (0... 65535) (or offset (-32768... 32767)) - $rowIndex = self::getUInt2d($cellAddressStructure, 0); - $row = self::getUInt2d($cellAddressStructure, 0) + 1; - - // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) - if (!(0x4000 & self::getUInt2d($cellAddressStructure, 2))) { - // offset: 2; size: 2; index to column or column offset + relative flags - // bit: 7-0; mask 0x00FF; column index - $colIndex = 0x00FF & self::getUInt2d($cellAddressStructure, 2); - - $column = Coordinate::stringFromColumnIndex($colIndex + 1); - $column = '$' . $column; - } else { - // offset: 2; size: 2; index to column or column offset + relative flags - // bit: 7-0; mask 0x00FF; column index - $relativeColIndex = 0x00FF & self::getInt2d($cellAddressStructure, 2); - $colIndex = $baseCol + $relativeColIndex; - $colIndex = ($colIndex < 256) ? $colIndex : $colIndex - 256; - $colIndex = ($colIndex >= 0) ? $colIndex : $colIndex + 256; - $column = Coordinate::stringFromColumnIndex($colIndex + 1); - } - - // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) - if (!(0x8000 & self::getUInt2d($cellAddressStructure, 2))) { - $row = '$' . $row; - } else { - $rowIndex = ($rowIndex <= 32767) ? $rowIndex : $rowIndex - 65536; - $row = $baseRow + $rowIndex; - } - - return $column . $row; - } - - /** - * Reads a cell range address in BIFF5 e.g. 'A2:B6' or 'A1' - * always fixed range - * section 2.5.14. - */ - private function readBIFF5CellRangeAddressFixed(string $subData): string - { - // offset: 0; size: 2; index to first row - $fr = self::getUInt2d($subData, 0) + 1; - - // offset: 2; size: 2; index to last row - $lr = self::getUInt2d($subData, 2) + 1; - - // offset: 4; size: 1; index to first column - $fc = ord($subData[4]); - - // offset: 5; size: 1; index to last column - $lc = ord($subData[5]); - - // check values - if ($fr > $lr || $fc > $lc) { - throw new Exception('Not a cell range address'); - } - - // column index to letter - $fc = Coordinate::stringFromColumnIndex($fc + 1); - $lc = Coordinate::stringFromColumnIndex($lc + 1); - - if ($fr == $lr && $fc == $lc) { - return "$fc$fr"; - } - - return "$fc$fr:$lc$lr"; - } - - /** - * Reads a cell range address in BIFF8 e.g. 'A2:B6' or 'A1' - * always fixed range - * section 2.5.14. - */ - private function readBIFF8CellRangeAddressFixed(string $subData): string - { - // offset: 0; size: 2; index to first row - $fr = self::getUInt2d($subData, 0) + 1; - - // offset: 2; size: 2; index to last row - $lr = self::getUInt2d($subData, 2) + 1; - - // offset: 4; size: 2; index to first column - $fc = self::getUInt2d($subData, 4); - - // offset: 6; size: 2; index to last column - $lc = self::getUInt2d($subData, 6); - - // check values - if ($fr > $lr || $fc > $lc) { - throw new Exception('Not a cell range address'); - } - - // column index to letter - $fc = Coordinate::stringFromColumnIndex($fc + 1); - $lc = Coordinate::stringFromColumnIndex($lc + 1); - - if ($fr == $lr && $fc == $lc) { - return "$fc$fr"; - } - - return "$fc$fr:$lc$lr"; - } - - /** - * Reads a cell range address in BIFF8 e.g. 'A2:B6' or '$A$2:$B$6' - * there are flags indicating whether column/row index is relative - * section 3.3.4. - */ - private function readBIFF8CellRangeAddress(string $subData): string - { - // todo: if cell range is just a single cell, should this funciton - // not just return e.g. 'A1' and not 'A1:A1' ? - - // offset: 0; size: 2; index to first row (0... 65535) (or offset (-32768... 32767)) - $fr = self::getUInt2d($subData, 0) + 1; - - // offset: 2; size: 2; index to last row (0... 65535) (or offset (-32768... 32767)) - $lr = self::getUInt2d($subData, 2) + 1; - - // offset: 4; size: 2; index to first column or column offset + relative flags - - // bit: 7-0; mask 0x00FF; column index - $fc = Coordinate::stringFromColumnIndex((0x00FF & self::getUInt2d($subData, 4)) + 1); - - // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) - if (!(0x4000 & self::getUInt2d($subData, 4))) { - $fc = '$' . $fc; - } - - // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) - if (!(0x8000 & self::getUInt2d($subData, 4))) { - $fr = '$' . $fr; - } - - // offset: 6; size: 2; index to last column or column offset + relative flags - - // bit: 7-0; mask 0x00FF; column index - $lc = Coordinate::stringFromColumnIndex((0x00FF & self::getUInt2d($subData, 6)) + 1); - - // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) - if (!(0x4000 & self::getUInt2d($subData, 6))) { - $lc = '$' . $lc; - } - - // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) - if (!(0x8000 & self::getUInt2d($subData, 6))) { - $lr = '$' . $lr; - } - - return "$fc$fr:$lc$lr"; - } - - /** - * Reads a cell range address in BIFF8 for shared formulas. Uses positive and negative values for row and column - * to indicate offsets from a base cell - * section 3.3.4. - * - * @param string $baseCell Base cell - * - * @return string Cell range address - */ - private function readBIFF8CellRangeAddressB(string $subData, string $baseCell = 'A1'): string - { - [$baseCol, $baseRow] = Coordinate::indexesFromString($baseCell); - $baseCol = $baseCol - 1; - - // TODO: if cell range is just a single cell, should this funciton - // not just return e.g. 'A1' and not 'A1:A1' ? - - // offset: 0; size: 2; first row - $frIndex = self::getUInt2d($subData, 0); // adjust below - - // offset: 2; size: 2; relative index to first row (0... 65535) should be treated as offset (-32768... 32767) - $lrIndex = self::getUInt2d($subData, 2); // adjust below - - // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) - if (!(0x4000 & self::getUInt2d($subData, 4))) { - // absolute column index - // offset: 4; size: 2; first column with relative/absolute flags - // bit: 7-0; mask 0x00FF; column index - $fcIndex = 0x00FF & self::getUInt2d($subData, 4); - $fc = Coordinate::stringFromColumnIndex($fcIndex + 1); - $fc = '$' . $fc; - } else { - // column offset - // offset: 4; size: 2; first column with relative/absolute flags - // bit: 7-0; mask 0x00FF; column index - $relativeFcIndex = 0x00FF & self::getInt2d($subData, 4); - $fcIndex = $baseCol + $relativeFcIndex; - $fcIndex = ($fcIndex < 256) ? $fcIndex : $fcIndex - 256; - $fcIndex = ($fcIndex >= 0) ? $fcIndex : $fcIndex + 256; - $fc = Coordinate::stringFromColumnIndex($fcIndex + 1); - } - - // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) - if (!(0x8000 & self::getUInt2d($subData, 4))) { - // absolute row index - $fr = $frIndex + 1; - $fr = '$' . $fr; - } else { - // row offset - $frIndex = ($frIndex <= 32767) ? $frIndex : $frIndex - 65536; - $fr = $baseRow + $frIndex; - } - - // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) - if (!(0x4000 & self::getUInt2d($subData, 6))) { - // absolute column index - // offset: 6; size: 2; last column with relative/absolute flags - // bit: 7-0; mask 0x00FF; column index - $lcIndex = 0x00FF & self::getUInt2d($subData, 6); - $lc = Coordinate::stringFromColumnIndex($lcIndex + 1); - $lc = '$' . $lc; - } else { - // column offset - // offset: 6; size: 2; last column with relative/absolute flags - // bit: 7-0; mask 0x00FF; column index - $relativeLcIndex = 0x00FF & self::getInt2d($subData, 6); - $lcIndex = $baseCol + $relativeLcIndex; - $lcIndex = ($lcIndex < 256) ? $lcIndex : $lcIndex - 256; - $lcIndex = ($lcIndex >= 0) ? $lcIndex : $lcIndex + 256; - $lc = Coordinate::stringFromColumnIndex($lcIndex + 1); - } - - // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) - if (!(0x8000 & self::getUInt2d($subData, 6))) { - // absolute row index - $lr = $lrIndex + 1; - $lr = '$' . $lr; - } else { - // row offset - $lrIndex = ($lrIndex <= 32767) ? $lrIndex : $lrIndex - 65536; - $lr = $baseRow + $lrIndex; - } - - return "$fc$fr:$lc$lr"; - } - - /** - * Read BIFF8 cell range address list - * section 2.5.15. - */ - private function readBIFF8CellRangeAddressList(string $subData): array - { - $cellRangeAddresses = []; - - // offset: 0; size: 2; number of the following cell range addresses - $nm = self::getUInt2d($subData, 0); - - $offset = 2; - // offset: 2; size: 8 * $nm; list of $nm (fixed) cell range addresses - for ($i = 0; $i < $nm; ++$i) { - $cellRangeAddresses[] = $this->readBIFF8CellRangeAddressFixed(substr($subData, $offset, 8)); - $offset += 8; - } - - return [ - 'size' => 2 + 8 * $nm, - 'cellRangeAddresses' => $cellRangeAddresses, - ]; - } - - /** - * Read BIFF5 cell range address list - * section 2.5.15. - */ - private function readBIFF5CellRangeAddressList(string $subData): array - { - $cellRangeAddresses = []; - - // offset: 0; size: 2; number of the following cell range addresses - $nm = self::getUInt2d($subData, 0); + $data = '#REF!'; + } - $offset = 2; - // offset: 2; size: 6 * $nm; list of $nm (fixed) cell range addresses - for ($i = 0; $i < $nm; ++$i) { - $cellRangeAddresses[] = $this->readBIFF5CellRangeAddressFixed(substr($subData, $offset, 6)); - $offset += 6; + break; + // Unknown cases // don't know how to deal with + default: + throw new Exception('Unrecognized token ' . sprintf('%02X', $id) . ' in formula'); } return [ - 'size' => 2 + 6 * $nm, - 'cellRangeAddresses' => $cellRangeAddresses, + 'id' => $id, + 'name' => $name, + 'size' => $size, + 'data' => $data, ]; } @@ -6936,7 +4646,7 @@ private function readBIFF5CellRangeAddressList(string $subData): array * It can also happen that the REF structure uses the -1 (FFFF) code to indicate deleted sheets, * in which case an Exception is thrown. */ - private function readSheetRangeByRefIndex(int $index): string|false + protected function readSheetRangeByRefIndex(int $index): string|false { if (isset($this->ref[$index])) { $type = $this->externalBooks[$this->ref[$index]['externalBookIndex']]['type']; @@ -6980,124 +4690,11 @@ private function readSheetRangeByRefIndex(int $index): string|false return false; } - /** - * read BIFF8 constant value array from array data - * returns e.g. ['value' => '{1,2;3,4}', 'size' => 40] - * section 2.5.8. - */ - private static function readBIFF8ConstantArray(string $arrayData): array - { - // offset: 0; size: 1; number of columns decreased by 1 - $nc = ord($arrayData[0]); - - // offset: 1; size: 2; number of rows decreased by 1 - $nr = self::getUInt2d($arrayData, 1); - $size = 3; // initialize - $arrayData = substr($arrayData, 3); - - // offset: 3; size: var; list of ($nc + 1) * ($nr + 1) constant values - $matrixChunks = []; - for ($r = 1; $r <= $nr + 1; ++$r) { - $items = []; - for ($c = 1; $c <= $nc + 1; ++$c) { - $constant = self::readBIFF8Constant($arrayData); - $items[] = $constant['value']; - $arrayData = substr($arrayData, $constant['size']); - $size += $constant['size']; - } - $matrixChunks[] = implode(',', $items); // looks like e.g. '1,"hello"' - } - $matrix = '{' . implode(';', $matrixChunks) . '}'; - - return [ - 'value' => $matrix, - 'size' => $size, - ]; - } - - /** - * read BIFF8 constant value which may be 'Empty Value', 'Number', 'String Value', 'Boolean Value', 'Error Value' - * section 2.5.7 - * returns e.g. ['value' => '5', 'size' => 9]. - */ - private static function readBIFF8Constant(string $valueData): array - { - // offset: 0; size: 1; identifier for type of constant - $identifier = ord($valueData[0]); - - switch ($identifier) { - case 0x00: // empty constant (what is this?) - $value = ''; - $size = 9; - - break; - case 0x01: // number - // offset: 1; size: 8; IEEE 754 floating-point value - $value = self::extractNumber(substr($valueData, 1, 8)); - $size = 9; - - break; - case 0x02: // string value - // offset: 1; size: var; Unicode string, 16-bit string length - $string = self::readUnicodeStringLong(substr($valueData, 1)); - $value = '"' . $string['value'] . '"'; - $size = 1 + $string['size']; - - break; - case 0x04: // boolean - // offset: 1; size: 1; 0 = FALSE, 1 = TRUE - if (ord($valueData[1])) { - $value = 'TRUE'; - } else { - $value = 'FALSE'; - } - $size = 9; - - break; - case 0x10: // error code - // offset: 1; size: 1; error code - $value = Xls\ErrorCode::lookup(ord($valueData[1])); - $size = 9; - - break; - default: - throw new PhpSpreadsheetException('Unsupported BIFF8 constant'); - } - - return [ - 'value' => $value, - 'size' => $size, - ]; - } - - /** - * Extract RGB color - * OpenOffice.org's Documentation of the Microsoft Excel File Format, section 2.5.4. - * - * @param string $rgb Encoded RGB value (4 bytes) - */ - private static function readRGB(string $rgb): array - { - // offset: 0; size 1; Red component - $r = ord($rgb[0]); - - // offset: 1; size: 1; Green component - $g = ord($rgb[1]); - - // offset: 2; size: 1; Blue component - $b = ord($rgb[2]); - - // HEX notation, e.g. 'FF00FC' - $rgb = sprintf('%02X%02X%02X', $r, $g, $b); - - return ['rgb' => $rgb]; - } - /** * Read byte string (8-bit string length) * OpenOffice documentation: 2.5.2. */ - private function readByteStringShort(string $subData): array + protected function readByteStringShort(string $subData): array { // offset: 0; size: 1; length of the string (character count) $ln = ord($subData[0]); @@ -7115,7 +4712,7 @@ private function readByteStringShort(string $subData): array * Read byte string (16-bit string length) * OpenOffice documentation: 2.5.2. */ - private function readByteStringLong(string $subData): array + protected function readByteStringLong(string $subData): array { // offset: 0; size: 2; length of the string (character count) $ln = self::getUInt2d($subData, 0); @@ -7130,207 +4727,7 @@ private function readByteStringLong(string $subData): array ]; } - /** - * Extracts an Excel Unicode short string (8-bit string length) - * OpenOffice documentation: 2.5.3 - * function will automatically find out where the Unicode string ends. - */ - private static function readUnicodeStringShort(string $subData): array - { - // offset: 0: size: 1; length of the string (character count) - $characterCount = ord($subData[0]); - - $string = self::readUnicodeString(substr($subData, 1), $characterCount); - - // add 1 for the string length - ++$string['size']; - - return $string; - } - - /** - * Extracts an Excel Unicode long string (16-bit string length) - * OpenOffice documentation: 2.5.3 - * this function is under construction, needs to support rich text, and Asian phonetic settings. - */ - private static function readUnicodeStringLong(string $subData): array - { - // offset: 0: size: 2; length of the string (character count) - $characterCount = self::getUInt2d($subData, 0); - - $string = self::readUnicodeString(substr($subData, 2), $characterCount); - - // add 2 for the string length - $string['size'] += 2; - - return $string; - } - - /** - * Read Unicode string with no string length field, but with known character count - * this function is under construction, needs to support rich text, and Asian phonetic settings - * OpenOffice.org's Documentation of the Microsoft Excel File Format, section 2.5.3. - */ - private static function readUnicodeString(string $subData, int $characterCount): array - { - // offset: 0: size: 1; option flags - // bit: 0; mask: 0x01; character compression (0 = compressed 8-bit, 1 = uncompressed 16-bit) - $isCompressed = !((0x01 & ord($subData[0])) >> 0); - - // bit: 2; mask: 0x04; Asian phonetic settings - //$hasAsian = (0x04) & ord($subData[0]) >> 2; - - // bit: 3; mask: 0x08; Rich-Text settings - //$hasRichText = (0x08) & ord($subData[0]) >> 3; - - // offset: 1: size: var; character array - // this offset assumes richtext and Asian phonetic settings are off which is generally wrong - // needs to be fixed - $value = self::encodeUTF16(substr($subData, 1, $isCompressed ? $characterCount : 2 * $characterCount), $isCompressed); - - return [ - 'value' => $value, - 'size' => $isCompressed ? 1 + $characterCount : 1 + 2 * $characterCount, // the size in bytes including the option flags - ]; - } - - /** - * Convert UTF-8 string to string surounded by double quotes. Used for explicit string tokens in formulas. - * Example: hello"world --> "hello""world". - * - * @param string $value UTF-8 encoded string - */ - private static function UTF8toExcelDoubleQuoted(string $value): string - { - return '"' . str_replace('"', '""', $value) . '"'; - } - - /** - * Reads first 8 bytes of a string and return IEEE 754 float. - * - * @param string $data Binary string that is at least 8 bytes long - */ - private static function extractNumber(string $data): int|float - { - $rknumhigh = self::getInt4d($data, 4); - $rknumlow = self::getInt4d($data, 0); - $sign = ($rknumhigh & self::HIGH_ORDER_BIT) >> 31; - $exp = (($rknumhigh & 0x7FF00000) >> 20) - 1023; - $mantissa = (0x100000 | ($rknumhigh & 0x000FFFFF)); - $mantissalow1 = ($rknumlow & self::HIGH_ORDER_BIT) >> 31; - $mantissalow2 = ($rknumlow & 0x7FFFFFFF); - $value = $mantissa / 2 ** (20 - $exp); - - if ($mantissalow1 != 0) { - $value += 1 / 2 ** (21 - $exp); - } - - if ($mantissalow2 != 0) { - $value += $mantissalow2 / 2 ** (52 - $exp); - } - if ($sign) { - $value *= -1; - } - - return $value; - } - - private static function getIEEE754(int $rknum): float|int - { - if (($rknum & 0x02) != 0) { - $value = $rknum >> 2; - } else { - // changes by mmp, info on IEEE754 encoding from - // research.microsoft.com/~hollasch/cgindex/coding/ieeefloat.html - // The RK format calls for using only the most significant 30 bits - // of the 64 bit floating point value. The other 34 bits are assumed - // to be 0 so we use the upper 30 bits of $rknum as follows... - $sign = ($rknum & self::HIGH_ORDER_BIT) >> 31; - $exp = ($rknum & 0x7FF00000) >> 20; - $mantissa = (0x100000 | ($rknum & 0x000FFFFC)); - $value = $mantissa / 2 ** (20 - ($exp - 1023)); - if ($sign) { - $value = -1 * $value; - } - //end of changes by mmp - } - if (($rknum & 0x01) != 0) { - $value /= 100; - } - - return $value; - } - - /** - * Get UTF-8 string from (compressed or uncompressed) UTF-16 string. - */ - private static function encodeUTF16(string $string, bool $compressed = false): string - { - if ($compressed) { - $string = self::uncompressByteString($string); - } - - return StringHelper::convertEncoding($string, 'UTF-8', 'UTF-16LE'); - } - - /** - * Convert UTF-16 string in compressed notation to uncompressed form. Only used for BIFF8. - */ - private static function uncompressByteString(string $string): string - { - $uncompressedString = ''; - $strLen = strlen($string); - for ($i = 0; $i < $strLen; ++$i) { - $uncompressedString .= $string[$i] . "\0"; - } - - return $uncompressedString; - } - - /** - * Convert string to UTF-8. Only used for BIFF5. - */ - private function decodeCodepage(string $string): string - { - return StringHelper::convertEncoding($string, 'UTF-8', $this->codepage); - } - - /** - * Read 16-bit unsigned integer. - */ - public static function getUInt2d(string $data, int $pos): int - { - return ord($data[$pos]) | (ord($data[$pos + 1]) << 8); - } - - /** - * Read 16-bit signed integer. - */ - public static function getInt2d(string $data, int $pos): int - { - return unpack('s', $data[$pos] . $data[$pos + 1])[1]; // @phpstan-ignore-line - } - - /** - * Read 32-bit signed integer. - */ - public static function getInt4d(string $data, int $pos): int - { - // FIX: represent numbers correctly on 64-bit system - // http://sourceforge.net/tracker/index.php?func=detail&aid=1487372&group_id=99160&atid=623334 - // Changed by Andreas Rehm 2006 to ensure correct result of the <<24 block on 32 and 64bit systems - $_or_24 = ord($data[$pos + 3]); - if ($_or_24 >= 128) { - // negative number - $_ord_24 = -abs((256 - $_or_24) << 24); - } else { - $_ord_24 = ($_or_24 & 127) << 24; - } - - return ord($data[$pos]) | (ord($data[$pos + 1]) << 8) | (ord($data[$pos + 2]) << 16) | $_ord_24; - } - - private function parseRichText(string $is): RichText + protected function parseRichText(string $is): RichText { $value = new RichText(); $value->createText($is); @@ -7356,286 +4753,14 @@ public function getMapCellStyleXfIndex(): array * * @see https://www.openoffice.org/sc/excelfileformat.pdf Search for CFHEADER followed by CFRULE */ - private function readCFHeader(): array - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer forward to next record - $this->pos += 4 + $length; - - if ($this->readDataOnly) { - return []; - } - - // offset: 0; size: 2; Rule Count -// $ruleCount = self::getUInt2d($recordData, 0); - - // offset: var; size: var; cell range address list with - $cellRangeAddressList = ($this->version == self::XLS_BIFF8) - ? $this->readBIFF8CellRangeAddressList(substr($recordData, 12)) - : $this->readBIFF5CellRangeAddressList(substr($recordData, 12)); - $cellRangeAddresses = $cellRangeAddressList['cellRangeAddresses']; - - return $cellRangeAddresses; - } - - private function readCFRule(array $cellRangeAddresses): void - { - $length = self::getUInt2d($this->data, $this->pos + 2); - $recordData = $this->readRecordData($this->data, $this->pos + 4, $length); - - // move stream pointer forward to next record - $this->pos += 4 + $length; - - if ($this->readDataOnly) { - return; - } - - // offset: 0; size: 2; Options - $cfRule = self::getUInt2d($recordData, 0); - - // bit: 8-15; mask: 0x00FF; type - $type = (0x00FF & $cfRule) >> 0; - $type = ConditionalFormatting::type($type); - - // bit: 0-7; mask: 0xFF00; type - $operator = (0xFF00 & $cfRule) >> 8; - $operator = ConditionalFormatting::operator($operator); - - if ($type === null || $operator === null) { - return; - } - - // offset: 2; size: 2; Size1 - $size1 = self::getUInt2d($recordData, 2); - - // offset: 4; size: 2; Size2 - $size2 = self::getUInt2d($recordData, 4); - - // offset: 6; size: 4; Options - $options = self::getInt4d($recordData, 6); - - $style = new Style(false, true); // non-supervisor, conditional - $noFormatSet = true; - //$this->getCFStyleOptions($options, $style); - - $hasFontRecord = (bool) ((0x04000000 & $options) >> 26); - $hasAlignmentRecord = (bool) ((0x08000000 & $options) >> 27); - $hasBorderRecord = (bool) ((0x10000000 & $options) >> 28); - $hasFillRecord = (bool) ((0x20000000 & $options) >> 29); - $hasProtectionRecord = (bool) ((0x40000000 & $options) >> 30); - // note unexpected values for following 4 - $hasBorderLeft = !(bool) (0x00000400 & $options); - $hasBorderRight = !(bool) (0x00000800 & $options); - $hasBorderTop = !(bool) (0x00001000 & $options); - $hasBorderBottom = !(bool) (0x00002000 & $options); - - $offset = 12; - - if ($hasFontRecord === true) { - $fontStyle = substr($recordData, $offset, 118); - $this->getCFFontStyle($fontStyle, $style); - $offset += 118; - $noFormatSet = false; - } - - if ($hasAlignmentRecord === true) { - //$alignmentStyle = substr($recordData, $offset, 8); - //$this->getCFAlignmentStyle($alignmentStyle, $style); - $offset += 8; - } - - if ($hasBorderRecord === true) { - $borderStyle = substr($recordData, $offset, 8); - $this->getCFBorderStyle($borderStyle, $style, $hasBorderLeft, $hasBorderRight, $hasBorderTop, $hasBorderBottom); - $offset += 8; - $noFormatSet = false; - } - - if ($hasFillRecord === true) { - $fillStyle = substr($recordData, $offset, 4); - $this->getCFFillStyle($fillStyle, $style); - $offset += 4; - $noFormatSet = false; - } - - if ($hasProtectionRecord === true) { - //$protectionStyle = substr($recordData, $offset, 4); - //$this->getCFProtectionStyle($protectionStyle, $style); - $offset += 2; - } - - $formula1 = $formula2 = null; - if ($size1 > 0) { - $formula1 = $this->readCFFormula($recordData, $offset, $size1); - if ($formula1 === null) { - return; - } - - $offset += $size1; - } - - if ($size2 > 0) { - $formula2 = $this->readCFFormula($recordData, $offset, $size2); - if ($formula2 === null) { - return; - } - - $offset += $size2; - } - - $this->setCFRules($cellRangeAddresses, $type, $operator, $formula1, $formula2, $style, $noFormatSet); - } - - /*private function getCFStyleOptions(int $options, Style $style): void - { - }*/ - - private function getCFFontStyle(string $options, Style $style): void - { - $fontSize = self::getInt4d($options, 64); - if ($fontSize !== -1) { - $style->getFont()->setSize($fontSize / 20); // Convert twips to points - } - $options68 = self::getInt4d($options, 68); - $options88 = self::getInt4d($options, 88); - - if (($options88 & 2) === 0) { - $bold = self::getUInt2d($options, 72); // 400 = normal, 700 = bold - if ($bold !== 0) { - $style->getFont()->setBold($bold >= 550); - } - if (($options68 & 2) !== 0) { - $style->getFont()->setItalic(true); - } - } - if (($options88 & 0x80) === 0) { - if (($options68 & 0x80) !== 0) { - $style->getFont()->setStrikethrough(true); - } - } - - $color = self::getInt4d($options, 80); - - if ($color !== -1) { - $style->getFont()->getColor()->setRGB(Xls\Color::map($color, $this->palette, $this->version)['rgb']); - } - } - - /*private function getCFAlignmentStyle(string $options, Style $style): void - { - }*/ - - private function getCFBorderStyle(string $options, Style $style, bool $hasBorderLeft, bool $hasBorderRight, bool $hasBorderTop, bool $hasBorderBottom): void - { - $valueArray = unpack('V', $options); - $value = is_array($valueArray) ? $valueArray[1] : 0; - $left = $value & 15; - $right = ($value >> 4) & 15; - $top = ($value >> 8) & 15; - $bottom = ($value >> 12) & 15; - $leftc = ($value >> 16) & 0x7F; - $rightc = ($value >> 23) & 0x7F; - $valueArray = unpack('V', substr($options, 4)); - $value = is_array($valueArray) ? $valueArray[1] : 0; - $topc = $value & 0x7F; - $bottomc = ($value & 0x3F80) >> 7; - if ($hasBorderLeft) { - $style->getBorders()->getLeft() - ->setBorderStyle(self::BORDER_STYLE_MAP[$left]); - $style->getBorders()->getLeft()->getColor() - ->setRGB(Xls\Color::map($leftc, $this->palette, $this->version)['rgb']); - } - if ($hasBorderRight) { - $style->getBorders()->getRight() - ->setBorderStyle(self::BORDER_STYLE_MAP[$right]); - $style->getBorders()->getRight()->getColor() - ->setRGB(Xls\Color::map($rightc, $this->palette, $this->version)['rgb']); - } - if ($hasBorderTop) { - $style->getBorders()->getTop() - ->setBorderStyle(self::BORDER_STYLE_MAP[$top]); - $style->getBorders()->getTop()->getColor() - ->setRGB(Xls\Color::map($topc, $this->palette, $this->version)['rgb']); - } - if ($hasBorderBottom) { - $style->getBorders()->getBottom() - ->setBorderStyle(self::BORDER_STYLE_MAP[$bottom]); - $style->getBorders()->getBottom()->getColor() - ->setRGB(Xls\Color::map($bottomc, $this->palette, $this->version)['rgb']); - } - } - - private function getCFFillStyle(string $options, Style $style): void - { - $fillPattern = self::getUInt2d($options, 0); - // bit: 10-15; mask: 0xFC00; type - $fillPattern = (0xFC00 & $fillPattern) >> 10; - $fillPattern = FillPattern::lookup($fillPattern); - $fillPattern = $fillPattern === Fill::FILL_NONE ? Fill::FILL_SOLID : $fillPattern; - - if ($fillPattern !== Fill::FILL_NONE) { - $style->getFill()->setFillType($fillPattern); - - $fillColors = self::getUInt2d($options, 2); - - // bit: 0-6; mask: 0x007F; type - $color1 = (0x007F & $fillColors) >> 0; - - // bit: 7-13; mask: 0x3F80; type - $color2 = (0x3F80 & $fillColors) >> 7; - if ($fillPattern === Fill::FILL_SOLID) { - $style->getFill()->getStartColor()->setRGB(Xls\Color::map($color2, $this->palette, $this->version)['rgb']); - } else { - $style->getFill()->getStartColor()->setRGB(Xls\Color::map($color1, $this->palette, $this->version)['rgb']); - $style->getFill()->getEndColor()->setRGB(Xls\Color::map($color2, $this->palette, $this->version)['rgb']); - } - } - } - - /*private function getCFProtectionStyle(string $options, Style $style): void - { - }*/ - - private function readCFFormula(string $recordData, int $offset, int $size): float|int|string|null + protected function readCFHeader(): array { - try { - $formula = substr($recordData, $offset, $size); - $formula = pack('v', $size) . $formula; // prepend the length - - $formula = $this->getFormulaFromStructure($formula); - if (is_numeric($formula)) { - return (str_contains($formula, '.')) ? (float) $formula : (int) $formula; - } - - return $formula; - } catch (PhpSpreadsheetException) { - return null; - } + return (new Xls\ConditionalFormatting())->readCFHeader2($this); } - private function setCFRules(array $cellRanges, string $type, string $operator, null|float|int|string $formula1, null|float|int|string $formula2, Style $style, bool $noFormatSet): void + protected function readCFRule(array $cellRangeAddresses): void { - foreach ($cellRanges as $cellRange) { - $conditional = new Conditional(); - $conditional->setNoFormatSet($noFormatSet); - $conditional->setConditionType($type); - $conditional->setOperatorType($operator); - $conditional->setStopIfTrue(true); - if ($formula1 !== null) { - $conditional->addCondition($formula1); - } - if ($formula2 !== null) { - $conditional->addCondition($formula2); - } - $conditional->setStyle($style); - - $conditionalStyles = $this->phpSheet->getStyle($cellRange)->getConditionalStyles(); - $conditionalStyles[] = $conditional; - - $this->phpSheet->getStyle($cellRange)->setConditionalStyles($conditionalStyles); - } + (new Xls\ConditionalFormatting())->readCFRule2($cellRangeAddresses, $this); } public function getVersion(): int diff --git a/src/PhpSpreadsheet/Reader/Xls/Biff5.php b/src/PhpSpreadsheet/Reader/Xls/Biff5.php new file mode 100644 index 0000000000..ef9619eec5 --- /dev/null +++ b/src/PhpSpreadsheet/Reader/Xls/Biff5.php @@ -0,0 +1,69 @@ + $lr || $fc > $lc) { + throw new ReaderException('Not a cell range address'); + } + + // column index to letter + $fc = Coordinate::stringFromColumnIndex($fc + 1); + $lc = Coordinate::stringFromColumnIndex($lc + 1); + + if ($fr == $lr && $fc == $lc) { + return "$fc$fr"; + } + + return "$fc$fr:$lc$lr"; + } + + /** + * Read BIFF5 cell range address list + * section 2.5.15. + */ + public static function readBIFF5CellRangeAddressList(string $subData): array + { + $cellRangeAddresses = []; + + // offset: 0; size: 2; number of the following cell range addresses + $nm = self::getUInt2d($subData, 0); + + $offset = 2; + // offset: 2; size: 6 * $nm; list of $nm (fixed) cell range addresses + for ($i = 0; $i < $nm; ++$i) { + $cellRangeAddresses[] = self::readBIFF5CellRangeAddressFixed(substr($subData, $offset, 6)); + $offset += 6; + } + + return [ + 'size' => 2 + 6 * $nm, + 'cellRangeAddresses' => $cellRangeAddresses, + ]; + } +} diff --git a/src/PhpSpreadsheet/Reader/Xls/Biff8.php b/src/PhpSpreadsheet/Reader/Xls/Biff8.php new file mode 100644 index 0000000000..f95c0b5ea5 --- /dev/null +++ b/src/PhpSpreadsheet/Reader/Xls/Biff8.php @@ -0,0 +1,365 @@ + '{1,2;3,4}', 'size' => 40] + * section 2.5.8. + */ + protected static function readBIFF8ConstantArray(string $arrayData): array + { + // offset: 0; size: 1; number of columns decreased by 1 + $nc = ord($arrayData[0]); + + // offset: 1; size: 2; number of rows decreased by 1 + $nr = self::getUInt2d($arrayData, 1); + $size = 3; // initialize + $arrayData = substr($arrayData, 3); + + // offset: 3; size: var; list of ($nc + 1) * ($nr + 1) constant values + $matrixChunks = []; + for ($r = 1; $r <= $nr + 1; ++$r) { + $items = []; + for ($c = 1; $c <= $nc + 1; ++$c) { + $constant = self::readBIFF8Constant($arrayData); + $items[] = $constant['value']; + $arrayData = substr($arrayData, $constant['size']); + $size += $constant['size']; + } + $matrixChunks[] = implode(',', $items); // looks like e.g. '1,"hello"' + } + $matrix = '{' . implode(';', $matrixChunks) . '}'; + + return [ + 'value' => $matrix, + 'size' => $size, + ]; + } + + /** + * read BIFF8 constant value which may be 'Empty Value', 'Number', 'String Value', 'Boolean Value', 'Error Value' + * section 2.5.7 + * returns e.g. ['value' => '5', 'size' => 9]. + */ + private static function readBIFF8Constant(string $valueData): array + { + // offset: 0; size: 1; identifier for type of constant + $identifier = ord($valueData[0]); + + switch ($identifier) { + case 0x00: // empty constant (what is this?) + $value = ''; + $size = 9; + + break; + case 0x01: // number + // offset: 1; size: 8; IEEE 754 floating-point value + $value = self::extractNumber(substr($valueData, 1, 8)); + $size = 9; + + break; + case 0x02: // string value + // offset: 1; size: var; Unicode string, 16-bit string length + $string = self::readUnicodeStringLong(substr($valueData, 1)); + $value = '"' . $string['value'] . '"'; + $size = 1 + $string['size']; + + break; + case 0x04: // boolean + // offset: 1; size: 1; 0 = FALSE, 1 = TRUE + if (ord($valueData[1])) { + $value = 'TRUE'; + } else { + $value = 'FALSE'; + } + $size = 9; + + break; + case 0x10: // error code + // offset: 1; size: 1; error code + $value = ErrorCode::lookup(ord($valueData[1])); + $size = 9; + + break; + default: + throw new ReaderException('Unsupported BIFF8 constant'); + } + + return [ + 'value' => $value, + 'size' => $size, + ]; + } + + /** + * Read BIFF8 cell range address list + * section 2.5.15. + */ + public static function readBIFF8CellRangeAddressList(string $subData): array + { + $cellRangeAddresses = []; + + // offset: 0; size: 2; number of the following cell range addresses + $nm = self::getUInt2d($subData, 0); + + $offset = 2; + // offset: 2; size: 8 * $nm; list of $nm (fixed) cell range addresses + for ($i = 0; $i < $nm; ++$i) { + $cellRangeAddresses[] = self::readBIFF8CellRangeAddressFixed(substr($subData, $offset, 8)); + $offset += 8; + } + + return [ + 'size' => 2 + 8 * $nm, + 'cellRangeAddresses' => $cellRangeAddresses, + ]; + } + + /** + * Reads a cell address in BIFF8 e.g. 'A2' or '$A$2' + * section 3.3.4. + */ + protected static function readBIFF8CellAddress(string $cellAddressStructure): string + { + // offset: 0; size: 2; index to row (0... 65535) (or offset (-32768... 32767)) + $row = self::getUInt2d($cellAddressStructure, 0) + 1; + + // offset: 2; size: 2; index to column or column offset + relative flags + // bit: 7-0; mask 0x00FF; column index + $column = Coordinate::stringFromColumnIndex((0x00FF & self::getUInt2d($cellAddressStructure, 2)) + 1); + + // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) + if (!(0x4000 & self::getUInt2d($cellAddressStructure, 2))) { + $column = '$' . $column; + } + // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) + if (!(0x8000 & self::getUInt2d($cellAddressStructure, 2))) { + $row = '$' . $row; + } + + return $column . $row; + } + + /** + * Reads a cell address in BIFF8 for shared formulas. Uses positive and negative values for row and column + * to indicate offsets from a base cell + * section 3.3.4. + * + * @param string $baseCell Base cell, only needed when formula contains tRefN tokens, e.g. with shared formulas + */ + protected static function readBIFF8CellAddressB(string $cellAddressStructure, string $baseCell = 'A1'): string + { + [$baseCol, $baseRow] = Coordinate::coordinateFromString($baseCell); + $baseCol = Coordinate::columnIndexFromString($baseCol) - 1; + $baseRow = (int) $baseRow; + + // offset: 0; size: 2; index to row (0... 65535) (or offset (-32768... 32767)) + $rowIndex = self::getUInt2d($cellAddressStructure, 0); + $row = self::getUInt2d($cellAddressStructure, 0) + 1; + + // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) + if (!(0x4000 & self::getUInt2d($cellAddressStructure, 2))) { + // offset: 2; size: 2; index to column or column offset + relative flags + // bit: 7-0; mask 0x00FF; column index + $colIndex = 0x00FF & self::getUInt2d($cellAddressStructure, 2); + + $column = Coordinate::stringFromColumnIndex($colIndex + 1); + $column = '$' . $column; + } else { + // offset: 2; size: 2; index to column or column offset + relative flags + // bit: 7-0; mask 0x00FF; column index + $relativeColIndex = 0x00FF & self::getInt2d($cellAddressStructure, 2); + $colIndex = $baseCol + $relativeColIndex; + $colIndex = ($colIndex < 256) ? $colIndex : $colIndex - 256; + $colIndex = ($colIndex >= 0) ? $colIndex : $colIndex + 256; + $column = Coordinate::stringFromColumnIndex($colIndex + 1); + } + + // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) + if (!(0x8000 & self::getUInt2d($cellAddressStructure, 2))) { + $row = '$' . $row; + } else { + $rowIndex = ($rowIndex <= 32767) ? $rowIndex : $rowIndex - 65536; + $row = $baseRow + $rowIndex; + } + + return $column . $row; + } + + /** + * Reads a cell range address in BIFF8 e.g. 'A2:B6' or 'A1' + * always fixed range + * section 2.5.14. + */ + protected static function readBIFF8CellRangeAddressFixed(string $subData): string + { + // offset: 0; size: 2; index to first row + $fr = self::getUInt2d($subData, 0) + 1; + + // offset: 2; size: 2; index to last row + $lr = self::getUInt2d($subData, 2) + 1; + + // offset: 4; size: 2; index to first column + $fc = self::getUInt2d($subData, 4); + + // offset: 6; size: 2; index to last column + $lc = self::getUInt2d($subData, 6); + + // check values + if ($fr > $lr || $fc > $lc) { + throw new ReaderException('Not a cell range address'); + } + + // column index to letter + $fc = Coordinate::stringFromColumnIndex($fc + 1); + $lc = Coordinate::stringFromColumnIndex($lc + 1); + + if ($fr == $lr && $fc == $lc) { + return "$fc$fr"; + } + + return "$fc$fr:$lc$lr"; + } + + /** + * Reads a cell range address in BIFF8 e.g. 'A2:B6' or '$A$2:$B$6' + * there are flags indicating whether column/row index is relative + * section 3.3.4. + */ + protected static function readBIFF8CellRangeAddress(string $subData): string + { + // todo: if cell range is just a single cell, should this funciton + // not just return e.g. 'A1' and not 'A1:A1' ? + + // offset: 0; size: 2; index to first row (0... 65535) (or offset (-32768... 32767)) + $fr = self::getUInt2d($subData, 0) + 1; + + // offset: 2; size: 2; index to last row (0... 65535) (or offset (-32768... 32767)) + $lr = self::getUInt2d($subData, 2) + 1; + + // offset: 4; size: 2; index to first column or column offset + relative flags + + // bit: 7-0; mask 0x00FF; column index + $fc = Coordinate::stringFromColumnIndex((0x00FF & self::getUInt2d($subData, 4)) + 1); + + // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) + if (!(0x4000 & self::getUInt2d($subData, 4))) { + $fc = '$' . $fc; + } + + // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) + if (!(0x8000 & self::getUInt2d($subData, 4))) { + $fr = '$' . $fr; + } + + // offset: 6; size: 2; index to last column or column offset + relative flags + + // bit: 7-0; mask 0x00FF; column index + $lc = Coordinate::stringFromColumnIndex((0x00FF & self::getUInt2d($subData, 6)) + 1); + + // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) + if (!(0x4000 & self::getUInt2d($subData, 6))) { + $lc = '$' . $lc; + } + + // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) + if (!(0x8000 & self::getUInt2d($subData, 6))) { + $lr = '$' . $lr; + } + + return "$fc$fr:$lc$lr"; + } + + /** + * Reads a cell range address in BIFF8 for shared formulas. Uses positive and negative values for row and column + * to indicate offsets from a base cell + * section 3.3.4. + * + * @param string $baseCell Base cell + * + * @return string Cell range address + */ + protected static function readBIFF8CellRangeAddressB(string $subData, string $baseCell = 'A1'): string + { + [$baseCol, $baseRow] = Coordinate::indexesFromString($baseCell); + $baseCol = $baseCol - 1; + + // TODO: if cell range is just a single cell, should this funciton + // not just return e.g. 'A1' and not 'A1:A1' ? + + // offset: 0; size: 2; first row + $frIndex = self::getUInt2d($subData, 0); // adjust below + + // offset: 2; size: 2; relative index to first row (0... 65535) should be treated as offset (-32768... 32767) + $lrIndex = self::getUInt2d($subData, 2); // adjust below + + // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) + if (!(0x4000 & self::getUInt2d($subData, 4))) { + // absolute column index + // offset: 4; size: 2; first column with relative/absolute flags + // bit: 7-0; mask 0x00FF; column index + $fcIndex = 0x00FF & self::getUInt2d($subData, 4); + $fc = Coordinate::stringFromColumnIndex($fcIndex + 1); + $fc = '$' . $fc; + } else { + // column offset + // offset: 4; size: 2; first column with relative/absolute flags + // bit: 7-0; mask 0x00FF; column index + $relativeFcIndex = 0x00FF & self::getInt2d($subData, 4); + $fcIndex = $baseCol + $relativeFcIndex; + $fcIndex = ($fcIndex < 256) ? $fcIndex : $fcIndex - 256; + $fcIndex = ($fcIndex >= 0) ? $fcIndex : $fcIndex + 256; + $fc = Coordinate::stringFromColumnIndex($fcIndex + 1); + } + + // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) + if (!(0x8000 & self::getUInt2d($subData, 4))) { + // absolute row index + $fr = $frIndex + 1; + $fr = '$' . $fr; + } else { + // row offset + $frIndex = ($frIndex <= 32767) ? $frIndex : $frIndex - 65536; + $fr = $baseRow + $frIndex; + } + + // bit: 14; mask 0x4000; (1 = relative column index, 0 = absolute column index) + if (!(0x4000 & self::getUInt2d($subData, 6))) { + // absolute column index + // offset: 6; size: 2; last column with relative/absolute flags + // bit: 7-0; mask 0x00FF; column index + $lcIndex = 0x00FF & self::getUInt2d($subData, 6); + $lc = Coordinate::stringFromColumnIndex($lcIndex + 1); + $lc = '$' . $lc; + } else { + // column offset + // offset: 6; size: 2; last column with relative/absolute flags + // bit: 7-0; mask 0x00FF; column index + $relativeLcIndex = 0x00FF & self::getInt2d($subData, 6); + $lcIndex = $baseCol + $relativeLcIndex; + $lcIndex = ($lcIndex < 256) ? $lcIndex : $lcIndex - 256; + $lcIndex = ($lcIndex >= 0) ? $lcIndex : $lcIndex + 256; + $lc = Coordinate::stringFromColumnIndex($lcIndex + 1); + } + + // bit: 15; mask 0x8000; (1 = relative row index, 0 = absolute row index) + if (!(0x8000 & self::getUInt2d($subData, 6))) { + // absolute row index + $lr = $lrIndex + 1; + $lr = '$' . $lr; + } else { + // row offset + $lrIndex = ($lrIndex <= 32767) ? $lrIndex : $lrIndex - 65536; + $lr = $baseRow + $lrIndex; + } + + return "$fc$fr:$lc$lr"; + } +} diff --git a/src/PhpSpreadsheet/Reader/Xls/ConditionalFormatting.php b/src/PhpSpreadsheet/Reader/Xls/ConditionalFormatting.php index fbd31d565e..324d12cb21 100644 --- a/src/PhpSpreadsheet/Reader/Xls/ConditionalFormatting.php +++ b/src/PhpSpreadsheet/Reader/Xls/ConditionalFormatting.php @@ -2,9 +2,14 @@ namespace PhpOffice\PhpSpreadsheet\Reader\Xls; +use PhpOffice\PhpSpreadsheet\Exception as PhpSpreadsheetException; +use PhpOffice\PhpSpreadsheet\Reader\Xls; +use PhpOffice\PhpSpreadsheet\Reader\Xls\Style\FillPattern; use PhpOffice\PhpSpreadsheet\Style\Conditional; +use PhpOffice\PhpSpreadsheet\Style\Fill; +use PhpOffice\PhpSpreadsheet\Style\Style; -class ConditionalFormatting +class ConditionalFormatting extends Xls { /** * @var array @@ -46,4 +51,291 @@ public static function operator(int $operator): ?string return null; } + + /** + * Parse conditional formatting blocks. + * + * @see https://www.openoffice.org/sc/excelfileformat.pdf Search for CFHEADER followed by CFRULE + */ + protected function readCFHeader2(Xls $xls): array + { + $length = self::getUInt2d($xls->data, $xls->pos + 2); + $recordData = $xls->readRecordData($xls->data, $xls->pos + 4, $length); + + // move stream pointer forward to next record + $xls->pos += 4 + $length; + + if ($xls->readDataOnly) { + return []; + } + + // offset: 0; size: 2; Rule Count +// $ruleCount = self::getUInt2d($recordData, 0); + + // offset: var; size: var; cell range address list with + $cellRangeAddressList = ($xls->version == self::XLS_BIFF8) + ? Biff8::readBIFF8CellRangeAddressList(substr($recordData, 12)) + : Biff5::readBIFF5CellRangeAddressList(substr($recordData, 12)); + $cellRangeAddresses = $cellRangeAddressList['cellRangeAddresses']; + + return $cellRangeAddresses; + } + + protected function readCFRule2(array $cellRangeAddresses, Xls $xls): void + { + $length = self::getUInt2d($xls->data, $xls->pos + 2); + $recordData = $xls->readRecordData($xls->data, $xls->pos + 4, $length); + + // move stream pointer forward to next record + $xls->pos += 4 + $length; + + if ($xls->readDataOnly) { + return; + } + + // offset: 0; size: 2; Options + $cfRule = self::getUInt2d($recordData, 0); + + // bit: 8-15; mask: 0x00FF; type + $type = (0x00FF & $cfRule) >> 0; + $type = self::type($type); + + // bit: 0-7; mask: 0xFF00; type + $operator = (0xFF00 & $cfRule) >> 8; + $operator = self::operator($operator); + + if ($type === null || $operator === null) { + return; + } + + // offset: 2; size: 2; Size1 + $size1 = self::getUInt2d($recordData, 2); + + // offset: 4; size: 2; Size2 + $size2 = self::getUInt2d($recordData, 4); + + // offset: 6; size: 4; Options + $options = self::getInt4d($recordData, 6); + + $style = new Style(false, true); // non-supervisor, conditional + $noFormatSet = true; + //$xls->getCFStyleOptions($options, $style); + + $hasFontRecord = (bool) ((0x04000000 & $options) >> 26); + $hasAlignmentRecord = (bool) ((0x08000000 & $options) >> 27); + $hasBorderRecord = (bool) ((0x10000000 & $options) >> 28); + $hasFillRecord = (bool) ((0x20000000 & $options) >> 29); + $hasProtectionRecord = (bool) ((0x40000000 & $options) >> 30); + // note unexpected values for following 4 + $hasBorderLeft = !(bool) (0x00000400 & $options); + $hasBorderRight = !(bool) (0x00000800 & $options); + $hasBorderTop = !(bool) (0x00001000 & $options); + $hasBorderBottom = !(bool) (0x00002000 & $options); + + $offset = 12; + + if ($hasFontRecord === true) { + $fontStyle = substr($recordData, $offset, 118); + $this->getCFFontStyle($fontStyle, $style, $xls); + $offset += 118; + $noFormatSet = false; + } + + if ($hasAlignmentRecord === true) { + //$alignmentStyle = substr($recordData, $offset, 8); + //$this->getCFAlignmentStyle($alignmentStyle, $style, $xls); + $offset += 8; + } + + if ($hasBorderRecord === true) { + $borderStyle = substr($recordData, $offset, 8); + $this->getCFBorderStyle($borderStyle, $style, $hasBorderLeft, $hasBorderRight, $hasBorderTop, $hasBorderBottom, $xls); + $offset += 8; + $noFormatSet = false; + } + + if ($hasFillRecord === true) { + $fillStyle = substr($recordData, $offset, 4); + $this->getCFFillStyle($fillStyle, $style, $xls); + $offset += 4; + $noFormatSet = false; + } + + if ($hasProtectionRecord === true) { + //$protectionStyle = substr($recordData, $offset, 4); + //$this->getCFProtectionStyle($protectionStyle, $style, $xls); + $offset += 2; + } + + $formula1 = $formula2 = null; + if ($size1 > 0) { + $formula1 = $this->readCFFormula($recordData, $offset, $size1, $xls); + if ($formula1 === null) { + return; + } + + $offset += $size1; + } + + if ($size2 > 0) { + $formula2 = $this->readCFFormula($recordData, $offset, $size2, $xls); + if ($formula2 === null) { + return; + } + + $offset += $size2; + } + + $this->setCFRules($cellRangeAddresses, $type, $operator, $formula1, $formula2, $style, $noFormatSet, $xls); + } + + /*private function getCFStyleOptions(int $options, Style $style, Xls $xls): void + { + }*/ + + private function getCFFontStyle(string $options, Style $style, Xls $xls): void + { + $fontSize = self::getInt4d($options, 64); + if ($fontSize !== -1) { + $style->getFont()->setSize($fontSize / 20); // Convert twips to points + } + $options68 = self::getInt4d($options, 68); + $options88 = self::getInt4d($options, 88); + + if (($options88 & 2) === 0) { + $bold = self::getUInt2d($options, 72); // 400 = normal, 700 = bold + if ($bold !== 0) { + $style->getFont()->setBold($bold >= 550); + } + if (($options68 & 2) !== 0) { + $style->getFont()->setItalic(true); + } + } + if (($options88 & 0x80) === 0) { + if (($options68 & 0x80) !== 0) { + $style->getFont()->setStrikethrough(true); + } + } + + $color = self::getInt4d($options, 80); + + if ($color !== -1) { + $style->getFont()->getColor()->setRGB(Color::map($color, $xls->palette, $xls->version)['rgb']); + } + } + + /*private function getCFAlignmentStyle(string $options, Style $style, Xls $xls): void + { + }*/ + + private function getCFBorderStyle(string $options, Style $style, bool $hasBorderLeft, bool $hasBorderRight, bool $hasBorderTop, bool $hasBorderBottom, Xls $xls): void + { + $valueArray = unpack('V', $options); + $value = is_array($valueArray) ? $valueArray[1] : 0; + $left = $value & 15; + $right = ($value >> 4) & 15; + $top = ($value >> 8) & 15; + $bottom = ($value >> 12) & 15; + $leftc = ($value >> 16) & 0x7F; + $rightc = ($value >> 23) & 0x7F; + $valueArray = unpack('V', substr($options, 4)); + $value = is_array($valueArray) ? $valueArray[1] : 0; + $topc = $value & 0x7F; + $bottomc = ($value & 0x3F80) >> 7; + if ($hasBorderLeft) { + $style->getBorders()->getLeft() + ->setBorderStyle(self::BORDER_STYLE_MAP[$left]); + $style->getBorders()->getLeft()->getColor() + ->setRGB(Color::map($leftc, $xls->palette, $xls->version)['rgb']); + } + if ($hasBorderRight) { + $style->getBorders()->getRight() + ->setBorderStyle(self::BORDER_STYLE_MAP[$right]); + $style->getBorders()->getRight()->getColor() + ->setRGB(Color::map($rightc, $xls->palette, $xls->version)['rgb']); + } + if ($hasBorderTop) { + $style->getBorders()->getTop() + ->setBorderStyle(self::BORDER_STYLE_MAP[$top]); + $style->getBorders()->getTop()->getColor() + ->setRGB(Color::map($topc, $xls->palette, $xls->version)['rgb']); + } + if ($hasBorderBottom) { + $style->getBorders()->getBottom() + ->setBorderStyle(self::BORDER_STYLE_MAP[$bottom]); + $style->getBorders()->getBottom()->getColor() + ->setRGB(Color::map($bottomc, $xls->palette, $xls->version)['rgb']); + } + } + + private function getCFFillStyle(string $options, Style $style, Xls $xls): void + { + $fillPattern = self::getUInt2d($options, 0); + // bit: 10-15; mask: 0xFC00; type + $fillPattern = (0xFC00 & $fillPattern) >> 10; + $fillPattern = FillPattern::lookup($fillPattern); + $fillPattern = $fillPattern === Fill::FILL_NONE ? Fill::FILL_SOLID : $fillPattern; + + if ($fillPattern !== Fill::FILL_NONE) { + $style->getFill()->setFillType($fillPattern); + + $fillColors = self::getUInt2d($options, 2); + + // bit: 0-6; mask: 0x007F; type + $color1 = (0x007F & $fillColors) >> 0; + + // bit: 7-13; mask: 0x3F80; type + $color2 = (0x3F80 & $fillColors) >> 7; + if ($fillPattern === Fill::FILL_SOLID) { + $style->getFill()->getStartColor()->setRGB(Color::map($color2, $xls->palette, $xls->version)['rgb']); + } else { + $style->getFill()->getStartColor()->setRGB(Color::map($color1, $xls->palette, $xls->version)['rgb']); + $style->getFill()->getEndColor()->setRGB(Color::map($color2, $xls->palette, $xls->version)['rgb']); + } + } + } + + /*private function getCFProtectionStyle(string $options, Style $style, Xls $xls): void + { + }*/ + + private function readCFFormula(string $recordData, int $offset, int $size, Xls $xls): float|int|string|null + { + try { + $formula = substr($recordData, $offset, $size); + $formula = pack('v', $size) . $formula; // prepend the length + + $formula = $xls->getFormulaFromStructure($formula); + if (is_numeric($formula)) { + return (str_contains($formula, '.')) ? (float) $formula : (int) $formula; + } + + return $formula; + } catch (PhpSpreadsheetException) { + return null; + } + } + + private function setCFRules(array $cellRanges, string $type, string $operator, null|float|int|string $formula1, null|float|int|string $formula2, Style $style, bool $noFormatSet, Xls $xls): void + { + foreach ($cellRanges as $cellRange) { + $conditional = new Conditional(); + $conditional->setNoFormatSet($noFormatSet); + $conditional->setConditionType($type); + $conditional->setOperatorType($operator); + $conditional->setStopIfTrue(true); + if ($formula1 !== null) { + $conditional->addCondition($formula1); + } + if ($formula2 !== null) { + $conditional->addCondition($formula2); + } + $conditional->setStyle($style); + + $conditionalStyles = $xls->phpSheet->getStyle($cellRange)->getConditionalStyles(); + $conditionalStyles[] = $conditional; + + $xls->phpSheet->getStyle($cellRange)->setConditionalStyles($conditionalStyles); + } + } } diff --git a/src/PhpSpreadsheet/Reader/Xls/DataValidationHelper.php b/src/PhpSpreadsheet/Reader/Xls/DataValidationHelper.php index 874e6994fd..a0c897efc9 100644 --- a/src/PhpSpreadsheet/Reader/Xls/DataValidationHelper.php +++ b/src/PhpSpreadsheet/Reader/Xls/DataValidationHelper.php @@ -2,9 +2,12 @@ namespace PhpOffice\PhpSpreadsheet\Reader\Xls; +use PhpOffice\PhpSpreadsheet\Cell\Coordinate; use PhpOffice\PhpSpreadsheet\Cell\DataValidation; +use PhpOffice\PhpSpreadsheet\Exception as PhpSpreadsheetException; +use PhpOffice\PhpSpreadsheet\Reader\Xls; -class DataValidationHelper +class DataValidationHelper extends Xls { /** * @var array @@ -69,4 +72,141 @@ public static function operator(int $operator): ?string return null; } + + /** + * Read DATAVALIDATION record. + */ + protected function readDataValidation2(Xls $xls): void + { + $length = self::getUInt2d($xls->data, $xls->pos + 2); + $recordData = $xls->readRecordData($xls->data, $xls->pos + 4, $length); + + // move stream pointer forward to next record + $xls->pos += 4 + $length; + + if ($xls->readDataOnly) { + return; + } + + // offset: 0; size: 4; Options + $options = self::getInt4d($recordData, 0); + + // bit: 0-3; mask: 0x0000000F; type + $type = (0x0000000F & $options) >> 0; + $type = self::type($type); + + // bit: 4-6; mask: 0x00000070; error type + $errorStyle = (0x00000070 & $options) >> 4; + $errorStyle = self::errorStyle($errorStyle); + + // bit: 7; mask: 0x00000080; 1= formula is explicit (only applies to list) + // I have only seen cases where this is 1 + //$explicitFormula = (0x00000080 & $options) >> 7; + + // bit: 8; mask: 0x00000100; 1= empty cells allowed + $allowBlank = (0x00000100 & $options) >> 8; + + // bit: 9; mask: 0x00000200; 1= suppress drop down arrow in list type validity + $suppressDropDown = (0x00000200 & $options) >> 9; + + // bit: 18; mask: 0x00040000; 1= show prompt box if cell selected + $showInputMessage = (0x00040000 & $options) >> 18; + + // bit: 19; mask: 0x00080000; 1= show error box if invalid values entered + $showErrorMessage = (0x00080000 & $options) >> 19; + + // bit: 20-23; mask: 0x00F00000; condition operator + $operator = (0x00F00000 & $options) >> 20; + $operator = self::operator($operator); + + if ($type === null || $errorStyle === null || $operator === null) { + return; + } + + // offset: 4; size: var; title of the prompt box + $offset = 4; + $string = self::readUnicodeStringLong(substr($recordData, $offset)); + $promptTitle = $string['value'] !== chr(0) ? $string['value'] : ''; + $offset += $string['size']; + + // offset: var; size: var; title of the error box + $string = self::readUnicodeStringLong(substr($recordData, $offset)); + $errorTitle = $string['value'] !== chr(0) ? $string['value'] : ''; + $offset += $string['size']; + + // offset: var; size: var; text of the prompt box + $string = self::readUnicodeStringLong(substr($recordData, $offset)); + $prompt = $string['value'] !== chr(0) ? $string['value'] : ''; + $offset += $string['size']; + + // offset: var; size: var; text of the error box + $string = self::readUnicodeStringLong(substr($recordData, $offset)); + $error = $string['value'] !== chr(0) ? $string['value'] : ''; + $offset += $string['size']; + + // offset: var; size: 2; size of the formula data for the first condition + $sz1 = self::getUInt2d($recordData, $offset); + $offset += 2; + + // offset: var; size: 2; not used + $offset += 2; + + // offset: var; size: $sz1; formula data for first condition (without size field) + $formula1 = substr($recordData, $offset, $sz1); + $formula1 = pack('v', $sz1) . $formula1; // prepend the length + + try { + $formula1 = $xls->getFormulaFromStructure($formula1); + + // in list type validity, null characters are used as item separators + if ($type == DataValidation::TYPE_LIST) { + $formula1 = str_replace(chr(0), ',', $formula1); + } + } catch (PhpSpreadsheetException $e) { + return; + } + $offset += $sz1; + + // offset: var; size: 2; size of the formula data for the first condition + $sz2 = self::getUInt2d($recordData, $offset); + $offset += 2; + + // offset: var; size: 2; not used + $offset += 2; + + // offset: var; size: $sz2; formula data for second condition (without size field) + $formula2 = substr($recordData, $offset, $sz2); + $formula2 = pack('v', $sz2) . $formula2; // prepend the length + + try { + $formula2 = $xls->getFormulaFromStructure($formula2); + } catch (PhpSpreadsheetException) { + return; + } + $offset += $sz2; + + // offset: var; size: var; cell range address list with + $cellRangeAddressList = Biff8::readBIFF8CellRangeAddressList(substr($recordData, $offset)); + $cellRangeAddresses = $cellRangeAddressList['cellRangeAddresses']; + + foreach ($cellRangeAddresses as $cellRange) { + $stRange = $xls->phpSheet->shrinkRangeToFit($cellRange); + foreach (Coordinate::extractAllCellReferencesInRange($stRange) as $coordinate) { + $objValidation = $xls->phpSheet->getCell($coordinate)->getDataValidation(); + $objValidation->setType($type); + $objValidation->setErrorStyle($errorStyle); + $objValidation->setAllowBlank((bool) $allowBlank); + $objValidation->setShowInputMessage((bool) $showInputMessage); + $objValidation->setShowErrorMessage((bool) $showErrorMessage); + $objValidation->setShowDropDown(!$suppressDropDown); + $objValidation->setOperator($operator); + $objValidation->setErrorTitle($errorTitle); + $objValidation->setError($error); + $objValidation->setPromptTitle($promptTitle); + $objValidation->setPrompt($prompt); + $objValidation->setFormula1($formula1); + $objValidation->setFormula2($formula2); + } + } + } } diff --git a/src/PhpSpreadsheet/Reader/Xls/ListFunctions.php b/src/PhpSpreadsheet/Reader/Xls/ListFunctions.php new file mode 100644 index 0000000000..2d8778ff25 --- /dev/null +++ b/src/PhpSpreadsheet/Reader/Xls/ListFunctions.php @@ -0,0 +1,158 @@ +loadOLE($filename); + + // total byte size of Excel data (workbook global substream + sheet substreams) + $xls->dataSize = strlen($xls->data); + + $xls->pos = 0; + $xls->sheets = []; + + // Parse Workbook Global Substream + while ($xls->pos < $xls->dataSize) { + $code = self::getUInt2d($xls->data, $xls->pos); + + match ($code) { + self::XLS_TYPE_BOF => $xls->readBof(), + self::XLS_TYPE_SHEET => $xls->readSheet(), + self::XLS_TYPE_EOF => $xls->readDefault(), + self::XLS_TYPE_CODEPAGE => $xls->readCodepage(), + default => $xls->readDefault(), + }; + + if ($code === self::XLS_TYPE_EOF) { + break; + } + } + + foreach ($xls->sheets as $sheet) { + if ($sheet['sheetType'] != 0x00) { + // 0x00: Worksheet, 0x02: Chart, 0x06: Visual Basic module + continue; + } + + $worksheetNames[] = $sheet['name']; + } + + return $worksheetNames; + } + + /** + * Return worksheet info (Name, Last Column Letter, Last Column Index, Total Rows, Total Columns). + */ + protected function listWorksheetInfo2(string $filename, Xls $xls): array + { + File::assertFile($filename); + + $worksheetInfo = []; + + // Read the OLE file + $xls->loadOLE($filename); + + // total byte size of Excel data (workbook global substream + sheet substreams) + $xls->dataSize = strlen($xls->data); + + // initialize + $xls->pos = 0; + $xls->sheets = []; + + // Parse Workbook Global Substream + while ($xls->pos < $xls->dataSize) { + $code = self::getUInt2d($xls->data, $xls->pos); + + match ($code) { + self::XLS_TYPE_BOF => $xls->readBof(), + self::XLS_TYPE_SHEET => $xls->readSheet(), + self::XLS_TYPE_EOF => $xls->readDefault(), + self::XLS_TYPE_CODEPAGE => $xls->readCodepage(), + default => $xls->readDefault(), + }; + + if ($code === self::XLS_TYPE_EOF) { + break; + } + } + + // Parse the individual sheets + foreach ($xls->sheets as $sheet) { + if ($sheet['sheetType'] != 0x00) { + // 0x00: Worksheet + // 0x02: Chart + // 0x06: Visual Basic module + continue; + } + + $tmpInfo = []; + $tmpInfo['worksheetName'] = $sheet['name']; + $tmpInfo['lastColumnLetter'] = 'A'; + $tmpInfo['lastColumnIndex'] = 0; + $tmpInfo['totalRows'] = 0; + $tmpInfo['totalColumns'] = 0; + + $xls->pos = $sheet['offset']; + + while ($xls->pos <= $xls->dataSize - 4) { + $code = self::getUInt2d($xls->data, $xls->pos); + + switch ($code) { + case self::XLS_TYPE_RK: + case self::XLS_TYPE_LABELSST: + case self::XLS_TYPE_NUMBER: + case self::XLS_TYPE_FORMULA: + case self::XLS_TYPE_BOOLERR: + case self::XLS_TYPE_LABEL: + $length = self::getUInt2d($xls->data, $xls->pos + 2); + $recordData = $xls->readRecordData($xls->data, $xls->pos + 4, $length); + + // move stream pointer to next record + $xls->pos += 4 + $length; + + $rowIndex = self::getUInt2d($recordData, 0) + 1; + $columnIndex = self::getUInt2d($recordData, 2); + + $tmpInfo['totalRows'] = max($tmpInfo['totalRows'], $rowIndex); + $tmpInfo['lastColumnIndex'] = max($tmpInfo['lastColumnIndex'], $columnIndex); + + break; + case self::XLS_TYPE_BOF: + $xls->readBof(); + + break; + case self::XLS_TYPE_EOF: + $xls->readDefault(); + + break 2; + default: + $xls->readDefault(); + + break; + } + } + + $tmpInfo['lastColumnLetter'] = Coordinate::stringFromColumnIndex($tmpInfo['lastColumnIndex'] + 1); + $tmpInfo['totalColumns'] = $tmpInfo['lastColumnIndex'] + 1; + + $worksheetInfo[] = $tmpInfo; + } + + return $worksheetInfo; + } +} diff --git a/src/PhpSpreadsheet/Reader/Xls/LoadSpreadsheet.php b/src/PhpSpreadsheet/Reader/Xls/LoadSpreadsheet.php new file mode 100644 index 0000000000..aeda44aa22 --- /dev/null +++ b/src/PhpSpreadsheet/Reader/Xls/LoadSpreadsheet.php @@ -0,0 +1,671 @@ +loadOLE($filename); + + // Initialisations + $xls->spreadsheet = new Spreadsheet(); + $xls->spreadsheet->setValueBinder($this->valueBinder); + $xls->spreadsheet->removeSheetByIndex(0); // remove 1st sheet + if (!$xls->readDataOnly) { + $xls->spreadsheet->removeCellStyleXfByIndex(0); // remove the default style + $xls->spreadsheet->removeCellXfByIndex(0); // remove the default style + } + + // Read the summary information stream (containing meta data) + $xls->readSummaryInformation(); + + // Read the Additional document summary information stream (containing application-specific meta data) + $xls->readDocumentSummaryInformation(); + + // total byte size of Excel data (workbook global substream + sheet substreams) + $xls->dataSize = strlen($xls->data); + + // initialize + $xls->pos = 0; + $xls->codepage = $xls->codepage ?: CodePage::DEFAULT_CODE_PAGE; + $xls->formats = []; + $xls->objFonts = []; + $xls->palette = []; + $xls->sheets = []; + $xls->externalBooks = []; + $xls->ref = []; + $xls->definedname = []; + $xls->sst = []; + $xls->drawingGroupData = ''; + $xls->xfIndex = 0; + $xls->mapCellXfIndex = []; + $xls->mapCellStyleXfIndex = []; + + // Parse Workbook Global Substream + while ($xls->pos < $xls->dataSize) { + $code = self::getUInt2d($xls->data, $xls->pos); + + match ($code) { + self::XLS_TYPE_BOF => $xls->readBof(), + self::XLS_TYPE_FILEPASS => $xls->readFilepass(), + self::XLS_TYPE_CODEPAGE => $xls->readCodepage(), + self::XLS_TYPE_DATEMODE => $xls->readDateMode(), + self::XLS_TYPE_FONT => $xls->readFont(), + self::XLS_TYPE_FORMAT => $xls->readFormat(), + self::XLS_TYPE_XF => $xls->readXf(), + self::XLS_TYPE_XFEXT => $xls->readXfExt(), + self::XLS_TYPE_STYLE => $xls->readStyle(), + self::XLS_TYPE_PALETTE => $xls->readPalette(), + self::XLS_TYPE_SHEET => $xls->readSheet(), + self::XLS_TYPE_EXTERNALBOOK => $xls->readExternalBook(), + self::XLS_TYPE_EXTERNNAME => $xls->readExternName(), + self::XLS_TYPE_EXTERNSHEET => $xls->readExternSheet(), + self::XLS_TYPE_DEFINEDNAME => $xls->readDefinedName(), + self::XLS_TYPE_MSODRAWINGGROUP => $xls->readMsoDrawingGroup(), + self::XLS_TYPE_SST => $xls->readSst(), + self::XLS_TYPE_EOF => $xls->readDefault(), + default => $xls->readDefault(), + }; + + if ($code === self::XLS_TYPE_EOF) { + break; + } + } + + // Resolve indexed colors for font, fill, and border colors + // Cannot be resolved already in XF record, because PALETTE record comes afterwards + if (!$xls->readDataOnly) { + foreach ($xls->objFonts as $objFont) { + if (isset($objFont->colorIndex)) { + $color = Color::map($objFont->colorIndex, $xls->palette, $xls->version); + $objFont->getColor()->setRGB($color['rgb']); + } + } + + foreach ($xls->spreadsheet->getCellXfCollection() as $objStyle) { + // fill start and end color + $fill = $objStyle->getFill(); + + if (isset($fill->startcolorIndex)) { + $startColor = Color::map($fill->startcolorIndex, $xls->palette, $xls->version); + $fill->getStartColor()->setRGB($startColor['rgb']); + } + if (isset($fill->endcolorIndex)) { + $endColor = Color::map($fill->endcolorIndex, $xls->palette, $xls->version); + $fill->getEndColor()->setRGB($endColor['rgb']); + } + + // border colors + $top = $objStyle->getBorders()->getTop(); + $right = $objStyle->getBorders()->getRight(); + $bottom = $objStyle->getBorders()->getBottom(); + $left = $objStyle->getBorders()->getLeft(); + $diagonal = $objStyle->getBorders()->getDiagonal(); + + if (isset($top->colorIndex)) { + $borderTopColor = Color::map($top->colorIndex, $xls->palette, $xls->version); + $top->getColor()->setRGB($borderTopColor['rgb']); + } + if (isset($right->colorIndex)) { + $borderRightColor = Color::map($right->colorIndex, $xls->palette, $xls->version); + $right->getColor()->setRGB($borderRightColor['rgb']); + } + if (isset($bottom->colorIndex)) { + $borderBottomColor = Color::map($bottom->colorIndex, $xls->palette, $xls->version); + $bottom->getColor()->setRGB($borderBottomColor['rgb']); + } + if (isset($left->colorIndex)) { + $borderLeftColor = Color::map($left->colorIndex, $xls->palette, $xls->version); + $left->getColor()->setRGB($borderLeftColor['rgb']); + } + if (isset($diagonal->colorIndex)) { + $borderDiagonalColor = Color::map($diagonal->colorIndex, $xls->palette, $xls->version); + $diagonal->getColor()->setRGB($borderDiagonalColor['rgb']); + } + } + } + + // treat MSODRAWINGGROUP records, workbook-level Escher + $escherWorkbook = null; + if (!$xls->readDataOnly && $xls->drawingGroupData) { + $escher = new SharedEscher(); + $reader = new Escher($escher); + $escherWorkbook = $reader->load($xls->drawingGroupData); + } + + // Parse the individual sheets + $xls->activeSheetSet = false; + foreach ($xls->sheets as $sheet) { + $selectedCells = ''; + if ($sheet['sheetType'] != 0x00) { + // 0x00: Worksheet, 0x02: Chart, 0x06: Visual Basic module + continue; + } + + // check if sheet should be skipped + if (isset($xls->loadSheetsOnly) && !in_array($sheet['name'], $xls->loadSheetsOnly)) { + continue; + } + + // add sheet to PhpSpreadsheet object + $xls->phpSheet = $xls->spreadsheet->createSheet(); + // Use false for $updateFormulaCellReferences to prevent adjustment of worksheet references in formula + // cells... during the load, all formulae should be correct, and we're simply bringing the worksheet + // name in line with the formula, not the reverse + $xls->phpSheet->setTitle($sheet['name'], false, false); + $xls->phpSheet->setSheetState($sheet['sheetState']); + + $xls->pos = $sheet['offset']; + + // Initialize isFitToPages. May change after reading SHEETPR record. + $xls->isFitToPages = false; + + // Initialize drawingData + $xls->drawingData = ''; + + // Initialize objs + $xls->objs = []; + + // Initialize shared formula parts + $xls->sharedFormulaParts = []; + + // Initialize shared formulas + $xls->sharedFormulas = []; + + // Initialize text objs + $xls->textObjects = []; + + // Initialize cell annotations + $xls->cellNotes = []; + $xls->textObjRef = -1; + + while ($xls->pos <= $xls->dataSize - 4) { + $code = self::getUInt2d($xls->data, $xls->pos); + + switch ($code) { + case self::XLS_TYPE_BOF: + $xls->readBof(); + + break; + case self::XLS_TYPE_PRINTGRIDLINES: + $xls->readPrintGridlines(); + + break; + case self::XLS_TYPE_DEFAULTROWHEIGHT: + $xls->readDefaultRowHeight(); + + break; + case self::XLS_TYPE_SHEETPR: + $xls->readSheetPr(); + + break; + case self::XLS_TYPE_HORIZONTALPAGEBREAKS: + $xls->readHorizontalPageBreaks(); + + break; + case self::XLS_TYPE_VERTICALPAGEBREAKS: + $xls->readVerticalPageBreaks(); + + break; + case self::XLS_TYPE_HEADER: + $xls->readHeader(); + + break; + case self::XLS_TYPE_FOOTER: + $xls->readFooter(); + + break; + case self::XLS_TYPE_HCENTER: + $xls->readHcenter(); + + break; + case self::XLS_TYPE_VCENTER: + $xls->readVcenter(); + + break; + case self::XLS_TYPE_LEFTMARGIN: + $xls->readLeftMargin(); + + break; + case self::XLS_TYPE_RIGHTMARGIN: + $xls->readRightMargin(); + + break; + case self::XLS_TYPE_TOPMARGIN: + $xls->readTopMargin(); + + break; + case self::XLS_TYPE_BOTTOMMARGIN: + $xls->readBottomMargin(); + + break; + case self::XLS_TYPE_PAGESETUP: + $xls->readPageSetup(); + + break; + case self::XLS_TYPE_PROTECT: + $xls->readProtect(); + + break; + case self::XLS_TYPE_SCENPROTECT: + $xls->readScenProtect(); + + break; + case self::XLS_TYPE_OBJECTPROTECT: + $xls->readObjectProtect(); + + break; + case self::XLS_TYPE_PASSWORD: + $xls->readPassword(); + + break; + case self::XLS_TYPE_DEFCOLWIDTH: + $xls->readDefColWidth(); + + break; + case self::XLS_TYPE_COLINFO: + $xls->readColInfo(); + + break; + case self::XLS_TYPE_DIMENSION: + $xls->readDefault(); + + break; + case self::XLS_TYPE_ROW: + $xls->readRow(); + + break; + case self::XLS_TYPE_DBCELL: + $xls->readDefault(); + + break; + case self::XLS_TYPE_RK: + $xls->readRk(); + + break; + case self::XLS_TYPE_LABELSST: + $xls->readLabelSst(); + + break; + case self::XLS_TYPE_MULRK: + $xls->readMulRk(); + + break; + case self::XLS_TYPE_NUMBER: + $xls->readNumber(); + + break; + case self::XLS_TYPE_FORMULA: + $xls->readFormula(); + + break; + case self::XLS_TYPE_SHAREDFMLA: + $xls->readSharedFmla(); + + break; + case self::XLS_TYPE_BOOLERR: + $xls->readBoolErr(); + + break; + case self::XLS_TYPE_MULBLANK: + $xls->readMulBlank(); + + break; + case self::XLS_TYPE_LABEL: + $xls->readLabel(); + + break; + case self::XLS_TYPE_BLANK: + $xls->readBlank(); + + break; + case self::XLS_TYPE_MSODRAWING: + $xls->readMsoDrawing(); + + break; + case self::XLS_TYPE_OBJ: + $xls->readObj(); + + break; + case self::XLS_TYPE_WINDOW2: + $xls->readWindow2(); + + break; + case self::XLS_TYPE_PAGELAYOUTVIEW: + $xls->readPageLayoutView(); + + break; + case self::XLS_TYPE_SCL: + $xls->readScl(); + + break; + case self::XLS_TYPE_PANE: + $xls->readPane(); + + break; + case self::XLS_TYPE_SELECTION: + $selectedCells = $xls->readSelection(); + + break; + case self::XLS_TYPE_MERGEDCELLS: + $xls->readMergedCells(); + + break; + case self::XLS_TYPE_HYPERLINK: + $xls->readHyperLink(); + + break; + case self::XLS_TYPE_DATAVALIDATIONS: + $xls->readDataValidations(); + + break; + case self::XLS_TYPE_DATAVALIDATION: + $xls->readDataValidation(); + + break; + case self::XLS_TYPE_CFHEADER: + $cellRangeAddresses = $xls->readCFHeader(); + + break; + case self::XLS_TYPE_CFRULE: + $xls->readCFRule($cellRangeAddresses ?? []); + + break; + case self::XLS_TYPE_SHEETLAYOUT: + $xls->readSheetLayout(); + + break; + case self::XLS_TYPE_SHEETPROTECTION: + $xls->readSheetProtection(); + + break; + case self::XLS_TYPE_RANGEPROTECTION: + $xls->readRangeProtection(); + + break; + case self::XLS_TYPE_NOTE: + $xls->readNote(); + + break; + case self::XLS_TYPE_TXO: + $xls->readTextObject(); + + break; + case self::XLS_TYPE_CONTINUE: + $xls->readContinue(); + + break; + case self::XLS_TYPE_EOF: + $xls->readDefault(); + + break 2; + default: + $xls->readDefault(); + + break; + } + } + + // treat MSODRAWING records, sheet-level Escher + if (!$xls->readDataOnly && $xls->drawingData) { + $escherWorksheet = new SharedEscher(); + $reader = new Escher($escherWorksheet); + $escherWorksheet = $reader->load($xls->drawingData); + + // get all spContainers in one long array, so they can be mapped to OBJ records + /** @var SpContainer[] $allSpContainers */ + $allSpContainers = method_exists($escherWorksheet, 'getDgContainer') ? $escherWorksheet->getDgContainer()->getSpgrContainer()->getAllSpContainers() : []; + } + + // treat OBJ records + foreach ($xls->objs as $n => $obj) { + // the first shape container never has a corresponding OBJ record, hence $n + 1 + if (isset($allSpContainers[$n + 1])) { + $spContainer = $allSpContainers[$n + 1]; + + // we skip all spContainers that are a part of a group shape since we cannot yet handle those + if ($spContainer->getNestingLevel() > 1) { + continue; + } + + // calculate the width and height of the shape + /** @var int $startRow */ + [$startColumn, $startRow] = Coordinate::coordinateFromString($spContainer->getStartCoordinates()); + /** @var int $endRow */ + [$endColumn, $endRow] = Coordinate::coordinateFromString($spContainer->getEndCoordinates()); + + $startOffsetX = $spContainer->getStartOffsetX(); + $startOffsetY = $spContainer->getStartOffsetY(); + $endOffsetX = $spContainer->getEndOffsetX(); + $endOffsetY = $spContainer->getEndOffsetY(); + + $width = SharedXls::getDistanceX($xls->phpSheet, $startColumn, $startOffsetX, $endColumn, $endOffsetX); + $height = SharedXls::getDistanceY($xls->phpSheet, $startRow, $startOffsetY, $endRow, $endOffsetY); + + // calculate offsetX and offsetY of the shape + $offsetX = (int) ($startOffsetX * SharedXls::sizeCol($xls->phpSheet, $startColumn) / 1024); + $offsetY = (int) ($startOffsetY * SharedXls::sizeRow($xls->phpSheet, $startRow) / 256); + + switch ($obj['otObjType']) { + case 0x19: + // Note + if (isset($xls->cellNotes[$obj['idObjID']])) { + //$cellNote = $xls->cellNotes[$obj['idObjID']]; + + if (isset($xls->textObjects[$obj['idObjID']])) { + $textObject = $xls->textObjects[$obj['idObjID']]; + $xls->cellNotes[$obj['idObjID']]['objTextData'] = $textObject; + } + } + + break; + case 0x08: + // picture + // get index to BSE entry (1-based) + $BSEindex = $spContainer->getOPT(0x0104); + + // If there is no BSE Index, we will fail here and other fields are not read. + // Fix by checking here. + // TODO: Why is there no BSE Index? Is this a new Office Version? Password protected field? + // More likely : a uncompatible picture + if (!$BSEindex) { + continue 2; + } + + if ($escherWorkbook) { + $BSECollection = method_exists($escherWorkbook, 'getDggContainer') ? $escherWorkbook->getDggContainer()->getBstoreContainer()->getBSECollection() : []; + $BSE = $BSECollection[$BSEindex - 1]; + $blipType = $BSE->getBlipType(); + + // need check because some blip types are not supported by Escher reader such as EMF + if ($blip = $BSE->getBlip()) { + $ih = imagecreatefromstring($blip->getData()); + if ($ih !== false) { + $drawing = new MemoryDrawing(); + $drawing->setImageResource($ih); + + // width, height, offsetX, offsetY + $drawing->setResizeProportional(false); + $drawing->setWidth($width); + $drawing->setHeight($height); + $drawing->setOffsetX($offsetX); + $drawing->setOffsetY($offsetY); + + switch ($blipType) { + case BSE::BLIPTYPE_JPEG: + $drawing->setRenderingFunction(MemoryDrawing::RENDERING_JPEG); + $drawing->setMimeType(MemoryDrawing::MIMETYPE_JPEG); + + break; + case BSE::BLIPTYPE_PNG: + imagealphablending($ih, false); + imagesavealpha($ih, true); + $drawing->setRenderingFunction(MemoryDrawing::RENDERING_PNG); + $drawing->setMimeType(MemoryDrawing::MIMETYPE_PNG); + + break; + } + + $drawing->setWorksheet($xls->phpSheet); + $drawing->setCoordinates($spContainer->getStartCoordinates()); + } + } + } + + break; + default: + // other object type + break; + } + } + } + + // treat SHAREDFMLA records + if ($xls->version == self::XLS_BIFF8) { + foreach ($xls->sharedFormulaParts as $cell => $baseCell) { + /** @var int $row */ + [$column, $row] = Coordinate::coordinateFromString($cell); + if (($xls->getReadFilter() !== null) && $xls->getReadFilter()->readCell($column, $row, $xls->phpSheet->getTitle())) { + $formula = $xls->getFormulaFromStructure($xls->sharedFormulas[$baseCell], $cell); + $xls->phpSheet->getCell($cell)->setValueExplicit('=' . $formula, DataType::TYPE_FORMULA); + } + } + } + + if (!empty($xls->cellNotes)) { + foreach ($xls->cellNotes as $note => $noteDetails) { + if (!isset($noteDetails['objTextData'])) { + if (isset($xls->textObjects[$note])) { + $textObject = $xls->textObjects[$note]; + $noteDetails['objTextData'] = $textObject; + } else { + $noteDetails['objTextData']['text'] = ''; + } + } + $cellAddress = str_replace('$', '', $noteDetails['cellRef']); + $xls->phpSheet->getComment($cellAddress)->setAuthor($noteDetails['author'])->setText($xls->parseRichText($noteDetails['objTextData']['text'])); + } + } + if ($selectedCells !== '') { + $xls->phpSheet->setSelectedCells($selectedCells); + } + } + if ($xls->activeSheetSet === false) { + $xls->spreadsheet->setActiveSheetIndex(0); + } + + // add the named ranges (defined names) + foreach ($xls->definedname as $definedName) { + if ($definedName['isBuiltInName']) { + switch ($definedName['name']) { + case pack('C', 0x06): + // print area + // in general, formula looks like this: Foo!$C$7:$J$66,Bar!$A$1:$IV$2 + $ranges = explode(',', $definedName['formula']); // FIXME: what if sheetname contains comma? + + $extractedRanges = []; + $sheetName = ''; + /** @var non-empty-string $range */ + foreach ($ranges as $range) { + // $range should look like one of these + // Foo!$C$7:$J$66 + // Bar!$A$1:$IV$2 + $explodes = Worksheet::extractSheetTitle($range, true); + $sheetName = trim($explodes[0], "'"); + if (!str_contains($explodes[1], ':')) { + $explodes[1] = $explodes[1] . ':' . $explodes[1]; + } + $extractedRanges[] = str_replace('$', '', $explodes[1]); // C7:J66 + } + if ($docSheet = $xls->spreadsheet->getSheetByName($sheetName)) { + $docSheet->getPageSetup()->setPrintArea(implode(',', $extractedRanges)); // C7:J66,A1:IV2 + } + + break; + case pack('C', 0x07): + // print titles (repeating rows) + // Assuming BIFF8, there are 3 cases + // 1. repeating rows + // formula looks like this: Sheet!$A$1:$IV$2 + // rows 1-2 repeat + // 2. repeating columns + // formula looks like this: Sheet!$A$1:$B$65536 + // columns A-B repeat + // 3. both repeating rows and repeating columns + // formula looks like this: Sheet!$A$1:$B$65536,Sheet!$A$1:$IV$2 + $ranges = explode(',', $definedName['formula']); // FIXME: what if sheetname contains comma? + foreach ($ranges as $range) { + // $range should look like this one of these + // Sheet!$A$1:$B$65536 + // Sheet!$A$1:$IV$2 + if (str_contains($range, '!')) { + $explodes = Worksheet::extractSheetTitle($range, true); + if ($docSheet = $xls->spreadsheet->getSheetByName($explodes[0])) { + $extractedRange = $explodes[1]; + $extractedRange = str_replace('$', '', $extractedRange); + + $coordinateStrings = explode(':', $extractedRange); + if (count($coordinateStrings) == 2) { + [$firstColumn, $firstRow] = Coordinate::coordinateFromString($coordinateStrings[0]); + [$lastColumn, $lastRow] = Coordinate::coordinateFromString($coordinateStrings[1]); + + if ($firstColumn == 'A' && $lastColumn == 'IV') { + // then we have repeating rows + $docSheet->getPageSetup()->setRowsToRepeatAtTop([$firstRow, $lastRow]); + } elseif ($firstRow == 1 && $lastRow == 65536) { + // then we have repeating columns + $docSheet->getPageSetup()->setColumnsToRepeatAtLeft([$firstColumn, $lastColumn]); + } + } + } + } + } + + break; + } + } else { + // Extract range + /** @var non-empty-string $formula */ + $formula = $definedName['formula']; + if (str_contains($formula, '!')) { + $explodes = Worksheet::extractSheetTitle($formula, true); + if ( + ($docSheet = $xls->spreadsheet->getSheetByName($explodes[0])) + || ($docSheet = $xls->spreadsheet->getSheetByName(trim($explodes[0], "'"))) + ) { + $extractedRange = $explodes[1]; + + $localOnly = ($definedName['scope'] === 0) ? false : true; + + $scope = ($definedName['scope'] === 0) ? null : $xls->spreadsheet->getSheetByName($xls->sheets[$definedName['scope'] - 1]['name']); + + $xls->spreadsheet->addNamedRange(new NamedRange((string) $definedName['name'], $docSheet, $extractedRange, $localOnly, $scope)); + } + } + // Named Value + // TODO Provide support for named values + } + } + $xls->data = ''; + + return $xls->spreadsheet; + } +} diff --git a/src/PhpSpreadsheet/Reader/Xls/Mappings.php b/src/PhpSpreadsheet/Reader/Xls/Mappings.php new file mode 100644 index 0000000000..7a6be8e86a --- /dev/null +++ b/src/PhpSpreadsheet/Reader/Xls/Mappings.php @@ -0,0 +1,271 @@ + ['ISNA', 1], + 3 => ['ISERROR', 1], + 10 => ['NA', 0], + 15 => ['SIN', 1], + 16 => ['COS', 1], + 17 => ['TAN', 1], + 18 => ['ATAN', 1], + 19 => ['PI', 0], + 20 => ['SQRT', 1], + 21 => ['EXP', 1], + 22 => ['LN', 1], + 23 => ['LOG10', 1], + 24 => ['ABS', 1], + 25 => ['INT', 1], + 26 => ['SIGN', 1], + 27 => ['ROUND', 2], + 30 => ['REPT', 2], + 31 => ['MID', 3], + 32 => ['LEN', 1], + 33 => ['VALUE', 1], + 34 => ['TRUE', 0], + 35 => ['FALSE', 0], + 38 => ['NOT', 1], + 39 => ['MOD', 2], + 40 => ['DCOUNT', 3], + 41 => ['DSUM', 3], + 42 => ['DAVERAGE', 3], + 43 => ['DMIN', 3], + 44 => ['DMAX', 3], + 45 => ['DSTDEV', 3], + 48 => ['TEXT', 2], + 61 => ['MIRR', 3], + 63 => ['RAND', 0], + 65 => ['DATE', 3], + 66 => ['TIME', 3], + 67 => ['DAY', 1], + 68 => ['MONTH', 1], + 69 => ['YEAR', 1], + 71 => ['HOUR', 1], + 72 => ['MINUTE', 1], + 73 => ['SECOND', 1], + 74 => ['NOW', 0], + 75 => ['AREAS', 1], + 76 => ['ROWS', 1], + 77 => ['COLUMNS', 1], + 83 => ['TRANSPOSE', 1], + 86 => ['TYPE', 1], + 97 => ['ATAN2', 2], + 98 => ['ASIN', 1], + 99 => ['ACOS', 1], + 105 => ['ISREF', 1], + 111 => ['CHAR', 1], + 112 => ['LOWER', 1], + 113 => ['UPPER', 1], + 114 => ['PROPER', 1], + 117 => ['EXACT', 2], + 118 => ['TRIM', 1], + 119 => ['REPLACE', 4], + 121 => ['CODE', 1], + 126 => ['ISERR', 1], + 127 => ['ISTEXT', 1], + 128 => ['ISNUMBER', 1], + 129 => ['ISBLANK', 1], + 130 => ['T', 1], + 131 => ['N', 1], + 140 => ['DATEVALUE', 1], + 141 => ['TIMEVALUE', 1], + 142 => ['SLN', 3], + 143 => ['SYD', 4], + 162 => ['CLEAN', 1], + 163 => ['MDETERM', 1], + 164 => ['MINVERSE', 1], + 165 => ['MMULT', 2], + 184 => ['FACT', 1], + 189 => ['DPRODUCT', 3], + 190 => ['ISNONTEXT', 1], + 195 => ['DSTDEVP', 3], + 196 => ['DVARP', 3], + 198 => ['ISLOGICAL', 1], + 199 => ['DCOUNTA', 3], + 207 => ['REPLACEB', 4], + 210 => ['MIDB', 3], + 211 => ['LENB', 1], + 212 => ['ROUNDUP', 2], + 213 => ['ROUNDDOWN', 2], + 214 => ['ASC', 1], + 215 => ['DBCS', 1], + 221 => ['TODAY', 0], + 229 => ['SINH', 1], + 230 => ['COSH', 1], + 231 => ['TANH', 1], + 232 => ['ASINH', 1], + 233 => ['ACOSH', 1], + 234 => ['ATANH', 1], + 235 => ['DGET', 3], + 244 => ['INFO', 1], + 252 => ['FREQUENCY', 2], + 261 => ['ERROR.TYPE', 1], + 271 => ['GAMMALN', 1], + 273 => ['BINOMDIST', 4], + 274 => ['CHIDIST', 2], + 275 => ['CHIINV', 2], + 276 => ['COMBIN', 2], + 277 => ['CONFIDENCE', 3], + 278 => ['CRITBINOM', 3], + 279 => ['EVEN', 1], + 280 => ['EXPONDIST', 3], + 281 => ['FDIST', 3], + 282 => ['FINV', 3], + 283 => ['FISHER', 1], + 284 => ['FISHERINV', 1], + 285 => ['FLOOR', 2], + 286 => ['GAMMADIST', 4], + 287 => ['GAMMAINV', 3], + 288 => ['CEILING', 2], + 289 => ['HYPGEOMDIST', 4], + 290 => ['LOGNORMDIST', 3], + 291 => ['LOGINV', 3], + 292 => ['NEGBINOMDIST', 3], + 293 => ['NORMDIST', 4], + 294 => ['NORMSDIST', 1], + 295 => ['NORMINV', 3], + 296 => ['NORMSINV', 1], + 297 => ['STANDARDIZE', 3], + 298 => ['ODD', 1], + 299 => ['PERMUT', 2], + 300 => ['POISSON', 3], + 301 => ['TDIST', 3], + 302 => ['WEIBULL', 4], + 303 => ['SUMXMY2', 2], + 304 => ['SUMX2MY2', 2], + 305 => ['SUMX2PY2', 2], + 306 => ['CHITEST', 2], + 307 => ['CORREL', 2], + 308 => ['COVAR', 2], + 309 => ['FORECAST', 3], + 310 => ['FTEST', 2], + 311 => ['INTERCEPT', 2], + 312 => ['PEARSON', 2], + 313 => ['RSQ', 2], + 314 => ['STEYX', 2], + 315 => ['SLOPE', 2], + 316 => ['TTEST', 4], + 325 => ['LARGE', 2], + 326 => ['SMALL', 2], + 327 => ['QUARTILE', 2], + 328 => ['PERCENTILE', 2], + 331 => ['TRIMMEAN', 2], + 332 => ['TINV', 2], + 337 => ['POWER', 2], + 342 => ['RADIANS', 1], + 343 => ['DEGREES', 1], + 346 => ['COUNTIF', 2], + 347 => ['COUNTBLANK', 1], + 350 => ['ISPMT', 4], + 351 => ['DATEDIF', 3], + 352 => ['DATESTRING', 1], + 353 => ['NUMBERSTRING', 2], + 360 => ['PHONETIC', 1], + 368 => ['BAHTTEXT', 1], + ]; + + /** + * Map tFuncV values (functions with variable number of arguments). + * Key is tFuncV value. + * Value is Excel function name. + */ + const TFUNCV_MAPPINGS = [ + 0 => 'COUNT', + 1 => 'IF', + 4 => 'SUM', + 5 => 'AVERAGE', + 6 => 'MIN', + 7 => 'MAX', + 8 => 'ROW', + 9 => 'COLUMN', + 11 => 'NPV', + 12 => 'STDEV', + 13 => 'DOLLAR', + 14 => 'FIXED', + 28 => 'LOOKUP', + 29 => 'INDEX', + 36 => 'AND', + 37 => 'OR', + 46 => 'VAR', + 49 => 'LINEST', + 50 => 'TREND', + 51 => 'LOGEST', + 52 => 'GROWTH', + 56 => 'PV', + 57 => 'FV', + 58 => 'NPER', + 59 => 'PMT', + 60 => 'RATE', + 62 => 'IRR', + 64 => 'MATCH', + 70 => 'WEEKDAY', + 78 => 'OFFSET', + 82 => 'SEARCH', + 100 => 'CHOOSE', + 101 => 'HLOOKUP', + 102 => 'VLOOKUP', + 109 => 'LOG', + 115 => 'LEFT', + 116 => 'RIGHT', + 120 => 'SUBSTITUTE', + 124 => 'FIND', + 125 => 'CELL', + 144 => 'DDB', + 148 => 'INDIRECT', + 167 => 'IPMT', + 168 => 'PPMT', + 169 => 'COUNTA', + 183 => 'PRODUCT', + 193 => 'STDEVP', + 194 => 'VARP', + 197 => 'TRUNC', + 204 => 'USDOLLAR', + 205 => 'FINDB', + 206 => 'SEARCHB', + 208 => 'LEFTB', + 209 => 'RIGHTB', + 216 => 'RANK', + 219 => 'ADDRESS', + 220 => 'DAYS360', + 222 => 'VDB', + 227 => 'MEDIAN', + 228 => 'SUMPRODUCT', + 247 => 'DB', + 255 => '', + 269 => 'AVEDEV', + 270 => 'BETADIST', + 272 => 'BETAINV', + 317 => 'PROB', + 318 => 'DEVSQ', + 319 => 'GEOMEAN', + 320 => 'HARMEAN', + 321 => 'SUMSQ', + 322 => 'KURT', + 323 => 'SKEW', + 324 => 'ZTEST', + 329 => 'PERCENTRANK', + 330 => 'MODE', + 336 => 'CONCATENATE', + 344 => 'SUBTOTAL', + 345 => 'SUMIF', + 354 => 'ROMAN', + 358 => 'GETPIVOTDATA', + 359 => 'HYPERLINK', + 361 => 'AVERAGEA', + 362 => 'MAXA', + 363 => 'MINA', + 364 => 'STDEVPA', + 365 => 'VARPA', + 366 => 'STDEVA', + 367 => 'VARA', + ]; +} diff --git a/src/PhpSpreadsheet/Reader/XlsBase.php b/src/PhpSpreadsheet/Reader/XlsBase.php new file mode 100644 index 0000000000..e6969db04d --- /dev/null +++ b/src/PhpSpreadsheet/Reader/XlsBase.php @@ -0,0 +1,397 @@ + 0x00, + Border::BORDER_THIN, // => 0x01, + Border::BORDER_MEDIUM, // => 0x02, + Border::BORDER_DASHED, // => 0x03, + Border::BORDER_DOTTED, // => 0x04, + Border::BORDER_THICK, // => 0x05, + Border::BORDER_DOUBLE, // => 0x06, + Border::BORDER_HAIR, // => 0x07, + Border::BORDER_MEDIUMDASHED, // => 0x08, + Border::BORDER_DASHDOT, // => 0x09, + Border::BORDER_MEDIUMDASHDOT, // => 0x0A, + Border::BORDER_DASHDOTDOT, // => 0x0B, + Border::BORDER_MEDIUMDASHDOTDOT, // => 0x0C, + Border::BORDER_SLANTDASHDOT, // => 0x0D, + Border::BORDER_OMIT, // => 0x0E, + Border::BORDER_OMIT, // => 0x0F, + ]; + + /** + * Codepage set in the Excel file being read. Only important for BIFF5 (Excel 5.0 - Excel 95) + * For BIFF8 (Excel 97 - Excel 2003) this will always have the value 'UTF-16LE'. + */ + protected string $codepage = ''; + + public function setCodepage(string $codepage): void + { + if (CodePage::validate($codepage) === false) { + throw new PhpSpreadsheetException('Unknown codepage: ' . $codepage); + } + + $this->codepage = $codepage; + } + + public function getCodepage(): string + { + return $this->codepage; + } + + /** + * Can the current IReader read the file? + */ + public function canRead(string $filename): bool + { + if (File::testFileNoThrow($filename) === false) { + return false; + } + + try { + // Use ParseXL for the hard work. + $ole = new OLERead(); + + // get excel data + $ole->read($filename); + if ($ole->wrkbook === null) { + throw new Exception('The filename ' . $filename . ' is not recognised as a Spreadsheet file'); + } + + return true; + } catch (PhpSpreadsheetException) { + return false; + } + } + + /** + * Extract RGB color + * OpenOffice.org's Documentation of the Microsoft Excel File Format, section 2.5.4. + * + * @param string $rgb Encoded RGB value (4 bytes) + */ + protected static function readRGB(string $rgb): array + { + // offset: 0; size 1; Red component + $r = ord($rgb[0]); + + // offset: 1; size: 1; Green component + $g = ord($rgb[1]); + + // offset: 2; size: 1; Blue component + $b = ord($rgb[2]); + + // HEX notation, e.g. 'FF00FC' + $rgb = sprintf('%02X%02X%02X', $r, $g, $b); + + return ['rgb' => $rgb]; + } + + /** + * Extracts an Excel Unicode short string (8-bit string length) + * OpenOffice documentation: 2.5.3 + * function will automatically find out where the Unicode string ends. + */ + protected static function readUnicodeStringShort(string $subData): array + { + // offset: 0: size: 1; length of the string (character count) + $characterCount = ord($subData[0]); + + $string = self::readUnicodeString(substr($subData, 1), $characterCount); + + // add 1 for the string length + ++$string['size']; + + return $string; + } + + /** + * Extracts an Excel Unicode long string (16-bit string length) + * OpenOffice documentation: 2.5.3 + * this function is under construction, needs to support rich text, and Asian phonetic settings. + */ + protected static function readUnicodeStringLong(string $subData): array + { + // offset: 0: size: 2; length of the string (character count) + $characterCount = self::getUInt2d($subData, 0); + + $string = self::readUnicodeString(substr($subData, 2), $characterCount); + + // add 2 for the string length + $string['size'] += 2; + + return $string; + } + + /** + * Read Unicode string with no string length field, but with known character count + * this function is under construction, needs to support rich text, and Asian phonetic settings + * OpenOffice.org's Documentation of the Microsoft Excel File Format, section 2.5.3. + */ + protected static function readUnicodeString(string $subData, int $characterCount): array + { + // offset: 0: size: 1; option flags + // bit: 0; mask: 0x01; character compression (0 = compressed 8-bit, 1 = uncompressed 16-bit) + $isCompressed = !((0x01 & ord($subData[0])) >> 0); + + // bit: 2; mask: 0x04; Asian phonetic settings + //$hasAsian = (0x04) & ord($subData[0]) >> 2; + + // bit: 3; mask: 0x08; Rich-Text settings + //$hasRichText = (0x08) & ord($subData[0]) >> 3; + + // offset: 1: size: var; character array + // this offset assumes richtext and Asian phonetic settings are off which is generally wrong + // needs to be fixed + $value = self::encodeUTF16(substr($subData, 1, $isCompressed ? $characterCount : 2 * $characterCount), $isCompressed); + + return [ + 'value' => $value, + 'size' => $isCompressed ? 1 + $characterCount : 1 + 2 * $characterCount, // the size in bytes including the option flags + ]; + } + + /** + * Convert UTF-8 string to string surounded by double quotes. Used for explicit string tokens in formulas. + * Example: hello"world --> "hello""world". + * + * @param string $value UTF-8 encoded string + */ + protected static function UTF8toExcelDoubleQuoted(string $value): string + { + return '"' . str_replace('"', '""', $value) . '"'; + } + + /** + * Reads first 8 bytes of a string and return IEEE 754 float. + * + * @param string $data Binary string that is at least 8 bytes long + */ + protected static function extractNumber(string $data): int|float + { + $rknumhigh = self::getInt4d($data, 4); + $rknumlow = self::getInt4d($data, 0); + $sign = ($rknumhigh & self::HIGH_ORDER_BIT) >> 31; + $exp = (($rknumhigh & 0x7FF00000) >> 20) - 1023; + $mantissa = (0x100000 | ($rknumhigh & 0x000FFFFF)); + $mantissalow1 = ($rknumlow & self::HIGH_ORDER_BIT) >> 31; + $mantissalow2 = ($rknumlow & 0x7FFFFFFF); + $value = $mantissa / 2 ** (20 - $exp); + + if ($mantissalow1 != 0) { + $value += 1 / 2 ** (21 - $exp); + } + + if ($mantissalow2 != 0) { + $value += $mantissalow2 / 2 ** (52 - $exp); + } + if ($sign) { + $value *= -1; + } + + return $value; + } + + protected static function getIEEE754(int $rknum): float|int + { + if (($rknum & 0x02) != 0) { + $value = $rknum >> 2; + } else { + // changes by mmp, info on IEEE754 encoding from + // research.microsoft.com/~hollasch/cgindex/coding/ieeefloat.html + // The RK format calls for using only the most significant 30 bits + // of the 64 bit floating point value. The other 34 bits are assumed + // to be 0 so we use the upper 30 bits of $rknum as follows... + $sign = ($rknum & self::HIGH_ORDER_BIT) >> 31; + $exp = ($rknum & 0x7FF00000) >> 20; + $mantissa = (0x100000 | ($rknum & 0x000FFFFC)); + $value = $mantissa / 2 ** (20 - ($exp - 1023)); + if ($sign) { + $value = -1 * $value; + } + //end of changes by mmp + } + if (($rknum & 0x01) != 0) { + $value /= 100; + } + + return $value; + } + + /** + * Get UTF-8 string from (compressed or uncompressed) UTF-16 string. + */ + protected static function encodeUTF16(string $string, bool $compressed = false): string + { + if ($compressed) { + $string = self::uncompressByteString($string); + } + + return StringHelper::convertEncoding($string, 'UTF-8', 'UTF-16LE'); + } + + /** + * Convert UTF-16 string in compressed notation to uncompressed form. Only used for BIFF8. + */ + protected static function uncompressByteString(string $string): string + { + $uncompressedString = ''; + $strLen = strlen($string); + for ($i = 0; $i < $strLen; ++$i) { + $uncompressedString .= $string[$i] . "\0"; + } + + return $uncompressedString; + } + + /** + * Convert string to UTF-8. Only used for BIFF5. + */ + protected function decodeCodepage(string $string): string + { + return StringHelper::convertEncoding($string, 'UTF-8', $this->codepage); + } + + /** + * Read 16-bit unsigned integer. + */ + public static function getUInt2d(string $data, int $pos): int + { + return ord($data[$pos]) | (ord($data[$pos + 1]) << 8); + } + + /** + * Read 16-bit signed integer. + */ + public static function getInt2d(string $data, int $pos): int + { + return unpack('s', $data[$pos] . $data[$pos + 1])[1]; // @phpstan-ignore-line + } + + /** + * Read 32-bit signed integer. + */ + public static function getInt4d(string $data, int $pos): int + { + // FIX: represent numbers correctly on 64-bit system + // http://sourceforge.net/tracker/index.php?func=detail&aid=1487372&group_id=99160&atid=623334 + // Changed by Andreas Rehm 2006 to ensure correct result of the <<24 block on 32 and 64bit systems + $_or_24 = ord($data[$pos + 3]); + if ($_or_24 >= 128) { + // negative number + $_ord_24 = -abs((256 - $_or_24) << 24); + } else { + $_ord_24 = ($_or_24 & 127) << 24; + } + + return ord($data[$pos]) | (ord($data[$pos + 1]) << 8) | (ord($data[$pos + 2]) << 16) | $_ord_24; + } +} diff --git a/src/PhpSpreadsheet/Reader/Xlsx.php b/src/PhpSpreadsheet/Reader/Xlsx.php index 3ada95ee40..d86102c3c9 100644 --- a/src/PhpSpreadsheet/Reader/Xlsx.php +++ b/src/PhpSpreadsheet/Reader/Xlsx.php @@ -395,6 +395,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet // Initialisations $excel = new Spreadsheet(); + $excel->setValueBinder($this->valueBinder); $excel->removeSheetByIndex(0); $addingFirstCellStyleXf = true; $addingFirstCellXf = true; @@ -914,6 +915,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet $value = self::castToString($c); if (is_numeric($value)) { $value += 0; + $cellDataType = DataType::TYPE_NUMERIC; } } else { // Formula diff --git a/src/PhpSpreadsheet/Reader/Xlsx/Chart.php b/src/PhpSpreadsheet/Reader/Xlsx/Chart.php index 94d9c6af50..146944150f 100644 --- a/src/PhpSpreadsheet/Reader/Xlsx/Chart.php +++ b/src/PhpSpreadsheet/Reader/Xlsx/Chart.php @@ -95,6 +95,7 @@ public function readChart(SimpleXMLElement $chartElements, string $chartName): \ $gapWidth = null; $useUpBars = null; $useDownBars = null; + $noBorder = false; foreach ($chartElementsC as $chartElementKey => $chartElement) { switch ($chartElementKey) { case 'spPr': @@ -108,6 +109,9 @@ public function readChart(SimpleXMLElement $chartElements, string $chartName): \ if (isset($children->ln)) { $chartBorderLines = new GridLines(); $this->readLineStyle($chartElementsC, $chartBorderLines); + if (isset($children->ln->noFill)) { + $noBorder = true; + } } break; @@ -470,6 +474,7 @@ public function readChart(SimpleXMLElement $chartElements, string $chartName): \ if ($chartBorderLines !== null) { $chart->setBorderLines($chartBorderLines); } + $chart->setNoBorder($noBorder); $chart->setRoundedCorners($roundedCorners); if (is_bool($autoTitleDeleted)) { $chart->setAutoTitleDeleted($autoTitleDeleted); diff --git a/src/PhpSpreadsheet/Reader/Xml.php b/src/PhpSpreadsheet/Reader/Xml.php index e0f2187476..ab7c0f7c80 100644 --- a/src/PhpSpreadsheet/Reader/Xml.php +++ b/src/PhpSpreadsheet/Reader/Xml.php @@ -44,6 +44,18 @@ public function __construct() { parent::__construct(); $this->securityScanner = XmlScanner::getInstance($this); + /** @var callable */ + $unentity = [self::class, 'unentity']; + $this->securityScanner->setAdditionalCallback($unentity); + } + + public static function unentity(string $contents): string + { + $contents = preg_replace('/&(amp|lt|gt|quot|apos);/', "\u{fffe}\u{feff}\$1;", trim($contents)) ?? $contents; + $contents = html_entity_decode($contents, ENT_NOQUOTES | ENT_SUBSTITUTE | ENT_HTML401, 'UTF-8'); + $contents = str_replace("\u{fffe}\u{feff}", '&', $contents); + + return $contents; } private string $fileContents = ''; @@ -242,6 +254,7 @@ public function loadSpreadsheetFromString(string $contents): Spreadsheet { // Create new Spreadsheet $spreadsheet = new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); $spreadsheet->removeSheetByIndex(0); // Load into this instance @@ -255,6 +268,7 @@ protected function loadSpreadsheetFromFile(string $filename): Spreadsheet { // Create new Spreadsheet $spreadsheet = new Spreadsheet(); + $spreadsheet->setValueBinder($this->valueBinder); $spreadsheet->removeSheetByIndex(0); // Load into this instance @@ -512,9 +526,6 @@ public function loadIntoExisting(string $filename, Spreadsheet $spreadsheet, boo if (isset($cell_ss['StyleID'])) { $style = (string) $cell_ss['StyleID']; if ((isset($this->styles[$style])) && (!empty($this->styles[$style]))) { - //if (!$spreadsheet->getActiveSheet()->cellExists($columnID . $rowID)) { - // $spreadsheet->getActiveSheet()->getCell($columnID . $rowID)->setValue(null); - //} $spreadsheet->getActiveSheet()->getStyle($cellRange) ->applyFromArray($this->styles[$style]); } diff --git a/src/PhpSpreadsheet/Spreadsheet.php b/src/PhpSpreadsheet/Spreadsheet.php index c0a3a9bc21..6ec75c33cc 100644 --- a/src/PhpSpreadsheet/Spreadsheet.php +++ b/src/PhpSpreadsheet/Spreadsheet.php @@ -4,6 +4,7 @@ use JsonSerializable; use PhpOffice\PhpSpreadsheet\Calculation\Calculation; +use PhpOffice\PhpSpreadsheet\Cell\IValueBinder; use PhpOffice\PhpSpreadsheet\Document\Properties; use PhpOffice\PhpSpreadsheet\Document\Security; use PhpOffice\PhpSpreadsheet\Reader\Xlsx as XlsxReader; @@ -173,6 +174,8 @@ class Spreadsheet implements JsonSerializable private Theme $theme; + private ?IValueBinder $valueBinder = null; + public function getTheme(): Theme { return $this->theme; @@ -1591,4 +1594,16 @@ public function getLegacyDrawing(Worksheet $worksheet): ?string { return $this->unparsedLoadedData['sheets'][$worksheet->getCodeName()]['legacyDrawing'] ?? null; } + + public function getValueBinder(): ?IValueBinder + { + return $this->valueBinder; + } + + public function setValueBinder(?IValueBinder $valueBinder): self + { + $this->valueBinder = $valueBinder; + + return $this; + } } diff --git a/src/PhpSpreadsheet/Writer/Xlsx/Chart.php b/src/PhpSpreadsheet/Writer/Xlsx/Chart.php index ca16557a88..3bbcb3a883 100644 --- a/src/PhpSpreadsheet/Writer/Xlsx/Chart.php +++ b/src/PhpSpreadsheet/Writer/Xlsx/Chart.php @@ -118,7 +118,7 @@ public function writeChart(\PhpOffice\PhpSpreadsheet\Chart\Chart $chart, bool $c $this->writeColor($objWriter, $fillColor); } $borderLines = $chart->getBorderLines(); - $this->writeLineStyles($objWriter, $borderLines); + $this->writeLineStyles($objWriter, $borderLines, $chart->getNoBorder()); $this->writeEffects($objWriter, $borderLines); $objWriter->endElement(); // c:spPr diff --git a/tests/PhpSpreadsheetTests/Chart/Issue562Test.php b/tests/PhpSpreadsheetTests/Chart/Issue562Test.php new file mode 100644 index 0000000000..21bb903fa6 --- /dev/null +++ b/tests/PhpSpreadsheetTests/Chart/Issue562Test.php @@ -0,0 +1,128 @@ +setIncludeCharts(true); + } + + public function writeCharts(XlsxWriter $writer): void + { + $writer->setIncludeCharts(true); + } + + /** + * @dataProvider providerNoBorder + */ + public function testNoBorder(?bool $noBorder, bool $expectedResult): void + { + $spreadsheet = new Spreadsheet(); + $worksheet = $spreadsheet->getActiveSheet(); + $worksheet->fromArray( + [ + ['', 2010, 2011, 2012], + ['Q1', 12, 15, 21], + ['Q2', 56, 73, 86], + ['Q3', 52, 61, 69], + ['Q4', 30, 32, 0], + ] + ); + + $dataSeriesLabels = [ + new DataSeriesValues(DataSeriesValues::DATASERIES_TYPE_STRING, 'Worksheet!$B$1', null, 1), // 2010 + new DataSeriesValues(DataSeriesValues::DATASERIES_TYPE_STRING, 'Worksheet!$C$1', null, 1), // 2011 + new DataSeriesValues(DataSeriesValues::DATASERIES_TYPE_STRING, 'Worksheet!$D$1', null, 1), // 2012 + ]; + + $xAxisTickValues = [ + new DataSeriesValues(DataSeriesValues::DATASERIES_TYPE_STRING, 'Worksheet!$A$2:$A$5', null, 4), // Q1 to Q4 + ]; + + $dataSeriesValues = [ + new DataSeriesValues(DataSeriesValues::DATASERIES_TYPE_NUMBER, 'Worksheet!$B$2:$B$5', null, 4), + new DataSeriesValues(DataSeriesValues::DATASERIES_TYPE_NUMBER, 'Worksheet!$C$2:$C$5', null, 4), + new DataSeriesValues(DataSeriesValues::DATASERIES_TYPE_NUMBER, 'Worksheet!$D$2:$D$5', null, 4), + ]; + + // Build the dataseries + $series = new DataSeries( + DataSeries::TYPE_AREACHART, // plotType + DataSeries::GROUPING_PERCENT_STACKED, // plotGrouping + range(0, count($dataSeriesValues) - 1), // plotOrder + $dataSeriesLabels, // plotLabel + $xAxisTickValues, // plotCategory + $dataSeriesValues // plotValues + ); + + $plotArea = new PlotArea(null, [$series]); + $legend = new ChartLegend(ChartLegend::POSITION_TOPRIGHT, null, false); + + $title = new Title('Test %age-Stacked Area Chart'); + $yAxisLabel = new Title('Value ($k)'); + + $chart = new Chart( + 'chart1', // name + $title, // title + $legend, // legend + $plotArea, // plotArea + true, // plotVisibleOnly + DataSeries::EMPTY_AS_GAP, // displayBlanksAs + null, // xAxisLabel + $yAxisLabel // yAxisLabel + ); + + // Set the position where the chart should appear in the worksheet + $chart->setTopLeftPosition('A7'); + $chart->setBottomRightPosition('H20'); + + if ($noBorder !== null) { + $chart->setNoBorder($noBorder); + } + + // Add the chart to the worksheet + $worksheet->addChart($chart); + + /** @var callable */ + $callableReader = [$this, 'readCharts']; + /** @var callable */ + $callableWriter = [$this, 'writeCharts']; + $reloadedSpreadsheet = $this->writeAndReload($spreadsheet, 'Xlsx', $callableReader, $callableWriter); + $spreadsheet->disconnectWorksheets(); + + $sheet = $reloadedSpreadsheet->getActiveSheet(); + $charts2 = $sheet->getChartCollection(); + self::assertCount(1, $charts2); + $chart2 = $charts2[0]; + self::assertNotNull($chart2); + self::assertSame($expectedResult, $chart2->getNoBorder()); + + $reloadedSpreadsheet->disconnectWorksheets(); + } + + public static function providerNoBorder(): array + { + return [ + [true, true], + [false, false], + [null, false], + ]; + } +} diff --git a/tests/PhpSpreadsheetTests/Reader/Csv/BinderTest.php b/tests/PhpSpreadsheetTests/Reader/Csv/BinderTest.php new file mode 100644 index 0000000000..95d8d5cb98 --- /dev/null +++ b/tests/PhpSpreadsheetTests/Reader/Csv/BinderTest.php @@ -0,0 +1,54 @@ +loadSpreadsheetFromString($data); + $sheet1 = $spreadsheet1->getActiveSheet(); + $sheet1->getCell('A3')->setValueExplicit(7, DataType::TYPE_STRING); + $sheet1->getCell('B3')->setValueExplicit(8, DataType::TYPE_NUMERIC); + $sheet1->setCellValue('C3', 9); + $sheet1->fromArray([10, 11, 12], null, 'A4'); + $expected1 = [ + [1, 2, 3], + [4, 5, 6], + ['7', 8, 9], + [10, 11, 12], + ]; + self::AssertSame($expected1, $sheet1->toArray(null, false, false)); + + $reader2 = new Csv(); + $reader2->setValueBinder(new StringValueBinder()); + $spreadsheet2 = $reader2->loadSpreadsheetFromString($data); + $sheet2 = $spreadsheet2->getActiveSheet(); + $sheet2->getCell('A3')->setValueExplicit(7, DataType::TYPE_STRING); + $sheet2->getCell('B3')->setValueExplicit(8, DataType::TYPE_NUMERIC); + $sheet2->setCellValue('C3', 9); + $sheet2->fromArray([10, 11, 12], null, 'A4'); + $expected2 = [ + ['1', '2', '3'], + ['4', '5', '6'], + ['7', 8, '9'], + ['10', '11', '12'], + ]; + self::AssertSame($expected2, $sheet2->toArray(null, false, false)); + + $spreadsheet1->disconnectWorksheets(); + $spreadsheet2->disconnectWorksheets(); + } +} diff --git a/tests/PhpSpreadsheetTests/Reader/Html/BinderTest.php b/tests/PhpSpreadsheetTests/Reader/Html/BinderTest.php new file mode 100644 index 0000000000..cbf1c4fdcd --- /dev/null +++ b/tests/PhpSpreadsheetTests/Reader/Html/BinderTest.php @@ -0,0 +1,58 @@ + + + 123 + 456 + + + EOF; + $reader1 = new Html(); + $spreadsheet1 = $reader1->loadFromString($data); + $sheet1 = $spreadsheet1->getActiveSheet(); + $sheet1->getCell('A3')->setValueExplicit(7, DataType::TYPE_STRING); + $sheet1->getCell('B3')->setValueExplicit(8, DataType::TYPE_NUMERIC); + $sheet1->setCellValue('C3', 9); + $sheet1->fromArray([10, 11, 12], null, 'A4'); + $expected1 = [ + [1, 2, 3], + [4, 5, 6], + ['7', 8, 9], + [10, 11, 12], + ]; + self::AssertSame($expected1, $sheet1->toArray(null, false, false)); + + $reader2 = new Html(); + $reader2->setValueBinder(new StringValueBinder()); + $spreadsheet2 = $reader2->loadFromString($data); + $sheet2 = $spreadsheet2->getActiveSheet(); + $sheet2->getCell('A3')->setValueExplicit(7, DataType::TYPE_STRING); + $sheet2->getCell('B3')->setValueExplicit(8, DataType::TYPE_NUMERIC); + $sheet2->setCellValue('C3', 9); + $sheet2->fromArray([10, 11, 12], null, 'A4'); + $expected2 = [ + ['1', '2', '3'], + ['4', '5', '6'], + ['7', 8, '9'], + ['10', '11', '12'], + ]; + self::AssertSame($expected2, $sheet2->toArray(null, false, false)); + + $spreadsheet1->disconnectWorksheets(); + $spreadsheet2->disconnectWorksheets(); + } +} diff --git a/tests/PhpSpreadsheetTests/Reader/Html/Issue1107Test.php b/tests/PhpSpreadsheetTests/Reader/Html/Issue1107Test.php new file mode 100644 index 0000000000..8668500839 --- /dev/null +++ b/tests/PhpSpreadsheetTests/Reader/Html/Issue1107Test.php @@ -0,0 +1,46 @@ +outfile !== '') { + unlink($this->outfile); + $this->outfile = ''; + } + } + + public function testIssue1107(): void + { + // failure due to cached file size + $outstr = str_repeat('a', 1023) . "\n"; + $allout = str_repeat($outstr, 10); + $this->outfile = $outfile = File::temporaryFilename(); + file_put_contents($outfile, $allout); + self::assertSame(10240, filesize($outfile)); + $spreadsheet = new Spreadsheet(); + $sheet = $spreadsheet->getActiveSheet(); + $sheet->getCell('A1')->setValue(1); + $writer = new HtmlWriter($spreadsheet); + $writer->save($outfile); + $spreadsheet->disconnectWorksheets(); + $reader = new HtmlReader(); + $spreadsheet2 = $reader->load($outfile); + $sheet2 = $spreadsheet2->getActiveSheet(); + self::assertSame(1, $sheet2->getCell('A1')->getValue()); + + $spreadsheet2->disconnectWorksheets(); + } +} diff --git a/tests/PhpSpreadsheetTests/Reader/Slk/BinderTest.php b/tests/PhpSpreadsheetTests/Reader/Slk/BinderTest.php new file mode 100644 index 0000000000..4be6753f11 --- /dev/null +++ b/tests/PhpSpreadsheetTests/Reader/Slk/BinderTest.php @@ -0,0 +1,30 @@ +load($infile); + $sheet = $spreadsheet->getActiveSheet(); + $expected1 = [[1, 2], [3, '']]; + self::assertSame($expected1, $sheet->toArray(null, false, false)); + $reader2 = new Slk(); + $reader2->setValueBinder(new StringValueBinder()); + $spreadsheet2 = $reader2->load($infile); + $sheet2 = $spreadsheet2->getActiveSheet(); + $expected2 = [['1', '2'], ['3', '']]; + self::assertSame($expected2, $sheet2->toArray(null, false, false)); + $spreadsheet->disconnectWorksheets(); + $spreadsheet2->disconnectWorksheets(); + } +} diff --git a/tests/PhpSpreadsheetTests/Reader/Xml/HtmlEntitiesLoadTest.php b/tests/PhpSpreadsheetTests/Reader/Xml/HtmlEntitiesLoadTest.php new file mode 100644 index 0000000000..197e008416 --- /dev/null +++ b/tests/PhpSpreadsheetTests/Reader/Xml/HtmlEntitiesLoadTest.php @@ -0,0 +1,29 @@ +', $contents); + $reader = new XmlReader(); + $spreadsheet = $reader->load($infile); + $sheet = $spreadsheet->getActiveSheet(); + self::assertSame('Τέλεια όραση χωρίς γυαλιά', $sheet->getCell('E2')->getValue()); + $g2 = $sheet->getCell('G2')->getValue(); + self::assertStringContainsString('
', $g2); + $spreadsheet->disconnectWorksheets(); + } +} diff --git a/tests/data/Reader/Xml/issue.2157.small.xml b/tests/data/Reader/Xml/issue.2157.small.xml new file mode 100644 index 0000000000..93b831f34f --- /dev/null +++ b/tests/data/Reader/Xml/issue.2157.small.xml @@ -0,0 +1,109 @@ + + + + + +id +isbn +barcode +url +title +short_description +description +category +weight +pages +publication_date +3d_photo +photo +pdf_preview +flipping_book +offer +active +availability +size +country +subject +retail_price +wholesale_price +ebook-isbn +ebook-barcode +ebook-retail_price +ebook-wholesale_price +ebook-urn +ebook-drm +ebook-format +ebook-active +author1-name +author1-url +author1-photo +author1-description +author1-link +author2-name +author2-url +author2-photo +author2-description +author2-link +translator-name +translator-photo +translator-description +editor-name +editor-photo +editor-description + + +2 +978-960-364-004-2 +9789603640042 +https://www.dioptra.gr/vivlio/ygeia-diatrofi/teleia-orasi-xoris-gualia/ +Τέλεια όραση χωρίς γυαλιά +Με τη μέθοδο Bates βελτιώνουμε την όρασή μας ή προλαμβάνουμε τα προβλήματα του πιο ευαίσθητου οργάνου του σώματος – των ματιών. +Φορώντας τα γυαλιά «καταλαβαίνεις αμέσως τη διαφορά». Δεν νοιάζεσαι αν θα χρειαστείς αργότερα πιο δυνατούς φακούς, γιατί νομίζεις ότι δεν υπάρχει θεραπεία για τα ταλαιπωρημένα μάτια.</br> +Κι όμως τα πράγματα δεν είναι έτσι.</br> +Οι περισσότεροι από αυτούς που φορούν γυαλιά θα μπορούσαν να τα είχαν αποφύγει. Ακόμα και τώρα μπορούν να βελτιώσουν την όρασή τους με τη μέθοδο Bates.</br> +Η μέθοδος Bates είναι μια σειρά ασκήσεων και τεχνικών που χαλαρώνουν τους οφθαλμολογικούς μυς και τους επανεκπαιδεύουν να εστιάζουν αποτελεσματικά τις φωτεινές ακτίνες που εισέρχονται στο μάτι, επιτρέποντάς μας να βλέπουμε καλά και καθαρά χωρίς γυαλιά.</br> +Εφαρμόστε σήμερα κιόλας, τη μέθοδο Bates για να βελτιώσετε την όρασή σας ή να προλάβετε τα προβλήματα του πιο ευαίσθητου οργάνου του σώματος – των ματιών σας.</br> +Υγεία - Διατροφή +0.272 +160 +2002-06-03 00:00:00 +https://www.dioptra.gr/Images/Products/004_list.jpg +https://www.dioptra.gr/Images/Products/004.jpg + + +0 +1 +1 +140 x 205 +Ελλάδα +- +13.19 +12.44 +- +- +- +- +- +- +- +- +George Kypreotakis +https://www.dioptra.gr/suggrafeas/171/ + + + +- +- +- +- +- +- +- +- +- +- +- + +
+
+