Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#508 Adding Object Mapping #509

Merged
merged 24 commits into from
Nov 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .php-cs-fixer.php
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
'ordered_imports' => ['imports_order' => ['class', 'function', 'const'], 'sort_algorithm' => 'alpha'],
'phpdoc_add_missing_param_annotation' => ['only_untyped' => true],
'phpdoc_align' => ['align' => 'left'],
'phpdoc_no_empty_return' => true,
'phpdoc_no_empty_return' => false,
'phpdoc_order' => true,
'phpdoc_scalar' => true,
'phpdoc_to_comment' => true,
Expand Down
6 changes: 4 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,15 @@ All Notable changes to `Csv` will be documented in this file

- `TabulatDataReader::value`
- `TabulatDataReader::select`
- `TabulatDataReader::map`
- `TabulatDataReader::matching`
- `TabulatDataReader::matchingFirst`
- `TabulatDataReader::matchingFirstOrFail`
- `FragmentFinder` to implement [RFC7111](https://www.rfc-editor.org/rfc/rfc7111)
- `League\Csv\FragmentFinder` to implement [RFC7111](https://www.rfc-editor.org/rfc/rfc7111)
- `ResultSet::fromRecords`
- `Stream::setMaxLineLen`
- `Stream::getMaxLineLen`
- `League\Csv\Serializer` to allow casting records to objects [#508](https://github.com/thephpleague/csv/issues/508)

### Deprecated

Expand All @@ -26,7 +28,7 @@ It's usage will trigger a `E_USER_DEPRECATED` call.

- The optional `$header` argument for `TabularDataReader;;getRecords` becomes a full mapper between the records column offset and the column names [#498](https://github.com/thephpleague/csv/issues/498)
- `ResultSet` constructor now allows the records to be an `array`.
- to the internal `Stream` object it will throw a `RuntimeException` if the rewind action fails
- The internal `Stream` object it will throw a `RuntimeException` if the rewind action fails
- if calls to `fseek` fails (returns `-1` ) a new `RuntimeException` will be thrown too.
- `Stream` can iterate and return the full line respecting `SplFielObject` flags. Previously it only returned the CSV records.

Expand Down
200 changes: 198 additions & 2 deletions docs/9.0/reader/tabular-data-reader.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,202 @@ and its value will represent its header value.
This means that you can re-arrange the column order as well as removing or adding column to the
returned iterator. Added column will only contain the `null` value.

### Mapping records to objects

<p class="message-notice">New in version <code>9.12.0</code></p>

If you prefer working with objects instead of typed arrays it is possible to convert each record using
the `map` method. This method will cast each array record into your specified object. To do so,
the method excepts:

- as its sole argument the name of the class the array will be deserialized in;
- information on how to convert cell value into object properties using dedicated attributes;

As an example if we assume we have the following CSV document:

```csv
date,temperature,place
2011-01-01,1,Galway
2011-01-02,-1,Galway
2011-01-03,0,Galway
2011-01-01,6,Berkeley
2011-01-02,8,Berkeley
2011-01-03,5,Berkeley
```

We can define a PHP DTO using the following class and the attributes.

```php
<?php

use League\Csv\Serializer\Cell;
use League\Csv\Serializer\Record;

#[Record]
final readonly class Weather
{
public function __construct(
public float $temperature,
public Place $place,
#[Cell(castArguments: ['format' => '!Y-m-d'])]
public DateTimeImmutable $date;
) {
}
}

enum Place
{
case Berkeley;
case Galway;
}
```

To get instances of your object, you now can call the `map` method as show below:

```php
$csv = Reader::createFromString($document);
$csv->setHeaderOffset(0);
foreach ($csv->map(Weather::class) as $weather) {
// each $weather entry will be an instance of the Weather class;
}
```

The `Record` attribute is responsible for converting record cell values into the appropriate instance
properties. This means that in order to use the `Record` attribute you are required to have
a `TabularDataReader` with a non-empty header.

The deserialization engine is able to cast the tabular data value into
the appropriate type if its value is a `string` or `null` and the object public properties ares typed with

- `null`
- `mixed`
- a scalar type (support for `true` and `false` type is also present)
- any `Enum` object (backed or not)
- `DateTime`, `DateTimeImmuntable` and any class that extends those two classes.

When converting to a date object you can fine tune the conversion by optionally specifying the date
format and timezone. You can do so using the `Cell` attribute. This attribute will override the automatic
resolution and enable fine-tuning type casting on the property level.

```php
use League\Csv\Serializer\Cell;
use Carbon\CarbonImmutable;

#[Cell(
offset:'date',
cast:Serializer\CastToDate::class,
castArguments: [
'format' => '!Y-m-d',
'timezone' => 'Africa/Nairobi'
])
]
public CarbonImmutable $observedOn;
```

The above rule can be translated in plain english like this:

> convert the value of the record cell named `date` into a `CarbonImmutable` object to
> inject into the `observedOn` property of the class using the date format `!Y-m-d` and the `Africa/Nairobi`
> timezone and the `CarbonImmutable::class`.

The `Cell` attribute differs from the `Record` attribute as it can be used:

- on class properties and methods (public, protected or private).
- with tabular data **without header** (in absence of header you are required to specify the offset number).

The `Cell` attribute can take up to three (3) arguments which are all optional:

- The `offset` argument which tell the engine which cell to use via its numeric or name offset. If not present
the property name or the name of the first argument of the `setter` method will be used. In such case,
the tabular data must be using a non-empty header.
- The `cast` argument which accept the name of a class implementing the `TypeCasting` interface and responsible
for type casting the cell value.
- The `castArguments` which enable controlling typecasting by providing extra arguments to the `TypeCasting` class constructor

In any cases, if type casting fails, an exception will be thrown.

The library comes bundles with three (3) type casting classes which relies on the property type information:

- `CastToBuiltInType`: converts the cell value to a scalar type or `null`, `true` depending on the property type information.
- `CastToDate`: converts the cell value into a PHP `DateTimeInterface` implementing object. You can optionally specify the date format and its timezone if needed.
- `CastToEnum`: converts the cell value into a PHP `Enum`.

You can also provide your own class to typecast the cell value according to your own rules. To do so, first,
specify your casting with the attribute:

```php
use League\Csv\Serializer\Cell;
#[Cell(
offset: 'rating',
cast: IntegerRangeCasting::class,
castArguments: ['min' => 0, 'max' => 5, 'default' => 2]
)]
private int $ratingScore;
```

The `IntegerRangeCasting` will convert cell value and return data between `0` and `5` and default to `2` if
the value is wrong or invalid. To allow your object to cast the cell value to your liking it needs to
implement the `TypeCasting` interface. To do so, you must define a `toVariable` method that will return
the correct value once converted.

```php
use League\Csv\Serializer\TypeCasting;
use League\Csv\Serializer\TypeCastingFailed;

/**
* @implements TypeCasting<int|null>
*/
readonly class IntegerRangeCasting implements TypeCasting
{
public function __construct(
private int $min,
private int $max,
private int $default,
) {
if ($max < $min) {
throw new LogicException('The maximum value can not be lesser than the minimum value.');
}
}

public function toVariable(?string $value, string $type): ?int
{
// if the property is declared as nullable we exist early
if (in_array($value, ['', null], true) && str_starts_with($type, '?')) {
return null;
}

//the type casting class must only work with property declared as integer
if ('int' !== ltrim($type, '?')) {
throw new TypeCastingFailed('The class '. self::class . ' can only work with integer typed property.');
}

return filter_var(
$value,
FILTER_VALIDATE_INT,
['options' => ['min' => $this->min, 'max' => $this->max, 'default' => $this->default]]
);
}
}
```

As you have probably noticed, the class constructor arguments are given to the `Column` attribute via the
`castArguments` which can provide more fine-grained behaviour.

Last but not least if you only which to convert a single record, you can do so using the `Serializer::map` static
method.

```php
use League\Csv\Serializer;

$record = [
'date' => '2023-10-30',
'temperature' => '-1.5',
'place' => 'Berkeley',
];

$weather = Serializer::map(Weather::class, $record);
```

### value, first and nth

You may access any record using its offset starting at `0` in the collection using the `nth` method.
Expand Down Expand Up @@ -291,7 +487,7 @@ closure.
use League\Csv\Reader;
use League\Csv\Writer;

$writer = Writer::createFromString('');
$writer = Writer::createFromString();
$reader = Reader::createFromPath('/path/to/my/file.csv', 'r');
$reader->each(function (array $record, int $offset) use ($writer) {
if ($offset < 10) {
Expand Down Expand Up @@ -323,7 +519,7 @@ $resultSet = ResultSet::createFromTabularDataReader($reader);

$nbTotalCells = $resultSet->recude(fn (?int $carry, array $records) => ($carry ?? 0) + count($records));

//$records contains the total number of celle contains in the $resultSet
//$records contains the total number of cells contains in the $resultSet
```

The closure is similar as the one used with `array_reduce`.
Expand Down
14 changes: 14 additions & 0 deletions src/Reader.php
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@
use Closure;
use Iterator;
use JsonSerializable;
use League\Csv\Serializer\MappingFailed;
use League\Csv\Serializer\TypeCastingFailed;
use SplFileObject;

use function array_filter;
Expand Down Expand Up @@ -403,6 +405,18 @@ public function select(string|int ...$columns): TabularDataReader
return new ResultSet($this->combineHeader($this->prepareRecords(), $this->computeHeader($header)), $finalHeader);
}

/**
* @param class-string $className
*
* @throws TypeCastingFailed
* @throws MappingFailed
* @throws Exception
*/
public function map(string $className): Iterator
{
return (new Serializer($className, $this->getHeader()))->deserializeAll($this);
}

/**
* @param array<string> $header
*
Expand Down
13 changes: 13 additions & 0 deletions src/ResultSet.php
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@
use Generator;
use Iterator;
use JsonSerializable;
use League\Csv\Serializer\MappingFailed;
use League\Csv\Serializer\TypeCastingFailed;
use LimitIterator;

use function array_filter;
Expand Down Expand Up @@ -112,6 +114,17 @@ public function getIterator(): Iterator
return $this->getRecords();
}

/**
* @param class-string $className
*
* @throws TypeCastingFailed
* @throws MappingFailed
*/
public function map(string $className): Iterator
{
return (new Serializer($className, $this->getHeader()))->deserializeAll($this);
}

/**
* @param Closure(array<mixed>, array-key=): mixed $closure
*/
Expand Down
Loading