Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add writers for xdr and ascii files #40

Merged
merged 124 commits into from
Jun 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
124 commits
Select commit Hold shift + click to select a range
d43adb7
Add xdr and ascii writers
trossi Jan 26, 2024
846817e
Assert that there is no unexpected value
trossi Nov 27, 2023
2331898
Include non-zero gp value in representation
trossi Jan 26, 2024
ad13676
Add initial Python to R converter
trossi Nov 27, 2023
ee7e635
Simplify string ndarray handling
trossi Nov 27, 2023
a17e1ad
Add converter for dict
trossi Nov 27, 2023
bac485e
Add converter for matrices
trossi Nov 29, 2023
816a2a4
Add function for building r lists
trossi Nov 27, 2023
1bdaad8
Convert tuple to list
trossi Nov 27, 2023
fc88b62
Use consistent function naming
trossi Nov 27, 2023
7a87f0f
Convert arrays of any dimensionality
trossi Nov 27, 2023
e632d87
Add converter for basic types
trossi Nov 27, 2023
9a539f4
Add compression
trossi Nov 27, 2023
a34f71c
Use functions instead of a class
trossi Nov 28, 2023
bbdb899
Add docstrings
trossi Nov 28, 2023
19a073c
Add Python-to-R converter functions
trossi Nov 28, 2023
7be83bf
Add docstring
trossi Nov 28, 2023
ac3fde5
Rename write_all() to write_r_data()
trossi Nov 28, 2023
328fb39
Fix flake8 errors
trossi Jan 26, 2024
41e3834
Update docstrings
trossi Jan 26, 2024
5631e4b
Fix ruff
trossi Jan 26, 2024
f15c495
Add support for RDA files (in addition to RDS)
trossi Feb 27, 2024
295d29c
Add initial test for writing
trossi Feb 29, 2024
c975802
Write RDA magic with other magic
trossi Feb 29, 2024
89bd71f
Avoid temporary files in test
trossi Feb 29, 2024
9453eee
Include test for ascii files
trossi Feb 29, 2024
192362c
Do not include trailing .0 for doubles
trossi Feb 29, 2024
98c29b0
Loop over all test files
trossi Feb 29, 2024
bd0550f
Mark not-implemented features xfail
trossi Feb 29, 2024
cb15e4c
Fix testing Windows-generated ascii files
trossi Feb 29, 2024
6532e38
Fix altreps
trossi Feb 29, 2024
646b32d
Fix writing NA string
trossi Feb 29, 2024
09aa9dd
Include filenames in pytest output
trossi Feb 29, 2024
ab5915a
Add debugging output
trossi Feb 29, 2024
ebef7d1
Remove redundant writer functions
trossi Feb 29, 2024
f58532d
Enable expression type
trossi Feb 29, 2024
e68536f
Enable builtin type
trossi Feb 29, 2024
cd30d46
Fix empty (None) string
trossi Feb 29, 2024
b153a3d
Add test for conversion
trossi Feb 29, 2024
5a9994b
Add option to change version information
trossi Feb 29, 2024
70187e2
Fix rda files
trossi Feb 29, 2024
dea4548
Set gp flag as plain int
trossi Feb 29, 2024
c1f5647
Use tuple
trossi Feb 29, 2024
4bace9f
Fix extra info in version 2 format
trossi Feb 29, 2024
6b5247b
Skip altreps
trossi Feb 29, 2024
65d1468
Generalize list builder
trossi Feb 29, 2024
b97d650
Use same encoding as in input
trossi Feb 29, 2024
499f706
Skip ambiguous tests
trossi Feb 29, 2024
fe7580e
Add support for R expression and language
trossi Mar 1, 2024
d99a003
Add equality operator for comparing numpy arrays
trossi Mar 1, 2024
dc0141f
Skip more ambiguous test files
trossi Mar 1, 2024
0ae79e0
Print also python data for debugging
trossi Mar 1, 2024
fb48601
Remove unused variable
trossi Mar 1, 2024
0d8d96a
Compare string presentations too
trossi Mar 4, 2024
534fa47
Allow debug printing in tests
trossi Mar 4, 2024
7d15bc9
Fix ruff
trossi Mar 4, 2024
b82aa04
Check that integers are small enough
trossi Mar 4, 2024
2bbf4eb
Fix mypy
trossi Mar 4, 2024
902a1fa
Add missing parameters to docstring
trossi Mar 4, 2024
cd5c6e7
Use RDS by default
trossi Mar 4, 2024
8aec705
Simplify loops
trossi Mar 4, 2024
05c4eac
Fix typos
trossi Mar 4, 2024
c0a118e
Remove debug prints from tests
trossi Mar 26, 2024
e0d5e6d
Add test for writing files with compression
trossi Mar 27, 2024
9bba69b
Use binary mode for writing ascii files
trossi Mar 27, 2024
733d6ca
Fix mypy
trossi Mar 27, 2024
386f907
Do not open temporary file twice
trossi Apr 15, 2024
fbdbbe0
Add high-level writer functions
trossi May 6, 2024
14c7f94
Rename format to file_format
trossi May 6, 2024
e8a5fad
Use high-level interface
trossi May 6, 2024
ad5ffc9
Add test for bad RDA creation
trossi May 6, 2024
5c36c73
Add tests for bad encodings
trossi May 6, 2024
b2b9b2b
Include testing None values
trossi May 6, 2024
0829615
Improve error messages
trossi May 6, 2024
5c27b04
Raise NotImplementedError on untested code
trossi May 6, 2024
df5b463
Remove unused function
trossi May 6, 2024
4beb10c
Add test for writing too large integers
trossi May 6, 2024
5e51d1f
Raise error instead of warning for unwritten tag
trossi May 6, 2024
3d815f1
Exclude type checking blocks from coverage report
trossi May 6, 2024
aa06572
Update authors
trossi May 6, 2024
76c13ab
Rename writer files to unparser
trossi May 16, 2024
a6af5fd
Rename writing to unparsing
trossi May 16, 2024
1c1cbfb
Add unparse_data() function returning bytes
trossi May 16, 2024
c129729
Fix function names in docstring
trossi May 16, 2024
6e9b022
Use Literal type for file format
trossi Jun 4, 2024
6d504d7
Use Literal type for compression
trossi Jun 4, 2024
07f4393
Use None for no compression
trossi Jun 4, 2024
f94996a
Use Literal type for file type
trossi Jun 4, 2024
746848d
Simplify handling RDA magic
trossi Jun 4, 2024
d834a8a
Simplify handling RDA vs RDS difference
trossi Jun 5, 2024
35c7ba8
Simplify type names
trossi Jun 5, 2024
3b201d2
Add test for unparsing bad rda
trossi Jun 5, 2024
86958d6
Simplify handling version integers in write
trossi Jun 6, 2024
49f70a1
Use hex representation for R versions
trossi Jun 6, 2024
7509ecb
Use callback protocol for type hint
trossi Jun 6, 2024
6347d17
Fix docstring style
trossi Jun 6, 2024
7eb953b
Fix mypy in Python 3.12
trossi Jun 6, 2024
b421ed0
Fix mypy
trossi Jun 6, 2024
6c72967
Fix mypy
trossi Jun 6, 2024
9b3093c
Remove unnecessary casting
trossi Jun 6, 2024
84cff21
Add test file for empty list
trossi Jun 6, 2024
eb57e98
Add test file for empty named list
trossi Jun 6, 2024
d539483
Fix parsing file format
trossi Jun 6, 2024
d82995f
Fix converting empty dict
trossi Jun 6, 2024
1fef633
Use file type parser function
trossi Jun 6, 2024
b7fdb39
Clarify variable names
trossi Jun 6, 2024
08c513f
Add tests for parsing empty lists
trossi Jun 6, 2024
0e001ea
Allow only non-empty data
trossi Jun 6, 2024
c35c0e5
Add test for failing to create empty RDA file
trossi Jun 6, 2024
5bc262f
Fix style
trossi Jun 6, 2024
f84b4aa
Include received type in the error message
trossi Jun 6, 2024
7aa1d31
Clarify the conversion of np.array([None])
trossi Jun 6, 2024
38c1ff6
Simplify no-error context
trossi Jun 6, 2024
cf78843
Rewrite file checker with existing functions
trossi Jun 6, 2024
53f6864
Fix mypy
trossi Jun 6, 2024
d4c961f
Clarify compression handling
trossi Jun 6, 2024
4f45cba
Add test for parsing NaN and Inf
trossi Jun 6, 2024
2bc73ce
Fix comparing NaN values
trossi Jun 6, 2024
5ab4cd5
Fix writing NaN and Inf
trossi Jun 6, 2024
565b383
Fix formatting alignment
trossi Jun 6, 2024
68b4e55
Simplify packing bits
trossi Jun 6, 2024
442fd3a
Fix error message
trossi Jun 17, 2024
42ac9b8
Use Literal type and lower case for encodings
trossi Jun 17, 2024
7cf643f
Fix integer range check in NumPy 2.0
trossi Jun 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 8 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ keywords = [
"dataset",
]
authors = [
{name = "Carlos Ramos Carreño", email = "[email protected]"},
{name = "Carlos Ramos Carreño"},
{name = "Tuomas Rossi"},
]
maintainers = [
{name = "Carlos Ramos Carreño", email = "[email protected]"},
Expand Down Expand Up @@ -103,6 +104,11 @@ addopts = "--doctest-modules --doctest-glob='*.rst'"
doctest_optionflags = "NORMALIZE_WHITESPACE ELLIPSIS"
norecursedirs = ".* build dist *.egg venv .svn _build docs/auto_examples examples asv_benchmarks"

[tool.coverage.report]
exclude_also = [
"if TYPE_CHECKING:",
]

[tool.ruff.lint]
select = [
"ALL",
Expand Down Expand Up @@ -144,4 +150,4 @@ max-args = 7
include = ["rdata*"]

[tool.setuptools.dynamic]
version = {attr = "rdata.__version__"}
version = {attr = "rdata.__version__"}
1 change: 1 addition & 0 deletions rdata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

from . import conversion as conversion, parser as parser, testing as testing
from ._read import read_rda as read_rda, read_rds as read_rds
from ._write import write_rda as write_rda, write_rds as write_rds

if TYPE_CHECKING:
from .parser._parser import Traversable
Expand Down
125 changes: 125 additions & 0 deletions rdata/_write.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
"""Functions to perform conversion and unparsing in one step."""
from __future__ import annotations

from typing import TYPE_CHECKING

from .conversion import build_r_data, convert_to_r_object, convert_to_r_object_for_rda
from .conversion.to_r import DEFAULT_FORMAT_VERSION
from .unparser import unparse_file

if TYPE_CHECKING:
import os
from typing import Any

from .conversion.to_r import Encoding
from .unparser import Compression, FileFormat


def write_rds(
path: os.PathLike[Any] | str,
data: Any, # noqa: ANN401
*,
file_format: FileFormat = "xdr",
compression: Compression = "gzip",
encoding: Encoding = "utf-8",
format_version: int = DEFAULT_FORMAT_VERSION,
) -> None:
"""
Write an RDS file.

This is a convenience function that wraps
:func:`rdata.conversion.convert_to_r_object`,
:func:`rdata.conversion.build_r_data`,
and :func:`rdata.unparser.unparse_file`,
as it is the common use case.

Args:
path: File path to be written.
data: Python data object.
file_format: File format.
compression: Compression.
encoding: Encoding to be used for strings within data.
format_version: File format version.

See Also:
:func:`write_rda`: Similar function that writes an RDA or RDATA file.

Examples:
Write a Python object to an RDS file.

>>> import rdata
>>>
>>> data = ["hello", 1, 2.2, 3.3+4.4j]
>>> rdata.write_rds("test.rds", data)
"""
r_object = convert_to_r_object(
data,
encoding=encoding,
)
r_data = build_r_data(
r_object,
encoding=encoding,
format_version=format_version,
)
unparse_file(
path,
r_data,
file_type="rds",
file_format=file_format,
compression=compression,
)


def write_rda(
path: os.PathLike[Any] | str,
data: dict[str, Any],
*,
file_format: FileFormat = "xdr",
compression: Compression = "gzip",
encoding: Encoding = "utf-8",
format_version: int = DEFAULT_FORMAT_VERSION,
) -> None:
"""
Write an RDA or RDATA file.

This is a convenience function that wraps
:func:`rdata.conversion.convert_to_r_object_for_rda`,
:func:`rdata.conversion.build_r_data`,
and :func:`rdata.unparser.unparse_file`,
as it is the common use case.

Args:
path: File path to be written.
data: Python dictionary with data and variable names.
file_format: File format.
compression: Compression.
encoding: Encoding to be used for strings within data.
format_version: File format version.

See Also:
:func:`write_rds`: Similar function that writes an RDS file.

Examples:
Write a Python dictionary to an RDA file.

>>> import rdata
>>>
>>> data = {"name": "hello", "values": [1, 2.2, 3.3+4.4j]}
>>> rdata.write_rda("test.rda", data)
"""
r_object = convert_to_r_object_for_rda(
data,
encoding=encoding,
)
r_data = build_r_data(
r_object,
encoding=encoding,
format_version=format_version,
)
unparse_file(
path,
r_data,
file_type="rda",
file_format=file_format,
compression=compression,
)
5 changes: 5 additions & 0 deletions rdata/conversion/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,8 @@
factor_constructor as factor_constructor,
ts_constructor as ts_constructor,
)
from .to_r import (
trossi marked this conversation as resolved.
Show resolved Hide resolved
build_r_data as build_r_data,
convert_to_r_object as convert_to_r_object,
convert_to_r_object_for_rda as convert_to_r_object_for_rda,
)
Loading
Loading