Fix row slicing with sort_and_finalize throw on column slicing when t… #1838

vasil-pashov · 2024-09-18T16:34:49Z

…he column group is larger than the segment column size

Reference Issues/PRs

What does this implement or fix?

Any other comments?

Checklist

Checklist for code changes...

Have you updated the relevant docstrings, documentation and copyright notice?
Is this contribution tested against all ArcticDB's features?
Do all exceptions introduced raise appropriate error messages?
Are API changes highlighted in the PR description?
Is the PR labelled as enhancement or bug so it appears in autogenerated release notes?

…he column group is larger than the segment column size

python/tests/unit/arcticdb/version_store/test_sort_merge.py

DrNickClarke · 2024-09-19T08:27:23Z

cpp/arcticdb/version/version_core.cpp

@@ -1456,6 +1456,14 @@ VersionedItem sort_merge_impl(
        "Finalizing staged data is not allowed with empty staging area"
    );

+    user_input::check<ErrorCode::E_INVALID_USER_ARGUMENT>(
+        write_options.dynamic_schema || pipeline_context->staged_descriptor_->field_count() < write_options.column_group_size,
+        "Sorting and finalizing staged data is not implemented in the case when column slicing would appear. The "


not yet implemented?

we do have a plan to solve this in future

alexowens90 · 2024-09-19T08:26:57Z

cpp/arcticdb/version/version_core.cpp

+        "input DataFrame has {} fields which is more than the column group size ({}) set in the library options",
+        pipeline_context->staged_descriptor_->field_count(),
+        write_options.column_group_size
+    );


This shouldn't be an error? It should just write segments wider than the lib config setting

alexowens90 · 2024-09-19T08:32:17Z

python/tests/unit/arcticdb/version_store/test_sort_merge.py

+class TestSlicing:
+    def test_long_append_segment(self, lmdb_library):
+        set_config_int('Merge.SegmentSize', 5)


Use a context manager to ensure it is reset at the end of the test:

ArcticDB/python/arcticdb/util/test.py

Line 122 in 008fa54

def config_context(name, value):

alexowens90 · 2024-09-19T08:34:03Z

python/tests/unit/arcticdb/version_store/test_sort_merge.py

+    assert set(lib.list_symbols()) == set([sym, sym_2])
+
+class TestSlicing:


Can we also have a test where the individual staged segments are smaller than the segment row count, but when they are compacted they are larger?

Fix row slicing with sort_and_finalize throw on column slicing when t…

12649f0

…he column group is larger than the segment column size

vasil-pashov commented Sep 18, 2024

View reviewed changes

python/tests/unit/arcticdb/version_store/test_sort_merge.py Show resolved Hide resolved

Use config maps to improve test timings

b9b357a

vasil-pashov marked this pull request as ready for review September 19, 2024 07:55

vasil-pashov requested review from alexowens90, willdealtry and poodlewars as code owners September 19, 2024 07:55

DrNickClarke reviewed Sep 19, 2024

View reviewed changes

alexowens90 requested changes Sep 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix row slicing with sort_and_finalize throw on column slicing when t… #1838

Fix row slicing with sort_and_finalize throw on column slicing when t… #1838

vasil-pashov commented Sep 18, 2024

DrNickClarke Sep 19, 2024

alexowens90 Sep 19, 2024

alexowens90 Sep 19, 2024

alexowens90 Sep 19, 2024

		assert set(lib.list_symbols()) == set([sym, sym_2])

		class TestSlicing:

Fix row slicing with sort_and_finalize throw on column slicing when t… #1838

Are you sure you want to change the base?

Fix row slicing with sort_and_finalize throw on column slicing when t… #1838

Conversation

vasil-pashov commented Sep 18, 2024

Reference Issues/PRs

What does this implement or fix?

Any other comments?

Checklist

DrNickClarke Sep 19, 2024

Choose a reason for hiding this comment

alexowens90 Sep 19, 2024

Choose a reason for hiding this comment

alexowens90 Sep 19, 2024

Choose a reason for hiding this comment

alexowens90 Sep 19, 2024

Choose a reason for hiding this comment